What question did this study set out to answer?

The aim is to present mLoRA, a system for efficient fine-tuning of Large Language Models using innovative techniques.

March 3, 2026Open Access

A Demonstration of mLoRA: A Parallelism-Efficient LLM Fine-Tuning System

Key Points

The aim is to present mLoRA, a system for efficient fine-tuning of Large Language Models using innovative techniques.
Introduced LoRAPP for maximizing GPU usage across multiple GPUs.
Utilized BatchLoRA to consolidate multiple tasks into fewer operations.
Implemented a memory-aware task scheduler for resource allocation efficiency.
Achieved 30–45% faster training compared to current parallel methods.
Demonstrated effectiveness on database tasks like Text2SQL and LLM4DP.

Abstract

This paper presents a demonstration of mLoRA, a system for parallel and efficient fine-tuning of Large Language Models (LLMs) using Low-Rank Adaptation (LoRA). mLoRA introduces two core components: LoRAPP, a zero-bubble pipeline parallelism mechanism that leverages the independence of LoRA adapters to maximize GPU utilization across multiple GPUs, and BatchLoRA, a custom operator that consolidates multiple LoRA tasks into batched matrix operations to reduce kernel launch overhead. The system also includes a memory-aware task scheduler for efficient resource allocation. Demonstrated on database-related tasks including Text2SQL and LLM-based data preprocessing (LLM4DP), mLoRA achieves 30–45% faster training compared to existing parallel methods and has been deployed in production at AntGroup. This demo paper was submitted to the PVLDB 2025 Demo Track and serves as a companion to the full research paper accepted at VLDB 2025.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Zelong Huang

Zhengmao Ye

Salma Filali

Actions

Institutions

Cornell University

Sichuan University

The University of Texas at Arlington

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

A Demonstration of mLoRA: A Parallelism-Efficient LLM Fine-Tuning System

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study