-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Feature request
This request proposes integrating Lily (Low-Rank Interconnected Adaptation across Layers), accepted to ACL 2025 Findings, into the PEFT library.
Paper: https://arxiv.org/pdf/2407.09946
Repo: https://github.com/yibozhong/lily
Motivation
Lily aims to directly make the rank of each individual adapter bigger under the same parameter budget, as it's shown in many papers that higher ranks are beneficial to PEFT performance. This is achieved by breaking the pair-AB-per-layer constraint of LoRA. That is, we do not give each layer a dedicated pair of A and B. Rather, we decouple all the Bs from the layer, and when adapting at each layer, we use a weighted sum of these Bs as the B for this layer. The weight is calculated by a lightweight trainable router, currently data-dependent.
Several points worth noting:
- The method looks somewhat similar to MosLoRA in structure, but it operates at the model level and the aim is to increase the individual rank of each adapter with dynamic adaptation.
- Currently in the paper, we use a data-dependent router, which makes it tricky to merge the weights. I do not observe notable inference latency, possibly due to small model size, but an option for using a non-data-dependent router can be included and enable easy merging the weights.
- The current As are still positioned at a fixed layer (using layer-wise sharing to reduce params). However, it also can be decoupled, simply by providing two routers for weighting As and Bs respectively, rather than one router for B in the current setup. This is a more elegant design and shares the same principle as Lily. After I run quick experiments demonstrating its effectiveness, I can integrate this setup into my current code as Lily v2.
Your contribution
Implement Lily, repo: https://github.com/yibozhong/lily.