The Surprising Effectiveness of Randomness in LLM Pruning

This repository contains the code for the paper:

Title: The Surprising Effectiveness of Randomness in LLM Pruning

Authors: Shuyao Xu, Jiayao Liu, ZhenFeng He, Cheng Peng, Weidi Xu

Conference: ICLR 2025 Workshop on Sparsity in LLMs (SLLM)

Paper URL: https://openreview.net/forum?id=YncWrbIxnN

Abstract

This paper investigates structured pruning of LLMs. We find random pruning is a surprisingly effective baseline at lower pruning ratios. We propose Random Clustering + Activation L2 Pruning (RC+A), a simple and efficient method combining randomness with activation magnitude, achieving performance comparable to gradient-based methods while being significantly faster (up to 50x).

Overview

This code implements and evaluates various structured pruning techniques for LLM MLP layers, focusing on the effectiveness of randomness. Key implemented methods include:

Random Pruning
Activation L2 Pruning
Taylor Pruning (Gradient-based)
RC+A (Ours): Random Clustering + Activation L2 Pruning
Similarity Clustering + Activation L2 Pruning

Installation

We recommend using Conda:

conda env create -f environment.yaml
conda activate llm-neuron-compression

(Alternatively, use pip install -r environment.txt after ensuring PyTorch with CUDA is installed.)

Usage

Experiments are run via scripts/run_experiment.py, preferably launched with accelerate.

Key Arguments:

--model: Hugging Face model ID (e.g., "Qwen/Qwen2.5-7B-Instruct").
--method: Pruning strategy (see below).
--ratio: Pruning ratio (e.g., 0.25).
--layers: Comma-separated layer indices (e.g., "5,6,7,...26") or "all".
--eval-tasks: Comma-separated lm-eval-harness tasks (e.g., "wikitext,mmlu,hellaswag").
--apply-mode prune: Required for RC+A and Similarity+ActivationL2 methods.

Methods (--method) corresponding to paper results:

random-prune: Random Pruning
gradient-magnitude: Taylor Pruning
squared-magnitude: Activation L2 Pruning
weight-l2: Weight L2 Pruning
random-merge: Use with --apply-mode prune for RC+A (Ours)
activation-merge: Use with --apply-mode prune for Similarity+ActivationL2 (Modulated-Act)
post-activation-merge: Use with --apply-mode prune for Similarity+ActivationL2 (Post-Act)

Example: Running RC+A (Ours) at 25% Pruning on Qwen-7B

# Define layers and tasks (adjust as needed)
LAYERS_TO_PRUNE="5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26"
EVAL_TASKS="wikitext,mmlu,arc_easy,arc_challenge,winogrande,hellaswag,openbookqa"

accelerate launch scripts/run_experiment.py \
    --model "Qwen/Qwen2.5-7B-Instruct" \
    --method random-merge \
    --apply-mode prune \
    --ratio 0.25 \
    --layers $LAYERS_TO_PRUNE \
    --dataset bookcorpus --num-calib-samples 10 --calib-seq-len 128 \
    --eval-tasks $EVAL_TASKS

(Adapt the --method and --ratio arguments for other experiments. See scripts/*.sh for more examples.)

Evaluation & Results

Evaluation uses lm-evaluation-harness. Results (metrics, config, timing) are saved as JSON in results/<model_name>/<method>/logs/.

Citation

If you find this work useful, please cite:

@inproceedings{
xu2025the,
title={The Surprising Effectiveness of Randomness in {LLM} Pruning},
author={Shuyao Xu and Jiayao Liu and Zhenfeng He and Cheng Peng and Weidi Xu},
booktitle={Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference},
year={2025},
url={https://openreview.net/forum?id=YncWrbIxnN}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
scripts		scripts
src		src
.gitignore		.gitignore
environment.txt		environment.txt
environment.yaml		environment.yaml
readme.md		readme.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Surprising Effectiveness of Randomness in LLM Pruning

Abstract

Overview

Installation

Usage

Evaluation & Results

Citation

About

Uh oh!

Releases

Packages

Languages

Tim-Siu/llm-random-prune

Folders and files

Latest commit

History

Repository files navigation

The Surprising Effectiveness of Randomness in LLM Pruning

Abstract

Overview

Installation

Usage

Evaluation & Results

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages