DeepSeek‑R1‑Medical‑COT

DeepSeek‑R1‑Medical‑COT is a 4‑bit fine‑tuned causal language model optimised for medical reasoning and clinical‑scenario interpretation.
It is based on unsloth/DeepSeek‑R1‑Distill‑Llama‑8B and fine‑tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset to provide structured, step‑by‑step clinical reasoning and evidence‑based conclusions.

Repository contents

├── README.md

├── Fine_Tune_DeepSeek_R1_Medical_COT.ipynb # notebook with full training workflow

└── requirements.txt # Python dependencies

The Jupyter notebook contains all of the steps: model loading, prompt formatting, data preparation, LoRA configuration, training, validation, saving and pushing to the Hugging Face hub.

Requirements

Create a virtual environment and install the required packages:

pip install unsloth trl peft accelerate bitsandbytes datasets
pip install transformers gradio huggingface-hub

Usage

Load and run the model as shown below:

from unsloth import FastLanguageModel
from transformers import AutoTokenizer

model_name = "DeepSeek-R1-Medical-COT"

# load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name, load_in_4bit=True
)

prompt = """
### Clinical Scenario:
A 54-year-old man complains of frequent urinary urgency, nocturia,
and a weak urinary stream. His prostate is moderately enlarged.
Predict likely cystometric findings.
"""

inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=500
)
print(tokenizer.decode(outputs[0]))

Training

Follow the instructions in Fine_Tune_DeepSeek_R1_Medical_COT.ipynb:

Load unsloth/DeepSeek-R1-Distill-Llama-8B with load_in_4bit=True.
Define a CoT‑style prompt template for medical questions.
Load the FreedomIntelligence/medical-o1-reasoning-SFT dataset and format examples.
Apply LoRA to selected attention and feed‑forward modules.
Configure TrainingArguments (batch 1 + accumulation, 4‑bit AdamW, FP16/BF16, 200 steps, etc.).
Train with trl.SFTTrainer.
Validate on a held‑out slice, save validation_results.csv and training_logs.csv.
Save the fine‑tuned model locally and push to the Hugging Face hub.
You can run the notebook interactively with Jupyter or convert it to a script with jupyter nbconvert.

Validation

The notebook creates a small validation split (train[500:550]), runs inference on a few examples, and stores outputs/lengths.

Pushing to Hugging Face

Set your token in an environment variable:

export HF_TOKEN="your_token_here"

Then run (see notebook for full code):

from huggingface_hub import login, HfApi
import os

login(token=os.environ["HF_TOKEN"])
api = HfApi()
api.create_repo("DeepSeek-R1-Medical-COT", repo_type="model",
                exist_ok=True)
model.push_to_hub("DeepSeek-R1-Medical-COT",
                  tokenizer=tokenizer,
                  save_method="merged_16bit")

After this the model can be loaded with FastLanguageModel.from_pretrained("DeepSeek-R1-Medical-COT").

Simple Gradio demo

The notebook includes an example Gradio interface that accepts a medical scenario and returns step‑by‑step reasoning plus a final conclusion.

Model details

Model type: Causal Language Model (LLM)
Base model: unsloth/DeepSeek-R1-Distill-Llama-8B
Fine‑tuned for: Medical instruction‑following and clinical reasoning

Sources

Model hub: https://huggingface.co/MohamedASAK/DeepSeek-R1-Medical-COT
Training dataset: https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT

Training details

Dataset: FreedomIntelligence/medical-o1-reasoning-SFT
Preprocessing: Prompts formatted with CoT style (grounded) for step-by-step reasoning
Fine‑tuning method: LoRA applied to attention and feedforward modules
Hyperparameters:
- Batch size: 1 (gradient accumulation 8)
- Max steps: 200
- Learning rate: 2 × 10⁻⁴
- Mixed precision: FP16 / BF16 depending on GPU support
- Optimizer: 8‑bit AdamW

Evaluation

Evaluated on a subset of medical reasoning questions
Metrics: correctness of reasoning, coherence, and answer accuracy
Results indicate improved structured reasoning over the base model

Limitations and Risks

Limited to the quality and scope of the training dataset
May not cover rare or highly specialized medical cases
Should not replace clinical judgment; intended for educational and reasoning support

Recommendation: Always review model outputs with a qualified healthcare professional.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepSeek‑R1‑Medical‑COT

Repository contents

Requirements

Usage

Training

Validation

Pushing to Hugging Face

Simple Gradio demo

Model details

Sources

Training details

Evaluation

Limitations and Risks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Fine_Tune_DeepSeek_R1_Medical_COT.ipynb		Fine_Tune_DeepSeek_R1_Medical_COT.ipynb
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

DeepSeek‑R1‑Medical‑COT

Repository contents

Requirements

Usage

Training

Validation

Pushing to Hugging Face

Simple Gradio demo

Model details

Sources

Training details

Evaluation

Limitations and Risks

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Pushing to Hugging Face

Packages