Skip to content

MohamedASAK/DeepSeek-R1-Medical-COT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

DeepSeek‑R1‑Medical‑COT

DeepSeek‑R1‑Medical‑COT is a 4‑bit fine‑tuned causal language model optimised for medical reasoning and clinical‑scenario interpretation.
It is based on unsloth/DeepSeek‑R1‑Distill‑Llama‑8B and fine‑tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset to provide structured, step‑by‑step clinical reasoning and evidence‑based conclusions.


Repository contents

├── README.md

├── Fine_Tune_DeepSeek_R1_Medical_COT.ipynb # notebook with full training workflow

└── requirements.txt # Python dependencies

The Jupyter notebook contains all of the steps: model loading, prompt formatting, data preparation, LoRA configuration, training, validation, saving and pushing to the Hugging Face hub.


Requirements

Create a virtual environment and install the required packages:

pip install unsloth trl peft accelerate bitsandbytes datasets
pip install transformers gradio huggingface-hub

Usage

Load and run the model as shown below:

from unsloth import FastLanguageModel
from transformers import AutoTokenizer

model_name = "DeepSeek-R1-Medical-COT"

# load model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name, load_in_4bit=True
)

prompt = """
### Clinical Scenario:
A 54-year-old man complains of frequent urinary urgency, nocturia,
and a weak urinary stream. His prostate is moderately enlarged.
Predict likely cystometric findings.
"""

inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(
    input_ids=inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_new_tokens=500
)
print(tokenizer.decode(outputs[0]))

Training

Follow the instructions in Fine_Tune_DeepSeek_R1_Medical_COT.ipynb:

  • Load unsloth/DeepSeek-R1-Distill-Llama-8B with load_in_4bit=True.
  • Define a CoT‑style prompt template for medical questions.
  • Load the FreedomIntelligence/medical-o1-reasoning-SFT dataset and format examples.
  • Apply LoRA to selected attention and feed‑forward modules.
  • Configure TrainingArguments (batch 1 + accumulation, 4‑bit AdamW, FP16/BF16, 200 steps, etc.).
  • Train with trl.SFTTrainer.
  • Validate on a held‑out slice, save validation_results.csv and training_logs.csv.
  • Save the fine‑tuned model locally and push to the Hugging Face hub.
  • You can run the notebook interactively with Jupyter or convert it to a script with jupyter nbconvert.

Validation

The notebook creates a small validation split (train[500:550]), runs inference on a few examples, and stores outputs/lengths.


Pushing to Hugging Face

Set your token in an environment variable:

export HF_TOKEN="your_token_here"

Then run (see notebook for full code):

from huggingface_hub import login, HfApi
import os

login(token=os.environ["HF_TOKEN"])
api = HfApi()
api.create_repo("DeepSeek-R1-Medical-COT", repo_type="model",
                exist_ok=True)
model.push_to_hub("DeepSeek-R1-Medical-COT",
                  tokenizer=tokenizer,
                  save_method="merged_16bit")

After this the model can be loaded with FastLanguageModel.from_pretrained("DeepSeek-R1-Medical-COT").


Simple Gradio demo

The notebook includes an example Gradio interface that accepts a medical scenario and returns step‑by‑step reasoning plus a final conclusion.


Model details

  • Model type: Causal Language Model (LLM)
  • Base model: unsloth/DeepSeek-R1-Distill-Llama-8B
  • Fine‑tuned for: Medical instruction‑following and clinical reasoning

Sources


Training details

  • Dataset: FreedomIntelligence/medical-o1-reasoning-SFT
  • Preprocessing: Prompts formatted with CoT style (grounded) for step-by-step reasoning
  • Fine‑tuning method: LoRA applied to attention and feedforward modules
  • Hyperparameters:
    • Batch size: 1 (gradient accumulation 8)
    • Max steps: 200
    • Learning rate: 2 × 10⁻⁴
    • Mixed precision: FP16 / BF16 depending on GPU support
    • Optimizer: 8‑bit AdamW

Evaluation

  • Evaluated on a subset of medical reasoning questions
  • Metrics: correctness of reasoning, coherence, and answer accuracy
  • Results indicate improved structured reasoning over the base model

Limitations and Risks

  • Limited to the quality and scope of the training dataset
  • May not cover rare or highly specialized medical cases
  • Should not replace clinical judgment; intended for educational and reasoning support

Recommendation: Always review model outputs with a qualified healthcare professional.

About

DeepSeek‑R1‑Medical‑COT is a 4‑bit fine‑tuned causal language model optimised for medical reasoning and clinical‑scenario interpretation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors