- [2025.03.18] Release training, evaluation, and serving codes of StreamMind.
Basic Dependencies:
- Python >= 3.10
- Pytorch >= 2.5.1
- CUDA Version >= 11.8
- transformers >= 4.44.2 (for mistral tokenizer)
- tokenizers >= 0.19.1 (for mistral tokenizer)
[Online Mode] Install required packages (better for development):
git clone https://github.com/xinding-sys/StreamMind
cd StreamMind
pip install -r requirements.txt
pip install flash-attn==2.5.8 --no-build-isolation
- Training Data Structure:
StreamMind
├── Online_datasets
│ ├── ego4d
| | ├── v2
| | | ├── annotations
| | | ├── full_scale
│ ├── MatchTime
| | ├── SN-caption
| | ├── Video
├── Offline_datasets
│ ├── videollava_pt
| | ├── llava_image/
| | ├── valley/
| | └── valley_llavaimage.json
│ ├── videollava_sft
| | ├── llava_image_tune/
| | ├── videochatgpt_tune/
| | └── videochatgpt_llavaimage_tune.json
- Command:
# Streammind train stage 1
bash scripts/custom/finetune_stage1.sh
# Streammind train stage 2
bash scripts/custom/finetune_stage2.sh
# Streammind evaluate
bash scripts/custom/eval/evaluate.sh
If you find StreamMind useful for your research and applications, please cite using this BibTeX:
@article{ding2025streammind,
title={StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition},
author={Ding, Xin and Wu, Hao and Yang, Yifan and Jiang, Shiqi and Bai, Donglin and Chen, Zhibo and Cao, Ting},
journal={arXiv preprint arXiv:2503.06220},
year={2025}
}
The codebase of StreamMind is adapted from VideoLLaMA 2, We are also grateful for the following projects our StreamMind arise from:
- Videollm-online LLaMA 2, Mistral-7B, OpenAI CLIP, Honeybee.
- Video-ChatGPT, Video-LLaVA.
- WebVid, Panda-70M, LanguageBind, InternVid.
- VideoChat2, Valley, VTimeLLM, ShareGPT4V.
This project is released under the Apache 2.0 license as found in the LICENSE file. The service is a research preview intended for non-commercial use ONLY, subject to the model Licenses of LLaMA and Mistral, Terms of Use of the data generated by OpenAI, and Privacy Practices of ShareGPT. Please get in touch with us if you find any potential violations.