Releases · oobabooga/text-generation-webui

26 Sep 03:37

oobabooga

1.6.1

019371c

1.6.1

What's Changed

Use call for conda deactivate in Windows installer by @jllllll in #4042
[extensions/openai] Fix error when preparing cache for embedding models by @wangcx18 in #3995
Create alternative requirements.txt with AMD and Metal wheels by @oobabooga in #4052
Add a grammar editor to the UI by @oobabooga in #4061
Avoid importing torch in one-click-installer by @jllllll in #4064

Full Changelog: v1.6...1.6.1

Contributors

jllllll, wangcx18, and oobabooga

Assets 2

22 Sep 22:17

oobabooga

v1.6

7b9ad64

v1.6

The one-click-installers have been merged into the repository. Migration instructions can be found here.

The updated one-click install features an installation size several GB smaller and a more reliable update procedure.

What's Changed

sd_api_pictures: Widen sliders for image size minimum and maximum by @GuizzyQC in #3326
Bump exllama module to 0.0.9 by @jllllll in #3338
Add an extension that makes chat replies longer by @oobabooga in #3363
add chat instruction config for BaiChuan-chat model by @CrazyShipOne in #3332
[extensions/openai] +Array input (batched) , +Fixes by @matatonic in #3309
Add a scrollbar to notebook/default textboxes, improve chat scrollbar style by @jparmstr in #3403
Add auto_max_new_tokens parameter by @oobabooga in #3419
Add the --cpu option for llama.cpp to prevent CUDA from being used by @oobabooga in #3432
Use character settings from API properties if present by @rafa-9 in #3428
Add standalone Dockerfile for NVIDIA Jetson by @toolboc in #3336
More models: +StableBeluga2 by @matatonic in #3415
[extensions/openai] include content-length for json replies by @matatonic in #3416
Fix llama.cpp truncation by @jparmstr in #3400
Remove unnecessary chat.js by @missionfloyd in #3445
Add back silero preview by @missionfloyd by @oobabooga in #3446
Add SSL certificate support by @oobabooga in #3453
Bump bitsandbytes to 0.41.1 by @jllllll in #3457
[Bug fix] Remove html tags form the Prompt sent to Stable Diffusion by @SodaPrettyCold in #3151
Fix: Mirostat fails on models split across multiple GPUs. by @Ph0rk0z in #3465
Bump exllama wheels to 0.0.10 by @jllllll in #3467
Create logs dir if missing when saving history by @jllllll in #3462
Fix chat message order by @missionfloyd in #3461
Add Classifier Free Guidance (CFG) for Transformers/ExLlama by @oobabooga in #3325
Refactor everything by @oobabooga in #3481
Use chat_instruct_command in API by @jllllll in #3482
Make dockerfile respect specified cuda version by @sammcj in #3474
Fixed a typo when displaying parameters on the llamm.cpp model did not correctly display "rms_norm_eps" by @berkut1 in #3494
Add option for named cloudflare tunnels by @Fredddi43 in #3364
Fix superbooga when using regenerate by @oderwat in #3362
Added the logic for starchat model series by @giprime in #3185
Streamline GPTQ-for-LLaMa support by @jllllll in #3526
Add Vicuna-v1.5 detection by @berkut1 in #3524
ctransformers: another attempt by @cal066 in #3313
Bump ctransformers wheel version by @jllllll in #3558
ctransformers: move thread and seed parameters by @cal066 in #3543
Unify the 3 interface modes by @oobabooga in #3554
Various ctransformers fixes by @netrunnereve in #3556
Add "save defaults to settings.yaml" button by @oobabooga in #3574
Add the --disable_exllama option for AutoGPTQ by @clefever in #3545
ctransformers: Fix up model_type name consistency by @cal066 in #3567
Add a "Show controls" button to chat UI by @oobabooga in #3590
Improved chat scrolling by @oobabooga in #3601
fixes error when not specifying tunnel id by @ausboss in #3606
Fix print CSS by @missionfloyd in #3608
Bump llama-cpp-python by @oobabooga in #3610
Bump llama_cpp_python_cuda to 0.1.78 by @jllllll in #3614
Refactor the training tab by @oobabooga in #3619
llama.cpp: make Stop button work with streaming disabled by @cebtenzzre in #3620
Unescape last message by @missionfloyd in #3623
Improve readability of download-model.py by @Thutmose3 in #3497
Add probability dropdown to perplexity_colors extension by @SeanScripts in #3148
Add a simple logit viewer by @oobabooga in #3636
Fix whitespace formatting in perplexity_colors extension. by @tdrussell in #3643
ctransformers: add mlock and no-mmap options by @cal066 in #3649
Update requirements.txt by @Tkbit in #3651
Add missing extensions to Dockerfile by @sammcj in #3544
Implement CFG for ExLlama_HF by @oobabooga in #3666
Add CFG to llamacpp_HF (second attempt) by @oobabooga in #3678
ctransformers: gguf support by @cal066 in #3685
Fix ctransformers threads auto-detection by @jllllll in #3688
Use separate llama-cpp-python packages for GGML support by @jllllll in #3697
GGUF by @oobabooga in #3695
Fix ctransformers model unload by @marella in #3711
Add ffmpeg to the Docker image by @kelvie in #3664
accept floating-point alpha value on the command line by @cebtenzzre in #3712
Bump llama-cpp-python to 0.1.81 by @jllllll in #3716
Make it possible to scroll during streaming by @oobabooga in #3721
Bump llama-cpp-python to 0.1.82 by @jllllll in #3730
Bump ctransformers to 0.2.25 by @jllllll in #3740
Add max_tokens_second param by @oobabooga in #3533
Update requirements.txt by @VishwasKukreti in #3725
Update llama.cpp.md by @q5sys in #3702
Bump llama-cpp-python to 0.1.83 by @jllllll in #3745
Update download-model.py (Allow single file download) by @bet0x in #3732
Allow downloading single file from UI by @missionfloyd in #3737
Bump exllama to 0.0.14 by @jllllll in #3758
Bump llama-cpp-python to 0.1.84 by @jllllll in #3854
Update transformers requirement from ==4.32.* to ==4.33.* by @dependabot in #3865
Bump exllama to 0.1.17 by @jllllll in #3847
Exllama new rope settings by @Ph0rk0z in #3852
fix lora training with alpaca_lora_4bit by @johnsmith...

Contributors

kelvie, oderwat, and 43 other contributors

Assets 2

26 Jul 14:14

oobabooga

v1.5

b17893a

v1.5

What's Changed

Add a detailed extension example and update the extension docs. The example can be found here: example/script.py.
Introduce a new chat_input_modifier extension function and deprecate the old input_hijack.
Change rms_norm_eps to 5e-6 for ~~llama-2-70b ggml~~ all llama-2 models -- this value reduces the perplexities of the models.
Remove FlexGen support. It has been made obsolete by the lack of Llama support and the emergence of llama.cpp and 4-bit quantization. I can add it back if it ever gets updated.
Use the dark theme by default.
Set the correct instruction template for the model when switching from default/notebook modes to chat mode.

Bug fixes

[extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matasonic in #3122
Fix typo in README.md by @eltociear in #3286
README updates and improvements by @netrunnereve in #3198
Ignore values in training.py which are not string by @Foxtr0t1337 in #3287

Contributors

eltociear, Foxtr0t1337, and netrunnereve

Assets 2

24 Jul 19:42

oobabooga

v1.4

a07d070

v1.4

What's Changed

Add llama-2-70b GGML support by @oobabooga in #3285
Bump bitsandbytes to 0.41.0 by @jllllll in #3258 -- faster speeds
Bump exllama module to 0.0.8 by @jllllll in #3256 -- expanded LoRA support

Bug fixes

Add checks for ROCm and unsupported architectures to llama_cpp_cuda loading by @jllllll in #3225

Extensions

[extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matatonic in #3122

Contributors

jllllll, matatonic, and oobabooga

Assets 2

19 Jul 14:22

oobabooga

v1.3.1

0d7f432

v1.3.1

Changes

Add missing EOS and BOS tokens to Llama-2 template
Bump transformers for better Llama-2 support
Bump llama-cpp-python for better unicode support (untested)

Assets 2

18 Jul 20:33

oobabooga

v1.3

3ef4939

v1.3

Changes

Llama-v2: add instruction template, autodetect the truncation length, add conversion documentation
[GGML] Support for customizable RoPE by @randoentity in #3083
Optimize llamacpp_hf (a bit)
Add Airoboros-v1.2 template
Disable "Autoload the model" by default
Disable auto-loading at startup when only one model is available by @jllllll in #3187
Don't unset the LoRA menu when loading a model
Bump accelerate to 0.21.0
Bump bitsandbytes to 0.40.2 (Windows wheels provided by @jllllll in #3186)
Bump AutoGPTQ to 0.3.0 (loading LoRAs is now supported out of the box)
Update LLaMA-v1 documentation

Bug fixes

Use 'torch.backends.mps.is_available' to check if mps is supported by @appe233 in #3164

Contributors

jllllll, appe233, and randoentity

Assets 2

16 Jul 05:44

oobabooga

v1.2

9f08038

v1.2

Changes

Create llamacpp_HF loader by @oobabooga in #3062
Make it possible to evaluate exllama perplexity by @oobabooga in #3138
Add support for logits processors in extensions by @cyberfox in #3029
Bump bitsandbytes to 0.40.1.post1 by @jllllll in #3156
Bump llama cpp version by @ofirkris in #3160
Increase alpha value limit for NTK RoPE scaling for exllama/exllama_HF by @Panchovix in #3149
Decrease download timeout

Bug fixes

Fix reload screen background color in dark mode

Extensions

Color tokens by probability and/or perplexity by @SeanScripts in #3078

Contributors

cyberfox, jllllll, and 4 other contributors

Assets 2

13 Jul 02:46

oobabooga

v1.1.1

0e62958

v1.1.1

Bug fixes

Fix output path when downloading models through the UI

Assets 2

12 Jul 18:56

oobabooga

v1.1

6447b2e

v1.1

Changes

Bump bitsandbytes Windows wheel by @jllllll in #3097 -- --load-in-4bit is now a lot faster
Add support low vram mode on llama.cpp module by @gabriel-pena in #3076
Add links/reference to new multimodal instructblip-pipeline in multimodal readme by @kjerk in #2947
Add token authorization for downloading model by @fahadh4ilyas in #3067
Add default environment variable values to docker compose file by @Josh-XT in #3102
models/config.yaml: +platypus/gplatty, +longchat, +vicuna-33b, +Redmond-Hermes-Coder, +wizardcoder, +more by @matatonic in #2928
Add context_instruct to API. Load default model instruction template … by @atriantafy in #2688
Chat history download creates more detailed file names by @UnskilledWolf in #3051
Disable wandb remote HTTP requests
Add Feature to Log Sample of Training Dataset for Inspection by @practicaldreamer in #1711
Add ability to load all text files from a subdirectory for training by @kizinfo in #1997
Add Tensorboard/Weights and biases integration for training by @kabachuha in #2624
Fix: Fixed the tokenization process of a raw dataset and improved its efficiency by @Nan-Do in #3035
More robust and error prone training by @FartyPants in #3058

Bug fixes

[Fixed] wbits and groupsize values from model not shown by @set-soft in #2977
Fix API example for loading models by @vadi2 in #3101
google flan T5 tokenizer download fix by @FartyPants in #3080
Changed FormComponent to IOComponent by @ricardopinto in #3017
respect model dir for downloads by @micsthepick in #3079

Extensions

Fix send_pictures extension
Elevenlabs tts fixes by @set-soft in #2959
[extensions/openai]: Major openai extension updates & fixes by @matatonic in #3049
substitu superboog Beatiful Soup Parser by @juhenriquez in #2996