Releases: oobabooga/text-generation-webui
Releases · oobabooga/text-generation-webui
1.6.1
What's Changed
- Use call for conda deactivate in Windows installer by @jllllll in #4042
- [extensions/openai] Fix error when preparing cache for embedding models by @wangcx18 in #3995
- Create alternative requirements.txt with AMD and Metal wheels by @oobabooga in #4052
- Add a grammar editor to the UI by @oobabooga in #4061
- Avoid importing torch in one-click-installer by @jllllll in #4064
Full Changelog: v1.6...1.6.1
v1.6
The one-click-installers have been merged into the repository. Migration instructions can be found here.
The updated one-click install features an installation size several GB smaller and a more reliable update procedure.
What's Changed
- sd_api_pictures: Widen sliders for image size minimum and maximum by @GuizzyQC in #3326
- Bump exllama module to 0.0.9 by @jllllll in #3338
- Add an extension that makes chat replies longer by @oobabooga in #3363
- add chat instruction config for BaiChuan-chat model by @CrazyShipOne in #3332
- [extensions/openai] +Array input (batched) , +Fixes by @matatonic in #3309
- Add a scrollbar to notebook/default textboxes, improve chat scrollbar style by @jparmstr in #3403
- Add auto_max_new_tokens parameter by @oobabooga in #3419
- Add the --cpu option for llama.cpp to prevent CUDA from being used by @oobabooga in #3432
- Use character settings from API properties if present by @rafa-9 in #3428
- Add standalone Dockerfile for NVIDIA Jetson by @toolboc in #3336
- More models: +StableBeluga2 by @matatonic in #3415
- [extensions/openai] include content-length for json replies by @matatonic in #3416
- Fix llama.cpp truncation by @jparmstr in #3400
- Remove unnecessary chat.js by @missionfloyd in #3445
- Add back silero preview by @missionfloyd by @oobabooga in #3446
- Add SSL certificate support by @oobabooga in #3453
- Bump bitsandbytes to 0.41.1 by @jllllll in #3457
- [Bug fix] Remove html tags form the Prompt sent to Stable Diffusion by @SodaPrettyCold in #3151
- Fix: Mirostat fails on models split across multiple GPUs. by @Ph0rk0z in #3465
- Bump exllama wheels to 0.0.10 by @jllllll in #3467
- Create logs dir if missing when saving history by @jllllll in #3462
- Fix chat message order by @missionfloyd in #3461
- Add Classifier Free Guidance (CFG) for Transformers/ExLlama by @oobabooga in #3325
- Refactor everything by @oobabooga in #3481
- Use chat_instruct_command in API by @jllllll in #3482
- Make dockerfile respect specified cuda version by @sammcj in #3474
- Fixed a typo when displaying parameters on the llamm.cpp model did not correctly display "rms_norm_eps" by @berkut1 in #3494
- Add option for named cloudflare tunnels by @Fredddi43 in #3364
- Fix superbooga when using regenerate by @oderwat in #3362
- Added the logic for starchat model series by @giprime in #3185
- Streamline GPTQ-for-LLaMa support by @jllllll in #3526
- Add Vicuna-v1.5 detection by @berkut1 in #3524
- ctransformers: another attempt by @cal066 in #3313
- Bump ctransformers wheel version by @jllllll in #3558
- ctransformers: move thread and seed parameters by @cal066 in #3543
- Unify the 3 interface modes by @oobabooga in #3554
- Various ctransformers fixes by @netrunnereve in #3556
- Add "save defaults to settings.yaml" button by @oobabooga in #3574
- Add the --disable_exllama option for AutoGPTQ by @clefever in #3545
- ctransformers: Fix up model_type name consistency by @cal066 in #3567
- Add a "Show controls" button to chat UI by @oobabooga in #3590
- Improved chat scrolling by @oobabooga in #3601
- fixes error when not specifying tunnel id by @ausboss in #3606
- Fix print CSS by @missionfloyd in #3608
- Bump llama-cpp-python by @oobabooga in #3610
- Bump llama_cpp_python_cuda to 0.1.78 by @jllllll in #3614
- Refactor the training tab by @oobabooga in #3619
- llama.cpp: make Stop button work with streaming disabled by @cebtenzzre in #3620
- Unescape last message by @missionfloyd in #3623
- Improve readability of download-model.py by @Thutmose3 in #3497
- Add probability dropdown to perplexity_colors extension by @SeanScripts in #3148
- Add a simple logit viewer by @oobabooga in #3636
- Fix whitespace formatting in perplexity_colors extension. by @tdrussell in #3643
- ctransformers: add mlock and no-mmap options by @cal066 in #3649
- Update requirements.txt by @Tkbit in #3651
- Add missing extensions to Dockerfile by @sammcj in #3544
- Implement CFG for ExLlama_HF by @oobabooga in #3666
- Add CFG to llamacpp_HF (second attempt) by @oobabooga in #3678
- ctransformers: gguf support by @cal066 in #3685
- Fix ctransformers threads auto-detection by @jllllll in #3688
- Use separate llama-cpp-python packages for GGML support by @jllllll in #3697
- GGUF by @oobabooga in #3695
- Fix ctransformers model unload by @marella in #3711
- Add ffmpeg to the Docker image by @kelvie in #3664
- accept floating-point alpha value on the command line by @cebtenzzre in #3712
- Bump llama-cpp-python to 0.1.81 by @jllllll in #3716
- Make it possible to scroll during streaming by @oobabooga in #3721
- Bump llama-cpp-python to 0.1.82 by @jllllll in #3730
- Bump ctransformers to 0.2.25 by @jllllll in #3740
- Add max_tokens_second param by @oobabooga in #3533
- Update requirements.txt by @VishwasKukreti in #3725
- Update llama.cpp.md by @q5sys in #3702
- Bump llama-cpp-python to 0.1.83 by @jllllll in #3745
- Update download-model.py (Allow single file download) by @bet0x in #3732
- Allow downloading single file from UI by @missionfloyd in #3737
- Bump exllama to 0.0.14 by @jllllll in #3758
- Bump llama-cpp-python to 0.1.84 by @jllllll in #3854
- Update transformers requirement from ==4.32.* to ==4.33.* by @dependabot in #3865
- Bump exllama to 0.1.17 by @jllllll in #3847
- Exllama new rope settings by @Ph0rk0z in #3852
- fix lora training with alpaca_lora_4bit by @johnsmith...
v1.5
What's Changed
- Add a detailed extension example and update the extension docs. The example can be found here: example/script.py.
- Introduce a new
chat_input_modifier
extension function and deprecate the oldinput_hijack
. - Change rms_norm_eps to 5e-6 for
llama-2-70b ggmlall llama-2 models -- this value reduces the perplexities of the models. - Remove FlexGen support. It has been made obsolete by the lack of Llama support and the emergence of llama.cpp and 4-bit quantization. I can add it back if it ever gets updated.
- Use the dark theme by default.
- Set the correct instruction template for the model when switching from default/notebook modes to chat mode.
Bug fixes
- [extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matasonic in #3122
- Fix typo in README.md by @eltociear in #3286
- README updates and improvements by @netrunnereve in #3198
- Ignore values in training.py which are not string by @Foxtr0t1337 in #3287
v1.4
What's Changed
- Add llama-2-70b GGML support by @oobabooga in #3285
- Bump bitsandbytes to 0.41.0 by @jllllll in #3258 -- faster speeds
- Bump exllama module to 0.0.8 by @jllllll in #3256 -- expanded LoRA support
Bug fixes
Extensions
- [extensions/openai] Fixes for: embeddings, tokens, better errors. +Docs update, +Images, +logit_bias/logprobs, +more. by @matatonic in #3122
v1.3.1
v1.3
Changes
- Llama-v2: add instruction template, autodetect the truncation length, add conversion documentation
- [GGML] Support for customizable RoPE by @randoentity in #3083
- Optimize llamacpp_hf (a bit)
- Add Airoboros-v1.2 template
- Disable "Autoload the model" by default
- Disable auto-loading at startup when only one model is available by @jllllll in #3187
- Don't unset the LoRA menu when loading a model
- Bump accelerate to 0.21.0
- Bump bitsandbytes to 0.40.2 (Windows wheels provided by @jllllll in #3186)
- Bump AutoGPTQ to 0.3.0 (loading LoRAs is now supported out of the box)
- Update LLaMA-v1 documentation
Bug fixes
v1.2
Changes
- Create llamacpp_HF loader by @oobabooga in #3062
- Make it possible to evaluate exllama perplexity by @oobabooga in #3138
- Add support for logits processors in extensions by @cyberfox in #3029
- Bump bitsandbytes to 0.40.1.post1 by @jllllll in #3156
- Bump llama cpp version by @ofirkris in #3160
- Increase alpha value limit for NTK RoPE scaling for exllama/exllama_HF by @Panchovix in #3149
- Decrease download timeout
Bug fixes
- Fix reload screen background color in dark mode
Extensions
- Color tokens by probability and/or perplexity by @SeanScripts in #3078
v1.1.1
v1.1
Changes
- Bump bitsandbytes Windows wheel by @jllllll in #3097 --
--load-in-4bit
is now a lot faster - Add support low vram mode on llama.cpp module by @gabriel-pena in #3076
- Add links/reference to new multimodal instructblip-pipeline in multimodal readme by @kjerk in #2947
- Add token authorization for downloading model by @fahadh4ilyas in #3067
- Add default environment variable values to docker compose file by @Josh-XT in #3102
- models/config.yaml: +platypus/gplatty, +longchat, +vicuna-33b, +Redmond-Hermes-Coder, +wizardcoder, +more by @matatonic in #2928
- Add context_instruct to API. Load default model instruction template … by @atriantafy in #2688
- Chat history download creates more detailed file names by @UnskilledWolf in #3051
- Disable wandb remote HTTP requests
- Add Feature to Log Sample of Training Dataset for Inspection by @practicaldreamer in #1711
- Add ability to load all text files from a subdirectory for training by @kizinfo in #1997
- Add Tensorboard/Weights and biases integration for training by @kabachuha in #2624
- Fix: Fixed the tokenization process of a raw dataset and improved its efficiency by @Nan-Do in #3035
- More robust and error prone training by @FartyPants in #3058
Bug fixes
- [Fixed] wbits and groupsize values from model not shown by @set-soft in #2977
- Fix API example for loading models by @vadi2 in #3101
- google flan T5 tokenizer download fix by @FartyPants in #3080
- Changed FormComponent to IOComponent by @ricardopinto in #3017
- respect model dir for downloads by @micsthepick in #3079
Extensions
- Fix send_pictures extension
- Elevenlabs tts fixes by @set-soft in #2959
- [extensions/openai]: Major openai extension updates & fixes by @matatonic in #3049
- substitu superboog Beatiful Soup Parser by @juhenriquez in #2996