llm_load_tensors: ggml ctx size = 0.16 MB llm_load_tensors: using CUDA for GPU acceleration llm_load_tensors: mem required = 9363.40 MB llm_load_tensors: offloading 6 repeating layers to GPU llm_load_tensors: offloaded 6/43 layers to GPU llm_load_tensors: VRAM used: 1637.37 MB .................................................................................GGML_ASSERT: D:\a\llama-cpp-python-cuBLAS-wheels\llama-cpp-python-cuBLAS-wheels\vendor\llama.cpp\ggml-cuda.cu:5925: false