Releases: kvcache-ai/ktransformers
Releases · kvcache-ai/ktransformers
v0.2.1.post1
- fix precision bug imported in 0.2.1, add mmlu/mmlu_pro text, fix server #413
v0.2.1
v0.2.0
- Support Deepseek-R1 and V3 on single (24GB VRAM)/multi gpu and 382G DRAM
- Support use dual socket
v0.1.4
v0.1.3
v0.1.2
- Support windows native. #4
- Support multiple GPU. #8
- Support llamfile as linear backend.
- Support new model: mixtral 8 * 7B and 8 * 22B
- Support q2k, q3k, q5k dequant on gpu. #16
- Support github action to create pre compile package
- Support shared memory in different operator
- Fix some bugs on build from source #23
v0.1.1
- support multiple cpu architecture pre compiled wheel package
- pre compile wheel package support multiple TORCH_CUDA_ARCH_LIST as "8.0;8.6;8.7;8.9"
- test and support python 3.10
- add a dockerfile to build docker image
- update README.md to support docker (In Progress: upload docker image)
- update version to 0.1.1
0.1.0
- Complete the submission information for PyPI.
- Support for dynamically detecting the current environment of the client. If precompiled packages can be used for installation, download and install the precompiled packages. (Adapted from flash-attn)
- Modified the installation process in the README.