25 Commits

Author SHA1 Message Date
ffc079fb21
llama-cpp: use gpt-oss-20b-mxfp4 2025-08-05 19:53:20 -07:00
7f7dc03a20
extend nixpkgs's lib instead 2025-07-11 20:40:27 -07:00
ce7801795f
DeepSeek-R1-0528-Qwen3-8B 2025-06-06 20:05:42 -07:00
7c217d6ead
llama-cpp: use q8 quantization instead of q4 2025-05-28 21:20:42 -07:00
4dc577fdcb
llama-cpp: disable gpu 2025-05-28 21:09:45 -07:00
109e132497
llama-cpp: vulkan broken 2025-05-28 21:04:34 -07:00
d0da2591a3
llama-cpp: disable flash attn 2025-05-28 21:00:08 -07:00
a292c2fc75
llama-cpp: nvidia-acereason-7b 2025-05-28 20:59:45 -07:00
fb4043712e
llm: use vulkan 2025-04-30 12:59:07 -04:00
843b64a644
llm: use xiomo model 2025-04-30 11:22:52 -04:00
b00ad9a33a
deepcoder 14b 2025-04-14 13:11:40 -04:00
db78740db3
change llm model 2025-04-10 11:15:18 -04:00
fe85083810
llm: model stuff 2025-04-08 00:22:12 -04:00
3653e06c7d
create single function to optimize for system 2025-04-07 14:33:34 -04:00
b764d2de45
move optimizeWithFlags 2025-04-07 14:31:56 -04:00
a688d9e264
fmt 2025-04-02 23:19:06 -04:00
11164f0859
llm: use finetuned model 2025-04-02 10:11:41 -04:00
06feb4e1e2
gemma-3 27b 2025-03-31 21:52:14 -04:00
2d47c441fe
llm: use Q4_0 quants (faster) 2025-03-31 18:33:24 -04:00
c31635bdd7
format 2025-03-31 17:04:41 -04:00
1482429a00
llm: enable AVX2 2025-03-31 12:02:38 -04:00
6cc3d96362
llama-cpp: compiler optimizations 2025-03-31 11:17:56 -04:00
d5ac5c8cd8
gemma-3 12b 2025-03-31 10:31:29 -04:00
d774568e01
auth for llm 2025-03-31 10:29:36 -04:00
d34793c18f
add llama-server 2025-03-31 03:59:54 -04:00