Qwen3.5
7 articles tagged with this topic
Gemma 4 and Qwen 3.5 GGUFs: Detailed Analysis by oobabooga
Oobabooga published 5 benchmark reports covering 70-90 GGUF quants each for Gemma 4 and Qwen 3.5 models using KL Divergence methodology.
Qwen3.5-9B GGUF Quant Rankings: Q8_0 Dominates KLD Scores
KLD benchmarks across community GGUF quants show Q8_0 variants cluster near 0.001 KLD, with quality degrading shar ply below Q5.
DFlash speculative decoding on Apple Silicon: 4.1x on Qwen3.5-9B, now open source (MLX, M5 Max)
Open-source DFlash achiev es 4.13x speedup on Qwen3.5-9B using MLX on M5 Max with 89.4% token acceptance rate.
Hitoku, open-source local macOS context aware assistant with Qwen3.5/Gemma4
Open-source macOS assistant runs Gemma 4 and Qwen 3.5 fully on-device with screen and document context .
Qwen 3.5 35B Benchmarks: Vulkan vs ROCm on AMD Strix Halo
Vulkan wins token generation (~57.5 t/s) while ROCm leads prompt processing (~1052 t/s) on AMD Ryzen AI MAX+ 395.
Gemma-4 E4B Vision Benchmarked: Scores 0.27 vs Qwen3.5-4B's 0.5
Community testing shows Gemma-4 E4B scores 0.27 on 100 vision tasks vs Qwen3.5-4B's baseline 0.5, raising red flags for multimodal use.
Qwen3.5 vs Gemma4 vs Cloud LLMs: Python Turtle Drawing Benchmark
A Reddit user benchmarks local and cloud LLMs on Python turtle graphics, revealing Gemma4 and Gemini share visual style.