Linux latency measurements and compositor tuning [KWin Wayland]
2 points, 1 comments on Hacker News
3 headlines
2 points, 1 comments on Hacker News
My MTP post showed multi-token prediction roughly doubling Qwen3. 6-27B's generation on a 3090. A reader asked the question I'd skipped: what about prompt processing at long context ?
1 points, 0 comments on Hacker News