Granite 4.1: IBM's 8B Model Matching 32B MoE

firethering.com - 96 poäng - 36 kommentarer - 5601 sekunder sedan

Kommentarer (9)

2ndorderthought - 4195 sekunder sedan
I test drove it yesterday. It's pretty impressive at 8b. Runs on commodity hardware quickly.
Qwen3.6 35b a3b is still my local champion but I may use this for auto complete and small tasks. It has recent training data which is nice. If the other small models got fine tuned on recent data I don't know if I would use this at all, but that alone makes it pretty decent.
The 4b they released was not good for my needs but could probably handle tool calls or something
cbg0 - 2571 sekunder sedan
The real "sleeper" might be https://huggingface.co/ibm-granite/granite-vision-4.1-4b if the benchmarks hold up for such a small model against frontier models for table & semantic k:v extraction.
Havoc - 4160 sekunder sedan
Interesting to see a pivot away from MoE by both IBM and mistral while the larger classes of SOTA of models all seem to be sticking to it.
Quick vibe check of it- 8B @ Q6 - seems promising. Bit of a clinical tone, but can see that being useful for data processing and similar. You don't really want a LLM that spams you with emojis sometimes...
tosh - 1797 sekunder sedan
IBM announcement: https://research.ibm.com/blog/granite-4-1-ai-foundation-mode...
agunapal - 1793 sekunder sedan
If you really think about why MoE came into existence, its to save significant cost during training, I don't think there was any concrete evidence of performance gains for comparable MoE vs dense models. Over the years, I believe all the new techniques being employed in post training have made the models better.
100ms - 3051 sekunder sedan
> Full stop.
Why people don't edit out obvious sloppification and expect to still have readers left
RugnirViking - 4372 sekunder sedan
sounds interesting. Here's hoping they release a 32B model, thats a pretty good sweet spot for feasibility of home setups.
edit: I just realised they do actually have a 30b release alongside this. Haven't tried it yet.
mdp2021 - 4477 sekunder sedan
Wish they also released an embedding model, in the line of their previous: compact (while good)...
whalesalad - 645 sekunder sedan
[flagged]