Michael Yang
adff143bcd
fix: mllama quality ( #10807 )
...
* fix mllama convert
- transform attn_gate and ffn_gate
- swap attention heads for vision models
* fix mllama
the mlp gate which was applied in the wrong place
2025-05-22 11:30:49 -07:00
..
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-15 12:15:01 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-22 11:30:49 -07:00
2025-05-06 11:20:48 -07:00
2025-05-13 20:58:02 -07:00
2025-05-13 20:58:02 -07:00
2025-05-19 09:54:22 -07:00
2025-05-16 13:40:23 -07:00
2025-05-13 17:36:02 -07:00
2025-05-13 20:58:02 -07:00
2025-05-16 13:40:23 -07:00
2025-05-16 13:40:23 -07:00