Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, I looked up some models I have actually run locally on my Strix Halo laptop, and its saying I should have much lower performance than I actually have on models I've tested.

For MoE models, it should be using the active parameters in memory bandwidth computation, not the total parameters.

 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: