Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd love to hear more about what you're running, and on what hardware. Also, what is your use case? Thanks!


So I am running Ollama on Windows using an 10700k and 3080ti. I'm using models like Qwen3-coder (4/8b) and 2.5-coder 15b, Llama 3 instruct, etc. These models are very fast on my machine (~25-100 tokens per second depending on model)

My use case is custom software that I build and host that leverages LLMs for example for domotica where I use my Apple watch shortcuts to issue commands. I also created a VS2022 extension called Bropilot to replace Copilot with my locally hosted LLMs. Currently looking at fine tuning these type of models for work where I work in finance as a senior dev


Thank you. I'll take a look at Bropilot when I get set up locally.

Have a great week.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: