Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Congrats guys this is a really neat hack, really impressive.

Did you guys think about building it further out to provide a GPU load balancer for multiple frontend machines running Cuda / OpenCL?



Hmm, can you elaborate? Do you mean having multiple smaller instances talk to a single GPU instance?


I'm talking about time-sharing. It doesn't mater if it's smaller instances sharing a single GPU instance or many instances sharing many GPU instances. Essentially N:M sharing (with some scheduling).

Since the GPU client is now abstracted from the GPU devices by placing the GPUs across the network. It seams like time-sharing should be next logical step.


Got it, this is actually already supported. At the very end of the blog post there is a link to create a custom configuration. You can create any N:M configuration, that is any number of clients to servers and therefore the level of performance scaling or GPU pooling.

Check it out: https://console.aws.amazon.com/cloudformation/home?region=us...


Great, thanks for clearing that up.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: