This is a blatant violation of patient privacy. That the output is often hallucinated doesn't even matter here. If the hospital wants to use LLMs, better deploy them on-premise or a trusted network at least.
I can already see the Nextdoor post: "Watch out for this man who is knocking doors around 10th street! He knocked on mine claiming to be my nephew and even looked the part. Already called the police but they arrived late."
I am very interested about this and would like an authoritative answer on this. I even went as far as buying some books on code optimization in the context of HFT and I was not impressed. Not a single snippet of assembly; how are you optimizing anything if you don't look at what the compiler produces?
But on Java specifically: every Java object still has a 24-byte overhead. How doesn't that thrash your cache?
The advice on avoiding allocations in Java also results in terrible code. For example, in math libraries, you'll often see void Add(Vector3 a, Vector3 b, Vector3 our) as opposed to the more natural Vector3 Add(Vector3 a, Vector3 b). There you go, function composition goes out the window and the resulting code is garbage to read and write. Not even C is that bad; the compiler will optimize the temporaries away. So you end up with Java that is worse than a low-level imperative language.
And, as far as I know, the best GC for Java still incurs no less than 1ms pauses? I think the stock ones are as bad as 10ms. How anyone does low-latency anything in Java then boggles my mind.
This is obvious to anyone with a brain. I'm not familiar with scam logistics or the videos you mentioned, and the exact same line you put in quotes is what first came to my mind.
tl;dr of this post is that Google wants to lock down Android and be its gatekeeper. Every other point of discussion is just a distraction.
UMAs aren't made for speed, but for power savings. You are ignoring the fact that a discrete GPU accesses VRAM and caches at much higher bandwidths (and power) than an iGPU does RAM. Shared mem also comes at the cost of keeping it coherent between CPU/GPU. So you can't just look at one part of the system and then claim that UMAs must be faster because there are no data transfers.
And by the way, even on UMAs, the iGPU can still have a dedicated segment of memory not readable by the CPU. Therefore UMA does not imply there won't be data transfers.
This is really not the right comparison to make. An OS will use memory liberally. Give it more and it'll use more. Give it less and it'll swap to disk. So the real question is how long a given workload takes to complete, or whether you can multi-task without shitting out to/from disk every time you switch windows. "My OS uses X amount of RAM" is an entirely meaningless and irrelevant statement.
> the real question is how long a given workload takes to complete
The memory eaters most people are complaining about are not workloads, but shitty communication apps that keep all those cat pictures from the last 4 months uncompressed in ram...
This is exactly it. The people who release stuff under the GPL do so precisely because they want the software and derivatives to stay free. The software has strings attached; the AI removes them. What's so hard to understand here?
Carmack's argument makes no sense, but I guess it has "Carmack" in it so obviously it must be on the front page of HN.
reply