Why do so many companies insist on shipping their logs via Kafka? I can't imagine deliverability semantics are necessary with logs, and if they are, they shouldn't be in your logs?
Kafka is a big dumb pipe that moves the bytes real fast, it's ideal for shipping logs. It accepts huge volumes of tiny writes without breaking a sweat, which is exactly what you want--get the logs off the box ASAP and persisted somewhere else durably (e.g. replicated).
My experience has been a mixture of "when all you have is a hammer ..." and Pointy Haired Bosses LOVE kafka, and tend to default to it because it's what all their Pointy Haired Boss friends are using
In a more generous take, using some buffered ingest does help with not having to choose between a c500.128xl ingest machine and dropping messages, but I would never advocate for standing up kafka just for log buffering
at that point you are likely slowing down your applications - I think a basic OpenTelemetry collector mostly solves this, and if you go beyond the available buffer there, then dropping it is the appropriate choice for application logs.
Dropping may be an unacceptable choice for some applications, though. For example dropping request logs is really bad, because now you have no idea who is interacting with your service. If a security breach happens and your answer is "like, bro, idk what happened man, we load shedded the logs away" that's not a great look...
In log shipping cases it’s good as a buffer so you can batch writes to the underlying SIEM. This prevents tons of small API calls with a few hundred or thousand log lines each. Instead Kafka will take all the small calls and the SIEM can subscribe and turn them into much larger batches to write to the underlying storage (eg S3).
Don’t forget about all the added cost. never got it as many shops can tolerate data loss for their melt data. So long as it’s collected 99.9% of the time it’s good enough.
That is not a great test. Maybe it's just me but I don't often write code with the proper code style (or even semi-colons depending on the language) and let the IDE handle all of that.
This was my introduction to K3s as well. RPis running K3s with enough resources left over to actually do some small tasks. I hosted a small data pipeline that analyzed trading data from an MMO. It was as fun as it was impractical, and I learned quite a bit.
Damn, 30 days is quick. I found out about https://neon.tech but then quickly ran into a major bug, and then thankfully found out about bit.io, which is what I use for https://dittoed.app.
Looks like I will have to go back to neon (they fixed the bug).
If anyone has other ideas, I'm all ears. Project is hosted on Cloudflare and they have D1 now, but Dittoed uses a little bit of PostGIS.
Neon sounds good for you. I'd wager any kind of managed database is fine, so the question is if you enjoy the features/cost savings Neon brings. Otherwise I cannot recommend using a managed DB enough because that's the best 20 bucks you're gonna spend.