Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In practice most projects seem to use Yjs rather than Automerge. Is there an up-to-date comparison of the two? Has anyone here chosen Automerge over Yjs?


I’m quite familiar with both, having spent some time building a crdt library of my own. The authors of both projects are lovely humans. There are quite a lot of small differences that might matter to some people:

- Yjs is mostly made by a single author (Kevin Jahns). It does not store the full document history, but it does support arbitrarily many checkpoints which you can rewind a document to. Yjs is written in JavaScript. There’s a rust rewrite (Yrs) but it’s significantly slower than the JavaScript version for some reason. (5-10x slower last I checked).

- Automerge was started by Martin Kleppmann, Cambridge professor and author of Designing Data Intensive Applications. They have some funding now and as I understand it there are people working on it full time. To me it feels a bit more researchy - for example the team has been working on Byzantine fault tolerance features, rich text and other interesting but novel stuff. These days it’s written in rust, with wasm builds for the web. Automerge stores the entire history of a document, so unlike Yjs, deleted items are stored forever - with the costs and benefits that brings. Automerge is also significantly slower and less memory efficient than Yjs for large text documents. (It takes ~10 seconds & 200mb of ram to load a 100 page document in my tests.) I’m assured the team is working on optimisations; which is good because I would very much like to see more attention in that area.

They’re both good projects, but honestly both could use a lot of love. I’d love to have a “SQLite of local first software”. I think we’re close, but not quite there yet.

(There are some much faster test based CRDTs around if that’s your jam. Aside from my own work, Cola is also a very impressive and clean - and orders of magnitude faster than Yjs and automerge.)


> I’d love to have a “SQLite of local first software”

We have recently published a new research paper on replicating SQLite [1] in a local-first manner. We think it goes a step closer to that goal.

[1] https://inria.hal.science/hal-04580135/document


It looks very similar to Evolu (https://github.com/evoluhq/evolu)


cr-sqlite https://github.com/vlcn-io/cr-sqlite :

> Convergent, Replicated SQLite. Multi-writer and CRDT support for SQLite

From "SQLedge: Replicate Postgres to SQLite on the Edge" (2023) https://news.ycombinator.com/item?id=37063238#37067980 :

>> In technical terms: cr-sqlite adds multi-master replication and partition tolerance to SQLite via conflict free replicated data types (CRDTs) and/or causally ordered event logs


this also looks promising: https://braid.org/ working with the IETF to standardize.


I helped coauthor some of the early drafts of Braid. Braid isn’t an attempt to make a local first, crdt based eventually consistent data store. It’s just a protocol.

Braid aims to make it easy for such systems, as they’re built, to be able to talk to each other.


I remember you describing Ropey's author as a "lovely human" too, and want to say that "it takes one to know one/real recognises real". :)


> (5-10x slower last I checked)

This was a thing around 2 years ago. Nowadays speeds is the same or in favor of Rust, depending on the benchmark in question.


It was still much slower ~6 months ago when I benchmarked it. I’ll rerun my benchmarks and confirm one way or another.


Oh amazing - it looks like the GP commenter is right. Yrs is significantly faster now than it was when I benchmarked it a few months ago. I'd update my comment above, but its too late.

For example, in one of my tests I'm seeing these times:

Yjs: 74ms

Yrs: 9.5ms

That's exceptionally fast.

This speedup seems to be consistent throughout my testing data. For comparison, automerge takes 1100ms to load the same editing history from disk, using its own file format. I'd really love to see automerge be competitive here.

(The test is loading / merging a saved text editing trace from a yjs file, recorded from a single user typing about 100 pages of text).


Sooo we’re building “SQLite for local-first development”, it’s here! Uses CRDTs, can be a partially replicated db, peer to peer networking and discovery.

Bruinen.co

Shoot me a note if you want an early build! Or if interested in building with us :)

tevon [at] bruinen.co


Using a closed-source DB is a hard sell.


Agreed, especially for a local first app.

Making the app work without an internet connection is step one. Making it reparable without an internet connection is step two.

Step two is blocked if you can't keep the code for all of the app's dependencies near enough at hand such that its still accessible after the network partitions.


How do you deal with persistence with the various solutions? What do you actually have to serialize to a db?


I use yjs myself and you can choose to serialise anything you like, most solutions allow saving snapshots of the document state. You can also store any incoming changes too for more fine grained undos and redos etc. AFAIK the state in typical solutions is a binary representation.


Yjs produces binary blobs for everything.


I'm curious to hear your thoughts on loro


Many projects use Yjs for its collaborative rich-text editing (e.g. Linear: https://x.com/artman/status/1733419888654291102). Yjs makes this easy by providing "bindings" to various rich-text editor GUIs, which sync between the editor's internal data structures and Yjs - something that involves a lot of detail work. Automerge's rich-text support is more recent (~last year), and so far they only have one editor binding (ProseMirror), so Yjs is naturally more popular here.

For non-text collaboration, there is a more crowded "market", because it is an easier problem to solve - at least when your app has a central server. Tools range from hosted platforms like Firebase RTDB to DIY solutions like Figma's (https://www.figma.com/blog/how-figmas-multiplayer-technology...). Meanwhile, Automerge's target niche is decentralized collaborative apps, which are rarer.


I’ve been working on a personal finance tracker that uses automerge as the primary backing store.

One trick we’ve been pulling is tailing the automerge contents into a sqlite db in-browser for more complex querying.

(some notes on how/why here: https://tender.run/blog/tender-and-crdts)


In a recent (abandoned) project, I used Reflect: https://reflect.net/

It was by far the most developer-friendly experience I've had trying to implement collaborative editing. The one thing it didn't have that YJS did was built-in undo/redo.


what about PartyKit?

okay looks like partykit hides some extra metered costs with a call us button

reflect's pricing seems a lot easier to understand!

just trying to think how this will work with cloudflare + fastify


Interesting, hadn't heard of PartyKit. FWIW Reflect seems to be pivoting sometime over the next 6 months, and they're going to open source their code, along with instructions on how to self-host. So if you're looking for long-term cost reduction, that might not be a bad choice.


yJS has the webrtc adapter and appears to still win out in the edit benchmarks. I’ve used yJS in two projects: once just for presence chat and syncing menus for a coaching site and once in a overlay graphics app for a livestream

Biggest problem with yJS for me has been the ergonomics when I use it with React. There’s a third party project called synced store that I used for the stream overlay but it has some strange behavior.

With first party support with React in automerge I think it’s worth a shot for my rewrite


The big yjs problem for me is the documentation. A few of the most important sections are just “todo” placeholders, like the “how to write a provider” section.

It’s great at first, but woefully underdocumented if you want to use it in a way that doesn’t have off the shelf support, and the code is tough to parse (for me at least). Same with subdocuments.

I wanted to use it with Lexical a while back, and yjs plugin was too tightly coupled too a single data model and was too complicated to DIY.


Yjs has a bootstrap problem. To get ot to sync you need to define the root schema and add data, otherwise updates ignore arbitrary keys.

The best way to manage is starting with a raw empty doc, defind all the keys, then update that with a custom blob.

Once you have this doc on your clients, its just a matter of applying the latest updates.

But yeah, Yjs has an eccentric code structure which is impossible to parse without understanding the ops rrquired.

Id look at indexeddb and webrtc plugins


In my case, my problem was wanting to sync key A’s data to/from Alice, and key B’s to/from both Alice and Bob. It was unclear to me where and when to apply which updates involved in the yjs protocols (which i couldn’t find any documentation on).


There's also Microsoft's Fluid Framework and Azure Fluid Relay which is powering their O365/SharePoint product.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: