The concurrency model in Node is essentially the reactor pattern. While there ar...

rdtsc · on Jan 27, 2016

A few places I could see using the reactor pattern over the "concurrency unit (i.e. green process, thread, co-routine) per connection context" is in very high performance scenarios and where callback chains are very short.

Think proxies (haproxy), routers, forwarders, web servers (nginx) etc. Where memory context per connection should be minimal, and everything should be as close as possible to the select/poll/epoll loop.

Funny enough, this also includes demos, quick example scripts, and benchmarks. I wonder if that what hooked most people to the reactor pattern -- small examples in Twisted, Node, etc, look pretty easy and simple. But when you start adding business logic and callback chains evolve into callback/errback trees 10 levels deeps when things get very scary.

unscaled · on Jan 27, 2016

You make a very important point here. When you're creating a very high-throughput network server, you want to use as little Reactor and proactor I/O makes it very easy to reason about execution order and more importantly memory usage. This kind of code usually gets written in C (or at the very best C++), and minimum memory usage per-socket is carefully planned.

Co-routine context and and green thread stack size can be tweaked and optimized sometimes, but if you want to have precise control over memory allocated per-connection, reactor/proactor is hard to beat.

Besides, to be 100% honest, C and C++ just doesn't have native support for any other concurrency model. That's probably the main reason why Node.js was designed to use callbacks. I'm pretty sure it would use generators or async functions if it was designed today.

klodolph · on Jan 27, 2016

Funny thing about "high-performance scenarios" is that the reactor pattern is often slower than other patterns. If you want to use the reactor pattern, you have to use asynchronous IO. But, asynchronous IO involves making more syscalls. Blocking IO is actually rather fast. The performance is differences are going to depend on a lot of particulars.

unscaled · on Jan 27, 2016

If someone has told you synchronous I/O is slow, they've obviously misunderstood the entire C10k argument going on for a while.

The problem with synchronous I/O is not speed, but blocking. Blocking means the only possible concurrency model is based on processes or threads. Consequentially, most performance problems with synchronous I/O stem from the thread model: context switching, synchronization, memory overhead, thread-pool starvation and so on.

rdtsc · on Jan 27, 2016

> Blocking IO is actually rather fast.

Right. Yeah there was a presentation about Java concurrency patterns about that. Basically dispelling the myth that is everything non-blocking and asynchronous will be faster than the old school blocking thread / socket.

Interestingly, I believe, haproxy and nginx use a hybrid model. They have a worker thread per CPU, all listening to the same socket using epoll from multiple threads! When data arrives they all get woken up and then use a shared mutex to decide which one will handle the request. (Now later kernels fixed that one can have exclusive wake-up across the same socket).

unscaled · on Jan 27, 2016

Nginx implementation may have changed since the last time I delved into it (and I delved pretty deep), but at least 2 years ago that was not the case. Nginx did have some vestiges of an abandoned multi-threaded implementation attempt from the 0.x days behind compiler flags, but beyond that it was thoroughly single-threaded, but there is a master process which forks itself into multiple processes.

There is a so-called accept_mutex, but I'm pretty sure you can't avoid that if you want to have multiple cores handle connections from the same port. Even Erlang would have to do that somewhere behind the scenes. Newer Linux kernel versions support the SO_REUSEPORT which is meant to address this situation - I guess this is what you're referring as exclusive wake-up accross the same socket?

kuschku · on Jan 27, 2016

Interestingly, Java 7’s "native non-blocking IO" actually uses exactly that – you have a ServerSocket, and, if a user connects to the server, it spawns a Thread and gives it a normal SocketChannel.

nwmcsween · on Jan 27, 2016

This completely depends on the kernel.

nyan4 · on Jan 27, 2016

> proxies (haproxy), routers, forwarders, web servers (nginx)

...all stuff that people should not implement in javascript in the first place.

rdtsc · on Jan 27, 2016

Someone at a place I used to work got the Node.js fever and stuck node's http proxy in front of our public facing servers. That was an epic disaster from multiple points of view: stability, performance, debuggability etc.

spion · on Jan 27, 2016

Or you use promises/observables (with arrow functions) or generators or async functions, and they don't. 2014 called, they want their node problems back.

Basically almost none of the criticism in this whole thread is true today. The only remaining true bit is that node still relies on cooperative multitasking and one of the tasks can hog the CPU of a single worker. Which isn't very good, but still, way better than say, the good old Rails 1 request per process model.

Someone should probably do a proper benchmark to show how different numbers of workers at different CPU workloads affect a node service's response time / latency. Especially with multiple processes (cluster), I would bet the effect would be much better than what people expect it to be.

klodolph · on Jan 27, 2016

We should remember to distinguish between concurrent, parallel, and asynchronous programming. UI is a bad example for the reactor pattern, since the reactor pattern is for concurrent processing and UIs should usually process events sequentially (but asynchronously). That's just a bog-standard event loop, not the reactor pattern.

"Practicing" different concurrency patterns sounds like a lot of tedious effort, and the suggestion to "go back to Node" seems a bit high-handed to me. I would encourage everyone to pick and choose what expertise they want to develop.

frozenport · on Jan 27, 2016

Actually UIs process data synchronously, that's why - Qt for example doesn't need to std::mutex everything.

klodolph · on Jan 27, 2016

When I say "asynchrounous" I'm talking about the arrival of events. It's not a particularly well-defined word, though.

_pmf_ · on Jan 27, 2016

> The concurrency model in Node is essentially the reactor pattern. While there are lots of concurrency patterns that you should learn, reactor pattern is also one of them.

A usable reactor pattern requires green threads (or threads) . Node.js does classic IO multiplexing (i.e. what has been available in C since the introduction of select()). It's not bad, but don't delude yourself into believing node.js does anything new.

im_down_w_otp · on Jan 27, 2016

Um... you can get the "reactor pattern" (I really hate SV rebranding of stuff that's existed forever by another name) with gen_event and gen_server in Erlang.

The pattern isn't special or strongly correlated to node.js

gpderetta · on Feb 1, 2016

Not sure how long is your 'forever', but FWIW, the name Reactor Pattern is at least as old as the ACE network library (which if not invented at least popularized it), i.e. circa 1995.

That predates the open sourcing of Erlang by a few years.

Also ACE didn't originate in SV.