Living in a state of accord.

LMAX Disruptor – High Performance, Low Latency and Simple Too

The LMAX disruptor is an ultra-high performance, low-latency message exchange between threads. It's a bit like a queue on steroids (but quite a lot of steroids) and is one of the key innovations used to make the LMAX exchange run so fast. There is a rapidly growing set of information about what the disruptor is, why it's important and how it works – a good place to start is the list of articles and for the on-going stuff, follow LMAX Blogs. For really detailed stuff, there's also the white paper (PDF).

While the disruptor pattern is ultimately very simple to work with, setting up multiple consumers with the dependencies between them can require a bit too much boilerplate code for my liking. To make it quick and easy for 99% of cases, I've whipped up a simple DSL for the disruptor pattern. For example, to wire up a "diamond pattern" of consumers:

Diamond pattern of consumers. C3 depends on both C1 and C2 completing.

(Image blatantly stolen from Trisha Gee's excellent series explaining the disruptor pattern)

In this scenario, consumers C1 and C2 can process entries as soon as the producer (P1) puts them on the ring buffer (in parallel). However, consumer C3 has to wait for both C1 and C2 to complete before it processes the entries. In real life this might be because we need to both journal the data to disk (C1) and validate the data (C2) before we do the actual business logic (C3).

With the raw disruptor syntax, these consumers would be created with the following code:

Executor executor = Executors.newCachedThreadPool();
BatchHandler handler1 = new MyBatchHandler1();
BatchHandler handler2 = new MyBatchHandler2();
BatchHandler handler3 = new MyBatchHandler3()
RingBuffer ringBuffer = new RingBuffer(ENTRY_FACTORY, RING_BUFFER_SIZE);
ConsumerBarrier consumerBarrier1 = ringBuffer.createConsumerBarrier();
BatchConsumer consumer1 = new BatchConsumer(consumerBarrier1, handler1);
BatchConsumer consumer2 = new BatchConsumer(consumerBarrier1, handler2);
ConsumerBarrier consumerBarrier2 =
ringBuffer.createConsumerBarrier(consumer1, consumer2);
BatchConsumer consumer3 = new BatchConsumer(consumerBarrier2, handler3);
ProducerBarrier producerBarrier =

We have to create our actual handlers (the two instances of MyBatchHandler), plus consumer barriers, BatchConsumer instances and actually execute the consumers on their own threads. The DSL can handle pretty much all of that setup work for us with the end result being:

Executor executor = Executors.newCachedThreadPool();
BatchHandler handler1 = new MyBatchHandler1();
BatchHandler handler2 = new MyBatchHandler2();
BatchHandler handler3 = new MyBatchHandler3();
DisruptorWizard dw = new DisruptorWizard(ENTRY_FACTORY, RING_BUFFER_SIZE, executor);
dw.consumeWith(handler1, handler2).then(handler3);
ProducerBarrier producerBarrier = dw.createProducerBarrier();

We can even build parallel chains of consumers in a diamond pattern: Two chains of consumers running in parallel with a final consumer dependent on both.

(Thanks to Trish for using her fancy graphics tablet to create a decent version of this image instead of my original finger painting on an iPad…)

dw.consumeWith(handler1a, handler2a);
dw.after(handler1b, handler2b).consumeWith(handler3);
ProducerBarrier producerBarrier = dw.createProducerBarrier();

The DSL is quite new so any feedback on it would be greatly appreciated and of course feel free to fork it on GitHub and improve it.

  • Trisha says:

    I’ll get right on it. After my holiday :)

    July 11, 2011 at 11:19 am
  • Anton Tagunov says:

    Hi Adrian,

    do you happen to know if this stuff has got anything in common with HawtDispatch?
    I think both are “marketed” under the same monicers


    July 12, 2011 at 12:06 am
  • Adrian Sutton says:

    They are completely separate. From a quick glance at HawtDispatch they also attempt to solve slightly different problems. The disruptor is really about communication between producers and consumers – the fact that they are on separate threads is important and useful but not the key aim. LMAX hasn’t done a good job so far of describing exactly what the disruptor is and what it solves but we’re working on getting a good succinct description of it.

    July 12, 2011 at 7:09 am
  • Anton Tagunov says:

    I see.

    Is my understanding correct that the chief source of groundbreaking
    performance are the busy-waits?

    (Yes you have a different policy for consumers which does put threads to sleep
    but then I assume you loose the miraculous throughput I suppose.)

    (Yes the other source of high performance is the design which uses a single cyclic
    buffer to orchestrate a whole diamond shaped thing or a chain.)

    (Yes it’s a novelle thing for me that false sharing is to be avoided to
    make best use of CPU caches and reduce contention, kudos for pointing that out!)

    But the chief thing which makes a single producer-single consumer so fast are the busy waits.
    Is this right?

    For record I like this approach and I see where it has its place – if you do want the fastest thing in the world :)

    July 12, 2011 at 11:27 am
  • Anton Tagunov says:

    Yes another source of high performance is judicial use of memory (pre-allocation of adjacent areas in heap at jvm start).

    But if it hadn’t been for the busy-waiting you wouldn’t have been able to beat ArrayBlockingQueue in the Unicast (1P – 1C) scenario would you?

    July 12, 2011 at 11:57 am
  • Adrian Sutton says:

    The busy wait is an important part of the performance of the disruptor but by itself it’s not really a big innovation. Using compare and set primitives in a busy wait loop existed prior to the disruptor coming along. The busy loop is important because with a queue implementation the main bottleneck is the cost of locks (even when they’re uncontested). Using the locking wait strategy with the disruptor will still be faster than a queue but not by order of magnitudes.

    What really makes the disruptor perform well is the combination of all the things you mentioned. Two things are most often commented on around the office as key techniques:

    1. Strict separation of concerns and the two phase commit. . This enables a lot of the lock free behaviour as well as most of the smaller performance wins you note.

    2. Efficient batching behaviour that doesn’t increase latency. The consumers can run a lot faster because if they get behind a little they can process a whole batch of entries without needing to re-read the barrier location for each entry. That means less usage of shared state and the consumer runs faster. Latency is preserved by still incrementing the consumers indestructible after each entry is processed. This is again enabled by the separation of concerns so that the central ring buffer itself isn’t tracking where each consumer is up to like a queue would.

    It’s worth noting that I’m very much a student of the disruptor but have the pleasure of working with the masters who designed it. The white paper covers the design and benefits in quite good detail Bautista if you still have questions the google code mailing kist would be an excellent place to ask.

    July 12, 2011 at 1:19 pm
  • JW says:

    From a quick glance, and without knowing the details it appears this

    ConsumerBarrier consumerBarrier2 =
    ringBuffer.createConsumerBarrier(consumer1, consumer2);
    BatchConsumer consumer3 = new BatchConsumer(consumerBarrier1, handler3);

    should be

    BatchConsumer consumer3 = new BatchConsumer(consumerBarrier2, handler3);


    If so this mistake was also carried over into Trisha’s latest

    July 22, 2011 at 7:48 pm
  • Adrian Sutton says:

    You’re right, it should have been consumerBarrier2. In my defense, it was actually me that copied the mistake from Trish. :)

    July 22, 2011 at 8:17 pm

Your email address will not be published. Required fields are marked *