Theron - C++ concurrency library

More performance results

Published today, some performance results for Theron running on a "proper" machine, specifically a four-core Intel Xenon X5550 2.67GHz.

http://theron.ashtonmason.net/index.php?t=page&p=performance

Like the earlier results, these are timing measurements for the ThreadRing benchmark included in Theron-2.05.00. They confirm that Theron is fast. A "token" message is passed around a ring of 503 actors, with each actor forwarding the message to the next in the ring. The time for 50 million actor-to-actor "hops" is around 8 seconds, which compares with the times for the same benchmark in Erlang.

A few caveats apply, of course. For one, Theron implements a considerably stripped-down version of the Actor Model and doesn't offer the full functionality of Erlang. It also isn't a language and so brings with it the hazards of C++ -- you can shoot yourself in the foot, basically. But for those that want the convenience of the Actor Model in a C++ environment, Theron provides the raw performance.

Another thing to note is that Theron is effectively running on a single thread in the ThreadRing benchmark, due to the use of tail-call optimization. The process of sending a message from one actor to another around a ring is essentially a serial operation with no opportunity for significant parallelism. Therefore the smart way to do it is using a single thread which "executes" each actor in turn. That's what Theron does, in this case. You can specify more worker threads, but the results are the same since the additional threads offer no advantage, in this very specialized case, and so just spend the whole benchmark asleep.

In that sense ThreadRing is quite limited, as a test of parallelism. It's a test of message passing overheads, from Theron's point of the view. The key to running it fast is to make the queueing of messages and the dispatch of "dirty" actors to the worker threads as cheap as possible. The results suggest Theron's implementation is pretty efficient.

A useful next step would be to write a benchmark which shows true parallelism, and how it can be exploited easily and to good effect in Theron.

Story published 5 July 2010.