Theron - C++ concurrency library

Version 6.00.02 released

Just a patch release with a couple of build fixes thanks to Josh Blum. One of them fixes build errors in Visual Studio 2012.

Story published 18 January 2014.

Version 6.00.01 released

Released a 6.00.01 patch to fix broken C++11 GCC build. Thanks to Dmitry Sobinov for the fix.

I also noticed that the CustomAllocator tutorial sample is currently broken, and the use of custom allocators is likely broken too. Aiming to fix this in the next few days.

Story published 22 October 2013.

Removed the forum

I've removed the forum from the website. Thanks to those who contributed to it, but because it never seemed to reach its potential as a community, and because I've found that personal email actually works better, I've decided to remove it and save myself the maintenance. Please feel free to mail me directly in future, and I'll be using this news feed for announcements as before.

Story published 20 October 2013.

Version 6.00.00 released

After a period of inactivity, I've released Theron 6. This version re-introduces support for condition-variable-based thread synchronization, last seen in Theron 3.

The spinlock-based synchronization introduced in Theron 4 is still available.

The hope is that by providing both options (with the ability to specify the strategy on a per-Framework basis), Theron can support both those who value throughput and those who value low latency. While spinlocks are a useful tool for low latency coders, they don't necessarily represent a good default strategy for general users. See Framework::Parameters for more info.

Theron 6 also fixes several long-standing bugs reported by users. Thanks to them and also to Josh Blum, without whom this release would never have happened. Lastly if you spot any errors let me know. Thanks.

Story published 20 October 2013.

Can't use Theron classes as global variables

A user has pointed out that in Theron 5, Theron classes can't be instantiated as global/static variables. In other words, they must always be constructed as local variables or dynamically via new(). This is true for basically all Theron classes, right from Theron::Address, through Theron::Actor, to Theron::EndPoint. I need to decide how I want to fix this, but for now, note that using Theron classes as global variables is out. Thanks to Geir Horn for reporting the bug.

Story published 7 March 2013.

Version 5.01.01 released

I released a patch for the 5.01 release, 5.01.01, which fixes some build warnings in Visual Studio x64 builds. Thanks to Brandon Rampersad for pointing these out!

Story published 14 December 2012.

Version 5.01.00 released

Theron 5.01 has been released, containing some new features and some bugfixes. Take a look at this forum post for more info:

http://forum.theron-library.com/viewtopic.php?f=8&t=39

Story published 10 November 2012.

Version 5.00.03 released

Yet another patch release for build breaks - apologies for the churn. I'm having to test a few different configurations, and seem to be forgetting to test some of them when I make fixes! This patch fixes a couple of build errors in Visual Studio builds.

Story published 20 September 2012.

Version 5.00.02 released

I've released another patch, 5.00.02, for the recent 5.0 release. The patch fixes some build errors which showed up in some situations.

Story published 17 September 2012.

Version 5.00.01 released

I've released a 5.00.01 patch for the recent 5.0 release. The patch fixes a bug where null addresses couldn't be used as the 'from' address when sending a message to a remote actor.

If you try out the remote/distributed actor support introduced in version 5 then let me know how it goes! For details see this page:

http://www.theron-library.com/index.php?t=page&p=distributed%20computing

Story published 16 September 2012.

Version 5.00.00 released: Distributed actor support!

Version 5 adds support for remote and distributed actors. Multiple Theron-based applications can be connected over a network, and messages can then be sent between actors (and receivers) on remote hosts. This release adds initial support: there are some limitations and there may also be bugs. See the release notes link below and please report any bugs by email to ash@ashtonmason.net or in the forum.

http://www.theron-library.com/index.php?t=notes&p=43

Story published 12 September 2012.

Version 4.02.02 released

Theron 4.02.02 now out: a patch release with some minor bugfixes. The most important is an improvement to the worst-case message response latency, especially in the case of sending messages between actors in different frameworks. Thanks to Josh Blum for alerting me to this.

Story published 4 September 2012.

Version 4.02.01 released

I patched the 4.02 release to fix a distinct lack of API reference documentation for the Framework class.

Story published 20 August 2012.

Version 4.02.00 released

Version 4.02 adds the API from version 3 that was initially missing from version 4. This includes functions for dynamically managing the size of a Framework's internal thread pool, functions for querying per-Framework event counters, and functions for querying the number of unprocessed messages queued at an actor.

Story published 20 August 2012.

Version 4.01.01 released

I've released a patch, 4.01.01, for the recent 4.01 release. The patch fixes a warning about unreachable code in ActorRef::Push().

Story published 19 August 2012.

Version 4.01.00 released

I've released version 4.01, which adds backwards compatibility with existing (3.x) code in the area of actor creation. See this forum post for details:
http://forum.theron-library.com/viewtopic.php?f=8&t=31

Story published 16 August 2012.

Version 4.00.00 released

Today I've released version 4.00.00, the first release of Theron 4. Version 4 represents a major change, with some significant API changes and a complete internal redesign. Treat 4.00.00 as an initial release: some features are missing and there may still be bugs.

See this forum announcement for more details:
http://forum.theron-library.com/viewtopic.php?f=8&t=28

Story published 15 August 2012.

Version 3.05.01 released

I just released a patch for version 3.05. The patch fixes a threading issue that could show up, I believe, when messages were sent to actors just as they were created or destroyed. Apologies to anyone affected.

Story published 17 April 2012.

Version 3.05.00 released

Out today, version 3.05.00. This release includes an improvement to Theron's internal memory block caching, and adds separate build configurations for the Just Thread threading library. The memory access pattern improvements give a significant speedup in benchmarks on some platforms.

Story published 11 April 2012.

Version 3.04.01 released

Release 3.04.01 is a patch release with a few minor bugfixes, mainly to gcc builds. I've added the -pthread option to the link flags in Boost builds, since it seems to be required in my testing on Linux. I've also fixed a bunch of cast warnings that showed up with gcc. Finally, the PingPong benchmark wasn't being built in the makefile, so I added it.

Story published 9 April 2012.

Build changes and the Getting Started page

As you may have seen, the 3.04 release of Theron introduces support for the new standard thread component of C++11.

As part of this, there have been some changes to the builds in both Linux and Windows. The makefile now supports a threads=std option, and the Visual Studio solution now has multiple configurations that allow the thread library to be selected from the Configuration drop-down listbox within the IDE. The available configurations are:

Boost Debug
Boost Release

Windows Debug
Windows Release

Std Debug
Std Release

On both platforms there are complications arising from the dependencies on external libraries. With Boost.Thread there is the dependency on Boost, which can live anywhere but typically is found in different ways on Linux and Windows. With std::thread there is the limited support for C++11 features in even recent compilers. And with Windows threads one of course needs to be using Windows!

When using std::thread in GCC builds you need to be using a recent version of GCC with support for the latest C++ features. When using Visual Studio your options are more limited since AFAIK only the beta of Visual Studio 2011 has support for the standard thread library. Since I'm using Visual Studio 2010 at present I'm testing std::thread support with Just Thread, which is a third-party implementation. For that reason the Std Debug and Std Release configurations in the Visual Studio solution are currrently set up to expect Just Thread. All of this will simplify once std::thread becomes more widely supported, whereupon I may be able to remove support for Boost and Windows threads completely.

For now, take a look at the Getting Started page on the website as you begin to use the 3.04 release. I've tried hard to make it a more useful introduction to building with Theron, and it now covers all this stuff in some useful detail.
http://www.theron-library.com/index.php?t=page&p=getting%20started

Story published 1 April 2012.

Version 3.04.00 released

Version 3.04 adds support for std::thread, the new standard threading component of C++. This means that Theron can now use std::thread as its underlying thread library, as well as the existing support for Boost.Thread and Windows threads, which are both still supported.

The std::thread support consists of new implementations of the simple four-class API with which Theron wraps its underlying thread library, hiding the differences between different libraries. The new Std implementations were contributed by Chinasaur and are similar to those already provided for Boost.Thread, due to the basis of std::thread on Boost.Thread.

Support for std::thread is enabled via a new THERON_USE_STD_THREADS define, which defaults to 0. As well as the code support, support for std::thread extends to the included makefile and Visual Studio solution. The makefile now responds to threads=std as well as the existing threads=boost and threads=windows. The Visual Studio solution now includes separate configurations for building with the three supported thread libraries. The configurations are implemented using the new Property Sheet mechanism introduced in Visual Studio 2010, which allows the settings for a configuration to be specified consistently in one place for multiple projects within a solution.

The Getting Started page on the website has been completely rewritten to reflect these build changes, and is now much more comprehensive. Consult it for help with the new build options. It is expected that some tweaking of build settings such as include paths, library paths and library names may be required.

Finally, some notes. The location of Boost is assumed to be /usr/include and /usr/lib in makefile builds, and $(BOOST_LOCATION) in Visual Studio builds. Support for std::thread in Visual Studio is currently tested against Just Thread, a commercial third-party implementation of std::thread, and the library and include paths are set up accordingly. Makefile builds, on the other hand, have no dependency on Just Thread and instead assume a recent version of GCC with support for C++11 features such as std::thread.

Story published 1 April 2012.

Version 3.03.02 released

The 3.03.02 patch release is out of beta and has now been released. The released version is similar to the beta but does contain an additional fix to an issue introduced in the beta.

The 3.03.02 patch fixes several important bugs. The most important is switching to boost::mutex rather than the expensive boost::recursive_mutex in Boost.Thread builds. Originally I believed recursive mutex was needed to allow recursive construction of actors by other actors, but more recently I found a way to do it with the standard boost::mutex. As well as seeming to be considerably faster, the standard boost::mutex has the advantage of not hiding unintentional repeated locks of the same mutex. Several such cases have been tracked down and fixed.

I had hoped that the switch to boost::mutex would resolve a severe performance issue reported by Neverlord, where Theron's performance seemed to degrade rapidly as the number of cores (and hardware threads) in the system increased. In the end the reported issue appears to be limited to systems with multiple physical CPUs -- systems with multiple cores on a single CPU are unaffected. But the move to boost::mutex doesn't fix the issue, and attempts to debug it have so far proved unsuccessful.

The issue shows up in both GCC (Boost.Thread) builds on Linux and Visual Studio (Windows Threads) builds on Windows. But as I say, systems with only one physical CPU are unaffected. My best guess is that it's caused by memory sharing (or perhaps false sharing?) causing excessive cache synchronization between the CPUs.

The patch does fix a number of other smaller issues, including two reported helpfully by Dan Newmarch. The first broke support for Actor types that derived from several base classes with multiple inheritance, and where Theron::Actor wasn't the first baseclass in the list. The second broke support for messages of zero size (empty structs) on ARM platforms, resulting in crashes.

Story published 31 March 2012.

Version 3.03.02 (beta)

I've uploaded a beta version of a 3.03.02 patch release. This is an early version which I've uploaded in case it's useful to anyone who's encountered one of the several bugs it claims to fix.

The main fix is switching to use boost::mutex rather than boost::recursive_mutex in Boost.Thread builds, which I hope will cure a strange performance issue, reported by libcppa author Neverlord, where performance would degrade rapidly as the number of system cores increased.

I haven't yet managed to confirm the fix, but it looks promising. There are also fixes for several other issues, including some pointed out usefully by users. See the release notes for info; More details to follow with the actual release, which I expect will be in the next few days.

Story published 29 March 2012.

Version 3.03.01 released

Version 3.03.01 is a patch release to fix a pointer alignment bug that caused occasional crashes on 64-bit platforms. Thanks to Josh Blum and Nicolas Thomasson for pointing it out.

Story published 12 March 2012.

Theron forum

The Theron website now has a forum! I hope this will allow us to share the knowledge, help each other out, and get to know one another a bit better.

Drop by and sign up at http://forum.theron-library.com

Story published 23 October 2011.

Version 3.03: Optimizations, bug-fixes, new features, plus x64 support

Version 3.03 is a mixture of bug-fixes, optimizations and new features.

Theron now supports x64 builds in Visual Studio, and comes with a Visual Studio 2010 solution.

Raw message processing performance has been improved by between 5% and 10%, plus further 20% reduction in benchmark times due to message type registration and x64 support. See the performance page on the website.

New features include reporting of undelivered and unhandled messages. Users can now register a per-framework 'fallback' handler which is executed for unhandled messages. The default handler asserts and dumps the details of the message to aid debugging.

It's also now possible to query an actor for the number of messages it has pending, useful for load balancing within actor subsystems.

Thanks to 4xCoder for helpful suggestions.

Story published 23 October 2011.

Version 3.02.01: Fixes for x64 warnings

Version 3.02.01 is a patch release, and fixes some build warnings seen in x64 builds. In future Theron will be adding full support for 64-bit builds, including a Visual Studio 2010 solution file for Windows users. But for now this patch just fixes the warnings. Thanks to 4xCoder for pointing these out.

Story published 17 October 2011.

Improvements to API documentation

In tandem with the release of 3.02, some work has been done to substantially improve the quality of the included API reference documentation, which is also available online:
http://www.theron-library.com/docs/3.02/

Story published 16 October 2011.

Version 3.02.00 released

Another week, another release. Version 3.02 is all about performance, with a new PingPong benchmark and reductions in message processing overheads of 15% or more compared to 3.01. See the Performance page on the website for the juicy details:
http://www.theron-library.com/index.php?t=page&p=performance

Story published 16 October 2011.

Version 3.01.00 released

Version 3.01 is a minor release and improves the default allocator, used by default within Theron for all of its allocations (including actors and internal copies of messages).

The default allocator, previously called Theron::Detail::SimpleAllocator, has been made a public class Theron::DefaultAllocator.

DefaultAllocator now supports aligned allocations, so can be used with aligned actor and message types (in conjunction with the alignment markup macros THERON_ALIGN_ACTOR and THERON_ALIGN_MESSAGE).

The checking performed by the DefaultAllocator in debug builds (by default, and under the control of a define) has been extended. The allocator is now able to report current and peak memory usage, and the checks for memory leaks now count bytes allocated rather than just allocations. All allocations are now guardband-checked, allowing for detection of overflows. The included benchmarks now report peak memory usage, in debug builds.

The define that enables checking of allocations has been renamed from THERON_ENABLE_SIMPLEALLOCATOR_CHECKS to THERON_ENABLE_DEFAULTALLOCATOR_CHECKS. The legacy name is still supported, for now.

Lastly the release includes a couple of small bugfixes, neither of which seemed to make much difference in reality.

Story published 9 October 2011.

Theron and Matlab

I thought I'd share up some work done recently by Peter Li to use Theron within Matlab, for parallelization of scientific computing within that environment.

Peter presents two posts on his blog, Absurdly Certain. The first is a discussion of concurrent programming within Matlab, the second a HOWTO guide aimed at helping other Matlab users wanting to add concurrency support via Theron.

http://absurdlycertain.blogspot.com/2011/09/simpler-concurrent-matlab-programming.html

http://absurdlycertain.blogspot.com/2011/09/preamble-what-follows-is-guide.html

In the process, Peter overcomes some issues related to shared libraries in Matlab and a confusing array of different Boost versions, but in the end gets Theron working with Matlab in both Mac and Linux environments. He reports useful speedups of around 4.5x, using 6 cores, in his particular use case.

(Apologies for the cut-and-paste links -- my news feed doesn't currently support HTML, for no good reason).

Story published 5 October 2011.

Version 3.00.00 released

I'm happy to announce the release of version 3.0 of Theron.

Version 3.0 brings a couple of major changes, but relatively few API changes, so should be a mostly painless upgrade.

Arguably the biggest change in 3.0 is to the license under which Theron is licensed. As of version 3.0, Theron is distributed under the MIT license. Earlier versions continue to be distributed under the Creative Commons Attribution 3.0 license, as before.

The reason for the license change is to address a long-standing issue kindly pointed out by Michel Boto: The Creative Commons licenses are explicitly not recommended for use in licensing software products.

As well as the license change, Theron 3.0 brings performance improvements to the tune of a 10-to-30 percent reduction in benchmark times. A lot of work has been done to streamline memory usage, mainly along the lines of a more data-oriented design where actor objects are stripped down to just 60 bytes -- with 32 bytes of that in stripped-down 'core' objects that are pool-allocated to improve cache coherence. The biggest speedups are in benchmarks that test real concurrency between multiple competing actors, but even the basic ThreadRing benchmark shows a 10-15 percent reduction of raw message passing overheads.

A new set of performance results are available, detailing the improvements over the previous version, 2.11. See the Performance section of the website for all the details.

The 3.0 release does bring a couple of API changes, some of them significant. On the plus side they shouldn't affect many users. Mainly, the semantics of the Address class have changed. Valid addresses can now only be derived from 'real' addressable objects, ie. actors and receivers. Default-constructed addresses are now null, ie. equal to Address::Null(). This differs from previous releases where a valid address could be generated just by default-constructing an address object. Accordingly, the Framework::CreateActorAtAddress() methods have been removed. Actor addresses are now entirely under the control of Theron, and it's no longer possible to default-construct a unique, valid address and then explicitly create an actor at that address. I hope this change doesn't cause anyone much trouble, but give me a shout if you need a hand updating.

See the release notes for more details of API changes in the 3.0 release.

Story published 2 October 2011.

Performance results for Theron-2.11.00

The "performance" page on the website has been updated with fresh performance results for the 2.11.00 release of Theron, measured using the three benchmarks included in that release. Some analysis is included.

Aside from being more up-to-date than the earlier results for 2.05.00, these results include measurements for benchmarks aimed at more "realistic" scenarios in which multiple actors are sending messages at once, leading to contention for resources.

Note that all results were recorded with Windows threads in a Visual Studio 2008 build. From what I've seen, results with Boost threads under gcc are currently somewhat slower.

Story published 13 September 2011.

Version 2.11.00 released

Released today, version 2.11 of Theron. This release fixes a potential race-condition in static initialization, but is otherwise focused on cleaning up the included set of samples and demos. It also adds some new benchmarks, with a view to later publishing a wider and more representative range of benchmark results.

Previously the Theron distribution included both a Samples folder and a Demos folder. The distinction was a bit hazy, but essentially samples were intended to be small, self-contained code examples aimed purely at education, while demos were intended to illustrate bigger stories and were allowed to take command-line arguments and perform real computation.

Some of the demos looked suspiciously like benchmarks, and others looked suspiciously like samples. And some of them seemed a little pointless. So the Demos folder has been broken up: the 'demos' it contained have been either renamed as Samples or Benchmarks, or removed altogether.

There's now a Benchmarks folder which contains a small collection of benchmarks, some of which are new.

The ThreadRing benchmark, which was previously called a Demo, is now there. The functionality of ThreadRing has been extended to allow multiple tokens to be created at once, rather than just one. As before, the created tokens are integers and are passed around the ring of actors, being decremented with each hop until they reach zero. The intention of introducing multiple tokens at once is to allow ThreadRing to be used to benchmark a more realistic scenario in which multiple actors are all trying to send and receive messages at the same time. While the existing ThreadRing behavior was interesting (and also comparable, to other published results for the same configuration), it was fundamentally limited in that it contains very little parallelism.

Another new benchmark added in 2.11 is ProducerConsumer, which is a versatile recreation of a typical multiprocessing scenario in which multiple producers are all trying to send messages to a single consumer at the same time. This also is intended to stress the message passing mechanism more than the old ThreadRing.

Lastly 2.11 adds a new CountingActor benchmark, which is a port of a benchmark I've seen online. In this scenario, a single Counter actor is sent a series of consecutive messages, each of which adds a passed integer value to an internal count stored in the Counter (hence the name). Finally the counter value is queried and reset via another message, and the final count is returned to the caller, where it is inspected. This benchmark also suffers from a lack of parallelism (since all the messages are passed in series) -- and is little different from a specialized usage of ProducerConsumer -- however it's useful to be able to score Theron against other Actor Model implementations, for which published results exist online.

In the next week or so I hope to publish some results for these new benchmarks, with pretty graphs etc -- and on a faster machine than the one I have at home! In the meantime feel free to download 2.11 and try them out yourself.

Story published 10 September 2011.

New domain: www.theron-library.com

As of today, the Theron project has a new home: www.theron-library.com

If you've really been paying attention you might also note that the website has had a minor nip and tuck -- but you may struggle to spot the differences unless you're some kind of weird stalker type.

More changes are afoot, too. The documentation could use a restructuring to make it clearer what's what, and where you are at any point. It's easy to get a bit lost at the moment.

Any feedback appreciated. Is the font too small?

Story published 9 September 2011.

Version 2.10.00 released

Another release of Theron today. Version 2.10 brings support for Boost 1.47, which amounted to adding a new define that seems to be required to prevent boost::thread from defaulting to dynamic/shared library calling conventions, in gcc builds. Weird.

There are also some improvements to the included makefile, making it more compatible, out-of-the-box, with typical Linux-based environments. Thanks to Peter Li for feedback on compatibility issues, and various helpful suggestions.

Otherwise, the release is mainly aimed at optimizations and improvements to the quality of the core actor processing code. Among the improvements is a "fix" for the dubious prior decision to allow the internal memory block caches to grow without limit (and never shrink). I've now reintroduced the checks that limit the sizes of the caches.

Currently the size limits are hardcoded and I'm still debating whether it's wise to expose them in a future release, given they are implementation details and may well change again later. The size limits of the memory block caches associated with the per-actor message queues are now quite small -- just three blocks -- with a view to reducing the per-actor memory overhead. The block cache for the core message queue pre-loads itself with memory blocks to encourage cache-coherence.

If the reintroduction of cache size limits had any negative effect on performance then that was offset by other optimizations, at least judging by the ThreadRing benchmark.

I've noticed that optimized builds using Boost threads are significantly slower, in the included benchmarks, than optimized builds using native Windows threads. The difference is quite marked, in the ThreadRing benchmark, but is likely to be less significant in real applications that do actual work rather than stress-test the thread system. It's possible that the Boost-based build was slowed (further?) by the switch to boost::recursive_mutex and boost::condition_variable_any (rather than the faster but non-recursive boost::mutex and boost::condition_variable), in Theron 2.09. Upgrading Boost from 1.43 to 1.47 didn't improve anything. More investigation required.

Story published 3 September 2011.

Version 2.09.00 released

Version 2.09 improves various features, in particular it is now possible to create actors, and send messages, from within the constructors of other actors. Thanks to Francesco for reminding me that this wasn't possible before.

2.09 also adds support for testing whether ActorRef objects are null, useful for testing whether a call to CreateActor succeeded. It is also now possible to compare two ActorRef objects to see whether they reference the same actor.

A couple of new samples are included in the release. The NonTrivialMessage sample shows the use of non-POD types such as std::vectors as messages, and EnvelopeMessages shows the use of simple envelope classes that act as lightweight references to owned objects, and can be sent safely as messages without introducing shared memory.

The release also fixes some important threading bugs, mainly one which caused an assert or crash in rare situations on destroying an actor.

Finally 2.09 adds a comprehensive set of new tests aimed at validation and regression testing of supported features.

Story published 21 August 2011.

Version 2.08.02 released

Another patch release today, this time fixing a build warning introduced in the previous patch. The warning shows up in gcc builds and was caused by using an inline keyword on a pure-virtual function declaration.

Story published 19 June 2011.

Version 2.08.01 released

Theron 2.08.01 is a patch release that fixes a serious bug in the copying of messages (or rather, the construction of the copies). This bug was introduced in 2.07.00, I believe. It broke the use of complex/abstract data types such as STL containers as message values -- basically any type that requires explicit construction and can't be allocated trivially via reinterpret_cast on a block of memory.

Thanks to Thomas for reporting this!

Story published 18 June 2011.

Version 2.08.00 released

Theron version 2.08 sees the introduction of new APIs for dynamic control over the threadpools used by Theron frameworks, allowing the number of threads to be varied at runtime.

Rather than implementing a particular thread management policy within Theron, the changes expose two new APIs by which users can implement their own. One API provides methods for querying and setting the number of threads used by each framework instance. Another provides methods for querying internal counters that track thread utilization. Together, these two APIs can be used to implement custom thread management schemes, expressed as actors.

The thread count API specifies the size of the threadpool indirectly by means of minimum and maximum bounds on the number of threads. The intention of this is to allow negotiation of the actual threadpool size between multiple concurrent subsystems, each with its own requirements. If two or more subsystems each specify a range of permissable threadpool sizes, the actual size is guaranteed to be within all the ranges, if the ranges overlap. Otherwise, the last call wins.

The performance counter API exposes a set of counters (currently only three), which can be queried to measure the processing done by the threads in the current threadpool, and so to estimate the usefulness of using more or less threads. The main counters track the number of times a thread is 'pulsed' in response to an arrived message, and the number of times a
thread was actually woken (indicating how often a sleeping thread was available to do the work). By comparing the two counters, we can get an idea of how many times message processing was delayed due to all the software threads being busy.

Legacy applications which make no use of the new thread count API behave exactly as before, with the size of the threadpool being defined on construction of the framework (either explicitly or by default). Only if you wish to control the threadpool size dynamically do you need to use the new API.

Two new samples have been added. One demonstrates the counter query API, the other the thread count API. In addition a new demo shows how to tie the two APIs together to write a simple threadpool manager actor. The two new samples are both discussed by new tutorial lessons on the website.

Thanks to various people, in particular Carl Sturtivant, for helpful suggestions.

Story published 15 May 2011.

Version 2.07.01 released

Patch release, fixes an issue where the Win32 implementation of the Thread class was calling the CreateThread() Win32 API function directly, rather than via _beginthreadex().

Apparently CreateThread() shouldn't be called directly because it doesn't handle some per-thread resources correctly, leading potentially to memory leaks. Thanks to Jim Morris for pointing that out!

Story published 7 May 2011.

Version 2.07.00 released

As promised, version 2.07 has now been released, with support for aligned allocations of actors and messages.

Alignment of actors and messages is a specialized feature aimed mainly at users wishing to use Theron in embedded or games console environments where hardware restrictions may require certain types (eg. math vectors) to be allocated strictly on 16 or 128-byte boundaries, in order to be used with a hardware feature (DMA, vector processor, hardware cache, etc). The ability to mark message and actor types as requiring non-default alignment, together with the ability to replace the custom allocator with one that supports aligned allocations, allows Theron to be used in such environments. Actors and messages can contain aligned types as members.

As mentioned in the previous news story, the internal memory caching system has needed to be extensively overhauled to support caching of aligned memory blocks. Mainly this has resulted in a simplification of the caching, and some options have been removed in order to make the caching simpler (and faster).

As a result of the overhead of alignment checking (and perhaps the redesign of the caching system) I'm seeing a slight slowdown in raw message processing speed, as measured by the ThreadRing demo. The slowdown is of the order of around 5% in my tests. Note however that this represents raw message processing speed (in a demo that is doing nothing but sending millions of messages), and in practice real applications are not expected to see a significant slowdown. Let me know if I'm wrong.

Part of the simplification of the internal memory block caching has been the removal of fixed limits on the sizes of the caches. Removing the limits makes the implementation simpler and faster, and potentially makes the caches more effective. But it does invite the possibility that the caches grow larger than people would like. In practice I don't think they will, for the simple reason that new allocations tend to be serviced from the cache -- but in 'burst' situations the caches could grow large and then not shrink again. I'll revisit this if it turns out to be a problem, but my feeling is it won't.

I'd like to thank Brian Meidell for useful discussions regarding the support for alignment and optional disabling of RTTI.

Story published 18 February 2011.

Changes planned for 2.07

An update on some work I've been doing for a forthcoming 2.07.00 release.

Mainly I've been adding support for aligned allocation of actor and message objects, another feature useful in embedded and games console environments. The changes allow messages and actors to be 'marked up' with their memory alignment requirements. The markup is by means of macros, similar to the existing markup for type names.

// A message type that requires alignment.
struct THERON_PREALIGN(16) AlignedMessage
{

// contents here

} THERON_POSTALIGN(16);

THERON_ALIGN_MESSAGE(AlignedMessage, 16);

When the marked objects are allocated by Theron, the system requests aligned memory blocks from the global allocator. The SimpleAllocator used by default doesn't actually support alignment, and just ignores alignment requests. But users can supply their own allocator, which does.

On games consoles it is common for math vectors to need 16-byte alignment, meaning they must always be allocated at 16-byte boundaries in memory. Some objects must also be aligned to 128-byte cache line boundaries. The changes allow Theron to be used with message and actor objects containing such types.

In the process I've been reworking the memory block caching. As well as the allocation system, the caching system needs to be made alignment aware. Fortunately the changes generally amount to a simplification to the caching -- arguably one of the more overly complex areas of Theron.

So far there is a small slowdown in the ThreadRing benchmark, of around 10%. For example, sending 50 million messages on my home machine now takes around 29 seconds, whereas before it took around 27 seconds. I'm still optimizing and it may be possible to improve the slowdown; it's not clear yet how much of it is genuinely due to the alignment overhead and how much due to my forced rework of the caching.

Anyway 5% or 10% seems to me a worthwhile price to pay for the benefit of supporting aligned allocations. It's important to remember that's 10% change in raw message processing speed, not in the overall speed of a real system, which is doing real processing and not just message handling.

Story published 13 February 2011.

Version 2.06.02 released

I've released another important patch release, fixing another bug in the enabling of debug functionality in debug builds.

Two errors are fixed: (1) THERON_DEBUG was being incorrectly tested with #ifdef instead of #if, causing debug functionality such as asserts to effectively always be enabled; (2) NDEBUG wasn't being defined in release mode, in makefile builds. This seems to be a tricky area for me :)

Definitely grab this patch release if you're on 2.06.01. It also fixes some smaller and fairly unimportant errors, see the release notes for details.

Story published 13 February 2011.

Version 2.06.01 released

This patch release fixes an important bug: asserts were completely disabled, even in debug builds. This was due to a missing include; now fixed. A couple of trivial fixes to build bugs but no other changes. Sorry to anyone bitten by this.

Story published 22 October 2010.

Version 2.06.00 released

Version 2.06 adds the ability to turn off C++ Runtime Type Information (RTTI). By default, Theron uses RTTI (via dynamic_cast) to match messages with message handlers by message type. Now, by registering the message types used in an application, users can cause Theron to use the registered names of the types instead. This allows RTTI to be turned off (by means of a compiler option).

Turning off RTTI is an advanced feature and is useful mainly in embedded environments, such as games consoles, where memory is tightly constrained. Disabling RTTI avoids introducing a hidden storage overhead into every class.

Users with no need to disable RTTI can continue to not explicitly register their message types as before.

A new RegisteringMessages sample shows off the new feature. In addition, the online tutorial has been extended with lessons covering the RegisteringMessages and MultipleFrameworks samples. See the documentation for details.

Story published 17 October 2010.

Version 2.05.02 released

I've just released a patch release for Theron, version 2.05.02. This release fixes a fairly minor threading issue (a race-condition where two addresses created at the same time could theoretically land up with the same value), and makes some improvements to the API docs.

Story published 13 October 2010.

FAQ page

I've added an FAQ page, linked from the documentation menu on the website.

http://theron.ashtonmason.net/index.php?t=page&p=documentation

It's pretty basic so far with answers to only a few questions. If you have any questions you'd like addressed let me know.

Story published 6 August 2010.

Version 2.05.01 released

This patch release fixes a couple of important build issues. Most significantly, the debug builds of the Boost thread libraries were previously used for release builds, when built via the included makefile. Visual Studio builds were unaffected.

As an incidental change, the makefile now assumes the use of GCC 4.4 rather than 3.4. That came about because I've been experimenting with Qt, and Qt Creator provides GCC 4.4 as part of its distribution. Turns out it's vital to build all libraries with the same release - or at least libraries built with 3.4 and 4.4 seem to be incompatible. The pre-built libraries included with Theron are now also built using GCC 4.4.

If you're using GCC 3.4 yourself then just switch the Boost library names back to "mgw34" in the makefile.

The general advice is to build Boost with your favourite build configuration, build Theron the same way, and update the included makefile to point at your Boost libraries correctly (the names reflect how they were built). See the release notes for more information.

Story published 15 July 2010.

More performance results

Published today, some performance results for Theron running on a "proper" machine, specifically a four-core Intel Xenon X5550 2.67GHz.

http://theron.ashtonmason.net/index.php?t=page&p=performance

Like the earlier results, these are timing measurements for the ThreadRing benchmark included in Theron-2.05.00. They confirm that Theron is fast. A "token" message is passed around a ring of 503 actors, with each actor forwarding the message to the next in the ring. The time for 50 million actor-to-actor "hops" is around 8 seconds, which compares with the times for the same benchmark in Erlang.

A few caveats apply, of course. For one, Theron implements a considerably stripped-down version of the Actor Model and doesn't offer the full functionality of Erlang. It also isn't a language and so brings with it the hazards of C++ -- you can shoot yourself in the foot, basically. But for those that want the convenience of the Actor Model in a C++ environment, Theron provides the raw performance.

Another thing to note is that Theron is effectively running on a single thread in the ThreadRing benchmark, due to the use of tail-call optimization. The process of sending a message from one actor to another around a ring is essentially a serial operation with no opportunity for significant parallelism. Therefore the smart way to do it is using a single thread which "executes" each actor in turn. That's what Theron does, in this case. You can specify more worker threads, but the results are the same since the additional threads offer no advantage, in this very specialized case, and so just spend the whole benchmark asleep.

In that sense ThreadRing is quite limited, as a test of parallelism. It's a test of message passing overheads, from Theron's point of the view. The key to running it fast is to make the queueing of messages and the dispatch of "dirty" actors to the worker threads as cheap as possible. The results suggest Theron's implementation is pretty efficient.

A useful next step would be to write a benchmark which shows true parallelism, and how it can be exploited easily and to good effect in Theron.

Story published 5 July 2010.

Performance results

I've already blogged about this in the blurb about the 2.05.00 release, but I thought I'd highlight that there are now some performance results for Theron online.

http://theron.ashtonmason.net/index.php?t=page&p=performance

Although tested on a somewhat humble machine, the results establish Theron's raw performance as being as good as (or better than) well-known Actor Model implementations such as Erlang and ActorFoundry.

(That said, it should be noted that those implementations are more fully-featured; Theron is quite lightweight by comparison and pretty "C++" in its design intentions).

More results, from a more representative four-core machine, to come in the near future.

Story published 3 July 2010.

Version 2.05.00 released

Version 2.05.00 is a minor release aimed at establishing some performance benchmarks.

The 'TokenRing' demo included in earlier releases has been renamed to 'ThreadRing', and rewritten to correctly implement the well-known thread-ring benchmark used for performance comparison of concurrent systems. The demo now matches the benchmark's traditional definition:

- create 503 linked threads (named 1 to 503)
- thread 503 should be linked to thread 1, forming an unbroken ring
- pass a token to thread 1
- pass the token from thread to thread N times
- print the name of the last thread (1 to 503) to take the token

Performance results for the new 'ThreadRing' benchmark are listed here:

http://theron.ashtonmason.net/index.php?t=page&p=performance

You can download the 2.05.00 release and read the release notes in full at this link:

http://theron.ashtonmason.net/index.php?t=downloads

Story published 3 July 2010.

Version 2.04.00 released

Version 2.04.00 of Theron is now available.

This release brings two important bugfixes and several significant optimizations. The TokenRing message processing speed benchmark is now faster by a factor of around two (ie. around a 50% reduction in execution times).

Hit the downloads page to grab a copy.

Story published 30 June 2010.

Version 2.03.00 released

Version 2.03.00 of Theron is now available for download. Another performance and benchmarking release, 2.03.00 sees another successive 40% improvement in raw message processing speed (as shown by the execution times of the TokenRing benchmark).

The release adds a couple of new benchmarks/demos, plus improvements to the metrics feature useful in profiling.

Of the new demos and benchmarks, Stack is aimed at stress-testing the message processing architecture of the Theron 2.0 framework. Executing a large number of push and pop operations on a single shared stack actor in parallel, the demo stresses the thread-safety and synchronization of the underlying code, and gives reason to think that the current implementation is pretty robust.

Under the hood, a bunch of work has been done to reduce the allocation overhead of message passing. Used message buffers are now cached in a succession of free lists and pools. Each actor now contains a local free list, and a global free list catches message buffers not cached by individual actors.

Feature-wise, the 2.03.00 release adds support for a new Actor::TailSend() method that exploits tail-call optimization to avoid thread switching overhead in the common situation where a message handler sends a message as its last operation. Instead of waking a separate thread to process the receiving actor in parallel, the receiving actor is typically processed by the thread processing the sending actor.

Story published 27 June 2010.

Version 2.02.00 released

Hot on the heels of the 2.01.00 release comes 2.02.00, another minor release with bugfixes and optimizations.

Various optimizations show a time reduction of around 40 percent in the included TokenRing benchmark demo (in the two worker thread case).

Bugfixes include an important fix to handling of messages sent to actors during garbage collection of those actors.

A new Theron::Metrics class allows capture of event counts, recording the number of messages passed, the number of actors processed, and so on. A new #define and makefile option enable the collection of metrics, which is off by default (naturally).

As always let me know if you find any bugs or have any questions or feedback.

Story published 22 June 2010.

Version 2.01.00 released

Version 2.01.00 is a minor release providing bugfixes and optimizations to the 2.00.00 release. It also adds support for forced function inlining and a new TokenRing demo, which serves as a benchmark of message sending and processing speed.

More optimizations are planned.

Story published 19 June 2010.

Version 2.00.00 released

Version 2.00.00 of Theron is now available for download.

This version represents a complete redesign and was rewritten pretty much from scratch. The intention of the redesign was to address a bunch of design issues highlighted by user feedback (and a rethink). Accordingly much of the API has changed, and there's no backwards compatibility with previous versions.

The good news is that the new design is much simpler and easier to use. A lot of the slightly weird stuff from the previous releases has gone. The core public API now consists of just seven classes.

Writing an actor is now as simple as deriving from the Theron::Actor baseclass. Everything else is up to you. Instead of a single catch-all Receive() method, you now register message handlers which are associated with the type of message they accept, using the C++ type system in an obvious way. Multiple handlers can be registered for the same message type. Handlers can be registered in the actor constructor, rather than just in a specialized Initialize() method as previously.

Actors no longer have input and output ports - the whole concept has been removed. There's no concept of an actor being "connected" to another actor; knowing the address of an actor is enough to be able to send it a message.

An important benefit is that the handling of messages within actors is now much easier to understand. Messages arriving at an actor are processed in strict arrival order. The processing of a message consists of executing any message handlers registered for the message type.

Message handlers can be safely registered, deregistered, and re-registered, from within any message handler, even themselves. So handlers can be registered or deregistered in response to received messages, changing the message interface of the actor dynamically.

Internally, the thread synchronization has been simplified by the removal of input ports.

Perhaps best of all, Theron now includes a makefile and builds out-of-the-box with gcc within, for example, MinGW. A VisualStudio solution is still provided.

The dependency on Windows threads has been resolved by adding support for using Boost Threads as the underlying threading implementation. The makefile build uses Boost Threads by default. Building with Boost Threads requires a built installation of Boost. Native Windows threads are still supported, so that the external dependency on Boost is avoided on Windows machines.

Finally, Theron now comes with a suite of twelve easy-to-follow samples, written from scratch and targeted at learning individual API features one at a time. An online tutorial on the website walks you through the samples. Plus, a new online user guide should help to make learning Theron far less painful than before.

As this is the first release of version two of Theron, it should be treated as a pre-release. Although tested fairly extensively by unit tests and samples, the new code hasn't yet been put to serious use in a large project. Therefore it may not yet be 100% production ready. Having said that it seems pretty stable and I don't know of any issues.

Be sure to get back to me with any bug reports or feedback. I'd love to hear what you build with Theron. I'd also like to thank the various people who have got in touch about previous releases, in particular Scott Gregory, Barak Amar, Geoff Burns, Jeffery Olson, Haimo Zobernig, James Osburn and Deepak Poondi.

Story published 12 June 2010.

Version 2.0

If you've been here before you'll notice that the website has been jigged around a bit. There must be a pretty good reason to get this guy off his ass, you think.

Well, you're right. I've been working on version 2.0 of Theron, which is going to be a complete rewrite and redesign. The new version massively simplifies a bunch of things - removing input and output ports and moving closer in spirit to the original Actor Model - so I hope it'll be a lot easier to figure out and more fun to use.

So a new website seems in order - maybe even one which actually tells you something useful about the library 8) I've been working this weekend on a set of online tutorials, which I think are looking promising.

I'm hoping to release at least an initial version in the next week or two. Subscribe to the RSS feed if you want to be kept informed.

Story published 30 May 2010.

More development

Lately I've been doing a bit of work on Theron in my spare time again. I don't want to get anyone's hopes up, but there may be another release in the pipeline at some point. I'm mainly working to improve the API and resolve some slight weirdnesses that a couple of helpful people have pointed out.

I've seen a couple of references to Theron on the web, and I've had a couple of interesting conversations with people, but I still don't know very much about who (if anyone) is actually using it, and what they like or don't like about it. So if you have anything to report, let me know.

I think the website could also use a serious overhaul. Now if I can just take a couple of weeks off work...

Story published 26 January 2010.

Inviting feedback

It's been a while since I made any changes at all to Theron or the website (theron.ashtonmason.net, in case you've forgotten), so some of you may be wondering what's going on.

The answer, predictably, is not much. After an initial burst of enthusiasm work on Theron has pretty much been on the back-burner. Mainly because nothing is driving the work. I'm not aware of anyone using Theron in earnest and frankly I have little idea what people make of it.

If you've downloaded Theron and found it interesting -- or even if you didn't -- I'd be interested, of course, to hear any and all feedback you might have. Likewise if you have any suggestions or requests, let me know. See the contact page on the site for contact info.

Story published 23 December 2008.

Fixed broken downloads

The source zipfile downloads are now working again.

Story published 10 October 2008.

Downloads are broken

I've managed to break the file downloads during some maintanance, hopefully I'll have them fixed tomorrow. Apologies for that.

Story published 9 October 2008.

New website

Welcome to the new Theron website. If you've been here before you may notice that much has been removed that was on the old site. This is somewhat temporary: the site has moved to a new server, and in the process it has been redeveloped from scratch. In time I plan to extend and improve it.

The last release of Theron was version 1.02.00 on 14 October 2007. Since then not a lot has happened. Basically I've been working on other things -- some web development and also my real job (games developer at EA). Theron, like all my projects, sees occasional rushes of enthusiams followed by months of languishing on the proverbial backburner.

As far as I'm aware, the 1.02.00 release is pretty functional and stable. But I can't claim to have used it in earnest much myself -- and I certainly haven't had much feedback either. If you've downloaded Theron (or even if you haven't) and have some thoughts to share then by all means drop me an email.

Story published 18 May 2008.

Version 1.02.00 released

This version adds support for runtime type-checking of message types by actor input and output ports, and improves the API reference documentation. The online documentation has also been updated and now matches the source code. In the process the search function has been fixed and the missing pages added.

Story published 14 October 2007.

Added online documentation

I've added online HTML documentation to the site. You can find it under 'documentation' in the menu. From what I've seen one or two pages are missing, plus in a couple of areas the documentation is more up to date than the code, in that it refers to unreleased code changes. But these things can and will be fixed.

Story published 8 October 2007.

Version 1.01.00 released

This is a relatively minor update that mainly completes actor garbage collection: previously unreferenced actors were only actually destroyed when their owning framework was destructed. Now they are continuously destroyed by a housekeeping thread (which otherwise spends most of its time asleep). See the release notes for more details.

The MessagesPerSecond benchmark that I added in the previous release now measures the message processing speed as around 300k messages per second. Previously I claimed this was 500k -- but even with the earlier code I now can't reproduce that. Strange. There is some variability naturally, but perhaps there's also some larger dependency on what is happening elsewhere in the OS. Or perhaps I was just deluded before :)

Story published 7 October 2007.

Version 1.00.00 released

This version is the first 'proper' release and adds significant new features and improvements.
You can now create multiple frameworks, each with their own independent pool of threads. Actors are now reference counted and automatically garbage collected once they become unreferenced (although note that deletion is currently deferred until destruction of the owning framework).
Memory allocation can be customized and controlled by the provision of custom allocators, including a built-in pool allocator for internal message queues. There have also been significant optimizations, and Theron now processes over half a million messages per second on my PC. Some work has been done to remove dependencies on Visual C things like Microsoft STL, and to reduce the dependency on Win32 threads. Lastly, the samples have been extended and a set of unit tests added.

See the release notes for full details.

Story published 4 October 2007.

Version 00.00.02 released

Like the previous release, this is a minor
update. It adds API reference documentation, something that was sorely lacking from the previous releases.

Story published 8 September 2007.