Damian Hickey

Mostly software and .NET related. Mostly.

Testing EventStore storage engines, part 2 - MongoDB

A follower on twitter, @red_square, ran my tests from my previous blog post using MongoDB as a storage engine that gave some interesting event read rates.

I incorporated this into my test:

Compared to the RDBMs, these reads speads are blistering: 7 times faster than SQL Server and over 20 times faster than PostgreSQL. There is a cost to this, and we are comparing apples and oranges here. There is plenty on the web comparing MongoDB and other NoSql storage tech to the traditional so I'm not going to regurgitate them here. 

For my case, I think I'd like to still present the option of using an RDMBS for our customers. I think they are just more comfortable with it. But it does open the door to an interesing optimization strategy. There are cases, such as deploying new projection schema, upgrading an exisiting one, recovering from a crash, or replaying to point-in-time for analysis, where you want a pipeline that can rebuild the projections as quickly as possible. So perhaps using MongoDB as a secondary event store (with lower reliabilty requirements) which gives you a super fast read pipeline could be feasible. This secondary store can be a local async mirror of the primary store that will get you 99% of the required events for a projection store re-build (it may not have 100% of events due to it being an async mirror). The remaining events can the come from the slower primary event store at the end of the rebuild step. If the MongoDB mirror is ever lost, such as in disaster recovery, it can just be re-mirrored. I'd hope this would be a very rare occurance

(Big thanks to Steve Flitcroft for the MongoDB help)

Edit: The specific numbers are not representative of real world numbers. The interesting bit is the relative difference between them, ceteris paribus.

Testing EventStore storage engines

From the project's home page: "The EventStore is a persistence library used to abstract different storage implementations when using event sourcing as storage mechanism." From the same page, you can see it either currently or will in the future, support a myriad of storage engines.

But which one performs the best?

My typical usage scenario is:

  1. Handle commands in parallel, where each command generally results in a single event being appended to a single stream.
  2. Read all / many events when I need to fully / partially rebuild a projection.

I've upload a project to github the test 3 engines. (I've only just hacked this up, so any and all feedback is welcome). The test isn't meant to be very scientific, there are a lot factors to consider. At this point I'm really only interested in getting a feeling for the relative differences between each engine.

Test server details:

  1. Windows 2008 R2 VM running in Virtual Box.
  2. 8GB RAM
  3. 4 CPU @ 4.3Ghz
  4. HD the VDI is on is a 750GB WD SATA3.
  5. SqlServer 2008R2 SP1, PostgreSql(9.1), and MySql(5.2) installed and running as services. Installations were clean and settings were left as default where possible.
  6. Comms between dev machine and server via BridgedAdapter.

Each test is run 3 times.

Results

Running test app on test server:

| DB                 | Run # | Stream Count | Events/Stream | Total Events | Insert Time (s) | Insert Rate (events/s) | Read Rate (events/s) |
| SqlServerPerfTest  | 0     | 100          | 100           | 10000        | 18.9264927      | 528                    | 10955                |
| SqlServerPerfTest  | 1     | 100          | 100           | 10000        | 19.2425916      | 520                    | 10655                |
| SqlServerPerfTest  | 2     | 100          | 100           | 10000        | 20.0126372      | 500                    | 12518                |
| PostgreSqlPerfTest | 0     | 100          | 100           | 10000        | 82.9118079      | 121                    | 4366                 |
| PostgreSqlPerfTest | 1     | 100          | 100           | 10000        | 79.6413285      | 126                    | 4396                 |
| PostgreSqlPerfTest | 2     | 100          | 100           | 10000        | 79.3978265      | 126                    | 4383                 |
| MySqlPerfTest      | 0     | 100          | 100           | 10000        | 33.5853395      | 298                    | 9633                 |
| MySqlPerfTest      | 1     | 100          | 100           | 10000        | 33.4475381      | 299                    | 10340                |
| MySqlPerfTest      | 2     | 100          | 100           | 10000        | 27.4845492      | 364                    | 12302                |

Running the app from dev machine, communicating via BridgedAdapter:

| DB                 | Run # | Stream Count | Events/Stream | Total Events | Insert Time (s) | Insert Rate (events/s) | Read Rate (events/s) |
| SqlServerPerfTest  | 0     | 100          | 100           | 10000        | 14.8824588      | 672                    | 14748                |
| SqlServerPerfTest  | 1     | 100          | 100           | 10000        | 16.6571154      | 600                    | 14805                |
| SqlServerPerfTest  | 2     | 100          | 100           | 10000        | 17.720558       | 564                    | 14749                |
| PostgreSqlPerfTest | 0     | 100          | 100           | 10000        | 69.212226       | 144                    | 4643                 |
| PostgreSqlPerfTest | 1     | 100          | 100           | 10000        | 72.2258403      | 138                    | 4647                 |
| PostgreSqlPerfTest | 2     | 100          | 100           | 10000        | 73.3332329      | 136                    | 4633                 |
| MySqlPerfTest      | 0     | 100          | 100           | 10000        | 34.1016615      | 293                    | 14624                |
| MySqlPerfTest      | 1     | 100          | 100           | 10000        | 34.7092424      | 288                    | 14585                |
| MySqlPerfTest      | 2     | 100          | 100           | 10000        | 34.4671897      | 290                    | 14595                |

Finally, running from laptop over Gb LAN:

| DB                 | Run # | Stream Count | Events/Stream | Total Events | Insert Time (s) | Insert Rate (events/s) | Read Rate (events/s) |
| SqlServerPerfTest  | 0     | 100          | 100           | 10000        | 27.4017972      | 365                    | 4293                 |
| SqlServerPerfTest  | 1     | 100          | 100           | 10000        | 27.5854037      | 363                    | 5282                 |
| SqlServerPerfTest  | 2     | 100          | 100           | 10000        | 30.104196       | 332                    | 5417                 |
| PostgreSqlPerfTest | 0     | 100          | 100           | 10000        | 133.9792362     | 75                     | 1942                 |
| PostgreSqlPerfTest | 1     | 100          | 100           | 10000        | 145.4393013     | 69                     | 2654                 |
| PostgreSqlPerfTest | 2     | 100          | 100           | 10000        | 154.967134      | 65                     | 2442                 |
| MySqlPerfTest      | 0     | 100          | 100           | 10000        | 44.8102286      | 223                    | 4756                 |
| MySqlPerfTest      | 1     | 100          | 100           | 10000        | 44.735568       | 224                    | 4837                 |
| MySqlPerfTest      | 2     | 100          | 100           | 10000        | 45.8796399      | 218                    | 4540                 |

The laptop is no where near as powerful as the desktop. During the exectution of the test, it's CPU was pegging around 90-95 percent which was a limiting factor. I expected that and the network I/O to give lower numbers, which it did. I also expected the numbers to level out a bit as the bottle neck can no longer be completely attributed to be the database. But that didn't happen - PostgreSql was slower by the same factor as when running the test app on the latptop as on the server itself. This leads me to suspect the npgsql data provider is not up to scratch.

Anyway, these are just preliminary numbers, and I'm sure there are lots of holes here. They're nonetheless intesting.

Our package, build and deployment pipeline (NuGet, MyGet, ProGet, TeamCity and OctopusDeploy Heaven)

I don't have this fully baked yet, some of the software is barely V1 and there are still some kinks to work out, but I am finding this relatively easy to manage. Compared to a couple of years ago, when there was virtually nothing like this in the .net space (MSIs do not count).