Damian Hickey

Mostly software and .NET related. Mostly.

RavenDB NuGet Packages: Coarse grained vs fine grained and sub-dependencies

This topic has recently come up on the RavenDB list (1, 2) and Ayende’s blog. I've been down that road of fine-grained packages (internal code) and back again so this is my opinion based on recent experience. The current opinion of the RavenDB team is that they want to avoid having 10s of nuget packages.

So, is 10’sof nuget packages really a problem and if so, for whom and how?

The Package Consumer

From the package consumer side, fine grained packages allows them to pick and choose precisely what they want without crufting up their project's references with unnecessary extraneous references. (Unnecessary references are a pet hate of mine). There are a number of OSS projects that appear to be doing this successfully such as Ninject , ServiceStack and NServiceBus.

One of the consumer’s concerns is that if they do pick 2 packages where one is dependent on the other, is that of package versioning and updating. If they were to pick RavenDB-Client and (a hypothetical) RavenDB-Client(Debug), they expect that at the precise moment one package is updated, then the other one is done so too, such that updating a solution is easy. That is unless RavenDB team is exercising flawless semantic versioning, which I doubt.

The other concern, regardless of a coarse-grained or fined grained packaging strategy, is that of package sub-dependencies. Despite the best intentions of authors with semver and specifying package version ranges in nuspec files , this is an area of brittleness, as was recently demonstrated by a recent log4net package update. Also specifying a package dependency because your package uses it internally unfairly makes your consumer dependent on it. Introduce another package that has the same dependency but maybe a different version ant they are at risk of runtime exceptions, deployment conflicts and having to perform brittle assembly redirect hacks.

Currently, adding a RavenDB-Client package to a class library adds the following 8 references:

image

… and the following package dependencies:

image

My fresh class library is now dependent on a specific Logging Framework, some MVC stuff that has nothing to do with what I am doing and a Community Technology Preview library that may or may not have redistribution licence concerns. This isn’t a great experience. A brief analysis:

  1. AsyncCtpLibrary’s usage is entirely internal to Raven.Client.Lightweight could be ilmerged and internalized. Example projects that do this approach include Automapper and Moq.
  2. Newtonsoft.Json is exposed through Raven.Client.Lightweight’s exported types so is a real dependency.
  3. NLog? There is a better way.
  4. Raven.Abstractions appears to contain both client side and server side concerns. The client side ones could be extracted and placed into Raven.Client.Lightweight and referenced by server side assemblies. (Perhaps, don’t know enough to say for sure)
  5. Raven.Client.MvcIntegration and .Debug are entirely optional and could be provided by separate packages, if I wanted them.
  6. System.ComponentModel.Composition could probably be removed if the server side abstractions were not part of the client package.

The end result, in my opinion, should look like this:

image

If the concerns of minimizing sub package dependencies and lock-stepping of package releases are addressed, then I believe that fine-grained packages are desirable to a package consumer.

Package Producer

The primary concern on the producer side is one of maintenance. Adding additional fine-grained package specifications to a single solution does have a cost, but I’d argue that it’s worth it when considering benefits to the consumer.

Where things do get difficult fast for the producer though is when the fine grain packages are independently versioned. Previously I said I doubted Raven is doing flawless semantic versioning. I doubt anyone is doing it flawlessly because there is no tooling available to enforce and you can’t rely on humans. I’ve tried the automatic CI based package updating “ripple” where Solution B that produces Package B but depends on Package A from solution A , automatically updates itself when a new version of Package A is produced. It didn’t work reliably at all. If the producer has a lot of fine-grained solutions and they have a lot of momentum, package management quickly becomes a mess and a massive time sink.

But if the package producer is using a single solution (as is the case of RavenDB) and a concurrent release of all fine-grained packages at the same time is performed, the cost of supporting fine grained packages is not prohibitive. This is the approach currently taken by ServiceStack.

Extraneous Project References

It’s an attention to detail thing, and for me it’s one of those telling indicators when assessing a code base and those who wrote it.

This is the default set of reference when creating a class library in VS2010:

image

Tell me, have you ever used System.Data.DataSetExtensions? When was the last time you actually used ADO.Net DataSets? 2008? 2005?

Having a reference is explicitly stating “I depend on this and I can’t and won’t work without it.”

When I review a project I haven’t seen before the very first thing I do is check it’s references. This instantly gives me an indication of how coupled the application / library is to other assemblies and what potential difficulties and pain there will be in versioning, maintenance and deployment. I also take into account the assembly type. Applications are at the top of the stack so their output will unlikely to be depended on. For these, I will be less concerned about the amount of references, but would still be concerned about diamond-dependencies from a maintenance perspective.

Framework assemblies, such as ORMs, Loggers, Data Clients, etc., are further down the stack and I am far more critical of. Any sort of non-BCL  dependencies in these will undoubtedly cause pain for the application developer. Your 3rd party ORM has a dependency on a certain version of one logger, but someone else’s DataClient uses a different version? Then you are in a world of odd runtime exceptions, deploy-time conflicts and assembly redirect hacks.

And forget about relying on accurate semantic versioning. There isn’t the tooling in .NET land to analyse assemblies and ‘break the build’ if there has been an breaking change without a corresponding bump in a version number. You have to rely on the authors of your 3rd party assemblies to be extra vigilant. Good luck with that.

In the end though, if there is a reference there and it’s not even being used, well, that is just sheer laziness.

SystemClock

Working on a financial application, operations and calculations are often time sensitive making for tricky unit and acceptance tests. As a result, I never use DateTime.UtcNow ( or DateTime.Now ) anywhere in my code, but instead, I use Ayende's SystemTime abstraction. For this abstraction to really work, all components and systems in your application need to support this. Two core components in my application, RavenDB and EventStore have been recently updated to include this abstraction (thanks for accepting the pull requests!) and a third, Quartz.net, already has it built in. I have an outsanding pull request for NLog to support it, but in the mean time, I am using a custom build.

If you are a library or component developer and you use DateTime.UtcNow anywhere, I strongly recommend that you add support for SystemTime.

Now that I have several SystemTime classes sitting in different libraries and frameworks I'd like one place to set them all. While one can set the exact specific time via SystemTime.UctNow, I usually like the time to move forward as usual. It's not desirable for all of my log messages to have the same timestamp for instance.

SystemClock

This is the one place to get the current SystemTime that moves forward and is also optimized (an idea I pinched from NLog):

using System;
using Beam.Common.Logging;

public class SystemClock
{
    private DateTime? _systemUtcTime;
    private DateTime _start;
    private int _lastTicks = -1;
    private DateTime _lastUtcDateTime = DateTime.MinValue;

    public DateTime UtcNow
    {
        get
        {
            int tickCount = Environment.TickCount;
            if(tickCount == _lastTicks)
            {
                return _lastUtcDateTime;
            }
            if (_systemUtcTime == null)
            {
                _lastUtcDateTime = DateTime.UtcNow;
            }
            else
            {
                var progressed = (DateTime.UtcNow - _start);
                _lastUtcDateTime = _systemUtcTime.Value + progressed;
            }
            _lastTicks = tickCount;
            return _lastUtcDateTime;
        }
    }

    public void SetSystemUtcTime(DateTime systemUtcTime)
    {
        _start = DateTime.UtcNow;
        _systemUtcTime = systemUtcTime;
        _lastTicks = -1;
    }
}

From a container perspective, SystemClock's activation scope is singleton. Now all you do is point your libraries SystemTime classes and your own to this clock at startup:

SystemClock systemClock = container.Resolve<SystemClock>();
SystemTime.UtcNow = systemClock.UtcNow
EventStore.SystemTime.UtcNow = systemClock.UtcNow;
Raven.Abstractions.SystemTime.UtcNow = systemClock.UtcNow;
Quartz.SystemTime = systemClock.UtcNow;
NLog.SystemTime = systemClock.UtcNow;
...

Then set the SytemClock's UtcNow and you're all sync'd up:

systemClock.SetUtcTime(new DateTime(1999, 12, 31, 23, 59, 59));

With this, I hope to be able to do more some interesting things such as accelerating time and jumping into the future that will allow me to similuate long running acceptance tests in shorter timeframes. More on that another time.

Further reading:

Kick the DateTime.Now addiction

SystemTime versus ISystemClock – dependencies revisited

Update: The irony is not lost on me that to effectively test this class, one must abstract over Environment.TickCount :)

Extension Methods, Guards Clauses and Code Analysis

Code Analysis, when encountering this:

public static class FooExtensions
{
    public static void Bar(this Foo instance)
    {
        Guard.Against(instance != null, () => new ArgumentNullException("instance"));
    }
}

...results in a a warning:

CA2208 : Microsoft.Usage : Method 'FooExtensions.Bar(this Foo)' passes 'instance' as the 'paramName' argument to a 'ArgumentNullException' constructor. Replace this argument with one of the method's parameter names. Note that the provided parameter name should have the exact casing as declared on the method.

The problem here is that the 'this Foo instance' paramater isn't a paramater in the usual sense, so I can't use ArgumentNullException or ArgumentException.

Perhaps NullReference exception then? Nope...

CA2201 : Microsoft.Usage : 'FooExtensions.Bar(this Foo)' creates an exception of type 'NullReferenceException', an exception type that is reserved by the runtime and should never be raised by managed code. If this exception instance might be thrown, use a different exception type.

So I've settled on using InvalidOperationException.

 

public static class FooExtensions
{
    public static void Bar(this Foo instance)
    {
        Guard.Against(instance != null, () => new InvalidOperationException("instance"));
    }
}