Setting Index Options for IDictionary in RavenDB

This post has been updated to work with the latest stable build of RavenDB (Build 573).

In my opinion, RavenDB is the best NoSQL option for .NET applications. Some time ago, I recommended it to one of my clients and they are planning to use it in a major greenfield project involving the re-architecture of their customer and administrative web applications. They are currently working on a series of technical spikes/proofs of concept to better understand technical risks, put together budgets and demonstrate key application capabilities to the project stakeholders. As part of their effort, I’ve been working with them to put together some demonstrations around search and how it can be extended into driving the configuration of custom landing pages for various marketing campaigns.

One of the biggest issues they face is that their products have different configurable options and properties. These are typically contained in an IDictionary<string, string>.  Users need to be able to search on the various properties.  For example, a user might want to find all products that have a property named “color” with a value of “red”.   To make matters more interesting, many of the property values have synonyms that must be usable in search too.  For example, they might need a search on color=maroon to match products where color is red.   RavenDB can do this, but there are some implementation details that are not well documented. This article outlines the solution that worked for us.

The first part of the solution is fairly well documented in the RavenDB Google Group. Take, for example, my colleague’s original post.  Given an object MyDocument with a property of type IDictonary<string,string> named Attributes, you create an index entry for each name/value pair as follows:

public class MyDocument_ByProperty : AbstractIndexCreationTask
{
    public MyDocument_ByProperty()
    {
        Map = docs => from doc in doc select new {
          _ = from prop in doc.AttributeValues select
                new Field(prop.Key, prop.Value, Field.Store.NO, Field.Index.ANALYZED))
    }
}

Although this gave us the ability to search by key/value, it does not handle the synonym requirement. For our proof of concept, we adapted the synonym analyzer described on Code Project. Since RavenDB provides a way to set the analyzer for a field, it should have been easy to configure it to use our synonym analyzer for the various name/values. Unfortunately, the method shown in the documented examples and discussed in the group only allow you to set an analyzer using Linq; Since the fields in this index are the result of a projection, we could not use it to set the analyzer for the projected fields.

Based on Ayende’s suggestion in the post referenced above, I took a look at the RavenDB source thinking I needed to create a plugin or some other extension to make this possible. As it turned out, the capability was already present. All we had to do was override another method of the AbstractIndexCreationTask as follows:

public class MyDocument_ByProperty : AbstractIndexCreationTask
{
    public MyDocument_ByProperty()
    {
        Map = docs => from doc in doc select new {
          _ = from prop in doc.AttributeValues select
                new Field(prop.Key, prop.Value, Field.Store.NO, Field.Index.ANALYZED))
    }

    public override IndexDefinition CreateIndexDefinition()
    {
        foreach (var propertyName in propertyNames)
        {
            var indexDefinition = base.CreateIndexDefinition();
            indexDefinition.Analyzers.Add(propertyName,
            "Eleanor.Analyzers.SynonymAnalyzer, Eleanor.Analyzers");
        }

        return indexDefinition;
    }
}

This illustrates yet another reason why I am a big advocate of dual-source licensing for commercial programming libraries and tools. The availability of RavenDB source code made it possible for us to get the most out of the product. It also means that as long as the project goes forward we will buy some RavenDB licenses. That’s a win-win outcome especially when you consider that without source we may have been forced to go in another direction, which would have meant the loss of licensing revenue for the developers of RavenDB.

Blogging on JDF Tools and Techniques at the JDF Blog

My passion is building systems that tie together supply chains.  For the last several years, I have focused my efforts on the commercial printing industry and the industry’s integration standard, JDF.  As my company gets closer to releasing FluentJDF, an opensource JDF library for .NET, I will be posting on JDF tools and techniques at the JDF Blog.  I will continue to post here on general programming and entrepreneurship.

NServiceBus Fluent Interface is Not All That Fluent

I am only getting started with NServiceBus after having used Rhino ESB for some time.  Overall, I’m liking the functionality.  However, at least in the 2.5 release, configuration is a little sensitive and often doesn’t provide any useful information when things go wrong.  Take, for example, the following configuration for a web application:

NServiceBus.Configure.WithWeb()
    .XmlSerializer()
    .Log4Net()
    .CastleWindsorBuilder()
    .MsmqTransport()
        .IsTransactional(false)
        .PurgeOnStartup(false)
    .UnicastBus()
    .ImpersonateSender(false)
    .CreateBus()
    .Start();

When you put in this your Application_Start method it throw a null reference exception in the NServiceBus configuration routine.  As it turns out, the code was supposed to look like this instead:

NServiceBus.Configure.WithWeb()
    .Log4Net()
    .CastleWindsorBuilder()
    .XmlSerializer()
    .MsmqTransport()
        .IsTransactional(false)
        .PurgeOnStartup(false)
    .UnicastBus()
    .ImpersonateSender(false)
    .CreateBus()
    .Start();

Did you spot the difference?  The issue is you can’t tell it which serializer to use until after you tell it how to configure the container.  Seems kind of fragile if you ask me.  This certainly doesn’t make NServiceBus a bad library, but it does make it quite a bit harder to get started.   Anyway, thanks to this being open source I was able to debug into the offending routine and figure out what was going wrong.

Impressions of Fitnesse With .NET for Acceptance Testing

After using FitNesse for the last several months I can say the following:

  • The documentation is quite limited especially when it comes to working with .NET.  Lots of trial an error involved for any novice.
  • How lucky am I to have Mike Stockdale, the principal developer of FitSharp, working on the project to show the team a variety of useful tricks?  I don’t think we would have been successful with FitNesse without him.
  • Technical product owners are able to write and troubleshoot their own tests using the wiki once the right test fixtures are in place.  Very nice.
  • Our tests generate lots of XML that the product owners review from time to time so I decided to add syntax highlighting via google’s prettifier javascript.  FitNesse uses velocity templates so it should have been easy to do.  Although I was able to get syntax highlighting working on the test history page, velocity is not used to generate the live test results so I couldn’t get it working there.  Bummer.  Have to find time to contribute a fix given that the velocity feature is no longer in active development.
  • Integrating FitNesse with TeamCity is easy as long as you don’t care about integrating the test counts.  Wrote a little MSBuild step that takes care of this.  Note to self: document and release as open source to help others.
  • Integrating FitNesse with TeamCity’s built-in code coverage has proved impossible thanks to the tests running under Java.  Oh well.
  • Database setup and FitNesse add substantial overhead to the acceptance test suite so it take several minutes to run.  Our extensive unit test suite remains fast partially because integration/acceptance tests run under FitNesse so this is not a big deal.
  • I have looked at alternatives like SpecFlow but remain convinced that FitNesse is about the only automated acceptance testing tool that is approachable for non-programmers.  For example, although  most product owners can write Gherkin specs for SpecFlow,  I don’t think they could easily run and troubleshoot tests like they can with FitNesse.  Therefore, I will continue to use FitNesse for acceptance testing on future projects.

Put Your Apps on the TopShelf

Many of my projects end up using a Windows service or three to host background processes.  Over the years, I’ve developed a common-sense strategy of setting up a server class to contain the functionality that implements start and stop methods.  I then create minimal command-line and windows service hosts to instantiate the server class and call start and stop when appropriate.  This gives me a command-line server that can be conveniently started from the debugger and a windows service application for use in the production environment.  Of course, this also means using InstallUtil when it comes time to install the service.

Today I stumbled across a much nicer solution in the open source TopShelf project.  It lets me build a console application using about ten lines of code that hosts my server for development and provides a command-line to install as a Windows service so InstallUtil is not required.   Highly recommended!

Open Source and Unreasonable Expectations

I like Sharp Architecture. Anybody who’s glanced at this blog must have spotted this by now. I flirted with the .NET Entity Framework (too much hoo ha in .NET 3.5 and .NET 4.0 not ready yet) and Teleriks’ ORM (not used widely enough for my tastes) before settling on NHibernate. Once I did that, it was pretty easy to decide on using Sharp Architecture since it provided a more or less complete architectural framework built around NHibernate. I started out with pre 1.0 and upgraded to 1.0 when it became available.

I don’t agree with every choice Sharp Architecture made. For example, I prefer the Spark view engine over the standard .NET ASPX engine. I also rather like the xVal client-side validation library. None of this was a problem since I had the source. I modified the original code generation templates to generate Spark views instead of ASPX views. While I was at it, I had them generate the little bit of xVal stuff needed and also gave them the ability to read object properties from my existing database tables. I also had to work through issues related to the new file handling mechanisms in the latest version of the T4 template engine.

Most recently, I wanted to upgrade to the latest version of Fluent NHibernate because it fixed some annoying bugs. I checked with the SharpArch group and discovered it wouldn’t be upgraded for awhile yet. However, someone was kind enough to offer a set of steps for performing the upgrade. Perfect, I thought, and started to work through them. Turned out there were some breaking changes in Fluent that were not mentioned in the steps. No big deal. They got me 90% of the way there. I worked through the rest of the issues one by one and within a couple of hours I had everything working. Thanks again SharpArch community!

The next morning someone posted a question about upgrading to Fluent that I actually knew how to answer. I started to draft up a reply but quickly realized there was too much detail to stick in a post to the group. Instead, I made a long blog post and sent in a reply with a link to the blog. A few questions later I decided to post the upgraded SharpArch binaries as well. Before I knew it, I had spent more time helping others than I had spent upgrading my solution in the first place. Again, no problem. It was the least I could do to start to pay back all the thousands of hours of work that had been contributed by others to make SharpArch possible in the first place.

So what does this experience teach? Well, open source is a community effort. Don’t expect the community to jump in and solve your problems if you are unwilling or unable to take on some of the necessary work. Usually, it took a whole lot more effort than you realize to get the library to its present state and community members will not always have the time or inclination to immediately do what you want them to do. If you are willing to roll up your sleeves, open source does give you the ability to do what you want when you want it since you will always have access to the code. This is rarely the case with commercial products. Finally, when you do add some capability, contribute it back to the community to help make the library better for all.

I like Sharp Architecture.  I flirted with the .NET Entity Framework (too much hoo ha in .NET 3.5 and .NET 4.0 not ready yet) and Teleriks’ ORM (not used widely enough for my tastes) before settling on NHibernate.  Once I did that, it was pretty easy to decide on using Sharp Architecture since it provided a more or less complete architectural framework built around NHibernate.  I started out with pre 1.0 and upgraded to 1.0 when it became available.

I don’t agree with every choice Sharp Architecture made.  For example, I prefer the Spark view engine over the standard .NET ASPX engine.  I also rather like the xVal client-side validation library.  None of this was a problem since I had the source.   I modified the original code generation templates to generate Spark views instead of ASPX views.  While I was at it, I had them generate the little bit of xVal stuff needed and also gave them the ability to read object properties from my existing database tables.  I also had to work through issues related to the new file handling mechanisms in the latest version of the T4 template engine.

Most recently, I wanted to upgrade to the latest version of Fluent NHibernate because it fixed some annoying bugs.  I checked with the SharpArch group and discovered it wouldn’t be upgraded for awhile yet.  However, someone was kind enough to offer a set of steps for performing the upgrade.  Perfect, I thought, and started to work through them.  Turned out there were some breaking changes in Fluent that were not mentioned in the steps.  No big deal.  They got me 90% of the way there.  I worked through the rest of the issues one by one and within a couple of hours I had everything working.  Thanks again SharpArch community!

The next morning someone posted a question about upgrading to Fluent that I actually knew how to answer.  I started to draft up a reply but quickly realized there was too much detail to stick in a post to the group.  Instead, I made a long blog post and sent in a reply with a link to the blog.  A few questions later I decided to post the upgraded SharpArch binaries as well.  Before I knew it, I had spent more time helping others than I had spent upgrading my solution in the first place.  Again, no problem.  It was the least I could do to start to pay back all the thousands of hours of work that had been contributed by others to make SharpArch possible in the first place.

So what does this experience teach?  Well, open source is a community effort.  Don’t expect the community to jump in and solve your problems if you are unwilling or unable to take on some of the necessary work.  Usually, it took a whole lot more effort than you realize to get the library to its present state and community members will not always have the time or inclination to immediately do what you want them to do.  If you are willing to roll up your sleeves, open source does give you the ability to do what you want when you want it since you will always have access to the code.   This is rarely the case with commercial products.  Finally, when you do add some capability, contribute it back to the community to help make the library better for all.

An Open Source Project to Watch

If you use .NET open source libraries in your development, you need to take a look at hornget – The Horn Package Management Project. Its initial goal is to provide a way to download and build popular .NET open source libraries, like Sharp Architecture and NHibernate, automatically resolving all necessary dependencies. The project is in very early stages and very rough around the edges. However, when you consider the couple of hours it takes to do something like upgrading Sharp Architecture to Fluent NHibernate 1.0 (as I did recently), you can see why something like hornget would be quite useful. I certainly intend to keep my eye on this one and, if time permits, at least contribute some testing effort.

If you use .NET open source libraries in your development, you need to take a look at hornget – The Horn Package Management Project.  Its initial goal is to provide a way to download and build popular .NET open source libraries, like Sharp Architecture and NHibernate, automatically resolving all necessary dependencies.  The project is in very early stages and very rough around the edges.  However, when you consider the couple of hours it takes to do something like upgrading Sharp Architecture to Fluent NHibernate 1.0 (as I did recently), you can see why something like hornget would be quite useful.  I certainly intend to keep my eye on this one and, if time permits, at least contribute some testing effort.