Setting Index Options for IDictionary in RavenDB

This post has been updated to work with the latest stable build of RavenDB (Build 573).

In my opinion, RavenDB is the best NoSQL option for .NET applications. Some time ago, I recommended it to one of my clients and they are planning to use it in a major greenfield project involving the re-architecture of their customer and administrative web applications. They are currently working on a series of technical spikes/proofs of concept to better understand technical risks, put together budgets and demonstrate key application capabilities to the project stakeholders. As part of their effort, I’ve been working with them to put together some demonstrations around search and how it can be extended into driving the configuration of custom landing pages for various marketing campaigns.

One of the biggest issues they face is that their products have different configurable options and properties. These are typically contained in an IDictionary<string, string>.  Users need to be able to search on the various properties.  For example, a user might want to find all products that have a property named “color” with a value of “red”.   To make matters more interesting, many of the property values have synonyms that must be usable in search too.  For example, they might need a search on color=maroon to match products where color is red.   RavenDB can do this, but there are some implementation details that are not well documented. This article outlines the solution that worked for us.

The first part of the solution is fairly well documented in the RavenDB Google Group. Take, for example, my colleague’s original post.  Given an object MyDocument with a property of type IDictonary<string,string> named Attributes, you create an index entry for each name/value pair as follows:

public class MyDocument_ByProperty : AbstractIndexCreationTask
{
    public MyDocument_ByProperty()
    {
        Map = docs => from doc in doc select new {
          _ = from prop in doc.AttributeValues select
                new Field(prop.Key, prop.Value, Field.Store.NO, Field.Index.ANALYZED))
    }
}

Although this gave us the ability to search by key/value, it does not handle the synonym requirement. For our proof of concept, we adapted the synonym analyzer described on Code Project. Since RavenDB provides a way to set the analyzer for a field, it should have been easy to configure it to use our synonym analyzer for the various name/values. Unfortunately, the method shown in the documented examples and discussed in the group only allow you to set an analyzer using Linq; Since the fields in this index are the result of a projection, we could not use it to set the analyzer for the projected fields.

Based on Ayende’s suggestion in the post referenced above, I took a look at the RavenDB source thinking I needed to create a plugin or some other extension to make this possible. As it turned out, the capability was already present. All we had to do was override another method of the AbstractIndexCreationTask as follows:

public class MyDocument_ByProperty : AbstractIndexCreationTask
{
    public MyDocument_ByProperty()
    {
        Map = docs => from doc in doc select new {
          _ = from prop in doc.AttributeValues select
                new Field(prop.Key, prop.Value, Field.Store.NO, Field.Index.ANALYZED))
    }

    public override IndexDefinition CreateIndexDefinition()
    {
        foreach (var propertyName in propertyNames)
        {
            var indexDefinition = base.CreateIndexDefinition();
            indexDefinition.Analyzers.Add(propertyName,
            "Eleanor.Analyzers.SynonymAnalyzer, Eleanor.Analyzers");
        }

        return indexDefinition;
    }
}

This illustrates yet another reason why I am a big advocate of dual-source licensing for commercial programming libraries and tools. The availability of RavenDB source code made it possible for us to get the most out of the product. It also means that as long as the project goes forward we will buy some RavenDB licenses. That’s a win-win outcome especially when you consider that without source we may have been forced to go in another direction, which would have meant the loss of licensing revenue for the developers of RavenDB.

%d bloggers like this: