Skip to content

Examine with Azure Directory (Blob Storage)

Shannon Deminick edited this page Jul 21, 2020 · 11 revisions

*** This is Legacy documentation *** here's the links to the current docs:

Tip: There are many unit tests in the source code that can be used as Examples of how to do things. There is also a test web project that has plenty of examples of how to configure indexes and search them.


Overview

(currently not supporting Examine 1.0.0 yet)

The purpose of this library is for load balancing in Azure.

Working with Lucene on Azure means that you need to host your Lucene files on the local fast drive (%temp%) since Lucene doesn't work well when it's read from a remote network drive which is how the storage works on Azure Web Apps. Examine has a few options to support working on the local fast drive with options like SyncTempEnvDirectoryFactory. When using Lucene on Azure it would be ideal to have the 'master' (write-only) index stored in Blob Storage.

Azure can move your site to a new server anytime. This is why SyncTempEnvDirectoryFactory exists, so that the 'master' index is stored in App_Data but the local read index is lazily built with the required files it needs from the 'master'. Without this syncing, it would mean that the index wouldn't be there at all when Azure moves your site and the index would need to be rebuilt on startup. Instead of storing the master index on the remote file share in App_Data, it could be stored in Blob Storage which would mean that the same index could be shared between multiple Web Apps, which means that Load Balancing would be much nicer to do with Lucene Indexes. When scaling, the new worker would just lazily build it's local index based on the Blob Storage 'master' index.

In order for this to work however it means that only a single worker can ever write to indexes. In Umbraco load balancing this is achieved by having a single Web App that is not scaled designated as the master CMS server and all other Web Apps are for serving front-end requests only.

There IS an already existing package called AzureDirectory but this is built only for Lucene 3.x not Lucene 2.9. I have contributed to the original project and have also found some bugs with it. Examine's version of AzureDirectory is a port of the original code but brought up to date with various bug fixes and built against 2.9. Examine's version also only implements simple file directory locking and not a native FS file lock which the original AzureDirectory attempts to do by using Blob leases. The simple file directory locking works fine (at least with Umbraco) because it is guaranteed that only a single process is ever writing to the index at one time.

Installation

The nuget package for this can be found https://www.nuget.org/packages/Examine.AzureDirectory

 Install-Package Examine.AzureDirectory

To activate it, you need to add these settings to your web.config

<add key="examine:AzureStorageConnString" value="YOUR-STORAGE-CONNECTION-STRING" />
<add key="examine:AzureStorageContainer" value="YOUR-CONTAINER-NAME" />

On your master server, this directoryFactory attribute needs to be added to each of your indexes in the ExamineIndexProviders section:

directoryFactory="Examine.AzureDirectory.AzureDirectoryFactory, Examine.AzureDirectory"

On your front-end/readonly/slave servers, this directoryFactory attribute needs to be added to each of your indexes in the ExamineIndexProviders section:

directoryFactory="Examine.AzureDirectory.ReadOnlyAzureDirectoryFactory, Examine.AzureDirectory"

For example:

<add name="InternalIndexer" directoryFactory="Examine.AzureDirectory.AzureDirectoryFactory, Examine.AzureDirectory"/>
Clone this wiki locally