Build a Sitecore Module With External Big Data through Solr

Sitecore stores and manages content through Sitecore items. They are actually data saved in relational SQL Sever database. Normally the content managed in Sitecore are better created as Sitecore items. But in today’s enterprise ecosystem, many external data cannot be created into Sitecore because:

  • There are existing applications managing the external data, no need to create save function in Sitecore. For example, PIM(Product Information/Inventory Management)
  • External data is huge and ETL might take long time to sync data to Sitecore. For example product inventory data in retail business by SKU, Home Depot sells 300K+, BestBuy sells 725K+; in travel booking system by vacation package, Sunwing sells 10M+
  • External data changes very frequently. For example dynamic price of flight ticket and hotel room, real time merchandise inventory.

So the right approach is,

  • Integrate with external data, no ETL to Sitecore
  • Conduct real time search against external data through Solr

Solr is the popular, blazing-fast, open source enterprise search platform built on Apache Lucene. Installing Solr is easy, see Learn more about Solr.  Here is an example with key points to create a core in Solr and configure index to a table in SQL Server.

solr1

  • solrconfig:
    <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-dataimporthandler-.*\.jar" /> 
    
    <lib dir="${solr.install.dir:../../../..}/dist/" regex="sqljdbc42\.jar" />
  • managed-schema:
    <field name="ProductID" type="long" indexed="true" required="false" stored="true"/>
    
    <field name="Name" type="string" docValues="true" indexed="true" stored="true"/>
    
    <field name="ProductNumber" type="string" docValues="true" indexed="true" stored="true"/>
  • db-data-config:
    <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" url="jdbc:sqlserver://localhost:8983;databaseName=Northwind" user="sa" password="xxxxxx" />
    
    <entity name="item"  query="select * from Products"  deltaQuery="select ProductID from Products and updateDate > '${dataimporter.last_index_time}‘”  deltaImportQuery=“” deletedPkQuery=“” transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer"> </entity>

SolrNet is an Apache Solr client for .NET to interact with the Solr search engine. Sitecore installation package officially supports Lucene for native and Solr for distributed indexing. Since Sitecore 8.2 the SolrNet is included in package which means SolrNet has been fully test by Sitecore. Programming with SolrNet is easy. Here is an example with steps:

  • Initialize:

     

    Startup.Init<Bike>("http://localhost:8983/solr/Bike");    
    
    ISolrOperations<Bike> solr =  ServiceLocator.Current.GetInstance<isolroperations<Bike>>();
  • Query in SolrNet:
    var qSize = new SolrQueryByRange<int>("Size", 40, 52);
    
    var qSubCat = new SolrQueryByField("ProductSubcategory", "Mountain Bikes");
    
    var qColor = new SolrQueryInList("Color", "Silver", "Black");
    
    SolrQueryResults<Bike> bikes = solr.Query(qSize && qSubCat && qColor);
  • Query Converted to Solr REST Api:
    http://localhost:8983/solr/Bike/select?fq=Color:((Silver)%20OR%20(Black))&fq=ProductSubcategory:%20%22Mountain%20Bikes%22&fq=Size:[40%20TO%2052]&indent=on&q=*:*&wt=json
This entry was posted in Information Technology, Sitecore, Solr and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s