Episerver.Search: Prevent xml FTS indexing?

Vote:
 

Hi

For some godforsaken reason Episerver.Search indexes xml and csv files for full text search. When xml files are large this takes a lot of CPU on the server, and I think it's just ridiculous to assume that editors would wish to search xml file content.

Documentation for Episerver.Search is a joke. Is there an easy way to indexing of xml and csv files by extension or by file size?

#116992
Feb 09, 2015 15:23
Vote:
 

Maby you could find some inspiration from Teds blog about EPiServer Search 

http://tedgustaf.com/blog/2013/4/add-custom-fields-to-the-episerver-search-index-with-episerver-7/

#116997
Feb 09, 2015 15:52
Vote:
 

Thanks Henrik! That pointed me to the right direction.

It turned out that method EPiServer.Search.IndexingService.IndexingServiceHandler.HandleDataUri was the main culprit. It receives a parameter DataUri, that's a path to the file content, and handles reading the file and populating the item. All this is veeeery slow with big files. The method is executed before event IndexingService.DocumentAdding is called, so events do little good here.

Most easiest way to solve this seemed to be overriding EPiServer.Search.SearchHandler not to pass the DataUri parameter at all, and then configuring StructureMap to use own implementation instead of the default. 

namespace Solita.Web.Utils.Performance
{
    public class NonFileIndexingSearchHandler : SearchHandler
    {
        public override void UpdateIndex(IndexRequestItem item)
        {
            UpdateIndex(item, null);
        }

        public override void UpdateIndex(IndexRequestItem item, string namedIndexingService)
        {
            // never index data content. it's too slow
            item.DataUri = null;

            base.UpdateIndex(item, namedIndexingService);
        }
    }
}



#117006
Edited, Feb 09, 2015 23:30
Vote:
 
#117010
Feb 10, 2015 4:56
Vote:
 

Great! And thanks for sharing the solution to your problem!

#117013
Feb 10, 2015 6:00
This topic was created over six months ago and has been resolved. If you have a similar question, please create a new topic and refer to this one.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.