Cannot get SearchDataSource to return Files

Vote:
 

I'm having some real issues getting the search results to return a file in the search results. I think I have everything setup correctly, but I'm not sure if I need either the EpiServer Indexing Service running, the Microsoft Indexing Service running or neither...

We are using the VirtualPathVersioningProvider for the file system. The indexingServiceCatalog attribute is set to "Web" for all three entries.

I have uploaded a PDF into the root of the Global folder of the file system. When I use part of the file name to search, I get no results back.

I have the EpiServer Indexing Service pointed to the VPP folders and running, but still no luck. 

The Search Control code is:

SearchDataSourceCtrl.PageLink = PageReference.StartPage;
SearchDataSourceCtrl.SearchFiles = true;
SearchDataSourceCtrl.SearchLocations = "~/Global/,~/Documents/,~/PageFiles/";
SearchDataSourceCtrl.SearchQuery = Session["SearchTerm"].ToString();

 

When I search using phrases found in pages, they are returned, but not the file system files.

Any ideas?

Thanks, 

Jim.

#20353
May 27, 2008 17:00
Vote:
 
Hi Jim,

When you are using the VirtualPathVersioningProvider you don't need to make any configurations in Microsofts indexing service.

Is it only PDF files that you cannot find? If you upload a txt-file, can you find that? Have you installed a IFilter to index PDF files?

Can you find files if you search in the file manager in edit mode?

You can also read this blog post written by Fredrik Haglund:
http://blog.fredrikhaglund.se/blog/2008/01/25/about-searching-inside-uploaded-files/

#20391
May 28, 2008 8:59
Vote:
 

I've placed three versions of the same file, one as PDF, one as Word Doc and one as plain text, all in the same place. I have searched on both the file name, file contents and the metadata title, and I get no files returned.

 I have also searched for other files in the file system, such as images and flash files, using part and all of the file name and I still get no results.

 I have not installed a IFilter for indexing PDF's, is this a standard Microsoft add-on?  But as I cannot seam to be able to find any files, I am not sure that this is the issue.

I have Microsoft Indexing Service and the Episerver Indexing service pointing to the VPP folder, but when I search I still get nothing.

 Is there something I have to do to include Indexing Service results in the search results?

 

#20403
May 28, 2008 10:55
Vote:
 

I'm experiencing the exact same issue. No files are found what so ever.

And if I combine both these lines:

SearchDataSource.PageLink = PageReference.StartPage;
SearchDataSource.SearchLocations = "~/Globals,~/Documents/,~/PageFiles/";
I get this error-message when I bind the asp:repeater
[NullReferenceException: Object reference not set to an instance of an object.]
EPiServer.Web.WebControls.GenericDataSourceView`1.ExecuteSelect(DataSourceSelectArguments arguments) +367  

Entire code (comment out PageLink returns zero results):

if (! String.IsNullOrEmpty(searchQuery))
{

SearchDataSource.SearchQuery = searchQuery;
SearchDataSource.PageLink = PageReference.StartPage;
SearchDataSource.OnlyWholeWords = false;

SearchDataSource.SearchFiles = true;
SearchDataSource.SearchLocations = "~/Globals,~/Documents/,~/PageFiles/";
//SearchDataSource.MainCatalog = "Web";

SearchResult.Visible = true;
SearchDataSource.DataBind();
SearchResult.DataBind();
}

When I search in the edit-mode File Manager I receive:
VirtualPathProvider 'SiteGlobalFiles', Search Error: %Project_path%\trunk\VPP\Globals\index not a directory
VirtualPathProvider 'SitePageFiles', Search Error: %Project_path%\trunk\VPP\PageFiles\index not a directory
 

 

#20433
Edited, May 29, 2008 8:53
Vote:
 
The error "%Project_path%\trunk\VPP\PageFiles\index not a directory" seems to imply that the EpiserverIndexingService5 is not indexing your files as the index directory should be there in your PageFiles folder.

Is the service running? Have you enabled indexing of the site via the manager? Click on the capabilities section under your site inside the manager.

To be able to search for PDF files you need to install an IFilter from adobe as that enables us to index the contents of pdf's. The IFilter is availiable on adobes site.

Jim, are you able to search for files in the file manager in edit mode?

#20443
May 29, 2008 16:36
Vote:
 

There are two instances of EPiServer.IndexingService.exe running on my system (when I view the Windows Task Manager). Indexing is enabled (according to the EPiServerManager).

In the Services-dialog I can see that both "EPiServer Indexing Service" and "EPiServerIndexingService5" are started.

Any idea why I get an "Object reference ..." on the search results page? 

(Edit: I'm running EPiServer 5.1.422.256. Stopping "EPiServer Indexing Service" did not help.)

#20447
Edited, May 29, 2008 20:43
Vote:
 
Hi,

No real idea why you are getting the object reference exception I'm afraid. Try to reindex your VPP by following these steps:

  1. Stop the indexing in the manager (make sure it removes the connectionstring in your configuration, view configuration)
  2. Stop episerverindexingservice5
  3. Backup and delete the index folder in your vpp folder (my case c:\vpp\5episerver\globals\index)
  4. Start episerverindexingservice5
  5. Start indexing in manager

Start by making sure that you can search for your files in the file manager in edit mode.

#20469
Jun 02, 2008 9:57
Vote:
 

Hi Nicklas,

 I have three versions of the same file in the filesystem. One is PDF, one is Microsoft Word and one is a Text file. They are all in the same foldr and when I try searching from within the file manager I get NO search results back.

 I have tried your suggestion to get it to re-index, which seamed to work as the index folders have been recreated.

 I have not tried installing the PDF IFilter yet, but I would have thought I should be getting the text and word document returned.

 Any surgestions on what to try next would be great?

 Thanks,

 Jim.

 

 

 

#20572
Jun 05, 2008 10:44
Vote:
 

Hi,

Just tried it again and it worked.

I think I was suffering from early morning syndrom. Sorry.

 Jim.

#20573
Jun 05, 2008 10:52
Vote:
 

Hi,

Great tip, I've got the search working in edit mode. Although I was unable to complete one of the steps: "Backup and delete the index folder in your vpp folder", there were no such folder. These were created after I restarted he indexing through the manager.

The "Object reference ..." error was due to a typo in one of the SearchLocations, it read "Globals" instead of "Global". I'm not sure if I copied the locations from a code example or just assumed the extra "s" since the other two has a suffixing "s".

Thanks for the help!

// Peter

#20649
Jun 08, 2008 22:10
Vote:
 

Hi.

I am having the same problem.  I have reindexed my VPP folders and I see that the index is built up again. The Search works in edit mode, but not when I search form the searchpage. When I debug, I see that the SearchFiles is set to true. 

Here is my Web.Config:
<virtualPath customFileSummary="~/FileSummary.config">
<providers>               
<add showInFileManager="true" virtualName="Global Files" virtualPath="~/Global/" bypassAccessCheck="false" indexingServiceCatalog="Web" physicalPath="C:\VPPNaf\Globals" name="SiteGlobalFiles" type="EPiServer.Web.Hosting.VirtualPathVersioningProvider,EPiServer" />
<add showInFileManager="true" virtualName="Documents" virtualPath="~/Documents/" bypassAccessCheck="false" indexingServiceCatalog="Web" physicalPath="C:\VPPNaf\Documents" name="SiteDocuments" type="EPiServer.Web.Hosting.VirtualPathVersioningProvider,EPiServer" maxVersions="5" />
<add showInFileManager="false" virtualName="Page Files" virtualPath="~/PageFiles/" bypassAccessCheck="false" indexingServiceCatalog="Web" physicalPath="C:\VPPNaf\PageFiles" name="SitePageFiles" type="EPiServer.Web.Hosting.VirtualPathVersioningProvider,EPiServer" />
<add name="PathMappings" type="EPiServer.Web.Hosting.VirtualPathMappedProvider, EPiServer" />
</providers>
 </virtualPath>

Here is my Search.aspx.cs:
SearchDataSource.PageLink = PageReference.RootPage;
SearchDataSource.SearchFiles = true;
SearchDataSource.SearchLocations = "~/Global/,~/Documents/";

Here is my indexingservice configuration:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <configSections>
    <section name="episerver.indexingService" allowDefinition="MachineToApplication" allowLocation="false" type="EPiServer.IndexingService.ConfigurationHandler,EPiServer.IndexingService" />
  </configSections>
  <episerver.indexingService>
    <indexes>
      <add connectionString="Data Source=DBsource;Database=DBname;User Id=DBusername;Password=DBPassword;Network Library=DBMSSOCN;" databaseClient="" filePath="C:\VPP\Globals" itemRoot="/Global" />
      <add connectionString="Data Source=DBsource;Database=DBname;User Id=DBusername;Password=DBPassword;Network Library=DBMSSOCN;" databaseClient="" filePath="C:\VPP\Documents" itemRoot="/Documents" />
      <add connectionString="Data Source=DBsource;Database=DBname;User Id=DBusername;Password=DBPassword;Network Library=DBMSSOCN;" databaseClient="" filePath="C:\VPP\PageFiles" itemRoot="/PageFiles" />
    </indexes>
  </episerver.indexingService>
</configuration>

Can anyone help me with this matter?

BR,

Tore

#23310
Sep 03, 2008 9:45
Vote:
 

Found a solution to my problem. I had EnableVisibleInMenu set to true. And the files returned by SearchDataSource.PerformUnifiedFileSystemSearch() has PageVisibleInMenu set to false. I reported this to EPiServer, and they have reported it their system as #14072: Searchdatasource does not return any files if EnableVisibleInMenu is set to true.

Read this blog post to se a solution to the problem.

http://labs.episerver.com/en/Blogs/Tore-Gjerdrum/Dates/2008/9/Problems-with-SearchDataSource-and-EnableVisibleInMenu/ 

#23631
Edited, Sep 11, 2008 11:50
Vote:
 

Hi again,

New site, new EPiServer version (R2), same problem.

"%%Project_path%%\trunk\VPP\Global\index not a directory".

How do I reindex the folder structure in R2? I can't find any configuration settings in the Installation Manager?

Why does it feel like I'm the only one with these problems? =)

Regards,

Peter

#25416
Oct 24, 2008 9:45
Vote:
 

I haven't tried to reindex in R2 myself but I would guess you could do the same way. But exclude the manager part. Just make sure your connectionstrings are valid.

You'll find your IndexingService configuration here:

C:\Program Files\EPiServer\Shared\Services\Indexing Service\EPiServer.IndexingService.exe.config

#25425
Oct 24, 2008 11:46
Vote:
 

Running R2 Sp1 and can not get any results when searching with the filemanager in admin mode. There is uploaded pdf files, txt files and doc files. I can get results from txt files but nothing else. EPiServer Indexing Service is running and no error in Eventlog. I have stoped the service and deleted all content under the coresponding index folders in each folder section ie /Global /Documents /Pages Files. I can see that is builds up new files but the search for pdf and doc does not work. Adobe IFilter is installed twice. Using VirtualPathVersioningProvider for all three folders. One difference in this configuration is the connectionstring against the sites db. It's using an instans name like this "Data Source=172.20.6.126\Applications;" but it works for the site, so I dont belive in any problems corresponding to that.

Any tips?

#28355
Mar 04, 2009 18:33
Vote:
 

Exactly what steps must be taken in order to get the search to work in EPiServer.
I am using VirtualPathVersioningProvider for the files I intend to searh for.

I get the following errors:

C:\EPiServer\VPP\website1\Global\index not a directory

EPiServer.Lucene

   vid Lucene.Net.Store.FSDirectory.Init(FileInfo path, Boolean create)
   vid Lucene.Net.Store.FSDirectory.GetDirectory(FileInfo file, Boolean create)
   vid Lucene.Net.Search.IndexSearcher..ctor(String path)
   vid EPiServer.Web.Hosting.Versioning.Store.LuceneQuery.Search(UnifiedSearchQuery userQuery)
   vid EPiServer.Web.Hosting.VersioningDirectory.Search(UnifiedSearchQuery query)
   vid EPiServer.Web.WebControls.SearchDataSource.PerformUnifiedFileSystemSearch()
   vid EPiServer.Web.WebControls.SearchDataSource.PerformFileSearch(TextSearchParameters searchParams)
   vid EPiServer.Web.WebControls.SearchDataSource.Select(DataSourceSelectArguments arguments)
   vid EPiServer.Web.WebControls.GenericDataSourceView`1.ExecuteSelect(DataSourceSelectArguments arguments)

The directory index do exist on the webservers harddrive.

#35408
Dec 09, 2009 13:11
Vote:
 
#35421
Dec 10, 2009 9:04
Vote:
 

I have followed the blog post...
http://labs.episerver.com/en/Blogs/Mari-Jorgensen/Dates/2009/11/Searching-for-files-in-EPiServer-CMS-5/

exactly as it is written but I do not get the search to work.
I have though found that EPiServer.Lucene fails in method Init() on class FSDirectory on the last of the lines where a call to Directory.Exists(this.directory.Fullname) is made and an IOException is thrown by the code.

 

#35422
Edited, Dec 10, 2009 9:10
Vote:
 

Hi:

You mentioned ' Have you enabled indexing of the site via the manager? Click on the capabilities section under your site inside the manager.'?

Are u talking about capabilities in episerver admin/edit mode or MMC snap in(not sure where u mean to enable this?)

The service is running , i have index folders created in vpp. installed I filter.Running this all on a 64 bit machine.

Please advise as i get no files back, docx or pdf..

Regards

#37545
Mar 10, 2010 14:11
Vote:
 

Hi

Got this working recently on a 64 bit machine. You need to download an iFilter 64 bit compatiable version - http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025.

#37547
Mar 10, 2010 15:03
Vote:
 

I have been investigating the search as discussed above but on x64 bit machine.
i have installed the x64Ifilter as supplied by adobe. I have used the indexservice in debug mode and it seems to load the files into the index all ok except for .doc files. (I get a message for .doc that it could not be added to index.)
I am also using the versioningprovider.
The filemanager search in epi is not returning files. not even for txt files
as this article indicates. : http://labs.episerver.com/en/Blogs/Mari-Jorgensen/Dates/2009/11/Searching-for-files-in-EPiServer-CMS-5/

Please advise how to query the Web catalog for verification purposes on dev.
What vpp provider to use etc..
Regards

#37565
Mar 11, 2010 11:31
Vote:
 

Gareth, was there is specifif way you setup your episite? eny environment settings I need to Follow. The vpp is getting /index folders but somewher in the process filemanager does not know what's going on. did u install x64 ontop of the 32bit IFilter? i only installed x64..

#37569
Mar 11, 2010 13:20
Vote:
 

I followed the same instructions here

Installed Microsoft Filter Pack

Installed iFilter 64 bit

Check C:\Program Files (x86)\EPiServer\Shared\Services\Indexing Service\ EPiServer.IndexingService.exe.config points to the correct VPP and correct database (same as web.config)

Delete Index folders in PageFiles, GlobalFile & Documents folder in VPP.

Stop EPiServer Indexing Service. Restart IIS. Start EPiServer Indexing Service.

New Index folders in VPP should now be created. Let me know if that works for you!

#37573
Mar 11, 2010 14:59
Vote:
 

thanks for the reply.. re the MS Filter pack, u added that to the episerver db right with the script instructions?

I have restarted it multiple times and deleted the indexes.

Do files list/search with text searches in epi Filemanager all ok.?

Can you please tell me if u are infact using VurtualPathVersioningProvider?

Regards

#37574
Mar 11, 2010 15:20
Vote:
 

Very interesting. It seems as though the Index called Web is not created on my server.. this is supposed to be created on install I assume? Our other office have sent me a screenshot of theire indexes and they have it in there by default. (Which is consistent with Epi's docs)

Please advise if urs was was infact there on install.?

Regards

#37575
Mar 11, 2010 15:42
Vote:
 

MS Filter pack is installed on the web server not the database. I just installed as per instructions on MS website. The iFilter is also on webserver not database server.

Yes files show up in filemanager search.

Im also using default VirtualPathVersioningProvider

You can also have a look inside the index to see if the files are getting indexed using Luke

#37576
Mar 11, 2010 15:53
Vote:
 

This makes me wonder if you in fact have the 'Web' index catalog on ur server?

#37577
Mar 11, 2010 16:02
Vote:
 

What O/S u running out of interest? Windows 7.

#37578
Mar 11, 2010 16:11
Vote:
 

I have config set to indexingServiceCatalog="Web" in config.

We have it on a Server 2008 64x machine. Actually looking now, it doesnt list Windows 7 as being supported http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025.

Maybe you should contact Adobe for support?

#37579
Mar 11, 2010 16:35
Vote:
 

very interesting, my default install did not create a 'Web' Catalog, and hence my issue..  It's more an episerver issue in this case.. It can't index a web catalog that does not. I ran indexservice in DEBUG mode and looks like the Ifilter - it confirms creation of *.pdf file on upload. Thanks for the luke info, that does not run on win 7 either.. :-( 

What machine do you develop on then? can u show me the Index Catalog settings for ur index catalog Web please.. i will create them manually myself.. aaarrgh the frustration

#37582
Mar 11, 2010 17:17
Vote:
 

Windows 7 and Episerver indexing seem to not be compatible - after a lot of frustration trying to fix it on my dev box!!!.

on my win 7 box I get the first index working configuring the epiindex.config etc.., any further uploads with file manager the /index folders and files do not update. All works on server 2008, so if u can develop on that do so.. or use another O/S for ur dev box.

In the case of having to stay win win 7, do the vpp backup and delete /index folders and restart IIS, and epi Index service as stated higher up on this article.

#38073
Mar 29, 2010 11:55
This thread is locked and should be used for reference only. Please use the Episerver CMS 7 and earlier versions forum to open new discussions.
* You are NOT allowed to include any hyperlinks in the post because your account hasn't associated to your company. User profile should be updated.