Scanning with Microsoft SharePoint

June 29, 2009

Enabling the PDF iFilter in SharePoint to Crawl Searchable PDFs

Filed under: sharepoint — Tags: , , , , , — scanguru @ 2:33 pm
  • Out of the box, Microsoft SharePoint will not index full text PDFs.  There are several steps to enable PDF indexing, and also make sure you see Adobe icons within the SharePoint viewer.
  • You will first need Adobe Reader, as it includes Adobe  IFilter from http://get.adobe.com/reader/
  • You will need to grab the Acrobat PDF Picture.  This will display the PDF icon next to PDF Documents in Microsoft SharePoint.  You can download it from http://www.adobe.com/images/pdficon_small.gif
  • You will now need to add the PDF file type to the Extensions List for SharePoint  search by editing the registry
    • Start the registry editor, by going to Run, and typing regedit
    • Open up HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{Random GUID}\Gather\Search\Extensions\ExtensionList
    • You will need to Add “pdf” to the list as a new String Value. Find the highest in the list, typically 37, and create a new key with the next number (38) as the key with the value “pdf”
  • Add the Acrobat PDF icon you downloaded above to the Microsoft SharePoint templates directory. Copy the icon called pdficon_small.gif into the folder “%programfiles%\Common Files\Microsoft Shared\Web Server Extensions\12\TEMPLATE\IMAGES”.

  • Now you will have to bind the Acrobat PDF picture to the PDF file type
    • Open the “%programfiles%\Common Files\Microsoft Shared\Web Server Extensions \12 \TEMPLATE\XML\DOCICON.XML file
      • Locate  the <DocIcons.ByExtension> section of the file.
      • Add the mapping below:
        <mapping Key=”pdf” Value=”pdficon_small.gif” OpenControl=”" />
      • Change the iFilter mapping in registry
        • Go to start, and run regedit
        • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
        • Add (or modify) the .pdf key
        • Add a Multi-String value with value {E8978DA6-047F-4E3D-9C78-CDBE46041603} or modify if another GUID value already exists.
        • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
        • Add (or modify) the .pdf key
        • Add a Multi-String value with value {4C904448-74A9-11D0-AF6E-00C04FD8DC02}
          • or modify if another GUID value already exists.
          • You will need to add the Adobe Reader folder to the environment path variable
            • Open the System Icon in the Control Panel
            • Open the Advanced tab
            • Go to the Environment variables
            • Edit the Path variable
            • Add your Reader folder to the Path list, e.g. C:\Program Files\Adobe\Reader 9.0\Reader
            • Restart the Search service by restarting your server or executing the following commands:
              • Run: net stop osearch
              • Run: net start osearch
              • Open a command prompt and do a iis_reset

Blog at WordPress.com.