Enabling the PDF iFilter in SharePoint to Crawl Searchable PDFs

  • Out of the box, Microsoft SharePoint will not index full text PDFs.  There are several steps to enable PDF indexing, and also make sure you see Adobe icons within the SharePoint viewer.
  • You will first need Adobe Reader, as it includes Adobe  IFilter from http://get.adobe.com/reader/
  • You will need to grab the Acrobat PDF Picture.  This will display the PDF icon next to PDF Documents in Microsoft SharePoint.  You can download it from http://www.adobe.com/images/pdficon_small.gif
  • You will now need to add the PDF file type to the Extensions List for SharePoint  search by editing the registry
    • Start the registry editor, by going to Run, and typing regedit
    • Open up HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{Random GUID}\Gather\Search\Extensions\ExtensionList
    • You will need to Add “pdf” to the list as a new String Value. Find the highest in the list, typically 37, and create a new key with the next number (38) as the key with the value “pdf”
  • Add the Acrobat PDF icon you downloaded above to the Microsoft SharePoint templates directory. Copy the icon called pdficon_small.gif into the folder “%programfiles%\Common Files\Microsoft Shared\Web Server Extensions\12\TEMPLATE\IMAGES”.

  • Now you will have to bind the Acrobat PDF picture to the PDF file type
    • Open the “%programfiles%\Common Files\Microsoft Shared\Web Server Extensions \12 \TEMPLATE\XML\DOCICON.XML file
      • Locate  the <DocIcons.ByExtension> section of the file.
      • Add the mapping below:
        <mapping Key=”pdf” Value=”pdficon_small.gif” OpenControl=”” />
      • Change the iFilter mapping in registry
        • Go to start, and run regedit
        • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
        • Add (or modify) the .pdf key
        • Add a Multi-String value with value {E8978DA6-047F-4E3D-9C78-CDBE46041603} or modify if another GUID value already exists.
        • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
        • Add (or modify) the .pdf key
        • Add a Multi-String value with value {4C904448-74A9-11D0-AF6E-00C04FD8DC02}
          • or modify if another GUID value already exists.
          • You will need to add the Adobe Reader folder to the environment path variable
            • Open the System Icon in the Control Panel
            • Open the Advanced tab
            • Go to the Environment variables
            • Edit the Path variable
            • Add your Reader folder to the Path list, e.g. C:\Program Files\Adobe\Reader 9.0\Reader
            • Restart the Search service by restarting your server or executing the following commands:
              • Run: net stop osearch
              • Run: net start osearch
              • Open a command prompt and do a iis_reset
Advertisements
Tagged with: , , , , ,
Posted in sharepoint
3 comments on “Enabling the PDF iFilter in SharePoint to Crawl Searchable PDFs
  1. JoeShoes says:

    I know this is an old post. I’m trying this but I get lost because I have a newer version of sharepoint. In the registry I have 14 not 12 and therefore, get lost when adding registry values. Any help would be greatly appreciated.

  2. vapcguy says:

    You leave out having to do a Full Crawl, and maybe a reset of the index before content will appear, since SharePoint won’t recrawl something that hasn’t changed. Also, you left out additional locations to modify, where these entries are needed:
    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\Filters\.pdf]
    “Extension”=”pdf”
    “FileTypeBucket”=dword:00000001
    “MimeTypes”=”application/pdf”
    @=””
    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\Filters\.pdf]
    “Extension”=”pdf”
    “FileTypeBucket”=dword:00000001
    “MimeTypes”=”application/pdf”
    @=””
    You also leave out that, when they are creating the Multi-String entries, they’ll have to export that registry entry afterwards, remove the simple String default entry, and re-import, so the Default entry is in a Multi-String format. OpenControl isn’t necessary on the icon XML entry.
    Just offering the benefit of my hellish experience combing through multiple websites that had various pieces of the puzzle on this, and none seemed to have all the info needed in one place to get it working.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Enter your email address to follow this blog and receive notifications of new posts by email.

Follow Scanning with Microsoft SharePoint on WordPress.com
BLOG Categories
Current Poll
%d bloggers like this: