While crawling Office documents that contain embedded images in SharerePoint Server 2013, we’ll receive the following warning:
“This item was partially parsed. The item has been truncated in the index because it exceeds the maximum size.”
This happens because the SharePoint Search document parses extracts the contents of the Office document when the crawler processes a file, but when the document has an image, it does not know how to process it and reports the warning. By the way, it does not process the image, but the text and metadata of the document are crawled and searchable.
To solve this problem. the SharePoint farm has to be updated until July 2014 CU and install a third-party iFilter to parse the images. In this case is it possible to use the following:
Microsoft Office 2010 Filter Packs
To enable the iFilter, we need to use PowerShell commands, like the following:
$ssa = Get-SPEnterpriseSearchServiceApplication Get-SPEnterpriseSearchFileFormat -SearchApplication $ssa -Identity docx Set-SPEnterpriseSearchFileFormatState -SearchApplication $ssa -Identity docx -UseIFilter $true -Enable $true #On each server that hosts the Content Processing component, the Search Host Controller service must be #restarted to accept the changes. Use the following procedure: net stop spsearchhostcontroller net start spsearchhostcontroller
After completed this steps, start a full crawl, and check if the documents that were generating the warning before should now be displayed as crawled successfully.