Content Crawler

contentCrawler is an integrated analysis, processing and reporting framework that intelligently assesses documents in a Document Management System for bulk processing. Users can bulk process documents in the DMS using either the OCR or Compression modules. Or, they can do both. For example, contentCrawler will convert all image-based documents in the DMS to text-searchable PDFs. The Compression module will then apply compression and downsampling to all PDFs, reducing them in file size. The automated end-to-end process can run 24/7 without any staff intervention, emailing periodic notifications of processing statistics and error reporting to the IT Administrator. Staff no longer have to worry about OCR or compression as a process or workflow.

Request information

Click to view large image

Key Features

  • Increase organizational productivity
  • Simplify management of image-based documents
  • Reduce non-compliance risks
  • Increase efficiency through automation
  • Leverage investment in DMS and search technology
  • Reduce costs managing OCR and Compression technology

Technical Specifications



  • 1 CPU
  • x86 or x64
  • 1.6Hz or faster
  • Expected throughput of 5 to 10 seconds per page (Processing 1 document at a time)


  • 4 CPUs dedicated
  • Multiple CPU cores for faster parallel processing
  • Upgrade licenses are available for OCR processing 8, 16 or 32 CPU cores
  • Expected throughput of 1 to 3 seconds per page (OCRing 4 documents at a time – 1 document per core)
  • Upgrading to more than 8 cores will provide expected OCR throughput of ½ to 1 second per page depending on speed of access to documents



  • 4GB


  • 8GB
  • Additional 1 – 2GB for each additional CPU core over 4
    Additional memory may be required if other application services will run on the same system

Hard Disk


  • 100GB – free disk space for program files and typical operation
    Additional disk space may be required to support large documents or if user wishes to pause processing prior to saving processed documents


  • 100GB – free disk space for program files and typical operation, or
  • 50GB – free disk space for contentCrawler program files on operating system drive and 50GB free disk space for document cache for held documents
  • Additional free space may be required if user chooses to hold for review prior to saving processed documents

Support Operating System


  • Microsoft® Windows Server® 2012 R2*
  • Microsoft® Windows Server® 2012*
  • Microsoft® Windows Server® 2008 R2 with SP1*
  • Microsoft® Windows Server® 2008 with SP2*


  • Microsoft® Windows Server® 2012*
  • Microsoft® Windows Server® 2008 R2 with SP1*

(*) Not supported on Server Core Role.

Additional Requirements


  • Microsoft .NET Framework 4.5 or 4.5.1- If your operating system is Server 2012 or higher you may already have the required version of  Microsoft .NET Framework
    – If you do not have .NET 4.5 or 4.5.1 installed you should download the full Microsoft .NET Framework 4.5 or 4.5.1 installer from Microsoft


  • Latest updates for Microsoft .NET Framework, such as Microsoft .NET Framework 4.5.1

Other Capture addons