Deepak Nagpal

Sample HCI Plugin (OCR)

Discussion created by Deepak Nagpal on Feb 1, 2018
Latest reply on Feb 21, 2018 by Sandeep Rakshe

We are new to HCI development, we developed a sample OCR stage plugin and want to share with community.

 

This plugin has specific use case, it will process scanned image files and convert to text using tesseract (OCR) library. Metadata is used to find specific text/ fields within the document and attach it with HCI document meta data field. It can also be configured as per the user needs by providing regular expression.

 

Entire source code/ JAR  and setup document can be downloaded from here (GitHub). Hope this helps team, feel free to share your suggestions.

 

 

Note: For better results use scanned images with 300 DPI.

Outcomes