• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Ron McLeod
  • paul wheaton
  • Jeanne Boyarsky
Sheriffs:
  • Paul Clapham
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
  • Himai Minh
Bartenders:

OCR Methodology

 
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am trying to figure out how to use OCR from the ground up (and using java somewhere in the back-end). Our current system is as follows:
1) Scan multiple documents (using a Canon 5020) and save as a PDF format (usually several hundred of the same type of document - each one for a different person).
2) Using a java Swing GUI, the user opens each PDF document and assigns a type to it and a few other parameters). The system then takes the PDF and stores it in an appropriate location and enters appropriate database information (always stored as a PDF).

I want to skip the user section and automatate with OCR. Not even sure where to start. Should I scan the documents and save as PDF and then use some OCR program to read through the PDF or should it be saved as some other format and converted later. What are some good tools, etc...

Thanks for any help!
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic