Optical Character Recognition (OCR)
Optical Character Recognition, or OCR, is the technology that reads or interprets the text from a document. Essentially, OCR is much like a human; comprehending characters, letters, numbers and symbols. Therefore, OCR provides the benefit of greatly reducing and potentially eliminating manual indexing.
The data secured from OCR informs the document management software of the type of document being handled. For instance: suppose that you are scanning a set of patient files into your system. As your files are being scanned, the OCR technology recognizes text, such as: patient name, date of birth, and social security number---all terms that are generally associated with information recorded in a patient file. Based upon the data identified, your system is able to properly manage, store, and/or route this information appropriately.
Furthermore, there are two Optical Character Recognition techniques: Zonal OCR and Full-Page OCR. Zonal OCR utilizes a set of coordinates to focus in on a particular field within a document, whereas Full-Page OCR examines and extracts data from the entire document.
To provide an example distinguishing the two, call to mind the set of patient files. If all of the information needed to inform your system that it was processing patient files was located in the upper right hand corner of each document, then Zonal OCR would be the more efficient method, because of its ability to extract data from a specific region. However, if the figures categorizing your documents as patient files were located on different points of each document, then Full Page OCR would be a more logical approach, because it would search each document completely.
Like most technologies, both OCR techniques invite pros and cons. While Zonal OCR's attention to a particular region may result in high accuracy rates, difficulty arises if a document-type changes format. This follows, because the requisite information may now be positioned in a quadrant other than the region being OCR'd. Regarding Full-Page OCR, the advantages and disadvantages are the same---ALL of the information on the page is OCR'd, but, EVERYTHING on the page is OCR'd. As such, the load of information does not always lend itself to precision. Additionally, the volume of information being OCR'd also affects the performance of your system. Thus, if you are scanning in a large quantity of documents, then Full-Page OCR would not be favorable, because it would take more time per page to OCR.
Regardless of type, OCR technology is capable of consolidating high volumes of information, while reducing data entry errors. Hence, OCR is a cost-reducing tool that streamlines otherwise time consuming practices.





