How the Document Scanning Process Works in 10 Simple Steps
by Jeff Osgar, Solutions Specialist // Document Management on Apr 10, 2018 8:03:00 AM
If you've ever looked around your office and thought that it would be great to get rid of all of the paper, you can do that.
Better yet, you already have what you need to get started – a copy machine.
The scanning functionality of your copier can be used to convert paper documents into digital format (and use software to make them easy to find again too). Once your documents are digital, you can skip the filing cabinet and save them to a document management system or even a simple file share; email them; or even launch a business process like accounts payable.
A quick word about terminology: there's an entire category of software and hardware products dedicated to the process of converting paper to digital documents. Capture, document imaging, forms processing, and even document management are all used to describe this IT industry segment.
I'd also like to point out that the concept of capturing a document is simple. However, under-the-hood it's an extremely complex technology. As you need to scan large volumes of paper with speed, extract data from forms, convert to a full-text searchable document or just capture the image; capture becomes complicated.
Capturing documents has a proven ROI. Research from industry association AIIM reveals that most capture installations report ROI in fewer than 12 months, often in less than half a year.
As a reminder, here’s a quick list of benefits:
- Reduce filing/storage costs
- Reduce distribution costs
- Protect/control information
- Improve access to information
- Comply with regulatory requirements
- Improve customer service
Ready to get started? Here are the steps you'll take to scan your documents.
Step 1: Two Types of Capture
Depending on your business objectives, you'll capture the image of the document and/or extract the relevant data from the document. Two types of capture software accomplish these tasks: document and data capture. These are two different technology processes.
Document capture converts a paper document into a digital replica of that document using PDF, JPEG, or TIFF file formats most often. The same software can also import and convert electronic files (Word, spreadsheets) as well.
Data capture (sometimes call forms processing) extracts data from a business form. Credit card applications are an easy example. When you reply, only the data on the form is captured and then placed into a database, an image of the form usually isn't kept unless there's a need to keep an image of your signature. An invoice is another example. Data capture software can be “trained” to look to certain areas of a form – the upper right for a customer reference number for example – and match invoice data to that customer, which will start a workflow for payment.
Step 2: Prepping the Documents
Once you decide what information you need to capture, you need to prepare the paper documents for scanning. As everyone knows, paper jams are no fun. Remove staples, paper clips, sticky notes (or tape them down so they can be scanned too); repair torn pages; and sort into batches (not always necessary depending on amount of scanning to do and the software you have).
Don't skip this step. If a paper clip or staple gets inside your copier or scanner, you might be shut down for a while as you try to fix it or have to call in a service tech.
Step 3: Conversion – Capture
Conversion is the software turning the analog document into digital format (or ingesting a born digital document so that it can be placed into a workflow).
Documents can be captured via:
- Fax – Image quality is usually lower here, which could lower recognition accuracy.
- Camera phone – You have a scanner in your pocket with your phone camera and easily downloadable software.
- Copier/Multifunction Printer – From desktop to large-volume, nearly all have scanning functionality now.
- Scanner – Various speeds and models available depending on daily scanning volume and business need. From desktop speeds of 10 pages per minute to 120 pages per minute and higher.
- Checks and microfilm – There are specialized scanners for both types of documents.
Once captured, the documents will go through some or all of the following steps: document imaging, forms processing, image cleanup, quality control, and recognition.
Step 4: Document Imaging
If you need to keep a digital representation of the document, it can be saved as one of a number of formats: TIFF (Tagged Image File Format), JPEG, PDF, PDF/A, or GIF (Graphics Interchange Format).
Step 5: Forms Processing
For forms, the data and/or the entire form can be captured, depending on what your business needs. Data captured from a form can be moved into the correct database.
Step 6: Image Cleanup
When documents are old or of poor quality, the quality of the documents can be improved with software or hardware image cleanup functionality. Common features include:
- Deskew – straightens images scanned in crooked
- Despeckle - removes dots from the document
- Rotate – turns documents fed in incorrectly to the right orientation
- Blank and double-page detection – blank pages can be deleted and a double-feed alert allows you to rescan the document
Step 7: Quality Control
No software or data entry is perfect. In key-from-image operations, data can be validated by a second operator or via automated processes like database lookups. Poor quality images are flagged and scanned again.
Step 8: Recognition
Recognition software is the heart of capture. The software reads the data so that it can be indexed, for example.
- OCR (optical character recognition) – Recognizes machine-printed characters.
- Zonal – Used where only specific fields on a form are required.
- Full-text – Free form document conversion allowing search on all words in the document.
- ICR (intelligent character recognition) – For hand-printed characters.
- OMR (optical mark recognition) – Recognizes check boxes, filled-in bubbles, etc.
- Barcodes – Read and extract information from a pre-printed barcode.
Step 9: Index
If you don't index your documents, you'll never find them again! The index can be full text or key fields; though a combination of both is best. There are a number of ways to index.
- Key from index fields (document type, date, customer name, etc.) – A data entry person manually indexes documents.
- Auto-indexing with barcodes – By storing form information on a barcode before scanning a batch of documents, certain index values can be automatically populated.
- Zone OCR – Also automatic
- Ingest from other applications – Email, word processing, etc.; metadata from the document (subject line, sender, etc.) become the index fields.
Step 10: Use Your Documents!
Now that your documents are digital, they'll be easily accessible to anyone in your organization (if they have the right permissions, of course). Use them to automatically launch workflows, make customer service faster, or simply eliminate the need for filing cabinets. You can also dispose the paper documents according to your business records management policy.
If you're ready to remove paper from your business process, it's time for a document management consultation.