Quality Assurance is critical in any SharePoint Scanning Project
Ah, QA. Is it really necessary? During a recent session with Microsoft, we dove deep into the scanning QA process, and some best practices for insuring a high quality end product. This is a summary of the best practices we shared during our meeting, and 10 critical steps we have found to be of the utmost importance in any SharePoint Scanning operation:
- Have someone else check the work. Scanning can be a mundane, tiring task, and any knowledge worker that is processing over 10 documents per day should have a downstream individual check both their images and data. A scanning best practice has always been to create a task specific scanning “assembly line” where workers are responsible or specific tasks: Capture, Indexing and QA.
- Use automation whenever possible. Ah the human race…fantastic, but prone to errors. The technology exists now to automate data collection and data entry through a wide variety of means: Advanced Data Extraction (ADE), Database Lookups and many others. Using them can reduce data entry errors and make the process much more efficient.
- Use validation and exception processing. Most true capture applications can automatically examine data and make sure they meet set criteria. The most basic are required fields being left blank, but most of our customers use advanced validation that check the pattern of entered data, run database validation to ensure the data matches a corporate record and use custom scripting to enforce business rule sets.
- Always use expected count validation. Paper is paper is paper. Counting the pages, documents and folders being scanned, and entering them into a validation interface before scanning can insure all is scanned and counted. This enables a physical to digital validation, and prevents poorly stacked and prepped paper records from being left out of the process.
- Use scanning hardware that can help. Most scanning hardware today has double feed detection to make sure pages are not stuck together as they go through the feeder. These vary from sonar that sends a pulse through the paper, to technologies that check the length of the image. Canon has some great technology in this arena and can save you the pain of finding out 3 years down the line you don’t have the most important page.
- Use QA sampling to save time. Make sure you utilize auto-viewing and sampling to speed up the process. Most true capture applications will allow you to run a process that samples every nth image or page. This can be a huge benefit in organizations scanning high volumes.
- Check the repository. QA during capture is all well and fine, but there also needs to be a check of the end resting place. Why? To make sure the documents and data are being placed in the right location, with the right rules being applied and the right data fields populated.
- Using reporting to get a warm fuzzy. I always highly recommend using a dual stream output. Place your images and metadata in your repository, and then your data and scanning statistics into a reporting DB. This can help you track your scanning operations, and make sure all your documents are being processed. Some advanced customers bounce this data off a line of business system to find exceptions or missing documents.
- QA the images AND the data. These two go hand in hand. Having an interface that has a dual view where you can see both the batch/folder/document/image structure, as well as a spreadsheet off the data is paramount to making sure all facets of your end product are in order. Some key features: interactive image cleanup tools, blank page flagging, and a thumbnail view.
- Involve the document owners. I have seen time and time again where organizations like to have a 3rd party or non-process owners scan, index and QA documents. Unless the scanning operators are thoroughly trained in document types, classification and data, this can be a recipe for disaster. The fix? Have a document expert QA the documents downstream in the capture workflow to catch any errors.