Optical Character Recognition (OCR)

Downloadable page: Lesson on OCR Download Lesson on OCR

The Standards

WCAG 2.0 Guidelines:

Guideline 1.4.5 Links to an external site."If the technologies being used can achieve the visual presentation, text is used to convey information rather than images of text except for the following: (Level AA)
- Customizable: The image of text can be visually customized to the user's requirements;
- Essential: A particular presentation of text is essential to the information being conveyed." (W3C)

What do the Standards Mean?

OCR is a process of converting scanned images to recognizable printed text by a computer or other electronic devices.

What is scanning?

Most offices personal or professional are commonly using All-in-one printers. These printers can print, scan, and fax documents. The process of converting a printed document to electronic can be achieved by scanning. This scanned product, however, may or may not be accessible. Inaccessible scanned documents are images that do not have the ability to be recognized as text. So, what visually appears on the screen after the scan is “readable” but if used in combination with assistive technology is only “seen” as an image with no text.

What is OCR?

The process of converting the scanned image to text that is readable by the computer or other assistive software is called OCR.

When an image is scanned it has the option to be saved as a PDF (Adobe reader compatible file). PDFs formats are commonly used to provide and disseminate documents in a learning environment. Saving a scanned image as a PDF does not necessarily mean that it is accessible.

There are several ways to check if your scanned PDF is an accessible file.

Using the mouse, click in the document. If the entire text gets selected and it looks like one big image, it probably is. This format is inaccessible.
Using your keyboard, execute the find command (i.e. Control + F for PC and Command + F for Mac). If you are unable to find text within the document the format is inaccessible.
Within Adobe Reader (free software), go to the “view” menu, and “Read Out Loud”, “Activate Read Out Loud” or press Control + Shift + Y to activate read out loud. Now “Read this page only” with Control + Shift + V. If the reader starts to read the text the document is accessible. If not, it may say blank page.

Scanned, inaccessible PDF images can be converted to accessible PDF by executing the process of OCR. There are many off-the-shelf software available to complete this process. Operating systems are also providing built-in OCRs to help with the process. The All-in-One printer may also have software that performs OCR.

Resources

Information on this page is from the following resources about Optical Character Recognition (OCR):

PDF Accessibility Links to an external site.. Retrieved December 30, 2015.

WebAIM is a non-profit organization associated with the Center for Persons with Disabilities at Utah State University. It is one of the leading resources for information on accessibility.

PDF Techniques Links to an external site.. Retrieved December 30, 2015.

This website, w3.org, is the website for The World Wide Web Consortium (W3C), which is a community of individuals who develop web standards. The guides on this website include extensive recommendations, examples, related resources and other helpful tools.

United States Access Board Links to an external site.. Retrieved January 13, 2016.

This page contains the Section 508 standards as of January, 2016. This does not contain the refreshed standards proposed in February 2015.

Quick Reference Guide to Section 508 Requirements and Standards. Links to an external site. Section 508.gov. Retrieved January 23, 2016.

This page contains the Section 508 standards as of January, 2016. This does not contain the refreshed standards proposed in February 2015.

Estimated time: 5 minutes