ARTIFICAL INTELLIGENCE
Teaching a Robot to Read
By: COLLEEN McCRETTON | AUGUST 18, 2021
Over the last several years one of the FCAT AI teams - code named “RoboReader” - has been working on processing documents and taking needed information from unstructured text and transforming it into structured data that can be used by the business. In the course of the work, we have noticed parallels in the way we are “teaching” the system and how we read as humans.
  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print

When reading for work, most of us skim or scan the contents looking for words, phrases or formatting that provide clues that something might be important to us. Information Foraging Theory,1 a concept that emerged in 1993 and correlates the behavior of humans looking for information to animals looking for food, gives reasons for this: (a) we want to maximize our reward (in the form of information or food) relative to our effort, and (b) as a result, we have developed learned behaviors that help us find what we are looking for quickly when reading for informational purposes.2 When we skim our goal is to get the general gist of the information we seek, often focusing on indexes or tables of contents, titles, subtitles and headings, bulleted lists, bold or underlined words, tables, charts and pictures. We also scan to find specific information, e.g. looking for specific words or phrases, ordering, or formatting on a page.3

In our project work, we found evidence of our business users implementing these methods. In one use case, users were always flipping to the last few pages of a document for the information they needed. In another use case, the important information was always in a bulleted list, and in yet another it was always in a table.

We used these observed behaviors when training our AI models. We utilized image processing techniques to “visually” scan for lines indicative of a table when teaching the system to process tabular data. We interpreted formatting metadata indicative of bulleted lists when teaching the system to look for requests, usually coming in this format. We taught the system how to recognize key:value pairs based on location and formatting cues. We taught it to find monetary amounts, dates, addresses and ID numbers based on location and formatting as well. Within paragraphs, we used leading and trailing language markers and letter case to teach it to identify names of people and companies and other specific relevant terms.

Yes, it is possible to teach a robot to read. It starts with understanding how we humans learn to read and transferring those same skills and techniques to our robot assistants. All of which perfectly illustrates the fact that, for humans and robots to be successful in their work, reading is fundamental.

Colleen McCretton is Director, User Experience Design, in FCAT

 
  • Facebook.
  • Twitter.
  • LinkedIn.
  • Print
close
Please enter a valid e-mail address
Please enter a valid e-mail address
Important legal information about the e-mail you will be sending. By using this service, you agree to input your real e-mail address and only send it to people you know. It is a violation of law in some jurisdictions to falsely identify yourself in an e-mail. All information you provide will be used by Fidelity solely for the purpose of sending the e-mail on your behalf.The subject line of the e-mail you send will be "Fidelity.com: "

Your e-mail has been sent.
close

Your e-mail has been sent.