A visually impaired client was recently referred to me by a funding agency to assist him in selecting the best assistive technology for him. One of his requests was for a portable device which would allow him to do optical character recognition. He wanted something that he could carry with him to do portable reading, something with which he could snap a picture of a page and have it read it to him.
Prior to seeing me he had been into a Mac dealer who had recommended for this purpose an iPad 16 G Wi-Fi plus 3G and an App Store app called Prizmo.
I have some experience with OCR applications having worked with them for more than 20 years. Only recently have we been able to use portable devices to provide OCR. The first device of this kind was the KNFB Reader, which married a digital camera to a PDA. This later evolved into software which could be installed on selected Nokia cell phones.
Because of my experience with these types of devices I noticed one immediate flaw in the recommendation. To have effective OCR on any kind of camera type device you need a minimum of the five megapixel camera. The iPad has a 720P high definition camera, which is only the equivalent to .92 megapixels. No matter how good the Prizmo app was, there was no way it would be effective on an iPad.
I am an iPhone junkie. I love my iPhone and I have an iPad at home that I’m never able to use because I can’t pry it out of the hands of my children. The iPhone has a five megapixel digital camera which should be high enough resolution for OCR applications.
I went online and I looked for reviews of the Prizmo app and found them to be quite favorable. It was, according to some, the best OCR application available for the iPhone. I was dubious however about OCR on the iPhone for two reasons. First, I had another supplier who had, in the last year come out with an OCR app for the iPhone for people with low vision. I had seen it work when it was being demonstrated on a technology tour, and had been less than impressed with the OCR results. Secondly, at a recent trade show I had met an executive from the company which manufactured the KNFB mobile reader and had asked him if they plan to release their KNFB mobile reader software as an iPhone app now that the iPhone had a five megapixel camera. He saids quite simply, “no.” because the quality of the phones camera wasn’t sufficient.
But this new recommendation got me to wondering if perhaps the Prizmo application might work better than past iPhone OCR applications. The application was only $10 and the speech output for the application was only an additional $3, so I decided to foot the bill and give it a try. The KNFB Reader in comparison is $995 US, or about 77 times the price.
The following is a comparison of the OCR results on three different types of documents. The first result is the output from the KNFB reader on a Nokia N82 cellular phone and the second is the output from the Prizmo app on an iPhone 4. All documents were scanned under normal office florescent lighting on a desk with a plain light grey background.
Laser Printed Text
The first document is one many will be familiar with. It is the first page of Moby Dick. The original document is a plain, single column, black text on a white background, laserprinted document in a 10 point Arial font. It doesn’t get much simpler than this.
Text Results (formatting has not been altered other than the font.)
Moby Dick from the KNFB reader:
Call me Ishmael. Some years ago-never mind how
long precisely -having little or no money in my purse, and nothing particular
to interest me on shore, I thought I would sail about a little and see the
watery part of the world. It is a way I have of driving off the spleen, and
regulating the circulation. Whenever I find myself growing grim about the
mouth: whenever it is a damp, drizzly November in my soul; whenever I find
myself involuntarily pausing before coffin warehouses, and bringing up the
rear of every funeral I meet; and especially whenever my hypos get such an
upper hand of me, that it requires a strong moral principle to prevent me
from deliberately stepping into the street, and methodically knocking
people’s hats off-then, I account it high time to get to sea as soon as I can.
This is my substitute for pistol and ball. With a philosophical flourish
Cato throws himself upon his sword; I quietly take to the ship. There is
nothing surprising in this. If they but knew it. almost all men in their
degree, some time or other, cherish very nearly the same feelings towards the
ocean with me. There now is your insular city of the Manhattoes, belted round
by wharves as Indian isles by coral reefs-commerce surrounds it with her surf.
Right and left, the streets take you waterward. Its extreme down-town is
the battery, where that noble mole is washed by waves, and cooled by
breezes, which a few hours previous were out of sight of land. Look at the
crowds of water-gazers there. Circumambulate the city of a dreamy Sabbath
afternoon. Go from Corlears Hook to Coenties Slip, and from thence, by
Whitehall northward. What do you see?-Posted like silent sentinels all
around the town, stand thousands upon thousands of mortal men fixed in ocean
reveries. Some leaning against the spiles; some seated upon the pier-heads;
some looking over the bulwarks glasses!
Conclusion: Perfect. Not a single mistake. The formatting was even identical to the original document which was scanned.
Mobi Dick from the Prizmo app:
Call me Ishmael. Some years ago-never mind how long precisely -having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen, and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requtres a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off-then, I account it high time to get to sea as soon as I can.
r~ This is my substitute for pistol and ball With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, chedsh very nearly the same feelings towards the ocean with me There now is your insular city of the Manhattoee, belted round by wharves as Indian isles by coral reefs-commerce surrounds it with her surf Right and left, the streets take you waterward. Its extreme down-town is the battery, where that noble mole is washed by waves, and cooled by breezes, which a few hours previous were out of sight of land. Look at the crowds of water-gazers there. Circumambulate the city of a dreamy Sabbath afternoon. Go from Corlears Hook to Coenties Slip, and from thence, by Whitehall northward. What do you see?-Posted like silent sentinels all around the town, stand thousands upon thousands of mortal men fixed in ocean reveries. Some leaning against the spites; some seated upon the pier-heads; some looking over the bulwarks glassesl
Conclusion: Not bad! 4 errors in recognition, requires was recognized as requtres, there was an extra r~ added for no good reason, cherish became chedsh and the last exclamation point became a number 1. All in all the text was readable. Formatting was lost however.
The second document is a paperback novel. A Wicked Snow by Gregg Olsen. The scan was of two facing pages at a time, the start of chapter 12. In typical paperback fashion the paper is thin and you can see some faint text from the pages behind. Its quality is OK, but not as clear as something printed on a laser printer, and the text size is smaller. The biggest problem with this type of document is that it is very difficult to get perfectly flat. This can cause both focus issues for a camera as well as shadow issues for a camera.
When I scanned this document, I turned the book sideways as the cameras on both the Nokia N82 and the iPhone shoot longer up and down than they do sideways. You can either turn the camera or turn the book. This way you get the largest possible shooting area and get as close as possible to the document which should help get you more accurate OCR results.
However what I discovered was that the Prizmo app has no auto-rotate feature, so I couldn’t shoot the document this way. The KNFB reader not only has an auto rotate feature, but it also has a field of view report which will tell you how much of a document is visible. If you’re a blind user this is critical to help you frame the document in the camera. Prizmo does not have this. The KNFB reader also uses the phones built in level to tell you if you’re holding the camera level. The iPhone has this feature, but Prizmo doesn’t make use of it. So with the Prizmo really you’re shooting in the dark if you can’t use the viewfinder. The Prizmo app has some features to help you adjust contrast of a document, but again unless you can see the screen it’s of no use. KNFB reader does this automatically. Essentially KNFB reader is designed from the ground up as a scanning product for the blind. Prizmo is not.
In the interest of a fair comparison of the OCR results, I rescanned with both with the document and camera in their normal orientation. Here are the results:
A Wicked Snow from the KNFB Reader:
Patty Masour knew the scanner codes better than
anyone in Rock Point, Oregon. She was the part-time
dispatcher at the Spruce County SherifTs Department,
a job she shared with her sister. Sandy. The code sput-
tering over the scanner next to her davenport meant
trouble, big trouble. Multiple homicide in the woods of the
county. She turned ofTher TV and told her husband she
felt uneasy about what she had half heard crackle, and
she dialed her sister.
“County SherifT. Merry Christmas and hello,” a
woman’s voice answered. Her voice was flat. her words
sounded as though they were read from a card. not
words from the heart.
“Yes, Patty? Oh dear,” she said when recognition
came. “Have your heard? They’re hauling bodies out of
the Logan family’s tree farm. I haven’t had time to call
you; things have been off the meter over here for the
past two and half hours!”
Logan?” Patty’s heart sank. She knew the family.
Everyone in Rock Point did. The Logan place had been
A WICKED SNOW
their destination just ten days before when she and her
children went to get their tree, a perfect, pyramidal-
shaped Noble fir.
“CIaire and her two little boys are missing. It’s real
bad out there. I mean realbadl Place is burning to the
ground and the girl…”
“Yeah, she’s the only survivor we know about”
Patty’s knees weakened, and she slid into the soft
folds of her velveteen davenport. “Michelle goes to
school with Hannah,” she said. Michelle was her daugh-
ter, thirteen. Patty hung on every word while her sister
went on about the investigation under way. She remem-
bered how Claire’s daughter had rung up the sale for
(lie Christmas tree in the little kiosk set up outside of
the wreath shed. She way a pretty girl, big brown eyes
with thick, ash-blond hair, held in a ponytail. Michelle
and Hannah had been in the same second- and fourth-
grade classes. They were best friends back then. By sev-
enth grade, though, they’d stopped seeing each other
outside of the classroom. Michelle told her mother that
Hannah was no longer much fun to be around. Patty
thought it might have had to do with what was going on
at home with the girl’s mother.
“They’re taking Hannah to the hospital for an exam,
then back here,” Sandy went on. “You got any clothes
that might fit her?”
UYcs,w Patty answered. “Hannah and Michelle are about
the same size.”
“Well, she hadn’t barely a stitch on when they found
her. She was wearing her nightgown and socks. Soaking
wet, too. The poor thing was out in the snow when they
Patty mumbled something about Christmas being
ruined, hung up, and spun around for her car keys. She
hurried to her daughter’s bedroom in search of some-
Conclusion: Not bad! There were five OCR errors, but for the most part the text was readable. Most of the errors were from letters being joined together such as the word Sherrif’s becoming SherifTs.
A Wicked Snow from the Prizmo app.
J Chapter Twelve pmtv ~ knew the scanner codes ~ than amine in Rock Point, Oregon. She was the part-time d~her at the Spruce County Sheriff’s Department, a .~b ~ shared with her sister, Sand)’. The code sputtenng ~ the scanner next to her davenport meant troutae, big troutde. ~~/n atew0o oft~ wtmat She turned off her “IV and told her husband she felt uneasy abOut what she had half heard crackle, and dialed her sister.
“CounV~ Sheriff. Merry Christmas and hello,~ a womans voice answered. Her voice was flat, her words sounded as though they were read from a card, not words from the hea~ Patty? Oh dear.” she said when recogniOon came. “Have your heard? They’re hauling bodies out of the Loga~ family’s tree farm. I haven’t had time to yrm; things have been off the meter over here for the pa~ two and half hours!”
1~n?* Party’s heart sank. She knew the famlJy.
Everyone in Rock Point did. The Logan place had been ., ~. ~ ~-~:A~:~:~ ~’~ m ~st n d~ ~ wh~m sl~e and her im~ ou~ ~ 1 ~se-am ~d ~ IP~me~ m bmmmg to th~ l~t~’~ ~,ee~ i~r~ ~ she did into the f~alds ~ he~ ~ ~pocL ~MicF~elle goes to school ~ ~” ~ ~ .’~,~d~ehe was her daughbered lm~ ~’s daugh~er had rung up the sale for ~be ~ wee i~ the ~ ~ se~ up oumide c~ the ~r~ -~be~L ~e ~,~a~ pr~uy g~l, big brown eyes dak:~ ~ bah-. hc’kl m a ponytail. Michene and Hannah had heen m the ~me second- and fourthgr~ dass~ Th¢~ ~’~re best friends baf._k then. By sev.
enth grade. ~ th~-’d ~ seeing each Other outside of the ~ Miche~ told her mother that Hannah ~s no km~ much fun to be around. Patty tSo~ght i~ migh~ h~-e had to do with what was going on a~ home with tl~ girl’s mo~, ~’re taking Hannah tothe hospital [~ran exam.
then back here." Sandy ~nt oo. "You go~ any dothes ttuu might fit her?."
~'~x" Pa~ ~ ~ mxl Miche~ ~m~ ~ mine si~e "
"Well" she ~'t bart4v a stiw.h on ~m they found ~oo. The ~her~ nightgown and socks. Soaking round her.~ ~ out in the mow when they ~o--nerda-- v u spun, around for. her car key~ She.
ug~u~r s bedroom m search of som
Conclusion: Simply awful. The text was unreadable, and multiple attempts didn’t return any improved results. Even the optimize feature built into the software, which can really only be used by someone sighted didn’t improve the output.
There was also a significant difference in the amount of time it took to yield the results. The KNFB Reader scanned this document and was reading within 17 seconds. The Prizmo app took 54 seconds before it yielded a result, and you have to manually tell it to start reading. In all it was well over three times longer to get an unreadable result from Prizmo. Very disappointing.
The third document I scanned was a security alarm bill from our security company. Customers frequently ask for solutions to reading their bills. There are multiple columns, tables and some places you want the reader to read across columns, and some places you don’t. Most bills these days also have different colors and patterns in the background which can further confuse OCR applications.
The KNFB Reader has a bills mode. By default it will start reading a document in a documents mode where it reads information in columns. It’s bills mode is designed to turn off column recognition and to read things across a page rather than in columns. Using a combination of the documents mode and the bills mode you can generally get a good idea of the information on the page, but there is no one mode that will do it perfectly.
The Prizmo app also has a bills mode, but I discovered when you use this mode, after capturing the image, it brings up a light blue bar on screen which it overlays on the picture that’s been captured. You are supposed to drag this bar left and right with your finger to select the dividing line between item descriptions and the dollar amounts on the bill. The problem with this is that it would only work with the very simplest of bills, and it can only be done by someone with very good eyesight. Fortunately for me I have very good eyesight so I carried on. The result is then output in a table format for reading.
Security Bill from KNFB Reader in Bills Mode
KEEP THIS PORTION
BILL TO: (E0281060)
SERVICE ADDRESS: (E0323534)
THE AROGA MARKETING GROUP
150 5055 JOYCE STREET
10621 100 AVENUE
W.O. Number Call Number Ticket
Are you moving
Don’t forget to let us know in advance. Contact
us during regular business hours:
t • BASIC ALARM MONITORING
9-EXTV.AR – LIFETIME SERVICE PROTECTION BASIC INSTRUSION
6 . SPECIAL DISCOUNT
For your convenience, make your payments
directly from your bank account or on your
without additional fees I
See form on reverse side.
Visit our new website at: www protectron corn
Please remit payment to:
Reliance Protectron Inc.
Invoicing due date
If you have already mailed your payment,
pleas disregard this notice
Conclusion: There were a few OCR errors, but the critical information, phone numbers, and most importantly the bills subtotals and totals were recognized properly.
Security Bill from the Prizmo App
|KEEP THIS PORgON||”0″|
|BILL TO: (E0281080) SER||”0″|
|ATTN:-JOE WONG THE||”67||110″|
|150 5055 JOYCE STREET||”16″|
|VAN~ BE; T~||”83″|
|Inv~ Dale Cm~omef W~O-||”11||710.11″|
|Are you moving?||”0″|
|Donl forget to let us knc~v in advance. ~||”0″|
|,.~ during ula business I’xx~:||”0″|
|For ~ convenieÂ¢,,~, mak, e your paymenls||”0″|
|without ~ fees !||”0″|
|,See form on reverse side.||”0″|
|Tel~i imi~ -~||”110″|
|~ nm~ paymm~:||”0″|
Conclusion: Utterly useless. Not only was it largely unreadable but none of the numbers presented made any sense. There is no way anyone would be able to read this bill, or likely any other complex bill with this product.
The iPhone is a marvelous device. As someone who remembers men flying to the moon when NASA had a total of 64k of memory in their computers, the idea of packing around a multi-purpose computer with Gigabytes of storage and such a staggering array of capabilities makes me fairly giddy. As an OCR device for the blind however it falls far short of the mark when compared to a KNFB Reader. Granted, the KNFB reader is substantially more expensive to purchase up front, but if you’re really looking for quality OCR results, you are not going to get them from the current technology offered on the iPhone. Either the camera quality or the software quality or some combination of the two makes OCR on the iPhone impractical at best. Perhaps the iPhone 5 will finally have the combination of a camera and a flash that can convince KNFB Reading Technologies to release a KNFB Reader App. Until then, Nokia appears to have the ideal platform for the KNFB Reader.