A visually impaired client was recently referred to me by a funding agency to assist him in selecting the best assistive technology for him. One of his requests was for a portable device which would allow him to do optical character recognition. He wanted something that he could carry with him to do portable reading, something with which he could snap a picture of a page and have it read it to him.
Prior to seeing me he had been into a Mac dealer who had recommended for this purpose an iPad 16 G Wi-Fi plus 3G and an App Store app called Prizmo.
I have some experience with OCR applications having worked with them for more than 20 years. Only recently have we been able to use portable devices to provide OCR. The first device of this kind was the KNFB Reader, which married a digital camera to a PDA. This later evolved into software which could be installed on selected Nokia cell phones.
Because of my experience with these types of devices I noticed one immediate flaw in the recommendation. To have effective OCR on any kind of camera type device you need a minimum of the five megapixel camera. The iPad has a 720P high definition camera, which is only the equivalent to .92 megapixels. No matter how good the Prizmo app was, there was no way it would be effective on an iPad.
I am an iPhone junkie. I love my iPhone and I have an iPad at home that I’m never able to use because I can’t pry it out of the hands of my children. The iPhone has a five megapixel digital camera which should be high enough resolution for OCR applications.
I went online and I looked for reviews of the Prizmo app and found them to be quite favorable. It was, according to some, the best OCR application available for the iPhone. I was dubious however about OCR on the iPhone for two reasons. First, I had another supplier who had, in the last year come out with an OCR app for the iPhone for people with low vision. I had seen it work when it was being demonstrated on a technology tour, and had been less than impressed with the OCR results. Secondly, at a recent trade show I had met an executive from the company which manufactured the KNFB mobile reader and had asked him if they plan to release their KNFB mobile reader software as an iPhone app now that the iPhone had a five megapixel camera. He saids quite simply, “no.” because the quality of the phones camera wasn’t sufficient.
But this new recommendation got me to wondering if perhaps the Prizmo application might work better than past iPhone OCR applications. The application was only $10 and the speech output for the application was only an additional $3, so I decided to foot the bill and give it a try. The KNFB Reader in comparison is $995 US, or about 77 times the price.
The following is a comparison of the OCR results on three different types of documents. The first result is the output from the KNFB reader on a Nokia N82 cellular phone and the second is the output from the Prizmo app on an iPhone 4. All documents were scanned under normal office florescent lighting on a desk with a plain light grey background.
Laser Printed Text
The first document is one many will be familiar with. It is the first page of Moby Dick. The original document is a plain, single column, black text on a white background, laserprinted document in a 10 point Arial font. It doesn’t get much simpler than this.
Text Results (formatting has not been altered other than the font.)
Moby Dick from the KNFB reader:
Call me Ishmael. Some years ago-never mind how
long precisely -having little or no money in my purse, and nothing particular
to interest me on shore, I thought I would sail about a little and see the
watery part of the world. It is a way I have of driving off the spleen, and
regulating the circulation. Whenever I find myself growing grim about the
mouth: whenever it is a damp, drizzly November in my soul; whenever I find
myself involuntarily pausing before coffin warehouses, and bringing up the
rear of every funeral I meet; and especially whenever my hypos get such an
upper hand of me, that it requires a strong moral principle to prevent me
from deliberately stepping into the street, and methodically knocking
people’s hats off-then, I account it high time to get to sea as soon as I can.
This is my substitute for pistol and ball. With a philosophical flourish
Cato throws himself upon his sword; I quietly take to the ship. There is
nothing surprising in this. If they but knew it. almost all men in their
degree, some time or other, cherish very nearly the same feelings towards the
ocean with me. There now is your insular city of the Manhattoes, belted round
by wharves as Indian isles by coral reefs-commerce surrounds it with her surf.
Right and left, the streets take you waterward. Its extreme down-town is
the battery, where that noble mole is washed by waves, and cooled by
breezes, which a few hours previous were out of sight of land. Look at the
crowds of water-gazers there. Circumambulate the city of a dreamy Sabbath
afternoon. Go from Corlears Hook to Coenties Slip, and from thence, by
Whitehall northward. What do you see?-Posted like silent sentinels all
around the town, stand thousands upon thousands of mortal men fixed in ocean
reveries. Some leaning against the spiles; some seated upon the pier-heads;
some looking over the bulwarks glasses!
Conclusion: Perfect. Not a single mistake. The formatting was even identical to the original document which was scanned.
Mobi Dick from the Prizmo app:
Call me Ishmael. Some years ago-never mind how long precisely -having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen, and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requtres a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people’s hats off-then, I account it high time to get to sea as soon as I can.
r~ This is my substitute for pistol and ball With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, chedsh very nearly the same feelings towards the ocean with me There now is your insular city of the Manhattoee, belted round by wharves as Indian isles by coral reefs-commerce surrounds it with her surf Right and left, the streets take you waterward. Its extreme down-town is the battery, where that noble mole is washed by waves, and cooled by breezes, which a few hours previous were out of sight of land. Look at the crowds of water-gazers there. Circumambulate the city of a dreamy Sabbath afternoon. Go from Corlears Hook to Coenties Slip, and from thence, by Whitehall northward. What do you see?-Posted like silent sentinels all around the town, stand thousands upon thousands of mortal men fixed in ocean reveries. Some leaning against the spites; some seated upon the pier-heads; some looking over the bulwarks glassesl
Conclusion: Not bad! 4 errors in recognition, requires was recognized as requtres, there was an extra r~ added for no good reason, cherish became chedsh and the last exclamation point became a number 1. All in all the text was readable. Formatting was lost however.
The second document is a paperback novel. A Wicked Snow by Gregg Olsen. The scan was of two facing pages at a time, the start of chapter 12. In typical paperback fashion the paper is thin and you can see some faint text from the pages behind. Its quality is OK, but not as clear as something printed on a laser printer, and the text size is smaller. The biggest problem with this type of document is that it is very difficult to get perfectly flat. This can cause both focus issues for a camera as well as shadow issues for a camera.
When I scanned this document, I turned the book sideways as the cameras on both the Nokia N82 and the iPhone shoot longer up and down than they do sideways. You can either turn the camera or turn the book. This way you get the largest possible shooting area and get as close as possible to the document which should help get you more accurate OCR results.
However what I discovered was that the Prizmo app has no auto-rotate feature, so I couldn’t shoot the document this way. The KNFB reader not only has an auto rotate feature, but it also has a field of view report which will tell you how much of a document is visible. If you’re a blind user this is critical to help you frame the document in the camera. Prizmo does not have this. The KNFB reader also uses the phones built in level to tell you if you’re holding the camera level. The iPhone has this feature, but Prizmo doesn’t make use of it. So with the Prizmo really you’re shooting in the dark if you can’t use the viewfinder. The Prizmo app has some features to help you adjust contrast of a document, but again unless you can see the screen it’s of no use. KNFB reader does this automatically. Essentially KNFB reader is designed from the ground up as a scanning product for the blind. Prizmo is not.
In the interest of a fair comparison of the OCR results, I rescanned with both with the document and camera in their normal orientation. Here are the results:
A Wicked Snow from the KNFB Reader:
Patty Masour knew the scanner codes better than
anyone in Rock Point, Oregon. She was the part-time
dispatcher at the Spruce County SherifTs Department,
a job she shared with her sister. Sandy. The code sput-
tering over the scanner next to her davenport meant
trouble, big trouble. Multiple homicide in the woods of the
county. She turned ofTher TV and told her husband she
felt uneasy about what she had half heard crackle, and
she dialed her sister.
“County SherifT. Merry Christmas and hello,” a
woman’s voice answered. Her voice was flat. her words
sounded as though they were read from a card. not
words from the heart.
“Yes, Patty? Oh dear,” she said when recognition
came. “Have your heard? They’re hauling bodies out of
the Logan family’s tree farm. I haven’t had time to call
you; things have been off the meter over here for the
past two and half hours!”
Logan?” Patty’s heart sank. She knew the family.
Everyone in Rock Point did. The Logan place had been
A WICKED SNOW
their destination just ten days before when she and her
children went to get their tree, a perfect, pyramidal-
shaped Noble fir.
“CIaire and her two little boys are missing. It’s real
bad out there. I mean realbadl Place is burning to the
ground and the girl…”
“Yeah, she’s the only survivor we know about”
Patty’s knees weakened, and she slid into the soft
folds of her velveteen davenport. “Michelle goes to
school with Hannah,” she said. Michelle was her daugh-
ter, thirteen. Patty hung on every word while her sister
went on about the investigation under way. She remem-
bered how Claire’s daughter had rung up the sale for
(lie Christmas tree in the little kiosk set up outside of
the wreath shed. She way a pretty girl, big brown eyes
with thick, ash-blond hair, held in a ponytail. Michelle
and Hannah had been in the same second- and fourth-
grade classes. They were best friends back then. By sev-
enth grade, though, they’d stopped seeing each other
outside of the classroom. Michelle told her mother that
Hannah was no longer much fun to be around. Patty
thought it might have had to do with what was going on
at home with the girl’s mother.
“They’re taking Hannah to the hospital for an exam,
then back here,” Sandy went on. “You got any clothes
that might fit her?”
UYcs,w Patty answered. “Hannah and Michelle are about
the same size.”
“Well, she hadn’t barely a stitch on when they found
her. She was wearing her nightgown and socks. Soaking
wet, too. The poor thing was out in the snow when they
Patty mumbled something about Christmas being
ruined, hung up, and spun around for her car keys. She
hurried to her daughter’s bedroom in search of some-
Conclusion: Not bad! There were five OCR errors, but for the most part the text was readable. Most of the errors were from letters being joined together such as the word Sherrif’s becoming SherifTs.
A Wicked Snow from the Prizmo app.
J Chapter Twelve pmtv ~ knew the scanner codes ~ than amine in Rock Point, Oregon. She was the part-time d~her at the Spruce County Sheriff’s Department, a .~b ~ shared with her sister, Sand)’. The code sputtenng ~ the scanner next to her davenport meant troutae, big troutde. ~~/n atew0o oft~ wtmat She turned off her “IV and told her husband she felt uneasy abOut what she had half heard crackle, and dialed her sister.
“CounV~ Sheriff. Merry Christmas and hello,~ a womans voice answered. Her voice was flat, her words sounded as though they were read from a card, not words from the hea~ Patty? Oh dear.” she said when recogniOon came. “Have your heard? They’re hauling bodies out of the Loga~ family’s tree farm. I haven’t had time to yrm; things have been off the meter over here for the pa~ two and half hours!”
1~n?* Party’s heart sank. She knew the famlJy.
Everyone in Rock Point did. The Logan place had been ., ~. ~ ~-~:A~:~:~ ~’~ m ~st n d~ ~ wh~m sl~e and her im~ ou~ ~ 1 ~se-am ~d ~ IP~me~ m bmmmg to th~ l~t~’~ ~,ee~ i~r~ ~ she did into the f~alds ~ he~ ~ ~pocL ~MicF~elle goes to school ~ ~” ~ ~ .’~,~d~ehe was her daughbered lm~ ~’s daugh~er had rung up the sale for ~be ~ wee i~ the ~ ~ se~ up oumide c~ the ~r~ -~be~L ~e ~,~a~ pr~uy g~l, big brown eyes dak:~ ~ bah-. hc’kl m a ponytail. Michene and Hannah had heen m the ~me second- and fourthgr~ dass~ Th¢~ ~’~re best friends baf._k then. By sev.
enth grade. ~ th~-‘d ~ seeing each Other outside of the ~ Miche~ told her mother that Hannah ~s no km~ much fun to be around. Patty tSo~ght i~ migh~ h~-e had to do with what was going on a~ home with tl~ girl’s mo~, ~’re taking Hannah tothe hospital [~ran exam.
then back here.” Sandy ~nt oo. “You go~ any dothes ttuu might fit her?.”
~’~x” Pa~ ~ ~ mxl Miche~ ~m~ ~ mine si~e ”
“Well” she ~’t bart4v a stiw.h on ~m they found ~oo. The ~her~ nightgown and socks. Soaking round her.~ ~ out in the mow when they ~o–nerda– v u spun, around for. her car key~ She.
ug~u~r s bedroom m search of som
Conclusion: Simply awful. The text was unreadable, and multiple attempts didn’t return any improved results. Even the optimize feature built into the software, which can really only be used by someone sighted didn’t improve the output.
There was also a significant difference in the amount of time it took to yield the results. The KNFB Reader scanned this document and was reading within 17 seconds. The Prizmo app took 54 seconds before it yielded a result, and you have to manually tell it to start reading. In all it was well over three times longer to get an unreadable result from Prizmo. Very disappointing.
The third document I scanned was a security alarm bill from our security company. Customers frequently ask for solutions to reading their bills. There are multiple columns, tables and some places you want the reader to read across columns, and some places you don’t. Most bills these days also have different colors and patterns in the background which can further confuse OCR applications.
The KNFB Reader has a bills mode. By default it will start reading a document in a documents mode where it reads information in columns. It’s bills mode is designed to turn off column recognition and to read things across a page rather than in columns. Using a combination of the documents mode and the bills mode you can generally get a good idea of the information on the page, but there is no one mode that will do it perfectly.
The Prizmo app also has a bills mode, but I discovered when you use this mode, after capturing the image, it brings up a light blue bar on screen which it overlays on the picture that’s been captured. You are supposed to drag this bar left and right with your finger to select the dividing line between item descriptions and the dollar amounts on the bill. The problem with this is that it would only work with the very simplest of bills, and it can only be done by someone with very good eyesight. Fortunately for me I have very good eyesight so I carried on. The result is then output in a table format for reading.
Security Bill from KNFB Reader in Bills Mode
KEEP THIS PORTION
BILL TO: (E0281060)
SERVICE ADDRESS: (E0323534)
THE AROGA MARKETING GROUP
150 5055 JOYCE STREET
10621 100 AVENUE
W.O. Number Call Number Ticket
Are you moving
Don’t forget to let us know in advance. Contact
us during regular business hours:
t • BASIC ALARM MONITORING
9-EXTV.AR – LIFETIME SERVICE PROTECTION BASIC INSTRUSION
6 . SPECIAL DISCOUNT
For your convenience, make your payments
directly from your bank account or on your
without additional fees I
See form on reverse side.
Visit our new website at: www protectron corn
Please remit payment to:
Reliance Protectron Inc.
Invoicing due date
If you have already mailed your payment,
pleas disregard this notice
Conclusion: There were a few OCR errors, but the critical information, phone numbers, and most importantly the bills subtotals and totals were recognized properly.
Security Bill from the Prizmo App
|KEEP THIS PORgON||“0”|
|BILL TO: (E0281080) SER||“0”|
|ATTN:-JOE WONG THE||“67||110″|
|150 5055 JOYCE STREET||“16”|
|VAN~ BE; T~||“83”|
|Inv~ Dale Cm~omef W~O-||“11||710.11″|
|Are you moving?||“0”|
|Donl forget to let us knc~v in advance. ~||“0”|
|,.~ during ula business I’xx~:||“0”|
|For ~ convenieÂ¢,,~, mak, e your paymenls||“0”|
|without ~ fees !||“0”|
|,See form on reverse side.||“0”|
|Tel~i imi~ -~||“110”|
|~ nm~ paymm~:||“0”|
Conclusion: Utterly useless. Not only was it largely unreadable but none of the numbers presented made any sense. There is no way anyone would be able to read this bill, or likely any other complex bill with this product.
The iPhone is a marvelous device. As someone who remembers men flying to the moon when NASA had a total of 64k of memory in their computers, the idea of packing around a multi-purpose computer with Gigabytes of storage and such a staggering array of capabilities makes me fairly giddy. As an OCR device for the blind however it falls far short of the mark when compared to a KNFB Reader. Granted, the KNFB reader is substantially more expensive to purchase up front, but if you’re really looking for quality OCR results, you are not going to get them from the current technology offered on the iPhone. Either the camera quality or the software quality or some combination of the two makes OCR on the iPhone impractical at best. Perhaps the iPhone 5 will finally have the combination of a camera and a flash that can convince KNFB Reading Technologies to release a KNFB Reader App. Until then, Nokia appears to have the ideal platform for the KNFB Reader.
I am surprised that you would have chosen the Prizmo app. I realize it was recommended by an ill informed salesperson at a Mac dealer. To compare to the KNFB reader, a better choice would have been the iAquared app Zoom reader.
Having continuously tested the OCR functions of many devices over the years, scanning on the iPhone fares even better if the scan is first performed and processed with the Scanner Pro app. I realize this is a two step process, but with the KNFB reader you are limited to the Nokia phone with Talks, the entire proposition costing as much as a Macbook Pro.
I have also had excellent results using the Scanner Pro app on the iPhone and then processing the result with the inexpensive Pdfscanner app on my Mac.
The KNFB reader software remains an example of accessibility software outrageously overpriced well past its time.
Hopefully the advent of the 8 megapixel camera, better placement of the flash on the iPhone 5 will markedly improve the versatility and accessibility of the iPhone.
Can you give more detail on how you would use this Scanner Pro app Robert? Do you first store it as an image and then OCR the image file?
Pingback: Portable OCR: Willl the iPhone 5 Save the day for the blind and visually impaired? — BlindGadget
I honestly can’t blame Robert and others for being excited about the IPhone. It does a lot of things well. Unfortunately OCR is just not one of them. Search the VIPhone google group and read the step-by-steps from IPhone users who are happy with Prismo. You will see that with a deal of time and patience it can be made to work. Unfortunately most people blind or sighted simply would not have the stamina to make this process happen as it is currently. Good for the brave few who have gotten it up and running, but for most of us it’s just not practical yet. Regarding ZoomReader, I hunted diligently for it on the app store. After googling to find out that Zoom Reader is actually spelled ZoomReader and iaquared is actually spelled AI Squared, sorry Robert, I still wasn’t able to turn it up on the app store. So, I went to the AI Squared website and found a link to it here: http://itunes.apple.com/us/app/zoomreader/id414117816?mt=8&ls=1
Look at the ratings. I can’t imagine what this company did to deserve this. They have a solid reputation in the AT industry and it’s only a fledgeling product.
No one would like this to fly more than I would. It’s tough to recommend some one buy a three or four year old handset and nls legacy software at a huge premium to get this job done. The thing is, the old crap works and the new stuff doesn’t, not for this. Should we recommend new technologies because they’re current and supported despite the fact that they don’t work simply because they are newer, cheeper, and supported? Or should we advise people to spend thousands on legacy technologies that aren’t supported or will soon cease to be supported on the basis that they do what the brochure says they do.
It’s the same on the mac. There’s no fast, simple, reliable OCR solution except for the eyepal camera which costs $1400 here, the price of a 13 inch macbook pro. Should we not recommend eyepal and force people to use Windows scanning packages in a virtual machine just because the Eyepal is overpriced. Or should we pay for the convenience of a mac product that is fast, reliable, and made simple to use for blind users?
It’s sad to see the current state of reading machines. I don’t think there’s been so much upheaval in the field since the demise of the opticon.
I stick with older technology since it works. I recommend it, as well. Its true that the newer stuff is not working as well as the older stuff. If anything newer was actually decent enough, I would recommend it. I think the biggest problem is that when visiting technology stores, the recommendations from sales staff will always be from sighted people. They will not understand our deeply dedicated needs.
As an insider and developer of OCR solution for many years, I see a very clear explanation to this simple paradigm. The problem is that the whole OCR market for the blind is very small. Companies that pioneered the technology, like ABIsee and KNFB, invested a lot in research and development. Although the devices seem to be overpriced, the companies that pioneered a technology and keep investing a lot in maintaining it can not afford selling for a price of newer devices that, as it was pointed in this discussion do not work.
On the other side, once original OCR devices were introduced back 5-6 years ago and become known in the industry, there were a lot of people who decided to copy them. There were dozens of attempts to either buy a Chinese camera for $20 and sell it in a package with commercial OCR for $500, or quickly write an iPhone application and sell it for $10. As development costs on such a product is minimal, and requires almost no maintenance and support, they can afford selling for a such a small price. Unfortunately the result is that such a devices end up to be useless.
For Windows there is a new low cost reading machine called SnapVision that is available for only $199 which includes the OCR application and high quality text to speech voices and a 5.0 MP document camera. Check out the web site at http://www.topocr.com
A great followup article:
I used to be recommended this blog by my cousin.
I’m not certain whether or not this submit is written via him as no one else recognise such precise approximately my problem. You are incredible! Thanks!
If you are having difficulty getting the phone aligned with the document, or your getting blurry images and crappy OCR from hand-shake, then do try the Giraffe Reader – a new folding stand for blind and partially sighted iPhone users for OCR. Order one from us at http://www.giraffe-reader.com
Pingback: The KNFB Reader App released for iPhones - Senior Tech Insider