View Single Post
Old 10-22-2007, 06:11 PM   #27
johnfowles
Senior Member
 
Join Date: Jun 2003
Location: New Jersey U.S.A. ex UK and Canada
Posts: 4,846
Send a message via AIM to johnfowles
Default Re: "Lightfoot" Album Liner Notes

Having resurrected this thread two replies in particular caught my eye
the first very recent
Quote:
Originally Posted by louisemnnpls View Post
But, there is something to be said for the newsprint, now yellow with age, the old ticket stubs, even the smell of the scrapbooks, of those of us who have them.
the other from earlier this year

Quote:
Originally Posted by charlene View Post
It seems that my scanner alo has OCR functions..
Quote:
Originally Posted by charlene View Post
who knew?
lolol
these reminded me that I have for a while wanted to present a little magnum opus guide showing what can be achieved with OCR (Optical Character Recognition) using as an example an old yellowing cutting.
I use the simply splended HP Director program that is part of the comprehensive scanning suite that is supplied with HP "All-In-One Printer/Scanner/Copiers, so pay attention Jesse Joe (he now has an HP PSC machine that hopefully came with the same or similar software to what I have here) and others who would like to use OCR , for Char with her I believe Canon equivalent things will be somewhat differrent but the same principles will apply
I still have a yellowing cutting from the May 4 1968 issue of the old Montreal Star about the series of GL concerts that week. I did in fact attend my very first GL thrillorama in a proper theatre,

as opposed to the splendiferous New Penelope Coffee House where I had previously come under his spell one year before, onMay 10th (click for setlist)
the HP Director program is most intelligent to scan an image it has already found during installation what possible image and text handling/displaying programs are already on your computer. and if you select graphics there is a drop down list


and for text


I produced the primary image scan


OK it is itself at that size quite readable

In order to extract the text by OCR it showed the relevant text editing programs: I selected Wordpad (as in many ways I prefer using that to the larger Word, primarily because it opens far quicker


trying to OCR complex document in one fell swoop can get confusing



So with multiple column pages such as this I find it best to scan then OCR one column at a time and gradually copy and paste into a master wordpad *.rtf (Rich Text Format) file. thus I used the grab handes to limit the are to be scanned progressively starting with the top of the first column



it scans direct to wordpad



then processes it to convert the scanned image to text



following which Wordpad opens to show the text ready for correcting


Unretouched image of uncorrected text (click to view it full sized)

Pretty darned good I reckon!
Oh yes there wax on the same page this theatre advert


It does not need OCR to read that just a whiff of nostalgia for the days long gone when the best seats at a Gord concerrt were a mere C$5.50 each (OK at the exchange rate then over US$6.00)

(of course ones salary was commensurately smaller too!!)
I completed the scanning and corrections, and here I intended showing the full text but that would grossly exceed the apparent text linit of 1000 charaxters so see my later message:-


Finally "rubbies," (which is as printed in the article)
I googled and could only find that they are some sort of girls clothing . Does anybody have any idea how that word fits into the text??
and note that like the Montreal Star (in its day the superior Montreal daily) the Toronto Telegram has gone to that great printing press in the sky

Last edited by johnfowles; 10-22-2007 at 07:13 PM.
johnfowles is offline   Reply With Quote