10-22-2007, 06:11 PM
|
#27
|
Senior Member
Join Date: Jun 2003
Location: New Jersey U.S.A. ex UK and Canada
Posts: 4,846
|
Re: "Lightfoot" Album Liner Notes
Having resurrected this thread two replies in particular caught my eye
the first very recent
Quote:
Originally Posted by louisemnnpls
But, there is something to be said for the newsprint, now yellow with age, the old ticket stubs, even the smell of the scrapbooks, of those of us who have them.
|
the other from earlier this year
Quote:
Originally Posted by charlene
It seems that my scanner alo has OCR functions..
|
Quote:
Originally Posted by charlene
who knew?
lolol
|
these reminded me that I have for a while wanted to present a little magnum opus guide showing what can be achieved with OCR (Optical Character Recognition) using as an example an old yellowing cutting.
I use the simply splended HP Director program that is part of the comprehensive scanning suite that is supplied with HP "All-In-One Printer/Scanner/Copiers, so pay attention Jesse Joe (he now has an HP PSC machine that hopefully came with the same or similar software to what I have here) and others who would like to use OCR , for Char with her I believe Canon equivalent things will be somewhat differrent but the same principles will apply
I still have a yellowing cutting from the May 4 1968 issue of the old Montreal Star about the series of GL concerts that week. I did in fact attend my very first GL thrillorama in a proper theatre,
as opposed to the splendiferous New Penelope Coffee House where I had previously come under his spell one year before, onMay 10th (click for setlist)
the HP Director program is most intelligent to scan an image it has already found during installation what possible image and text handling/displaying programs are already on your computer. and if you select graphics there is a drop down list
and for text
I produced the primary image scan

OK it is itself at that size quite readable
In order to extract the text by OCR it showed the relevant text editing programs: I selected Wordpad (as in many ways I prefer using that to the larger Word, primarily because it opens far quicker

trying to OCR complex document in one fell swoop can get confusing

So with multiple column pages such as this I find it best to scan then OCR one column at a time and gradually copy and paste into a master wordpad *.rtf (Rich Text Format) file. thus I used the grab handes to limit the are to be scanned progressively starting with the top of the first column

it scans direct to wordpad

then processes it to convert the scanned image to text

following which Wordpad opens to show the text ready for correcting

Unretouched image of uncorrected text (click to view it full sized)
Pretty darned good I reckon!
Oh yes there wax on the same page this theatre advert

It does not need OCR to read that just a whiff of nostalgia for the days long gone when the best seats at a Gord concerrt were a mere C$5.50 each (OK at the exchange rate then over US$6.00)
(of course ones salary was commensurately smaller too!!)
I completed the scanning and corrections, and here I intended showing the full text but that would grossly exceed the apparent text linit of 1000 charaxters so see my later message:-
Finally "rubbies," (which is as printed in the article)
I googled and could only find that they are some sort of girls clothing . Does anybody have any idea how that word fits into the text??
and note that like the Montreal Star (in its day the superior Montreal daily) the Toronto Telegram has gone to that great printing press in the sky
Last edited by johnfowles; 10-22-2007 at 07:13 PM.
|
|
|