OVGTSL 2007 – Part 4 – RDA
The final part of the conference focused on RDA. I think Dr. Tillett is the third member of the JSC I’ve heard speak on RDA. Every time I hear one of them, I’m very encouraged that things are moving in...
View ArticleJCDL 2007
The ACM and IEEE put on their Joint Conference on Digital Libraries this week in Vancouver, B.C. While I was not able to stay for the full conference, which looked to have a great program, I was...
View ArticleDid you mean: fluoride?
My dentist told me two noteworthy things yesterday: I need to floss more, and she misses the card catalog. I’ll leave aside my dental hygiene, it being a bit out of the scope of this blog, to focus on...
View ArticleDigital to Print to Digital, or, Running in Circles
Rule: Don’t add unnecessary, value-subtracting steps. If a process already has these steps in it, take them out. Application: I’ve come to be responsible for an ongoing newspaper digitization project....
View ArticleFun with Acrobat
In my last post, I noted the need to convert some PDFs from a format suitable for a printer to a format suitable for online reading. The PDFs of the Muncie Times that I receive are laid out as spreads...
View ArticleMeasuring the Value of a Book
How do you measure the value of a book? One might ask several questions when determining what a book is worth: How meaningful is the content? Is it enjoyable to read? Can you learn from it? Does it...
View ArticleFRBR: the Definitive Guide
Jonathan Rochkind recently posted his paraphrase of FRBR over at Bibliographic Wilderness. It is clear, concise, and accurate. From now on, I will consider it the definitive guide to FRBR Group 1...
View ArticleOCR with OCRopus and Tesseract
While OCRing a batch of images through OmniPage the other day, I was silently cursing my computer. I had about 1,500 pages, and OmniPage was crashing after every second or third image. I’ve used...
View ArticleFind Files by Size
Find all TIFFs in a directory smaller than 90 MB: $ find /dir/to/search -name *.tif -size -90M -exec ls -lh {} \; Get just the size and path and write to a file: $ find /dir/to/search -name *.tif -size...
View ArticleConvert hOCR to PDF
As I mentioned recently, OCRopus OCR software output an hOCR file. What is hOCR? hOCR is an open standard for representing OCR results in an HTML document (not to be confused with HOCR). It is...
View Article