Case Study: Rare & Old Books are Digitized using OCR Technology

The Client

Our client is a reputed library service provider based in Chicago. It has been offering reading services to people for more than a century now. As such, the library has a huge repository of archaic books some of which are rare.

The Requirement

Client wanted digitization of books on its library’s shelves in economic manner. The text contained in the hardcover versions of the book has to be converted in digital mode. The client, influenced by the reputation of MAPSystems, wanted to leverage our expertise in OCR conversion services to get the digitization process successfully done accurately in cost competitive manner. Our team had to scan, index and then convert the books’ pages to PDF format. Also, the content should be searchable for ensuring its interactivity. This would enable potential readers to filter and then display relevant portions of the books easily. Also, the images and illustrations had to be searchable. Our previous experience in executing projects of similar types for leading publishing houses and other institutions was counted upon by the client.

The Challenges

The number of pages collectively in all the books of the library ran in excess of 50,000. Some of the books dated back to 19th century and early 20th century. The fragile condition of the olsd books was a cause of concern as extra efforts were needed to handle them. Some of the journals, upon being touched, started crumbling at the edges. To ensure that the hardcopies of them are not adversely impacted, our team had to manually carry out OCR document scanning.

The majority of books carried illustrations and images. They had to be subjected to optimization for convenient viewing over the web. The main headings had to be indexed from each of the chapters. Style sheets had to be applied in correct context for ensuring optimum readability on the web as text conversion had to be done in HTML format. The client wanted the OCR Pdf conversion to be completely accurate.

The Process

We assigned experienced professionals for the job under the supervision of a Project Manager. After manually scanning the book pages, they were subjected to optical character recognition. The main headings were indexed. The images and illustrations were optimized for web. Contemporary software packages were used for PDF and image compression. They also generated pdf files which not only were optimized for web but also were searchable. The compressed version of PDFs allowed for storage of digitized books in lesser disk space. This also impacted the download times for users who would like to store the pdf books in their respective systems for reading later. The metadata was created along with the index to ensure that readers can go about searching, filtering and displaying all the content from the books including images and illustrations.

The Result

MAPSystems could efficiently complete the task of conversion of hardcover books and journals in the library of client to searchable, web optimized PDF versions. Many rare volumes, whose hard copies are in extremely fragile states, have been preserved in optimized format. Each digitized book version can be searched completely by title, author’s name, year of publication etc. Readers can also filter the content of books based on specific topics and display the same.

The client was very happy with our team’s efficiency and the accurate manner in which digitization and conversions were carried out. The client assured us that all such future projects would be assigned to us only. Also, it would offer us wide publicity through social platforms and word of mouth promotion.

Contact Us

MAPSystems is your one stop destination for hardcover titles’ digitization, web optimization and other pre-press services. We assure you of lowest turnaround time in the industry, 99.99% accuracy, and cost competitive rates. Our content conversion solutions are exhaustive. Dynamic digital content can be created from any source by us. Poorly formatted data are converted to well-structured PDF versions along with XML, HTML, SGNL mark-up formats efficiently. Pictures and illustrations can be repurposed in usable formats.

Irrespective of the complexity level of the projects and type of your industry, we can reliably cater to your digitization and conversion requirements impeccably. Connect with us to know more.

Rare & old books are digitized using OCR technology