Million Book Project
From Freepedia
The Million Book Project, led by Carnegie Mellon University School of Computer Science and University Libraries, aims to digitize a million books by 2007. Working with government and research partners in India and China, the project is scanning the books, using OCR to enable full text searching, and providing free-to-read access to the books on the web. A pilot Thousand Book Project was performed to test the concept.
As of summer 2005, over 400,000 books have been scanned. Most of the books are in the public domain, but permission has been acquired to include over 53,000 copyrighted books. The books will be mirrored at sites in India, China, Carnegie Mellon, the Internet Archive, and possibly other locations. The books that have been scanned to date are not yet all available online, and no single site has copies of all the books that are available online.
Significant research is underway in the project, including machine translations and OCR for Indian languages and scripts. The research also includes developments in image processing, large-scale database management, and strategies for acquiring copyright permission at an affordable cost.
External link
- http://www.library.cmu.edu/Libraries/MBP_FAQ.html (Frequently Asked Questions)
- http://tera-3.ul.cs.cmu.edu/ (the Universal Library)
- http://www.archive.org/details/millionbooks (the archived pilot)



