Skip to main content

UCLA selects IBM to digitize the Hearst Newsreel Collection

Comprises one of the most important historical resources of twentieth century history

Select a topic or year

Yorktown Heights, NY, USA - 20 Oct 2003: In June, the University of California, Los Angeles engaged IBM to digitize the Hearst Metronome Newsreel Collection, which comprises one of the most important historical resources of twentieth century history. The collection consists of approximately 850 hours of newsreel footage covering major events from 1914 to 1971, including the formation of the League of Nations in 1919, Charles Lindbergh's solo crossing of the Atlantic in 1927 and the first flights into outer space. The collection was bequeathed to the UCLA archive in 1981 and is primarily documented on aging paper format, including 675,000 typed index cards, 7,700 synopsis sheets and 190,000 disposition sheets. Access to the collection has been restricted in an effort to preserve the deteriorating paper catalogue. The cards provide detailed descriptions of the event documented on each newsreel and are irreplaceable historical documents.

For ten years, UCLA searched for an affordable way to digitize the paper documentation of the Hearst Newsreel Collection to create a searchable online database that would be accessible to the general public. In August 2002, the UCLA Film and Television Archive and IBM began investigating the possibility of digitizing the paper records. Recently, with the help of IBM Research scientists in Haifa, Israel, the Archive began to work on using newly created software designed to make this project a reality.

When the project was initially brought to IBM's attention, Dr. Jeffrey Schick, director of Content Management Worldwide for IBM, was fascinated by the complexity of the Hearst documentation and recognized its research, education and historical value. He felt confident that IBM had the resources needed to tackle this project. IBM Haifa scientists worked for over half a year to develop software capable of performing highly accurate optical character recognition and to place the scanned material into discrete database fields automatically. The software uses innovative scanning technology that can be applied to complex and varied records, so the index cards can be scanned and image files saved using IBM Content Manager. Users will be able to search the database by subject, description or date.

Dr. Ehud Karnin, manager of Signal Processing and Image Technologies at the IBM Haifa Lab noted that "IBM is very enthusiastic about the project and happy to collaborate on an effort with such historical significance. This research also opens the door to opportunities for digitizing many different archives, offering scholars and educators the chance to study information that would otherwise be inaccessible." For IBM itself, this project opens new doors for content management of large archives and the application of scientific innovations to this area.

Optical character recognition had previously been dismissed because the software necessary to complete the project at an acceptable quality level did not exist. When asked why IBM was able to meet the challenge head on, where others could not, Dr. Karnin mentioned Haifa’s advanced technology that was adapted for the tough problems. These technologies involve binarization, cleaning of the text, separation of individual characters, and identification of the characters. He further noted IBM’s Content Manager storage system, which is specifically designed for media storage

Following the creation of the online database, the second phase of the project will link digitized newsreels to the database. UCLA will format the newsreels (over 27 million feet of film) into high quality video masters. IBM's DB2 Content Manager Software will be used to link the film footage to the online database. Once this has been accomplished, the astonishing historical record that is the Hearst Metrotone Newsreel Collection will be easily accessible to the public via the Internet.

The first software prototype was recently launched for testing. It is now being refined for the project, which is scheduled for completion later this year. The dedicated team of Haifa scientists includes Dr. Yaakov Navor, Dr. Eugene Walach, Asaf Tzadok, Avichai Giat, and Dr. Ehud Karnin. In recognition of the project's significance, IBM is contributing the servers for the database as well as the necessary software.

Related XML feeds
Topics XML feeds
Chemistry, computer science, electrical engineering, materials and mathematical sciences, physics and services science