Skip to main content

IBM to Help CERN Build Massive Data Grid to Understand Origins of the Universe

IBM's Innovative Storage Virtualization Technology to Handle Up to One Petabyte of Data

Select a topic or year


GENEVA, Switzerl & & ARMONK, N.Y. - 02 Apr 2003: IBM and the European Organization for Nuclear Research (CERN) today announced that IBM is joining the CERN openlab for DataGrid applications to collaborate in creating a massive data-management system built on Grid computing.

IBM's innovative storage virtualization and file management technology will play a pivotal role in this collaboration, which aims to create a data file system far larger than exists today to help scientists at the renowned particle physics research center understand some of the most fundamental questions about the nature of matter and the universe.

Conceived in IBM Research as Storage Tank®, the new technology is designed to provide scalable, high-performance and highly available management of huge amounts of data using a single file namespace regardless of where or on what operating system the data reside. IBM and CERN will work together to extend Storage Tank's capabilities so it can manage and provide access from any location worldwide to the unprecedented torrent of data -- billions of gigabytes a year -- that CERN's Large Hadron Collider (LHC) is expected to produce when it goes online in 2007. The LHC is the next-generation particle accelerator. It will recreate -- on a tiny scale -- conditions that existed shortly after the Big Bang, enabling researchers to answer outstanding questions about what the universe is made of and the laws that govern its behaviour.

The very same CERN community that invented the World Wide Web in 1990 is now developing a new application for the Internet -- Grid computing - that will push technology limits with its data processing requirements for the LHC. CERN openlab is a collaboration between CERN and leading industrial partners, which aims to create and implement data-intensive Grid-computing technologies that will aid the LHC scientists. Because the same issues facing CERN are becoming increasingly important to the IT industry, the CERN openlab and its innovative partners -- which include Enterasys Networks, HP and Intel -- are eager to explore new computing and data management solutions far beyond today's Internet-based computing.

By 2005, the CERN openlab collaboration with IBM is expected to be able to handle up to a petabyte (a million gigabytes) of data, which is equivalent to the information stored in 20 million four-draw filing cabinets full of paper, or 500 million floppy disks, or 1.5 million CD-ROMs.

"CERN has a long-standing collaborative relationship with IBM, and we are delighted that IBM is joining the CERN openlab for DataGrid applications," said Wolfgang von Ruden, Information Technology Division Leader and Head of the CERN openlab. "Together with IBM, we aim to achieve a one petabyte storage solution and integrate it with the Grid that CERN is building to handle the extreme data challenges of the LHC project."

"CERN's scientists and colleagues want to be able to get to their data wherever it may be -- local or remote and regardless of which operating system on which it may reside," said Jai Menon, IBM Fellow at IBM's Almaden Research Center (San Jose, Calif.) and co-director of IBM's Storage Systems Institute joint program between IBM Research and the company's product division. "This is the perfect environment for us to enhance Storage Tank to meet the demanding requirements of large-scale Grid computing systems."

As part of the agreement, several leading storage management experts from IBM's Almaden and Haifa (Israel) Research Labs will work with the CERN openlab team. In addition, through its Shared University Research (SUR) program, IBM will supply CERN with the system's initial 20 terabytes of high-performance disk storage, a cluster of six eServer xSeries systems running Linux and on-site engineering support and services by IBM Switzerland. The SUR award is valued at $1.5 million for the first year.

Storage Tank employs policy-based storage management to help lower its "total cost of ownership." Clustering and specialized protocols that detect network failures enable very high reliability and availability

In this initiative, IBM is following a collaboration strategy initiated in 2001 with the European Union-sponsored European Data Grid project, which is also led by CERN.

Related XML feeds
Topics XML feeds
Business partners
Business partner information including strategic alliances
Research
Chemistry, computer science, electrical engineering, materials and mathematical sciences, physics and services science
Services and solutions