Halls of Power: Inside the Large Hadron Collider's Incredible Data Center

The Large Hadron Collider is justifiably famous for its incredible discoveries and frontier technology, but the computer grid that stores, processes and shares all its collected data is no less impressive.

posted on 06/20/2013
Yohani Kamarudin
Scribol Staff

CERN computer center corridor
Image: Andrew Strickland/© 2010 CERN
A corridor of CERN’s computing center

Whenever there’s mention of CERN, or the European Organization for Nuclear Research, we’re used to hearing about mind-blowing discoveries. Experiments conducted in its Large Hadron Collider (LHC) use state-of-the-art technology seen nowhere else on Earth. And the knowledge gained from these experiments changes the way we think about our universe and the forces it contains. This is why data storage and processing is a crucial task – and one that requires some complex technology of its own.

CERN computer center fisheye
Image: Andrew Strickland/© 2010 CERN
Technology can be beautiful as well as smart.

The amount of raw data being generated at CERN is astonishing. The LHC itself produces around 15 petabytes (15,360 terabytes) of information a year, the result of 600 million particle collisions being recorded each second. As you might expect, CERN cannot afford to sift through this much information. Hence, since 2002, the data has been sent out to a worldwide grid made up of computer centers – with the data made available to more than 8,000 physicists. This grid is known as the Worldwide LHC Computing Grid (WLCG).

Blue lit corridor at CERN computer center
Image: Roger Claus/© 2010 CERN
A blue-lit corridor of CERN’s computing center

When CERN was formed in 1954, there was no centralized computer system in place. Methods of recording data were wildly varied, and there was no technology capable of sharing information easily – such as the Internet, which was actually developed by CERN. In those early decades, records were rarely digital; indeed, they exist mostly as typewritten reports on paper.

Black and white corridor at CERN computing
Image: Maximillien Brice and Anna Pantelia /© 2012 CERN
To facilitate cooling, the massive racks of processors are arranged with their cool sides facing one another other in “cold aisles” and their hot sides positioned in the same manner in “hot aisles.”

This rather chaotic state of affairs continued until the mid ‘70s. Then in 1976, CERN introduced CERNET. Using this system, data could be transferred between different mainframes and smaller computers by 2Mbit/s serial lines employing packet switching in a gateway-node network. Confused? Never mind. Anyway, CERNET operated for a decade, and by the end, it had its own type of LAN bridge, which connected early CERN Ethernets. None of this allowed for long-distance communication, however. A better system was needed.

Purple lights in the CERN computer center
Image: Roger Claus/© 2010 CERN
Some racks in the data center only have a handful of machines inside them, as CERN doesn’t have the cooling capacity for more.

In the early ‘80s, a satellite channel was utilized to connect two local area networks for the first time. In 1983, Ethernet was introduced. And from 1984 to 1985, TCP/IP protocols were established at CERN. Slowly, TCP/IP was introduced, paving the way for huge changes in how CERN – and eventually the entire world – would communicate.

Looking down a corridor at the CERN computer center
Image: Maximilien Brice/© 2012 CERN
Another corridor of impressive computing power

The first Internet connection at CERN was opened to external sources in 1989. Also in 1989, the ENQUIRE project was started by Tim Berners-Lee, who was joined by fellow computer scientist Robert Cailliau in 1990. This project was based on using hypertext to share information between CERN researchers. In 1991, it resulted in the first website going online. This led to the World Wide Web, which CERN declared free to all users on April 30, 1993.

Green and blue lights
Image: Roger Claus/© 2010 CERN
Most of the machines are powered using an uninterruptable power supply (UPS), which can shift between French and Swiss power supplies, if necessary.

Although the web is a monumental contribution to the world of computing, it’s not the only one that CERN has been involved with. These days, the organization has become a pioneer in the field of grid computing. As of 2012, The Worldwide LHC Computing Grid (WLCG) encompassed more than 170 computer centers across 36 different countries. This system has been divided into four tiers – 0, 1, 2 and 3 – with each one offering particular services.

Close up in CERN computer center
Photo: Guido Giudetti/© 2010 CERN
A copy of all data generated by the LHC will be set aside for the collider’s entire lifespan.

The first level, tier 0, is CERN’s Data Centre and is responsible for the storage of all raw data produced by the LHC. The Data Centre also begins the analysis process by reconstituting the information in a more helpful way. The data then continues on through a fiber optic network to 11 tier 1 centers located across the globe. Working at speeds of 10 gigabits a second, this network is known as the LHC Optical Private Network (LHCOPN).

Corridor in CERN computer center
Photo: Guido Giudetti/© 2010 CERN
There are 14,900 processors with a staggering 70,000 cores at CERN’s computing center.

At the tier 1 centers, the information from the Data Centre is reprocessed and stored on tape, with replicas of the raw data also stored at this stage. Tape is used for data storage because it lasts longer than disk or RAM memory; it’s also more economical, as it is cheaper and uses up less power. Researchers can, however, access the latest data on disk servers.

An automated magnetic tape vault
Image: Claudia Marcelloni and Maximilien Brice/© 2008 CERN
The magnetic storage tapes are handled by robotic arms.

Next, the tier 1 centers output the processed information to some 140 tier 2 centers. These generally consist of universities and scientific institutions that connect to the tier 1 sites via standard networks. Here, the data is again stored, processed and analyzed. This is the last level at which there is formal engagement with the WLCG. After passing through the tier 2 centers, the data is made available to educational faculties and individuals via tier 3.

CERN computer center
Image: Szilard Pataki/© 2010 CERN
Cold air is channeled into the data center using giant blue pipes that stretch from the roof to the floor.

As information gathered from the LHC experiments increases, so the grid also expands. In June 2013, a new tier 0 center was opened in Budapest, Hungary. Installed at the Wigner Research Centre for Physics, the new site is linked to CERN’s Data Centre and has the capability to transmit 100 gigabits a second. Amazingly, the Wigner Centre works as though it were part of the Data Centre at CERN itself – even though it’s located some 600 miles distant. The extra data center will boost CERN’s processing capacity by around 70 percent.

A dark section of the CERN computer center
Image: Guido Giudetti/© 2010
This vault looks like something out of Star Wars.

Through its tiered grid system, CERN allows more than 8,000 physicists around the world to access its collected data practically in real time. This system is a technological feat almost as impressive as the LHC itself. With CERN continuing to break new ground in the field of computing and push the boundaries of science, who knows what world-changing computer technology the organization will come up with next?

Sources: 1, 2, 3, 4, 5, 6, 7

Yohani Kamarudin
Scribol Staff