ALCF’s Holohan plays a critical role in maintaining and optimizing Argonne’s high performance computing systems, including the new Aurora supercomputer.
Carissa Holohan has loved computers ever since she was a little kid. She spent her childhood tinkering with broken electronics and earned a ham radio license by age 12. These interests inspired her to study telecommunications and network technologies at Virginia Tech, where she had an opportunity to work on the network connecting the university’s high performance computing (HPC) cluster system.
After graduating with a degree in computer engineering in 2003, Holohan worked as a networking and telecommunications engineer at Virginia Tech. Then in April 2010, she took a job at the U.S. Department of Energy’s (DOE) Argonne National Laboratory.
“When an opportunity presented itself with the Argonne Leadership Computing Facility, I was very interested. It was a chance to move to a bigger and faster system that was larger scale than what I could do at an individual university,” she said.
Now, Holohan is the principal HPC network architect for the Argonne Leadership Computing Facility (ALCF), a DOE Office of Science user facility. She spends her days maintaining the networks that connect the lab’s major HPC systems, which she defines as systems that allow for extremely rapid communication, low latencies and interdependent calculations between individual computers within a supercomputing system.
“My day-to-day can vary wildly depending on how well behaved the computers are,” Holohan said. As the person responsible for the network architecture of the supercomputing center, she must ensure that the ALCF’s HPC systems are effectively communicating internally, with one another and with external supercomputing centers. “Currently, my work is mostly focused on monitoring, maintaining and optimizing performance of the Aurora project.”
Aurora is Argonne’s newest HPC system — an exascale machine that provides significantly more computing power than previous-generation supercomputers. “Our scientists are just now discovering how much they can achieve with this level of computation,” Holohan noted. “The scale is mind-boggling when you consider how far we’ve come from what our supercomputers used to be. An individual compute node on this machine is as powerful as a whole row of a machine from 10 years ago.”
Currently, Holohan and her team are figuring out how to stabilize and scale up operations of the Aurora system. This includes making sure data travels efficiently into and across the system’s 10,624 nodes. With a massive, 40,000-square-foot data center, distance can be a problem.
“Our data center is so big, the speed of light is too slow,” she explained. “We have to measure some latencies in nanoseconds. In fiber optic cable, light moves about two-thirds as fast as it does in air. To get from one end of the room to the other, that means it takes about 500 nanoseconds for data to cross the floor: that’s an eternity for a supercomputer as fast as Aurora. Part of my job is to optimize that so Aurora isn’t waiting around for data.”
System optimization is a team effort, and many other components are put to the test by the ALCF. “We joke that if a vendor or a manufacturer of computer hardware truly wants to stress test their equipment, they should send it to the ALCF because our users will push it to the limit,” Holohan said. “On the Mira supercomputing system, it got to the point where we could tell who was running on the system by the noises that the power supplies in the computer would make because some of our users had written code that was so incredibly optimized that the power supplies would whine and squeal.”
After 14 years at Argonne, Holohan continues to be inspired not only by the innovations in networking technologies, but also by the broader impacts of her work. “I care about supporting open science. It’s really rewarding to see the impact that we have at Argonne and how many researchers from different scientific fields are taking advantage of the work that we do,” she said.
Holohan also expressed appreciation for her colleagues at the ALCF. “I work with some of the absolute best people I could ever hope to work with. My team lead is one of my best friends.” Outside of work, the ALCF team often goes out to the movies or plays games. Holohan is also a fan of Japanese animation, or anime. She takes pride in her massive manga collection and enjoys attending anime conventions. “I also have a number of friends who work in the voice acting industry,” she said. “In fact, I’ve written a number of ghost stories for some of their horror podcasts.”