By Simone Ulmer, CSCS. Original article
The Square Kilometre Array Observatory (SKAO), currently under construction in South Africa and Australia, will generate 600 petabytes of astronomical data annually once operational. Both sites will have a high-performance computing (HPC) data centre to correlate and collect the data. In order to be accessible to the consortium’s researchers worldwide, this enormous amount of data will be distributed to different computing centres around the globe to be stored. CSCS will be one of these centres to receive some of the data, which it will make available for analysis. Pablo Fernandez, Service and Business Manager at CSCS, and computational scientist Victor Holanda Rusu are significantly involved in the project. In this interview, they offer insight into the challenges of building such a platform and connecting it to other HPC sites around the world.
Why and how is CSCS part of SKAO/SKA Switzerland Consortium (SKACH)?
Pablo: When Switzerland joined the consortium at the beginning of the year, the door was opened to become a contributing member. Switzerland has already been involved in the SKA project over the past five years, gaining observer status in the SKA Organisation in 2016. There was a strong push for doing that from the research organizations in Switzerland to the government . CSCS formally joined the discussions in 2020, and then started drafting the proposal for funding that started in 2022. Our contribution is mostly on the platform side and on the infrastructure side. Besides the HPC knowhow CSCS has and our ability to contribute to scientific codes and libraries, we have a lot of knowledge on how to develop and operate these sorts of platforms because we are members of the Worldwide Large Hadron Collider (WLHC) computing grid, which is very good at moving data. SKA is going to use some of the components that are coming from there. We are also part of the Cherenkov Telescope Array (CTA) collaboration. In Switzerland, the government has decided to put SKA and CTA together so that we can also find synergies between these two.
What are SKACH’s expectations of CSCS in the project?
Pablo: Our task is to contribute to the construction of the Swiss SKA Regional Centre (SRC). The Swiss SRC is this data centre — and I quote, “data centre” — because this is not just about hardware that is sitting in some machine room. It is also the software that lies on top that allows one centre to connect to other data centres, to do the data movement, to do the authentication, the simulations, the analyses, etc. The expectations are mostly on the infrastructure, but also on the platform, the software development, helping them run in different places, and getting access to more resources than just what CSCS can provide, such as in PRACE, EuroHPC and other calls.
Victor: SKACH expects us to contribute to the development of the Swiss SKA Regional Centre by providing the expertise that is unique to CSCS. As the Swiss National Supercomputing Centre, our expertise goes beyond the technical aspects of how to develop and operate such research infrastructure; it also includes handling and navigating international research collaborations that require research infrastructure to be developed and operated. Furthermore, we can leverage our working relationships and experience dealing with the different areas of science to help define the HPC requirements and resolve needs of the Swiss SKA community.
How free is CSCS and the other Swiss institutions that are part of the collaboration in the development of these platform?
Pablo: One of the key partnerships for CSCS is with EPFL and all the other partner institutions in Switzerland in defining what they need for the platform. Another key partnership is the international collaboration itself. It is an international team that is working on defining how the platform should look, because we cannot just do whatever we want together with EPFL. We must be able to connect to all the other sites to move data around. In a way we are free to develop some solutions that are interesting to Switzerland only, but we will also be working under the framework of this so called the SRCnet. It is understood that each of these SRCs will be unique in a certain way, so we will have a lot of freedom, but this heterogeneity makes the definition even more difficult. The details are still unclear since we don’t know yet how the user and resource management will work.
Victor: We are free to choose solutions that fit the expectations, but at the same time, we need to interact with the international partners in order to define the minimal set of features that all the SRC sites have to abide by to be part of the SRC network. On top of the minimal requirements, there is some freedom in the platform definitions.
Is there a unique contribution to SKA from CSCS?
Pablo: We have the specific task of giving the Swiss scientists an edge when it comes to running mostly simulations, but also analysis, at scale. We participate in the Platform for Advanced Scientific Computing (PASC), developing libraries and codes together with the scientists in order to make them scalable for SKA data processing. The people in the PASC projects “Next-Generation Radio Interferometry” and “SPH-EXA2” are doing amazing work over there. The goal there, though, is not to target just Switzerland — we want to give Swiss scientists the possibility to scale up and run their simulations anywhere in the world, so we helped them in getting all these libraries prepared. The two unique ways CSCS supports SKA are that we have our flexible supercomputing infrastructure ‘Alps’, and we have the knowhow on writing scientific libraries. On ‘Alps’ we can create so-called vClusters which will be an instantiation of a cluster for SKACH inside of Alps.
Victor: Besides the software development, we are also helping shape the Swiss SRC, because we got the mandate from State Secretary for Education, Research and Innovation (SERI) to find synergies between CTA and SKA. EPFL and the scientists are leading the project needs because it’s a science project, but we are helping them to define their priorities in the sense of the compute, storage, and functionalities. We also take part in the discussions between Switzerland and the SKA international consortium, and we are also helping them test different algorithms, techniques, and technologies. We are experimenting with things that they haven’t done before or have only been done in different contexts. For example, one of the things we are currently experimenting with is to attach a different storage backend to the data distribution system currently being tested by the SKA consortium.
When SKAO is fully functional, will CSCS simply provide computing power and storage?
Victor: We actually don’t know. My feeling is that this consortium will evolve because, we still have seven, eight years to go, until we have all the telescopes fully functioning. Until then, there will be new science theories and ideas, new technologies on the floor, and new software to be developed. We will support the Swiss government and our scientists to engage in the consortium, and to the best of our abilities, be ready for their future needs. Maybe we will increase our engagement beyond the current level, be it with our research infrastructure, in the development of new software, or maybe by trying to figure out new mechanisms to keep the entire SRC net secure.
What are the big challenges?
Pablo: The data movement part is quite impressive. We have a huge infrastructure, so we’re talking about huge amounts of data. Also the network endpoint connection is a challenge.
Victor: The challenge is not only to transfer this data, but also make it all available to scientists and provide them with an environment where they can do science. There will be different data science products being generated at the different SRCs. Some of those depend on the raw telescope data, so we also need to support — to a certain extent — the different pipelines that run to the Science Data Processing centres that are connected to the telescopes. At the same time, the entire workflow must be secure to guarantee confidentiality, integrity, and availability of the data and the science products generated. Furthermore, we also have to eventually optimize all this workflow using different criteria, such as energy efficiency, time to solution, and so forth. There are a lot of challenges.
What is each of your roles in SKACH?
Victor: I’m the project manager for the collaboration inside CSCS. And on SKACH, I’m the chair of the Computing Platform and Infrastructure program. So, I have two hats.
Pablo: I am a member of the of the board. The SKACH collaboration has one member per institution, and I am representing CSCS inside the collaboration in Switzerland.
How does this look concretely in your daily work?
Victor: For me, it is a lot of meetings, a lot of traveling, a lot of reports and coordination, internal and external. As chair of the computing platform, I have to attend both the SKACH and SKA meetings.
Pablo: I meet with the board once per month, either physically or virtually. Victor is certainly more involved in the day-to-day. I spend weekly time with Victor trying to follow up with the different things that are happening — there are some things that he needs help with on the management side. I’m in touch mainly with the professors in the collaboration. At the moment, I am preparing for the next funding cycle, which is coming from 2025 to 2028.
What makes SKA exciting and interesting for you personally?
Victor: So many things! It’s challenging from a technical point of view. You have to look to the future. But you also deal with people — this is the most exciting part. One has to talk to different cultures and different educational backgrounds. I have to attend meetings with a diverse set of scientists and engineers all at different stages in their careers. This is a very good experience for me. I’m learning a lot. I love the project! Furthermore, the science aspect is very important to me, as I earned a PhD in chemistry and have lived a bit in the science universe. This project reminds me why I have studied chemistry in the first place and why I love to work at CSCS. Astronomy is considered by some to be the area that has the biggest impact in creating the initial spark of interest that leads students to become scientists. The idea that you are contributing to or enabling new scientific discoveries can potentially increase people’s interest in science, and to be a part of a large project that has potential to impact all of mankind is thrilling. It’s easy to wake up in the morning and go to work.
Pablo: For me, I think it’s really from the heart. I really follow and like astronomy. I have a telescope, and looking at the universe is, for me personally, deeply interesting. Creating this astronomy center for Switzerland — I’m super proud that I’m part of the driving wheel that is making this possible. This is pretty exciting!