Feb 11, 2019 from Activeeon
How to monitor Earth and its evolution? How to record and forecast changes to better protect our planet? These are challenges addressed by the CNES (French Spatial Agency), which relies on data and photos taken from satellites revolving around Earth, to analyze and provide them to scientists and the public, for free on https://peps.cnes.fr/rocket. A matter of public interest, with a cloud architecture up to cope with the amount of data processed and the challenges upcoming. Let’s review this project, its backstage and interest.
Satellites that continuously monitor Earth and its evolution… and give you free access to their data. It is the main objective and interest of Copernicus, a program from the European Union dedicated to observation and monitoring of Earth. A “service with general interest from Europe, complete and in free access”, was launched in 2011 and relies on 6 families of satellites, called Sentinels. Some of them monitor the Earth crust, others - terrestrial and marine biosphere or even the topography of fields.
“Every 4 days, satellites cover the sky. They record a massive set of information, geolocalized and timed precisely to the second.”, explained Jean-Pierre Gleyzes, Assistant Director of digital Infrastructures, IT Scientific and Application systems at CNES. “We are following the evolution of our planet as a whole and very precisely. The level of detail is amazing, we can see boats that are cleaning out, buildings destroyed by natural disasters, temperature and the composition of oceans… It is really impressive.”
With all the information recorded and sent back from Copernicus, research teams of the entire world can use these data for free, without any restriction, to study Earth. The goal being to address current challenges such as the urbanization, ensuring food supply, sea-level rise and of course climate change.”
It is consequently an ambitious high-level program, which implies cutting-edge expertise in all areas: Copernicus is coordinated by the European Commission whereas data and satellite photos are generated by the European Space Agency (ESA).
Then remains the distribution of the information to everyone, for free, for public institutions, research centers, but also private actors such as startups and labs. This is the purpose of the PEPS platform - Sentinel Products Exploitation Platform - under the umbrella of the Spatial Study European Center (CNES).
Today PEPS allows search, visualization, selection and download of images. Processing capabilities required for different use cases will soon be available on the platform.
“Our roadmap includes multiple objectives: centralize the satellite data to make them available to the biggest number of users, promote application developments, and enable SME/SMI and partners to offer and lead data processing”, specified Jean-Pierre Gleyzes.
There are various objectives and numerous challenges to tackle. The first one, the weight of each satellite image, which induces a massive and exponential volume of data. The program, which accumulated one petabyte over 20 years, just created 7 petabytes in 2 years! (A multiplication factor of 70!). A considerably large dataset in addition to the 13 terabytes… every day. Heavy data, that needs to be hosted and secured, while opening their access in a fast and secure way.
The second challenge: the massive amount of processing required. The raw satellite data have to be aggregated and reprocessed to be usable.
The CNES has entered a PoC (Proof of Concept) with Activeeon and Microsoft. The goal: deploy an impressively large calculation capacity supported by services on premises and in the cloud. Indeed, within its cutting-edge datacenter, the CNES owns 8,000 processors for the entire set of its missions. Hundreds are allocated to the PEPS program, but the needs are sometimes higher, as explain Erwann Poupart, member of the Architecture and the IT systems: “the flow of data is so large… We are on a specific use case, with some very large volumes and a requirement to have a calculation chain on all those data. We are benefiting from the internal infrastructure, but we were looking to launch the same processing chains in the cloud, in a hybrid form, in order to benefit from additional capacity.”
“It is a very interesting use case, which raises a lot of hope for those use cases” added Jean-Pierre Gleyzes.” The quantity of data that we are processing is growing to a point where we will never internally have the infrastructure required.”
Storage capacity and data processing are then distributed in the best way between the CNES premises and the Microsoft Azure cloud. The ProActive solution, developed by a Activeeon, embeds a scheduler, an orchestrator and resource management to seamlessly allocate resources between on-premise and cloud infrastructures. From Azure cloud side an access to large capacity of storage and an unlimited processing power are provided, with algorithms and large set of necessary APIs. This breakthrough already has impact on the productivity of engineers within the CNES, so confirms Erwann Poupart: “Now we can run the same processing chains locally and in the cloud. The main advantage is that the results and calculations are identical on both on-prem and in Azure cloud. It is very important for us, because we can now increase our computation capacity and processing of data in a substantial way.”
A breakthrough that delights Francois Tournesac from Activeeon, a company who has implemented the development of this project: “What is very interesting here, that is the diversity of offer and the openness of Microsoft Azure. Thanks to its unique capacity, it is now possible for the CNES to test its processing chains on a large variety of servers which enable them to optimize their development. The potential has no limit now.”
“This pilot project has been positive to us”, added Jean-Pierre Gleyzes. “From technical point of view on one side it enabled us to host the entire set of data, processing power, reproductibility and transparence of results. Also, cloud elasticity and scalability convinced us in deployment aspects.”
Those two key points settled (hosting and data processing), the CNES can now dedicate its second half of the project to offer new third-party users new possibility to work on the data.
“This storage and calculation capacity are almost unlimited, with a large flexibility, a tight security and transparent for the user… This is magic. And it gives us the willingness to continue”, was pleased Jean-Pierre Gleyzes. “Let’s now move to the next step of the project: encourage institutions and public or private organizations to develop applications that use these data and extract value out of it.”
Copied and translated from Microsoft blog
Apr 25, 2019 from Nicolas Narbais
Auto ML theories are gaining popularity and multiple solutions are gaining traction in the open source community. However, there is still a large gap between theory and practice. Let's identify some of the challenges....