ENCCS/HiDALGO Workshop on High-performance Data Analytics
April 27 @ 09:00 - 17:00 CESTFree
General description and learning outcomes
We would like to invite you to participate in our High-performance data analytics course where we will introduce different tools and methods for Big data handling. The tools will be presented on two different use cases but can be applied to any data.
HiDALGO (https://hidalgo-project.eu/) – HPC and Big Data Technologies for Global Systems – is a European project funded by the Horizon 2020 Framework Programme of the European Union. The project is carried out by 13 institutions from seven countries.
This training event will start with an introductory talk to provide a view of high-performance data analytics (HPDA) from the HiDALGO perspective. The main concepts will be presented, listing the tools that have been used, together with information about benchmarks the consortium has done (as a source of information about their scalability). This introduction also presents how these tools are being applied in HiDALGO, in order to solve different problems.
The following part of the training will focus on HPC and HPDA technologies, applied to use-cases such as Urban Air Pollution (UAP). The UAP application is a software framework for modeling the vehicular traffic emitted air pollution and its dispersion at very high resolution by using geometry inputs (Open Street Map), coupled weather data (ECMWF) and traffic simulation (SUMO), computational fluid dynamics (CFD) tools running on HPC infrastructures (OpenFOAM), and evaluation with HPDA methods.
This HPC/HPDA/UAP-part of the training will introduce the UAP concept, workflows, implementations, application of the CFD-module in HPC environment, deployment to HPC, running, and HPDA for evaluation and model order reduction. Participants will learn the techniques of these parts from a general perspective, namely, HPC workflow modeling (TOSCA in YAML rendering), basics of OpenFOAM for computation of air pollutant dispersion using HPC, and the applied HPDA methods for fast evaluation and model reduction (POD with SVD).
The last part will provide an introduction to the data available at ECMWF and Copernicus, and the APIs for retrieving the data, followed by practical sessions on data exploration and manipulation. After this web-seminar, participants will be able to independently discover weather, climate, and environmental data produced and hosted by ECMWF, and also to retrieve and process these data using Python libraries.
The hands-on part will be carried out using the PSNC (https://www.psnc.pl/) training cluster.
For whom is the workshop
Researchers, practitioners, and developers who are interested in the implementation of HPC workflows and HPDA. Environmental scientists that would like to apply microscale models and exploit the strength of HPC easily.
The part provided by ECMWF is aimed in particular at researchers that would like to use weather, climate, or environmental data in their work.
- Participants are expected to have some basic knowledge about Big Data technologies (although not mandatory).
- Participants need to have basic knowledge of Linux CLI for the developer parts of the UAP training. For the application side, basic environmental knowledge related to air pollution needed (although not mandatory).
- For the part related to weather, participants are expected to have basic Python knowledge and be comfortable using Jupyter Notebooks. They will have the option to either use the mybinder.org platform or work locally. The github repository with the notebooks and a list of libraries will be provided before the workshop.
Zoltán Horváth, Ákos Kovács, László Környei, Mátyás Constans, Széchenyi István University, Győr
Milana Vuckovic, European Centre for Medium-Range Weather Forecasts
Tentative schedule (CEST time)
- Introduction to HiDALGO and HPDA – 30 minutes
- The global challenge. Requirements and workflows for the Urban Air Pollution application.
- Workflow modelling (TOSCA), orchestration (croupier), data store (CKAN), and running from a web-interface. Hands-on session.
- Running UAP in the Linux CLI with OpenFOAM using HPC. Hands-on session.
- HPDA and visualization of the computational results. Hands-on session.
Total: 4 hours.
- Introduction to ECMWF and Copernicus data
- Introduction to APIs for retrieving the data
- Processing and visualising meteorological data
- Examples of using weather data in HiDALGO applications
Total: 2 hours.
You can register at https://events.prace-ri.eu/event/1193/registrations/868/.