Task Area 2: (Meta)data Repositories and Catalogues

Summary

Goal of Task Area 2 (TA2) is to implement interlinked data sets and catalogs that fully comply with FAIR principles at all participating neutron and synchrotron radiation sources but also at universities and other research institutions. Data repositories have to be carefully curated and continuously developed in their structure in order to sustainably ensure and optimize findability, reuse and extension of the data repositories for the national and international community of scientists in the field of research with X-rays and neutrons. Besides raw data, repositories will include all processing and analysis steps from raw data to the final result and the detailed description of the sample. This will raise transparency and thus quality, trustworthiness and reusability of the data sets shown in publications to a new level.

Challenges and goals of TA2

Initially, TA2 will be primarily concerned with the collection of raw (meta)data assets and the integration of processed and published data. This requires standardization of data and metadata (with TA1) and linkage to software assets (with TA3). Well-defined interfaces and common standards for the vocabulary used will need to be created in order to link the inventories that relate to different phases of the analysis process through to publication. The possibility should be created to enrich the raw data after an analysis with further metadata as well as the evaluated data sets, derived new data, complementary experimental, simulated and theoretical model derived data, precisions on properties and quality of the sample as well as links to further information. In addition to the classical search in metadata to find measurement data, the implementation of (AI) search algorithms to find measurement data based on an experimental or simulated dataset is also envisaged.

The goals of TA2 are in particular

  • Implementation and further development of data sets and catalogs
  • (Meta-)data standardization and sample identification (sample PID)
  • Creation of the possibility to insert additional (meta-)data into inventory/catalogue
  • Implementation of innovative algorithms for efficient data search in real time

DAPHNE4NFDI will make a significant contribution to establishing standards for data acquisition and storage, as well as building well-maintained data repositories. While TA1 is mainly concerned with the automated acquisition of (meta-)data, TA2 is dedicated to the "Findable", "Accessible" and "Re-useable" aspects of the FAIR data policy.

Experience and expertise

TA2 draws on the expertise of participating groups at X-ray and neutron research centers, universities and other research institutions. DAPHNE4NFDI will benefit from the involvement of some partners in the EU-funded PaNOSC and ExPaNDS projects. Several institutions in DAPHNE4NFDI already operate data catalogs to record experiments performed and measurement data generated. Work is already underway to link such data catalogs into a union catalog by defining and implementing a common API for searching the data. These catalogs also have interfaces to connect to transnational infrastructures such as OpenAIRE and EUDAT.

Focus areas

Data set building will initially focus on X-ray absorption spectroscopy, tomography, diffraction (wide and small angle), and spectroscopy (quasi-elastic neutron scattering and XPCS), and will also include time-dependent measurements.

Participation

The groups participating in DAPHNE4NFDI cover with their expertise large parts of the research with X-rays and neutrons. Accordingly, all relevant techniques will be addressed by the DAPHNE4NFDI consortium and taken into account in the planning and realization of (meta-)data formats, repositories, catalogs, and software projects. An important pilot project for the development of a reference database is planned in the field of X-ray absorption spectroscopy including X-ray emission and further photon-in/out techniques.