Driving Data Innovation: Progress and future plans for DATAMITE pilots
The DATAMITE project is steadily progressing through its six innovative pilots, each addressing unique data management challenges and demonstrating DATAMITE's potential for innovation in data sharing, governance and quality assessment. With milestones set through 2025, these pilots will strengthen the DATAMITE framework's ability to address diverse sectoral challenges and provide sustainable, scalable solutions for data sharing and management.
Here is an overview of the progress and upcoming milestones for each of these pilots:
Pilot 1: Data exchange between companies within the same corporate group
The infrastructure for data exchange is in place, with an on-premises DATAMITE ecosystem using PostgreSQL, MongoDB, and Cassandra. Test scenarios have been established, and a beta test API, developed using Django, facilitates interactions for OBREMO and FACSA use cases. Sensitive data management is in line with DATAMITE's legal advice, with problematic and controversial data being removed. Finally, cloud deployment testing is underway in collaboration with ITI.
Future steps
- Deployment of the DATAMITE framework, at least in the on-premises infrastructure.
- Communication and connection testing with API, with additional integration in case of interest.
- Collect and analyse initial KPIs.
- Produce multimedia materials to showcase test results.
- Drafting the pilot report.
Pilot 2: Corporate Multi-Site Data Exchange
The DATAMITE framework has been deployed in a development infrastructure. Test scenarios and synthetic datasets have been prepared in Month 23 (November 2024). The team is refining the data structures and running preliminary evaluations of the framework's performance.
Future steps
- Full deployment of the framework in the pilot development infrastructure.
- Finalise and execute use case validation.
- Set up production pilot infrastructure in Azure.
- Evaluate the DATAMITE framework on different operating systems, such as Red Hat Enterprise Linux flavours, to compare with traditional distributions such as Ubuntu or Debian.
- Submit a publication describing the pilot to the next AIAI 2025 conference.
Pilot 3: Offering Data to Service Providers with DataSpaces
Component deployment and testing began in month 17 (May) and is scheduled to be completed by month 25 (January 2025). The current priorities include deploying and testing the relevant DATAMITE components, finalizing the test scenarios, and contributing to the course design. Additionally, the deployment will soon be moved from Google Cloud VM to a permanent solution.
Future steps
- Carry out the first iteration of scenario testing between November 2024 and the end of January 2025, completing component deployment and testing, and gathering feedback to refine the second iteration of scenarios testing from March 2025.
- Migration of the deployment to an on-premises infrastructure.
- Deployment of Data Sharing components and publication of anonymised sensitive and composite data products.
- Draft the initial pilot report.
Pilot 4: Leverage Electricity Distribution Open Data
The internal infrastructure, which will be based on a MS Azure environment, is ready for deployment, while Data Quality, Data Governance and Data Support Tools have already started to be deployed. Testing will be divided into three levels and is focused on improving data collection and sharing processes.
Future steps
- Complete and validate pilot test scenarios.
- Deploy and test components from the Data Support Tools, Data Quality and Data Governance modules until the end of January 2025, which will be the focus of the first two levels of the test scenarios
- Draft the initial pilot report
Pilot 5: Connecting eDWIN to Data Markets
The team has completed the integration with Pontus-X for the agri-food sector. Meteorological data has been integrated with AIM and normalisation pipelines for pest and production (agrifood domain) datasets are operational.
Future steps
- Complete the first automated data workflow.
- Integrate eDWIN with DATAMITE data quality and data governance tools.
- Conduct test scenarios and complete demonstrators by the end of February 2025.
Pilot 6: Connecting MISTRAL to the EU AI-ON-Demand Platform
Resource allocation on the cloud is underway, along with the deployment of the development environment to test the DATAMITE framework. The team is also currently defining data quality checks and analysing data connectivity for external catalogues such as AI-on-Demand and the Mistral Open Data Catalogue.
Future steps
- Integration of the DATAMITE framework with the Mistal platform.
- Implementation of a Mistral connector to retrieve (meta)data in the DATAMITE framework.
- Dissemination of the data catalogue with data sharing to a local open data catalogue on the one hand, and to an external AI-on-demand platform on the other.
- Set up user-defined custom rules.
- Enable data quality checks with the Data Quality Module.
If you want to always stay updated about our project, follow us on LinkedIn, Twitter and Bluesky!
DATAMITE takes the stage at the first Gaia-X Hub Spain Data Space Summit
The first Gaia-X Hub Spain Data Space Summit ("Primera cumbre de espacios datos", as it is called in Spanish) took place in Madrid on 3 and 4 December. The most relevant international event on Data Spaces in Spanish brought together prominent leaders of national initiatives, government representatives and European project managers in the field of Data Spaces, Technology and Artificial Intelligence, with a special focus on use cases and ongoing projects, to share their knowledge and experiences in a dynamic and collaborative environment.
DATAMITE, as a European project focused on the monetisation of data and data spaces, coordinated by a Spanish entity (ITI), did not miss the event. Jordi Arjona Aroca, from ITI and technical coordinator of DATAMITE, presented the key aspects of the project and how the DATAMITE framework will revolutionise the data monetisation landscape in a session entitled 'Technology for data spaces' on 4 December.
If you want to always stay updated about our project, follow us on LinkedIn, Twitter and Bluesky!
DATAMITE plenary meeting in Aachen
From 26 to 28 November, the DATAMITE consortium met in Aachen for its second - and last - plenary meeting of 2024. As at the previous plenary meeting in Poznan, the agenda was divided into a first day of CodeCamp with the developers involved in the project, followed by two days with the whole consortium, where each work package had its moment in the spotlight with presentations and/or workshops.
The first day was spent at the CodeCamp, as a warm-up for the plenary sessions that would take place over the next two days. This edition was less about coding and more about practical activities and shaping the future of the DATAMITE framework. All DATAMITE design and coding partners were divided into parallel sessions, according to their needs, in order to get the most out of CodeCamp.
On 27 and 28 November the remaining partners who were not involved in the codecamp joined in. The DATAMITE consortium exchanged ideas, discussed the results of the review meeting and set the roadmap for the final year of the project. In contrast to previous plenary meetings, and despite the structure of the agenda being divided into work packages, the Aachen meeting was much more practical and focused on encouraging the participation of all partners in all work packages through more interactive workshops and sessions.
Some of the main topics discussed were the six pilot projects of the project and the future steps to be taken before the end of 2025. Exciting things to come next year!
In addition to three intense days of work, the DATAMITE consortium enjoyed a guided tour of the Aachen Christmas Market and a visit to the Demonstration Factory, which is a central component of the Smart Logistics Cluster on the RWTH Aachen campus, thanks to the colleagues from FIR RWTH Aachen University who hosted this plenary meeting.
If you want to always stay updated about our project, follow us on LinkedIn, Twitter and Bluesky!