-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Challenge 24 - Knowledge Graph Generation for Enhanced Chatbot and Scientific Literature Synthesis
Stream 2 - Machine Learning for Earth Sciences applications
Goal
In this challenge, participants will build a tool to enhance the existing ECMWF chatbot*. The goal is to create a system that can understand scientific texts about weather and use that information to make interactive graphs to facilitate better understanding and exploration of weather-related concepts and phenomena.
- Using transformer models to develop a search engine for datasets, charts, and documentation
- ECMWF's experimental AI-based assistant
Mentors and skills
- Mentors: Ana Prieto Nemesio, Florian Pinault, Baudouin Raoult (all ECMWF)
- Skills required:
- Python
- Machine Learning
- Data Science and NLP
- Knowledge of Graph Theory and Knowledge Representation
- Possible extra: Confluence plugins and macros (Java, Velocity)
Challenge description
Knowledge graphs help by structuring knowledge, making it easier to understand and use. By adding knowledge graphs to chatbots and search engines, user experience can be enhanced. Additionally, a tool capable of generating interactive content graphs could allow simply the process of synthesizing scientific literature, making it easier to explore and analyze connections between ideas.
Challenge Tasks
- Data Acquisition and Preprocessing - Parsing and Entity Extraction:
Participants will collect relevant datasets, including scientific literature, weather data, domain-specific ontologies, etc.
They will then preprocess the data to extract key entities, relationships, and metadata necessary for knowledge graph generation. For this, they will leverage existing functionality developed for the company's internal chatbot around data parsing and entity extraction. - Knowledge Graph Generation:
Participants will design and implement algorithms and techniques to generate knowledge graphs from the acquired data, aiming to capture meaningful relationships between entities, concepts, and events.
They'll refine the graphs by clarifying entity meanings, resolving entity ambiguity, refining relationships, and enriching entity attributes. - Interactive Visualization and Exploration:
Participants will develop interfaces or tools to interactively visualize and explore the generated knowledge graphs. The visualization should facilitate intuitive navigation, querying, and exploration of the underlying knowledge graph.* - Integration with Chatbot/Search Engine:
Participants will integrate the developed knowledge graph system with the existing ECMWF ChatBOT. This integration will allow users to access insights and information from the knowledge graphs directly through the chatbot or search engine interface.
Dataset
Participants will be provided with datasets containing scientific literature, weather data, domain-specific ontologies, and any other relevant sources, including preliminary entity dataset extracted from the chatbot development.