Skip to content

Semantic search for GEO datasets #15

@MomirMilutinovic

Description

@MomirMilutinovic

Users should be able to query GEO datasets related to a list of PubMed IDs using natural language to find quickly surface the most relevant data.

Create an endpoint /relevant-datasets that:

  • Takes as input a list of PubMed IDs and a query string
  • Returns a list of GEO datasets related to the PubMed IDs sorted by their relevance to the query

Notes:

  • The relevance of a dataset to the user's query should be measured using cosine similarity between the embeddings of the query and dataset metadata
  • Use the PubTrends sentence-transformer API to fetch embeddings

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions