Skip to content

Conversation

nakib103
Copy link
Contributor

@nakib103 nakib103 commented Aug 5, 2025

https://embl.atlassian.net/browse/ENSVAR-6812

Pipeline update:
Update ProteinFunction pipeline to load AlphaMissense and ESM1b scores and prediction instead of MetaLR and MutationAssessor. This will be effective for dbNSFP version >= 5.0.

API update:
The API have added functionality to retrieve and store from these 2 predictors.

  1. the prediction matrix now use 3 bits for prediction. Because Alphamissense from dbNSFP can have 5 different prediction hence cannot fit into 2 bits.
  2. For ESM1b scores which can be between -24.538 to 6.937 the scores are converted from a (-50)-50 coord to 0-1 coord by adding 50 and dividing by 100 before storing. The reverse operation converts the score back to (-50)-50 coord. To keep accuracy in the decimal digit we round up the score to first decimal place.

Unit test update:
Because of the change to prediction bit occupying 2 bits to 3 bits, we needed to update the protein function matrix in test database - modules/t/test-genome-DBs/homo_sapiens/variation/protein_function_predictions.txt
The same has not been done for cache, partly due to complexity in generating those and because only one test is effected by this. The test data has been change for this test case so it can pass.
Besides above, test data has been added for dbNSFP 5.0 and 5.2 and the new predictors.

Test
pipeline - http://guihive.ebi.ac.uk:8080/versions/96/?driver=mysql&username=ensadmin&host=mysql-ens-var-prod-4&port=4694&dbname=snhossain_ehive_pf_dbsnfp_test_116_38_homo_sapiens&passwd=xxxxx
(the pipeline was not fully run because of codon unavailability)

The following database has been updated with the above pipeline -
@v4 snhossain_homo_sapiens_variation_116_38

Related PRs -
Ensembl/ensembl-webcode#1100
Ensembl/public-plugins#910
Ensembl/ensembl-glossary#11

@nakib103 nakib103 marked this pull request as draft August 5, 2025 10:34
@nakib103 nakib103 requested a review from jamie-m-a August 18, 2025 13:24
@nakib103 nakib103 marked this pull request as ready for review August 18, 2025 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants