Add UMI reads processing capability#145
Conversation
maxulysse
left a comment
There was a problem hiding this comment.
Looks amazing.
We just need the test data to do some CI.
Can you update the CHANGELOG as well?
|
Made a couple of suggestions, if you accept them, you can batch commit them. |
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
Co-Authored-By: Maxime Garcia <maxime.garcia@scilifelab.se>
lescai
left a comment
There was a problem hiding this comment.
agree with suggestions, reviewed and made more explicit
|
Hi any updates on adding umi to variant calling? Is it working? Otherwise I will build a new pipeline. |
|
Hi @chelauk not sure what's holding the pull request at this stage, I did test everything at the March hackathon using the test data here |
|
Hi @chelauk @nibscles |
|
@chelauk @nibscles |
nf-core/sarek pull request
Many thanks for contributing to nf-core/sarek!
Please fill in the appropriate checklist below (delete whatever is not relevant).
These are the most common things requested on pull requests (PRs).
PR checklist
This pull request introduces a chunk of code to process reads containing UMIs. Unique Molecular Indices are very important particularly for somatic workflows aiming at detecting very low allele-fraction variants (MRD, Liquid Biopsy). The chosen workflow adopts the FGBIO tools, which create a consensus read within the same UMI-groups, and a robust method for identification of the groups. See blog and ref.
The approach ensures downstream compatibility with the workflow: the result of the UMI process is a uBam, which can then be fed into MappingReads and downstream in both HaplotypeCaller and more importantly Mutect2 or Strelka.
Tests are work in progress: datasets have been identified from 2 different UMI types (QIAseq and Illumina TSO), but cannot complete them on laptop
As indicated above, the reads will be uploaded at nf-core/sarek branch on the nf-core/test-datasets repo
The code has passed lints (
nf-core lint .).Documentation in
docshas been updatedCHANGELOG.mdis not been updated yetREADME.mdhas not been updated yet (not sure if this is relevant)