You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/usage.md
+18Lines changed: 18 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -37,6 +37,9 @@
37
37
-[--cf_window](#--cf_window)
38
38
-[--no_gvcf](#--no_gvcf)
39
39
-[--no_strelka_bp](#--no_strelka_bp)
40
+
-[--umi](#--umi)
41
+
-[--read_structure1](#--read_structure1)
42
+
-[--read_structure2](#--read_structure2)
40
43
-[--pon](#--pon)
41
44
-[--pon_index](#--pon_index)
42
45
-[Annotation](#annotation)
@@ -478,6 +481,21 @@ Path to CADD SNVs index.
478
481
479
482
Enable genesplicer within VEP.
480
483
484
+
### --umi
485
+
486
+
If provided, UMIs steps will be run to extract and annotate the reads with UMIs and create consensus reads: this part of the pipeline uses *FGBIO* to convert the fastq files into a unmapped BAM, where reads are tagged with the UMIs extracted from the fastq sequences. In order to allow the correct tagging, the UMI sequence must be contained in the read sequence itself, and not in the FASTQ name.
487
+
Following this step, the uBam is aligned and reads are then grouped based on mapping position and UMI tag.
488
+
Finally, reads in the same groups are collapsed to create a consensus read. To create consensus, we have chosen to use the *adjacency method*[ref](https://cgatoxford.wordpress.com/2015/08/14/unique-molecular-identifiers-the-problem-the-solution-and-the-proof/).
489
+
In order for the correct tagging to be performed, a read structure needs to be specified as indicated below.
490
+
491
+
### --read_structure1
492
+
493
+
When processing UMIs, a read structure should always be provided for each of the fastq files, to allow the correct annotation of the bam file. If the read does not contain any UMI, the structure will be +T (i.e. only template of any length). The read structure follows a format adopted by different tools, and described [here](https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures)
494
+
495
+
### --read_structure2
496
+
497
+
When processing UMIs, a read structure should always be provided for each of the fastq files, to allow the correct annotation of the bam file. If the read does not contain any UMI, the structure will be +T (i.e. only template of any length). The read structure follows a format adopted by different tools, and described [here](https://github.com/fulcrumgenomics/fgbio/wiki/Read-Structures)
498
+
481
499
## Reference genomes
482
500
483
501
The pipeline config files come bundled with paths to the Illumina iGenomes reference index files.
--ignore_soft_clipped_bases [bool] Do not analyze soft clipped bases in the reads for GATK Mutect2
98
98
--no_gvcf [bool] No g.vcf output from GATK HaplotypeCaller
99
99
--no_strelka_bp [bool] Will not use Manta candidateSmallIndels for Strelka (not recommended by Best Practices)
100
+
--umi [bool] If provided, UMIs steps will be run to extract and annotate the reads with UMI and create consensus reads
101
+
--read_structure1 [string] When processing UMIs, a read structure should always be provided for each of the fastq files. If the read does not contain any UMI, the structure will be +T (i.e. only template of any length).
--read_structure2 [string] When processing UMIs, a read structure should always be provided for each of the fastq files. If the read does not contain any UMI, the structure will be +T (i.e. only template of any length).
0 commit comments