Conversation
|
I think it's best to start small, and just do germline snps/indels for now.
I do think it's better to have that optional, people can have different usage downstream.
Yes, in my opinion, as long as we produce a vcf.gz, we should have it tabix indexed. |
Thanks for the feedback, @maxulysse. Much appreciated. I'll make the concatenation optional somehow :-) Concerning your idea about the text-file - the vcf-file produced by
I'd say that makes the text-file redundant, right? |
|
…vcf-files from deepvariant.
… for vcf-files from freebayes.
… for vcf-files from tiddit.
…th the other variant-calling-parameters.
|
@FriederikeHanssen @maxulysse : Can I get you guys to do a preliminary review of this PR? If this PR looks okay, then I'll update the corresponding modules in github.com/nf-core/modules. I've tested this PR with the following cmd: and it gives me a concatenated germline-vcf-file which was made by this (N.B. The cnvkit doesn't produce a vcf-file, so no variants from cnvkit in the concatenated vcf-file.) In fact, two concatenated vcf-files were produced, since the input-samplesheet contains to bam-files: The vcf-files are sorted and have corresponding tbi-files. Warning: This PR contains some real clumsy code: |
lassefolkersen
left a comment
There was a problem hiding this comment.
I think the overall looks good. I had one comment on sorting --- is that something that has been discussed? I mean post-concatenate sorting?
Also -- I think interpretation wise it does make sense to talk about 'an intersection' of results with long variants, cnvs, e.g. as when including Manta etc. Few long variant callers will agree on exactly where the break ends are of a variant, so it's not really going to be an intersection more like an union with slightly different border repeats of the same CNV.
🙈 oh no |
|
Ok, so I introduced a local module for adding the With the CLI-options In the attached concatenated vcf-files, there are no variant from manta or tiddit. What do you guys think about this solution? I'm still passing the index-files from the variant-caller-modules all the way back to |
…nd patient-id correspond to id in bam-files. See nf-core#872
|
I'm still passing the index-files from the variant-caller-modules all the way back to sarek.nf; that is actually not necessary with the usage of the local module. Should I get rid of the code passing the index-files from the variant-caller-modules back to sarek.nf? The fastest and easiest solution would be just to get rid of the (new) code which is passing the index-files back to |
@maxulysse asked me to get rid of the redundant code, and so I did. I now - finally - have all CI-tests passing. Let's merge this thing! |
|
uh nice 🥳 🚀 |
Issue #738.
Adding option
--concatenate_vcfsfor concatenating the vcf-files from all the germline variant-callers, except cnvkit which doesn't produce a vcf-file.I've set it up so that Sarek puts the concatenated
.vcf.gz-file here:The output-vcf-files are sorted and have index-files. Added an INFO-field named
SOURCEin the output-vcf-fils giving the name of the (source) vcf-file from whence the variant came. (The filename also indicate which variant-caller was used.)I updated
nextflow_schema.json- perhapsdocs/usage.mdanddocs/output.mdshould also be updated?PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).