Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,12 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- [#41](https://github.com/nf-core/sarek/pull/41) - Update `qualimap` from `2.2.2b` to `2.2.2c`
- [#41](https://github.com/nf-core/sarek/pull/41) - Update `tiddit` from `2.7.1` to `2.8.0`
- [#41](https://github.com/nf-core/sarek/pull/41) - Update `vcfanno` from `0.3.1` to `0.3.2`
- [#46](https://github.com/nf-core/sarek/pull/46) - Add location to abstacts.

### `Removed`

- [#45](https://github.com/nf-core/sarek/pull/45) - Include Workflow figure in `README.md`
- [#46](https://github.com/nf-core/sarek/pull/46) - Remove mention of old `build.nf` script which was included in `main.nf`

### `Fixed`

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2016-09-KICR.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# The XVth KICancer Retreat 2016
# The XVth KICancer Retreat - Djurö, Sweden, 2016/09

## Cancer Analysis Workflow Of Tumor/Normal Pairs At The National Genomics Infrastructure Of SciLifeLab

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2017-05-ESHG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# European Human Genetics Conference 2017
# European Human Genetics Conference - Copenhagen, Denmark, 2017/05

## CAW - Cancer Analysis Workflow to process normal/tumor WGS data

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2018-05-PMC.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Keystone Symposia - Precision Medicine in Cancer
# Keystone Symposia - Precision Medicine in Cancer - Stockholm, Sweden, 2018/05

## Sarek, a workflow for WGS analysis of germline and somatic mutations

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2018-06-EACR25.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 25th Biennial Congress Of The European Association For Cancer Research 2018
# 25th Biennial Congress Of The European Association For Cancer Research - Amsterdam, Netherlands, 2018/06-07

## Somatic and germline calls from tumour/normal whole genome data: bioinformatics workflow for reproducible research

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2018-06-NPMI.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# The Nordic Precision Medicine Initiative - Meeting No 5
# The Nordic Precision Medicine Initiative - Meeting No 5 - Reykjavìk, Iceland, 2018/06

## Sarek, a portable workflow for WGS analysis of germline and somatic mutations

Expand Down
2 changes: 1 addition & 1 deletion docs/abstracts/2018-07-JOBIM.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Journées Ouvertes en Biologie, Informatique et Mathématiques 2018
# Journées Ouvertes en Biologie, Informatique et Mathématiques - Marseille, France, 2018/07

## Sarek, a portable workflow for WGS analysis of germline and somatic mutations

Expand Down
16 changes: 8 additions & 8 deletions docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ Settings in `igenomes.config` can be tailored to your needs.

To speed up some preprocessing and variant calling processes, the reference is chopped into smaller pieces.
The intervals are chromosomes cut at their centromeres (so each chromosome arm processed separately) also additional unassigned contigs.
We are ignoring the hs37d5 contig that contains concatenated decoy sequences.
Parts of preprocessing and variant calling are done by this intervals, and the different resulting files are then merged.
We are ignoring the `hs37d5` contig that contains concatenated decoy sequences.
Parts of preprocessing and variant calling are done by these intervals, and the different resulting files are then merged.
This can parallelize processes, and push down wall clock time significantly.

The calling intervals can be defined using a `.list` or a `.bed` file.
Expand All @@ -37,10 +37,10 @@ Second, the jobs with largest processing time are started first, which reduces w
If no runtime is given, a time of 1000 nucleotides per second is assumed.
Actual figures vary from 2 nucleotides/second to 30000 nucleotides/second.

## build.nf
### Working with whole exome (WES) or panel data

The [`build.nf`](#buildnf) script is used to build reference needed for smallGRCh37.

```bash
nextflow run build.nf
```
The `--targetBED` parameter does _not_ imply that the workflow is running alignment or variant calling only for the supplied targets.
Instead, we are aligning for the whole genome, and selecting variants only at the very end by intersecting with the provided target file.
Adding every exon as an interval in case of WES can generate >200K processes or jobs, much more forks, and similar number of directories in the Nextflow work directory.
Furthermore, primers and/or baits are not 100% specific, (certainly not for MHC and KIR, etc.), quite likely there going to be reads mapping to multiple locations.
If you are certain that the target is unique for your genome (all the reads will certainly map to only one location), and aligning to the whole genome is an overkill, better to change the reference itself.