Skip to content

add ktimportkrona module #8859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

add ktimportkrona module #8859

wants to merge 2 commits into from

Conversation

paulwolk
Copy link
Contributor

@paulwolk paulwolk commented Aug 7, 2025

PR checklist

Closes #7977

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

Module information:

Collect multiple krona reports in one file - kind of like multiqc, just for krona / classification files

https://manpages.ubuntu.com/manpages/jammy/man1/ktImportKrona.1.html
https://anaconda.org/bioconda/krona
https://github.com/marbl/Krona

@paulwolk paulwolk added new module Adding a new module Ready for Review labels Aug 7, 2025
Copy link
Contributor

@famosab famosab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good already!

@@ -0,0 +1,44 @@
process KRONA_KTIMPORTKRONA {
label 'process_low'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your tool allow multi-threading? If yes you need to add the task.cpus somewhere, if no I would suggest to change this to process_single :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. I'll do that, thx!

'biocontainers/krona:2.8.1--pl5321hdfd78af_1' }"

input:
path html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a meta map here?

Suggested change
path html
tuple val(meta), path(html)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the multiqc way here, since it does similar things - combining results of many (krona) processes into one. What's the standard here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we hand a meta map with inputa and outputs (see any other modules). Not so sure why its different to multiqc. I am happy to accept as is, maybe we need to discuss this in the slack then :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that it's standard normally, but maybe the reason it's absent in multiqc is hidden in this sentence?

"Where applicable all sample-specific information e.g. 'id', 'single_end', 'read_group' MUST be provided as an input via a Groovy Map called 'meta'. This information may not be required in some instances, for example indexing reference genome files."

There are no sample-specific informations since there are multiple samples, so meta map isn't provided.

path html

output:
path "*.html" , emit: html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a meta here? :)


stub:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ? "-o ${task.ext.prefix}.html" : 'krona.krona.html'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def prefix = task.ext.prefix ? "-o ${task.ext.prefix}.html" : 'krona.krona.html'
def output = task.ext.prefix ? "${task.ext.prefix}.html" : 'krona.krona.html'

def args = task.ext.args ?: ''
def prefix = task.ext.prefix ? "-o ${task.ext.prefix}.html" : 'krona.krona.html'
"""
touch ${prefix}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
touch ${prefix}
touch ${output}

Comment on lines 16 to 36
setup {
run("KRAKENTOOLS_KREPORT2KRONA") {
script "../../../krakentools/kreport2krona/main.nf"
process {
"""
input[0] = Channel.of([
[id: 'testreport'],
file(params.modules_testdata_base_path + 'genomics/sarscov2/metagenome/test_1.kraken2.report.txt', checkIfExists: true)
])
"""
}
}
run("KRONA_KTIMPORTTEXT") {
script "../../ktimporttext/main.nf"
process {
"""
input[0] = KRAKENTOOLS_KREPORT2KRONA.out.txt
"""
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can move the setup block out of the test block and reuse it along all tests :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh good info! Thank you!

Comment on lines 49 to 51
{ assert snapshot(process.out.versions).match() },
{ assert file(process.out.html[0]).text.contains('s__Severe_acute_respiratory_syndrome-related_coronavirus') },
{ assert file(process.out.html.get(0)).getText().contains("testreport") }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we want all the assertions to be reflected in the snapshot :) so we need to find a way to check for these things in the snapshot. I will link another modules to explain what I mean!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it works to assert the whole output? Or is there a timestamp in the html file? Then we need to filter that out :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a timestamp, the html outputs are just not consistent. Sometimes the colors are swapped and weird stuff like that :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok then just go for my suggestion below :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for your suggestions, applied the changes!

Comment on lines 98 to 101
{ assert snapshot(process.out.versions).match() },
{ assert file(process.out.html[0]).text.contains('s__Severe_acute_respiratory_syndrome-related_coronavirus') },
{ assert file(process.out.html.get(0)).getText().contains("testreport1") },
{ assert file(process.out.html.get(0)).getText().contains("testreport2") }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{ assert snapshot(process.out.versions).match() },
{ assert file(process.out.html[0]).text.contains('s__Severe_acute_respiratory_syndrome-related_coronavirus') },
{ assert file(process.out.html.get(0)).getText().contains("testreport1") },
{ assert file(process.out.html.get(0)).getText().contains("testreport2") }
{ assert snapshot(
process.out.versions,
path(process.out.html.get(0).get(1)).getText().contains("s__Severe_acute_respiratory_syndrome-related_coronavirus"),
path(process.out.html.get(0).get(1)).getText().contains("testreport1"),
path(process.out.html.get(0).get(1)).getText().contains("testreport2")
).match() },

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

new module: krona/ktimportkrona
2 participants