Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit concurrent fauna connections to 2 #202

Merged
merged 2 commits into from
Jan 13, 2025
Merged

Conversation

joverlee521
Copy link
Contributor

@joverlee521 joverlee521 commented Jan 10, 2025

Description of proposed changes

The latest upload workflow¹ seemed to overwhelm the rethinkdb server so limiting concurrent fauna connections to see if it would prevent the OOM error.

Use of Snakemake's workflow.global_resources inspired by the ncov use to limit concurrent deploy jobs.² Doing this instead of limiting the overall number of Snakemake jobs with -j because there are other rules in the workflow that can run without connections to fauna.

¹ https://github.com/nextstrain/seasonal-flu/actions/runs/12715154181/job/35446799658
² https://github.com/nextstrain/ncov/blob/20f5fc3c7032f4575a99745cee3238ecbeebb6e0/workflow/snakemake_rules/export_for_nextstrain.smk#L340-L362

Checklist

The latest upload workflow¹ seemed to overwhelm the rethinkdb server
so limiting concurrent fauna connections to see if it would prevent
the OOM error.

Use of Snakemake's `workflow.global_resources` inspired by the
ncov use to limit concurrent deploy jobs.² Doing this instead of
limiting the overall number of Snakemake jobs with `-j` because there
are other rules in the workflow that can run without connections to
fauna.

¹ <https://github.com/nextstrain/seasonal-flu/actions/runs/12715154181/job/35446799658>
² <https://github.com/nextstrain/ncov/blob/20f5fc3c7032f4575a99745cee3238ecbeebb6e0/workflow/snakemake_rules/export_for_nextstrain.smk#L340-L362>
@joverlee521
Copy link
Contributor Author

The upload workflow completed successfully 🎉

In the Snakemake logs, you can see the resource limit for concurrent_fauna at the start of the workflow

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Provided resources: concurrent_fauna=2

and the resource used for each job (e.g. download_titers)

localrule download_titers:
    output: data/h3n2/niid_egg_hi_titers.tsv
    log: logs/download_titers_h3n2_niid_egg_hi.txt
    jobid: 182
    benchmark: benchmarks/download_titers_h3n2_niid_egg_hi.txt
    reason: Missing output files: data/h3n2/niid_egg_hi_titers.tsv
    wildcards: lineage=h3n2, center=niid, passage=egg, assay=hi
    resources: tmpdir=/tmp, concurrent_fauna=1

Side note, the resources was only visible for the download_titers jobs in this run because the download_sequences log was masked by the Snakemake message directive. I've removed the message directive in 123716a.


This workflow completed in 2h7m. Considering the previous upload workflow failed at 2h3m, the limit of concurrent fauna connections does not slow down the workflow. Last week's upload workflow completed in 1h50m, but this workflow's runtime has also been increasing as we add more data.

@joverlee521 joverlee521 merged commit 73b8294 into master Jan 13, 2025
3 checks passed
@joverlee521 joverlee521 deleted the limit-concurrent-fauna branch January 13, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants