Skip to content

Commit

Permalink
Merge pull request #35 from CDCgov/dev
Browse files Browse the repository at this point in the history
docs: updated readme
  • Loading branch information
slsevilla authored May 16, 2024
2 parents c9c1566 + 479f545 commit bfb1855
Showing 1 changed file with 2 additions and 21 deletions.
23 changes: 2 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@

## This project is a successor to the [C-WAP pipeline](https://github.com/CFSAN-Biostatistics/C-WAP) and is intended to process SARS-CoV-2 wastewater samples to determine relative variant abundance.

⚠️Warning⚠️

This pipeline has been validated for short reads. The results generated by this pipeline are not CLIA certified and should not be considered diagnostic.

## Introduction
**CDCgov/aquascope** is a bioinformatics best-practice pipeline for early detection of SARS-COV variants of concern, sequenced throughshotgun metagenomic sequencing, from wastewater.

Expand All @@ -31,27 +27,12 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_

3. Prepare the `assets/samplesheet.csv`

- Use the `assets/test_highcoverage_samplesheet.csv` as an example

- Create custom sample sheets using the `fastq_dir_to_samplesheet.py` script
```
Usage:
fastq_dir_to_samplesheet.py \
/absolute/path/to/fastq/dir \
-st <forward/reverse/unstranded> \
samplesheetName.csv
```

i. FASTQ files must be paired end, following `_R1`, `_R2` naming convention.

ii. Strandedness must be known or "unstranded". NOTE: DNASeq is by default `unstranded`, while RNASeq is usually `stranded`
3. Prepare the `assets/samplesheet.csv`. Refer to [prepare-files] (https://cdcgov.github.io/aquascope/).

4. Prepare the configuration files
A. `nextflow.config` is prepared with default parameters, update as needed
B. `test.config` is prepared with default parameters, update as needed
C. `cdc-dev.config` is prepared for **CDC-Users** and it has the **Rosalind** cluster configurations.

5. Run the pipeline profile
```
Expand Down

0 comments on commit bfb1855

Please sign in to comment.