Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in DGEList(counts = data, genes = rownames(data)) : non-numeric values found in counts #282

Closed
mdozmorov opened this issue Sep 9, 2023 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@mdozmorov
Copy link

Description of the bug

This issue is identical to #218, which was closed but not resolved. After debugging, the problem is that the data object being supplied to DGEList(counts=data,genes=rownames(data)) has 0 rows and 0 columns

dataDGE<-DGEList(counts=data,genes=rownames(data))

The preceding code filters out rows and columns that have only zeros, but no code checks the data dimensions afterward.

This occurred when I analyzed two samples, to see if the pipeline runs. I'm new to Nextflow, don't know why my samples produce such results. Don't know how the output of edgeR_miRBase.r, or the lack of it, will affect the downstream steps.

Command used and terminal output

DIRIN=/Users/bluedot/data/WorkData/2023-08.miRNA-seq
INPUT=${DIRIN}/samplesheet_full.csv
DIROUT=${DIRIN}/OUT_test
GENOME=/Users/bluedot/data/ExtData/UCSC/hg38/hg38.fa
BWAINDEX=/Users/bluedot/data/ExtData/UCSC/hg38/hg38.fa
CHROMSIZE=/Users/bluedot/data/ExtData/UCSC/genometable.hg38.txt
MIRBASE_GFF=/Users/bluedot/data/ExtData/UCSC/hg38/hsa.gff3
MIRBASE_MATURE=/Users/bluedot/data/ExtData/UCSC/hg38/mature.fa
MIRBASE_HAIRPIN=/Users/bluedot/data/ExtData/UCSC/hg38/hairpin.fa

nextflow run nf-core/smrnaseq --input ${INPUT} \
  --outdir ${DIROUT} \
  -profile 'singularity' \
  --genome GRCh38 \
  --mirtrace_species 'hsa' \
  --protocol 'illumina' \
  --mirna_gtf ${MIRBASE_GFF} \
  --mature ${MIRBASE_MATURE} \
  --hairpin ${MIRBASE_HAIRPIN} \
  -resume
====

ERROR ~ Error executing process > 'NFCORE_SMRNASEQ:SMRNASEQ:MIRNA_QUANT:EDGER_QC'

Caused by:
  Process `NFCORE_SMRNASEQ:SMRNASEQ:MIRNA_QUANT:EDGER_QC` terminated with an error exit status (1)

Command executed:

  edgeR_miRBase.r sample01_mature.sorted.idxstats sample09_mature.sorted.idxstats sample01_mature_hairpin.sorted.idxstats sample09_mature_hairpin.sorted.idxstats

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_SMRNASEQ:SMRNASEQ:MIRNA_QUANT:EDGER_QC":
      r-base: $(echo $(R --version 2>&1) | sed 's/^.*R version //; s/ .*$//')
      limma: $(Rscript -e "library(limma); cat(as.character(packageVersion('limma')))")
      edgeR: $(Rscript -e "library(edgeR); cat(as.character(packageVersion('edgeR')))")
      data.table: $(Rscript -e "library(data.table); cat(as.character(packageVersion('data.table')))")
      gplots: $(Rscript -e "library(gplots); cat(as.character(packageVersion('gplots')))")
      methods: $(Rscript -e "library(methods); cat(as.character(packageVersion('methods')))")
      statmod: $(Rscript -e "library(statmod); cat(as.character(packageVersion('statmod')))")
  $hairpin
  [1] "sample01_mature_hairpin.sorted.idxstats"
  [2] "sample09_mature_hairpin.sorted.idxstats"

Command error:
  INFO:    Environment variable SINGULARITYENV_NXF_DEBUG is set, but APPTAINERENV_NXF_DEBUG is preferred

  Attaching package: ‘gplots’

  The following object is masked from ‘package:stats’:

  $mature
  [1] "sample01_mature.sorted.idxstats"
  [2] "sample09_mature.sorted.idxstats"

  $hairpin
  [1] "sample01_mature_hairpin.sorted.idxstats"
  [2] "sample09_mature_hairpin.sorted.idxstats"

  Error in DGEList(counts = data, genes = rownames(data)) :
    non-numeric values found in counts
  Execution halted

Relevant files

No response

System information

nextflow version 23.04.1.5866
HPC
local
singularity
CentOS Linux 7
2.2.2

@mdozmorov mdozmorov added the bug Something isn't working label Sep 9, 2023
@mdozmorov
Copy link
Author

The following helped me to complete the pipeline.

nextflow run nf-core/smrnaseq --input ${INPUT} \
  --outdir ${DIROUT} \
  -profile 'singularity' \
  --genome GRCh38 \
  --mirtrace_species 'hsa' \
  --protocol 'illumina' \
  --skip_mirdeep \
  -resume \
  -r fix-mirtop-gff

@christopher-mohr
Copy link
Contributor

Hi @mdozmorov, can you confirm that this issue does not exist anymore in 2.3.0 ?

@apeltzer apeltzer added this to smrnaseq Aug 8, 2024
@apeltzer apeltzer added this to the 2.4.0 milestone Aug 20, 2024
@atrigila atrigila self-assigned this Aug 23, 2024
@atrigila
Copy link
Contributor

I tried to reproduce this error in the latest dev version, using two samples as in the original post, but I couldn't. The pipeline completed successfully.

Code:

DIRIN=/workspace/
INPUT=${DIRIN}/smrnaseq/assets/samplesheet.csv
DIROUT=${DIRIN}/test_issue_282
// MIRBASE_GFF: This file is downloaded automatically if using iGenomes
// MIRBASE_MATURE: This file is downloaded automatically from mirbase if not provided
// MIRBASE_HAIRPIN: This file is downloaded automatically from mirbase if not provided

 nextflow run nf-core/smrnaseq \
--input ${INPUT}   \
--outdir ${DIROUT}   \
-profile illumina,singularity   \
--genome GRCh38   \
--mirtrace_species 'hsa'   \
-r dev 

Samplesheet:

sample,fastq_1
Clone1_N1,s3://ngi-igenomes/test-data/smrnaseq/C1-N1-R1_S4_L001_R1_001.fastq.gz
Clone9_N1,s3://ngi-igenomes/test-data/smrnaseq/C9-N1-R1_S7_L001_R1_001.fastq.gz

@github-project-automation github-project-automation bot moved this from On Hold to Done in smrnaseq Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants