This a repository to analyze the data from the Cyano Interlab study.
This project contains the following directories:
data/
: contains raw data filesfigures/
: contains the figures for the manuscriptR/
: contains all functions for data loading and processing, and figure generationreports/
: contains reports to load raw data, process it and export it and to create the figures for the manuscript_targets
: directory created by the targets package required to run he workflowrenv
: directory created by the renv package required to manage project dependencies
Participants submitted data in standard spreadsheet templates from
experimental runs and reference measurements. These are stored
indata/experiments/
and data//reference/
respectively. In addition,
two extra datasets are needed to run the analysis. data/dilutions
contains two spreadsheets with information about the dilutions factors
applied by each participant in the micro-titer plates.
data/location_labels
contains a table to match the location name in
the raw files to the names that will be displayed in the figures.
The analysis of this project is done using a targets pipeline. This pipeline includes reading raw files, wrangling data, normalization of plate-reader measurements, and creating the figures and statistical analyses presented in the manuscript.
The code to run the pipeline is found in the _targets.R
file and can
be executed by running targets::tar_make()
. This pipeline makes use of
custom functions stored in the R/
directory. Below, we describe every
step of the process:
Raw data is loaded with the functions read_dilutions()
,
read_raw_experiment_data()
, and read_raw_reference_data()
. The first
function will create a regular data frame. Tast two, will create a
list-column tibble containing a row per lab, experimental run and
dataset, stored in the raw_data
column. These tibbles can be obtained
by providing the corresponding path and file name pattern to each
function or by using targets::tar_load()
after running the pipeline:
targets::tar_load(data.experiments.raw)
data.experiments.raw
## # A tibble: 170 × 5
## location sheet raw_data exper…¹ exper…²
## <chr> <chr> <list> <chr> <chr>
## 1 Amsterdam Fluorescence Plate Reader <tibble [28 × 11]> 202111… 202111…
## 2 Amsterdam OD Plate Reader <tibble [28 × 11]> 202111… 202111…
## 3 Amsterdam OD 730 nm Spectrophotometer <tibble [8 × 10]> 202111… 202111…
## 4 Amsterdam Chlorophyll 665 nm Spectrophoto <tibble [8 × 6]> 202111… 202111…
## 5 Amsterdam Spectrum Spectrophotometer <tibble [351 × 21]> 202111… 202111…
## 6 Amsterdam Fluorescence Plate Reader <tibble [28 × 11]> 202111… 202111…
## 7 Amsterdam OD Plate Reader <tibble [28 × 11]> 202111… 202111…
## 8 Amsterdam OD 730 nm Spectrophotometer <tibble [8 × 10]> 202111… 202111…
## 9 Amsterdam Chlorophyll 665 nm Spectrophoto <tibble [8 × 6]> 202111… 202111…
## 10 Amsterdam Spectrum Spectrophotometer <tibble [351 × 21]> 202111… 202111…
## # … with 160 more rows, and abbreviated variable names ¹experiment_date,
## # ²experiment_id
Two functions (process_experiments_data()
and
process_reference_data()
) are used to process the tibbles generated in
the previous step. These functions will wrangle the raw data into the
right format and process the plate-reader (background-signal correction
and reference-strain normalization of relative fluorescence units). The
output is a list-column tibble, containing a dataset per row. These
tibbles can be obtained by providing the corresponding datasets to each
function or by using targets::tar_load()
after running the pipeline:
targets::tar_load(data.experiments.processed)
data.experiments.processed
## # A tibble: 9 × 2
## data_id data
## <chr> <list>
## 1 pr.fl.raw <tibble [6,664 × 10]>
## 2 pr.od.raw <tibble [6,727 × 10]>
## 3 pr.fl <tibble [8,232 × 11]>
## 4 pr.od <tibble [8,295 × 11]>
## 5 pr.bc <tibble [1,904 × 16]>
## 6 pr.norm <tibble [1,904 × 17]>
## 7 sp.od <tibble [1,904 × 8]>
## 8 sp.full.spectrum <tibble [279,180 × 8]>
## 9 chl <tibble [360 × 9]>
The table shows the content of each dataset:
Dataset | Description |
---|---|
pr.fl.raw |
Long-format plate-reader fluorescence |
pr.od.raw |
Long-format plate-reader OD |
pr.fl |
Long-format, dilution corrected, plate-reader fluorescence |
pr.od |
Long-format, dilution corrected, plate-reader OD |
pr.bc |
Long-format, dilution corrected, background-signal corrected, plate-reader fluorescence |
pr.norm |
Long-format, dilution corrected, background-signal corrected, reference-strain normalized plate-reader RFU |
sp.od |
Long-format spectrophotometer OD |
sp.full.spectrum |
Long-format spectrophotometer full spectrum |
chl |
Long-format spectrophotometer chlorophyll content |
Processed datasets can be accessed with:
targets::tar_load(data.experiments.processed)
# example to extract spectrophotometer OD
data.experiments.processed |>
dplyr::filter(data_id == "sp.od") |>
tidyr::unnest(data) |>
dplyr::select(-data_id)
## # A tibble: 1,904 × 8
## location strain induction bio_replicate experiment_d…¹ exper…² OD_730 time_h
## <chr> <chr> <chr> <dbl> <chr> <chr> <dbl> <dbl>
## 1 Amsterdam EVC - 1 20211118 202111… 0.532 0
## 2 Amsterdam EVC - 1 20211118 202111… 0.501 2
## 3 Amsterdam EVC - 1 20211118 202111… 0.574 4
## 4 Amsterdam EVC - 1 20211118 202111… 0.584 5
## 5 Amsterdam EVC - 1 20211118 202111… 0.627 6
## 6 Amsterdam EVC - 1 20211118 202111… 0.658 7
## 7 Amsterdam EVC - 1 20211118 202111… 1.89 24
## 8 Amsterdam EVC + 1 20211118 202111… 0.532 0
## 9 Amsterdam EVC + 1 20211118 202111… 0.512 2
## 10 Amsterdam EVC + 1 20211118 202111… 0.542 4
## # … with 1,894 more rows, and abbreviated variable names ¹experiment_date,
## # ²experiment_id
The code to produce each figure in the manuscript can be found in the
R/
directory. Each figure is created with a custom function stored in
a separate file. The targets
pipeline includes these functions and the
output is stored in the figures/
directory (.tiff files are not
include in this repository due to file size).
We provide renv
files to facilitate dependency management. We include the .Rprofile
file so when cloning the repository and opening the project, renv
should be automatically downloaded and installed and project
dependencies can be restore with renv::restore()
. See this
resource
for futher details.