Skip to content

Latest commit

 

History

History
224 lines (206 loc) · 6.74 KB

readme.org

File metadata and controls

224 lines (206 loc) · 6.74 KB

Simulation configuration

There are a sequence of configurations: Charmander, Charmeleon and Charizard. These all use the same model but are of increasing size and use broader distributions over the simulation parameters.

Charmander

This is intended as a toy dataset. It has a 800-100-100 training-validation-testing split. The parameters are nearly constant through time, for example, the $R_0$ values are shown in Figure fig:charmander-r0s. The configuration for this simulation is simulation-charmander.json.

../out/sim-charmander/plots/r0_trajectories.png

Charmander contemporaneous

This is very similar to the Charmander configuration but instead of serial sampling, there is a single contemporaneous sample at the present.

Charmeleon

This is intended as a small dataset. It has a 1600-200-200 training-validation-testing split. The parameters vary significantly through time, for example, the $R_0$ values are shown in Figure fig:charmeleon-r0s. The configuration for this simulation is simulation-charmeleon.json.

../out/sim-charmeleon/plots/r0_trajectories.png

Charizard

This is intended as a plausible dataset for use in training a useful neural network. It has a 8000-1000-1000 training-validation-testing split (although there are 11000 simulations attempted to adjust for failures). The parameters vary significantly through time, for example, the $R_0$ values are shown in Figure fig:charizard-r0s. The configuration for this simulation is simulation-charizard.json.

../out/sim-charizard/plots/r0_trajectories.png

Notes

  • You can validate a simulation configuration against the schema using one of the many free online tools:
  • simulation-hyperparameters.contemporaneous_sample should be true for a contemporaneous sample and false for serial sampling (the default value). If there is serial sampling, the simulation-hyperparameters.sampling_prop_bounds is used as the sampling proportion given removal. If there is a contemporaneous sample, this is the probability of an extant lineage being included in the sample.
  • simulation-hyperparameters.report-temporal-data can be set to true in order to capture temporal data from the simulation in the resulting database (false by default). If this parameter is set to true, simulation-hyperparameters.num-temp-measurements (an integer) must also be specified. This is the number of randomly selected time points between the start of the epidemic and the present at which data is reported.
  • simulation-hyperparameters.limited_time_sampling can be set to true in order to implement sampling over limited time only (false by default). If this parameter is set to true, the sampling proportion is zero until a random change point, uniformly distributed over the duration of the epidemic. If false, the sampling proportion is nonzero throughout and changes in sync with the other parameters of the epidemic.

Examples

  • debugging.json is a simple example of a configuration.
  • debugging-limited-time-sampling.json is an example demonstrating the use of the limited_time_sampling flag.
  • debugging-measurement-times.json is an example demonstrating the use of the report_temporal_data flag.
  • simulation-charmander.json small simulation
  • simulation-charmeleon.json medium simulation
  • simulation-charizard.json large simulation
  • simulation-bulbasaur.json small simulation with limited-time sampling
  • simulation-ivysaur.json medium simulation with limited-time sampling
  • simulation-venusaur.json large simulation with limited-time sampling

Schema

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "simulation-name": {
      "type": "string"
    },
    "output-hdf5": {
      "type": "string"
    },
    "seed": {
      "type": "integer"
    },
    "remaster-xml": {
      "type": "string"
    },
    "num-simulations": {
      "type": "integer"
    },
    "num-workers": {
      "type": "integer"
    },
    "simulation-hyperparameters": {
      "type": "object",
      "properties": {
        "duration-range": {
          "type": "array",
          "items": [
            {
              "type": "integer"
            },
            {
              "type": "integer"
            }
          ]
        },
        "num-changes": {
          "type": "array",
          "items": [
            {
              "type": "integer"
            },
            {
              "type": "integer"
            }
          ]
        },
        "shrinkage-factor": {
          "type": "number"
        },
        "r0_bounds": {
          "type": "array",
          "items": [
            {
              "type": "number"
            },
            {
              "type": "number"
            }
          ]
        },
        "net_rem_rate_bounds": {
          "type": "array",
          "items": [
            {
              "type": "number"
            },
            {
              "type": "number"
            }
          ]
        },
        "sampling_prop_bounds": {
          "type": "array",
          "items": [
            {
              "type": "number"
            },
            {
              "type": "number"
            }
          ]
        },
        "contemporaneous_sample": {
          "type": "boolean"
        },
        "report-temporal-data": {
            "type": "boolean"
        },
        "num-temp-measurements": {
            "type": "integer"
        },
        "limited_time_sampling": {
            "type": "boolean"
        }
      },
      "required": [
        "duration-range",
        "num-changes",
        "shrinkage-factor",
        "r0_bounds",
        "net_rem_rate_bounds",
        "sampling_prop_bounds",
      ]
    }
  },
  "required": [
    "simulation-name",
    "output-hdf5",
    "seed",
    "remaster-xml",
    "num-simulations",
    "simulation-hyperparameters"
  ]
}