Simulation Tutorial

Here is an example run through the simulation process for a simple case, from beginning to end.

Setup

First, we set up the example by loading packages and defining a random number generator for reproducibility. Monty relies heavily on components of GeoStats.jl for geometric primitives/operations and spatial covariation models.

using Monty, Random, Distributions, Meshes
using GeoStatsFunctions, GeoStatsProcesses

import CairoMakie as Mke

const rng = Xoshiro(3)

Deployment Geometry

Now we define the geometry for our treatment and control areas. They can be anything, but shouldn't overlap. Here I use a simple square, 100 m by 100 m, with a portion of the upper right corner alloted to control samples.

# polygon for a treatment area
treatment = PolyArea([(0, 0), (0, 100), (80, 100), (80,50), (100,50), (100,0)])
control = Box(Point(80,50), Point(100,100))
Example block output

Sample Plan

Next, we need to define a sample plan.

  • In this case, we'll collect samples at three different times for the imaginary deployment.
  • We will also collect the samples at the same locations each time: a paired sample plan.
  • Even though they are revisited for each round of sampling, the sample locations will be randomized.
  • Arbitrarily, we'll get 20 samples in the treatment zone and 5 samples in the control zone.

Generating this kind of plan is easy.

plan = pairedsampleplan(rng, treatment, 20, control, 5, [-0.03, 0.05, 1.0])
SamplePlan{Float64)
├─ total samples = 75
├─ sampling rounds = 3
├─ sampling times = [-0.03, 0.05, 1.0]
└─ bounding box
  ├─ min = Point(2.7541792822504574, 9.211427208511214)
  └─ max = Point(99.99794890034651, 99.91219704388106)

The readout indicates that we have 75 total samples over three rounds of sampling: (20 x 3) + (5 x 3). The sample locations are shown below.

Example block output

We can convert the plan into a geo-referenced table using the GeoTable constructor, which is reexported by Monty. This is useful for inspection and for exporting the information in a variety of formats (see GeoIO.jl).

GeoTable(plan)
75×5 GeoTable over 75 PointSet{2,Float64}
location round time control geometry
Categorical Categorical Continuous Categorical Point2
[NoUnits] [NoUnits] [NoUnits] [NoUnits]
1 1 -0.03 false (33.9735, 95.9233)
1 2 0.05 false (33.9735, 95.9233)
1 3 1.0 false (33.9735, 95.9233)
2 1 -0.03 false (99.9979, 9.21143)
2 2 0.05 false (99.9979, 9.21143)
2 3 1.0 false (99.9979, 9.21143)
3 1 -0.03 false (69.8426, 49.6285)
3 2 0.05 false (69.8426, 49.6285)
3 3 1.0 false (69.8426, 49.6285)
4 1 -0.03 false (79.654, 50.3309)

Realized Core Locations

Now we execute the sample plan. This means that the target locations are visited, with some error in their locations, and the individual cores for each sample are assigned locations. This is always done using executeplan! or executeplan.

In this case we use 6 cores for each sample, arranged in a "hub and spoke" pattern, meaning a circle with a central point. The core stencil has a radius of 2 meters. We also define some spatial error (jitter) for the sample locations and the individual cores.

samp = executeplan(
    plan,
    stencil=HubSpokeStencil(6, 2.0),
    samplerjitter=Jitter(2.0, seed=12),
    corejitter=Jitter(0.1, seed=11)
)
CoreSet{Float64}
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ bounding box
  ├─ min = Point(0.06900390018592663, 9.039310915323108)
  └─ max = Point(103.48809101193869, 103.57134753245931)

Executed plans are stored in a CoreSet, which is just a matrix of the locations. The readout indicates we now have 450 cores, 6 for each of our 75 samples. Here are the cores, this time with the different colors representing the different sampling rounds.

Example block output

The realized core locations are not associated with the rest of the metadata in the sample plan, but we can make a GeoTable by calling the constructor with both groups of information. This time the table includes the "core" column, which indicates the core number for each sample location.

GeoTable(samp, plan)
450×6 GeoTable over 450 PointSet{2,Float64}
location round time control core geometry
Categorical Categorical Continuous Categorical Categorical Point2
[NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits]
1 1 -0.03 false 1 (34.7186, 98.1325)
1 1 -0.03 false 2 (36.5343, 97.8874)
1 1 -0.03 false 3 (35.0816, 99.8953)
1 1 -0.03 false 4 (32.8894, 99.2637)
1 1 -0.03 false 5 (33.1171, 96.7801)
1 1 -0.03 false 6 (35.3641, 95.987)
1 2 0.05 false 1 (31.05, 101.339)
1 2 0.05 false 2 (33.0346, 101.275)
1 2 0.05 false 3 (31.6193, 103.193)
1 2 0.05 false 4 (29.5978, 102.566)

Simulation

We have all the information about sample locations and times that we need, in addition to whether samples are controls.

To simulate, we define a container that makes it easy to keep track of simulated quantities and set their values. This is the Simulation type. Because it's just a container, we only need the number of samples and cores per sample to allocate one.

Below a simulation is started with Ca, Mg, and Zr as the target analytes/elements. By passing samp to the constructor, the simulation is created with the correct size automatically.

sim = Simulation((:Ca, :Mg, :Zr), samp)
Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock fraction (γ)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ core depth (d)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock bulk density (ρf)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock concentrations (cf)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ soil bulk density (ρs)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ background soil concentrations (cs)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  └─ feedstock mass loss fraction (ℒ)
       NaN ± NaN (NaN %) ∈ [NaN, NaN]

The readout for a Simulation is long, but it's an exhaustive summary of the information inside the container and it indicates if fields have not been defined yet. Undefined fields have NaN values in their summaries. Initially, everything except the core cross-sectional area is undefined. This area is usually not important so it's assigned a default value, but it can also be declared.

Now we have to fill in the parameters of a simulation, defining each of them for every core in the deployment.

Application Rate

First, applied feedstock. I'll assume an application of, on average, 3 kg/m$^2$. I ignore moisture for now. Feedstock is generally applied by spreaders hitched to tractors, which are driven over fields in linear passes. As such, we might assume the variability in application rate over our treatment area has some spatial structure. Maybe it has a vaguely striped and streaky pattern.

This can be represented using an appropriate variogram, which defines correlations between points based on their distances from each other. A lot can be said about variograms, but here I will just demonstrate how to use one and use it to fill in the application rate for the simulation. Instead of a Variogram I define a Covariance which is more convenient and what the Monty functions expect.

# average application rate
Q = 3.0
# this "ball" defines the range on each axis
ball = MetricBall((5.0, 1000.0))
# spatial structure
C = CubicCovariance(
    ball,
    nugget=(0.1Q)^2, # 10 % nugget effect
    sill=(0.25Q)^2,  # 25 % overall standard deviation
)
CubicCovariance (anisotropic)
├─ sill: 0.5625
├─ nugget: 0.09000000000000002
└─ ranges: (5.0, 1000.0)
└─ distance: Mahalanobis

What does this structure look like? It's helpful to realize the spatial structure on a dense grid, to get a sense for it. One realization is shown below. The overall structure is streaky with some noise over very short distances from the nugget effect.

Example block output

From the covariance type defined above, Monty will assemble a multivariate Gaussian distribution with the proper covariance matrix. This distribution can then be used to fill in the application rate for the simulation once or, for repeated simulations, as many times as desired.

The constructor needs our CoreSet, which we named samp, because samp contains all the realized core locations. There is also a spreading! function that draws from the distribution and puts the values into the correct places, ignoring control points automatically.

X = Monty.MvNormal(Q, C, samp)
spreading!(rng, sim, plan, X)
sim
Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    2.43 ± 1.38 (56.8 %) ∈ [0, 4.86]
  ├─ feedstock fraction (γ)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ core depth (d)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock bulk density (ρf)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock concentrations (cf)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ soil bulk density (ρs)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ background soil concentrations (cs)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  └─ feedstock mass loss fraction (ℒ)
       NaN ± NaN (NaN %) ∈ [NaN, NaN]

Notice that the readout for the simulation has been updated. It shows a non-NaN summary for Q, which is our symbol for the application rate. The summary shows the mean value, the standard deviation, and the extrema. Note that the mean value is not 3, as expected. This is because of the control samples. Of our 25 sample locations, 20 are treatment and 5 are control. The expected average is 2.4, close to what we see above.

Pre-Spreading Soil Concentrations

It's reasonable to expect the baseline soil, before any feedstock is spread, to have some spatial structure as well. We expect calcium, magnesium, and zirconium concentrations in the soil to be similar to each other over short distances. Further, we might expect calcium and magnesium to be cross-correlated, meaning we tend to find high calcium wherever we find high magnesium, and vice versa.

This can also be represented in Monty. For two cross-correlated fields like Ca and Mg, Monty has a GaussianCosimulator type that handles spatial autocorrelation in two analytes and cross-correlation between them.

# mean values for Ca and Mg in soil
μ = (Ca=0.002, Mg=0.001)
c = SphericalCovariance(
    nugget=1e-4^2,
    sill=3e-4^2,
    range=20.0
)
ρ = 0.7
gc = GaussianCosimulator(samp, μ, c, ρ)
GaussianCosimulator on analytes (:Ca, :Mg)
├─ 450 locations
└─ cross correlation coefficient = 0.7

This type also has a convenience function associated with it, to put the realization into the simulation.

soilconcentration!(rng, sim, gc)

Here is what the baseline soil concentrations actually look like:

Example block output

Finally, we assume no spatial structure for Zr, just to keep things moving.

μ = 10e-6 # 10 ppm on average
X = Normal(μ, μ/5)
soilconcentration!(rng, sim, :Zr, X)

The simulation readout is getting filled in.

Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    2.43 ± 1.38 (56.8 %) ∈ [0, 4.86]
  ├─ feedstock fraction (γ)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ core depth (d)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock bulk density (ρf)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock concentrations (cf)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ soil bulk density (ρs)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ background soil concentrations (cs)
  │    Ca: 0.00199 ± 0.000303 (15.2 %) ∈ [0.000973, 0.00284]
  │    Mg: 0.000962 ± 0.000322 (33.5 %) ∈ [0.000123, 0.00195]
  │    Zr: 9.92e-06 ± 2.02e-06 (20.3 %) ∈ [2.79e-06, 1.58e-05]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  └─ feedstock mass loss fraction (ℒ)
       NaN ± NaN (NaN %) ∈ [NaN, NaN]

Sample Depth & Mixing

The sample depth and the mixing profile of feedstock in soil interact. In general, deeper samples contain more soil per unit feedstock because most of the feedstock is near the surface. But it's also possible for shallow samples to collect only part of the original applied feedstock because it has been mixed more deeply than the sample depth.

Monty handles sample depth and mixing profile jointly. The sample depth can be defined by a non-negative probability distribution and the mixing profile by a distribution over depth for each core. For example, imagine the feedstock mixing profile looks like a wedge, tapering off to zero at some depth which varies naturally from core to core. The sample depth also varies independently at each location.

triangularmixing!(
    rng,
    sim,
    depth=truncated(Normal(0.125, 0.025), 0.0, Inf),
    upper=Exponential(0.05)
)

This operation sets the depth (d) and the feedstock fraction (γ) for all cores in the simulation. Here is the updated read-out.

sim
Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    2.43 ± 1.38 (56.8 %) ∈ [0, 4.86]
  ├─ feedstock fraction (γ)
  │    0.985 ± 0.0633 (6.4 %) ∈ [0.357, 1]
  ├─ core depth (d)
  │    0.126 ± 0.0255 (20.3 %) ∈ [0.00797, 0.202]
  ├─ feedstock bulk density (ρf)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock concentrations (cf)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ soil bulk density (ρs)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ background soil concentrations (cs)
  │    Ca: 0.00199 ± 0.000303 (15.2 %) ∈ [0.000973, 0.00284]
  │    Mg: 0.000962 ± 0.000322 (33.5 %) ∈ [0.000123, 0.00195]
  │    Zr: 9.92e-06 ± 2.02e-06 (20.3 %) ∈ [2.79e-06, 1.58e-05]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  └─ feedstock mass loss fraction (ℒ)
       NaN ± NaN (NaN %) ∈ [NaN, NaN]

Leaching & Mass Loss

Just a couple more quantities worth attention before generating a synthetic dataset.

The amount of an element left in feedstock as time proceeds is defined by a "leaching model" in Monty. These models simply define functions that are

  • equal to one for time < 0
  • equal to zero for time = 0
  • monotonically increasing after time = 0

For example, here is a leaching model that represents exponential loss of an element over time, with a seasonal fluctuation on top. The value of the line represents the fraction of feedstock present/lost. Before time zero, this is always equal to one because no feedstock has been spread yet.

w = SeasonalLeaching(λ=2.0, C=0.9, power=1, phase=π / 2)
t = LinRange(-1.0, 3.0, 1000)
fig = Mke.Figure()
ax = Mke.Axis(
    fig[1, 1],
    title="Seasonal Leaching Model",
    xlabel="Time [years]",
    ylabel="Feedstock Loss Fraction [-]"
)
Mke.lines!(ax, t, w.(t))
Example block output

We can apply these models to mobile cations in the simulation. Immobile elements must also be assigned leaching models (no leaching).

leaching!(sim, :Ca, ExponentialLeaching(λ=0.8), plan)
leaching!(sim, :Mg, ExponentialLeaching(λ=1.2), plan)
leaching!(sim, :Zr, NoLeaching(), plan)

Additionally, we can define the mass loss fraction for all the feedstock, not just elements that are leaching out of it. For example, the material may have lost 30% of a mobile element, but 50 % of its total mass because it's composed of other elements like silicon and oxygen that are also liberated.

The total mass loss can be any function of the individual elemental loss fractions that produces a fractional value between zero and one. It should make physical sense, but Monty will not check that for you.

To keep it simple, I'll just use an average of the Ca and Mg loss fractions.

massloss!(sim, plan, x -> (x[:Ca] / 2 + x[:Mg] / 2))
Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    2.43 ± 1.38 (56.8 %) ∈ [0, 4.86]
  ├─ feedstock fraction (γ)
  │    0.985 ± 0.0633 (6.4 %) ∈ [0.357, 1]
  ├─ core depth (d)
  │    0.126 ± 0.0255 (20.3 %) ∈ [0.00797, 0.202]
  ├─ feedstock bulk density (ρf)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ feedstock concentrations (cf)
  │    Ca: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Mg: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  │    Zr: NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ soil bulk density (ρs)
  │    NaN ± NaN (NaN %) ∈ [NaN, NaN]
  ├─ background soil concentrations (cs)
  │    Ca: 0.00199 ± 0.000303 (15.2 %) ∈ [0.000973, 0.00284]
  │    Mg: 0.000962 ± 0.000322 (33.5 %) ∈ [0.000123, 0.00195]
  │    Zr: 9.92e-06 ± 2.02e-06 (20.3 %) ∈ [2.79e-06, 1.58e-05]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: 0.53 ± 0.393 (74.1 %) ∈ [0.0392, 1]
  │    Mg: 0.586 ± 0.393 (67.1 %) ∈ [0.0582, 1]
  │    Zr: 0.333 ± 0.472 (141.6 %) ∈ [0, 1]
  └─ feedstock mass loss fraction (ℒ)
       0.224 ± 0.284 (126.5 %) ∈ [0, 0.625]

Other Physical Quantities

The rest of the parameters are uncomplicated. I'll assign them independently with appropriate distributions.

# feedstock bulk density
sim.ρf .= 2e3
# soil bulk density
rand!(rng, Normal(1e3, 100), sim.ρs)
# feedstock concentrations with 2 % variability
feedstockconcentration!(rng, sim, (Ca=0.07, Mg=0.06, Zr=0.001), 0.02)

Now the simulation has complete information. Here's the readout.

Simulation{Float64} with 3 analytes: Ca, Mg, Zr
├─ total samples = 75
├─ cores per sample = 6
├─ total cores = 450
└─ simulated parameters μ ± σ (rsd) ∈ [max, min]
  ├─ core cross-sectional area (a)
  │    0.00125
  ├─ applied feedstock (Q)
  │    2.43 ± 1.38 (56.8 %) ∈ [0, 4.86]
  ├─ feedstock fraction (γ)
  │    0.985 ± 0.0633 (6.4 %) ∈ [0.357, 1]
  ├─ core depth (d)
  │    0.126 ± 0.0255 (20.3 %) ∈ [0.00797, 0.202]
  ├─ feedstock bulk density (ρf)
  │    2e+03 ± 0 (0.0 %) ∈ [2e+03, 2e+03]
  ├─ feedstock concentrations (cf)
  │    Ca: 0.07 ± 0.00138 (2.0 %) ∈ [0.0662, 0.0738]
  │    Mg: 0.0601 ± 0.00124 (2.1 %) ∈ [0.056, 0.0635]
  │    Zr: 0.001 ± 1.97e-05 (2.0 %) ∈ [0.000945, 0.00106]
  ├─ soil bulk density (ρs)
  │    1e+03 ± 103 (10.3 %) ∈ [702, 1.41e+03]
  ├─ background soil concentrations (cs)
  │    Ca: 0.00199 ± 0.000303 (15.2 %) ∈ [0.000973, 0.00284]
  │    Mg: 0.000962 ± 0.000322 (33.5 %) ∈ [0.000123, 0.00195]
  │    Zr: 9.92e-06 ± 2.02e-06 (20.3 %) ∈ [2.79e-06, 1.58e-05]
  ├─ feedstock leached fractions (𝓁)
  │    Ca: 0.53 ± 0.393 (74.1 %) ∈ [0.0392, 1]
  │    Mg: 0.586 ± 0.393 (67.1 %) ∈ [0.0582, 1]
  │    Zr: 0.333 ± 0.472 (141.6 %) ∈ [0, 1]
  └─ feedstock mass loss fraction (ℒ)
       0.224 ± 0.284 (126.5 %) ∈ [0, 0.625]

Generating Measurements

Once all the information is in place, the rest is very easy. There are three simple steps

  1. apply the mixing model to each core, using the values defined in the simulation
  2. composite the cores for each sample (by summing them)
  3. "measure" the composite samples by applying analytical noise to the concentrations and the sample masses

The simulation already has memory allocated for these operations, so they happen internally.

core!(sim)
sim.cores[1,1]
Sample{Float64} (3 analytes, 1 core)
mass = 0.12623262252137166
├─ Ca 0.0014039004547531796
├─ Mg 0.00035040916963150615
└─ Zr 9.042575656169531e-6
composite!(sim)
sim.composites[1]
Sample{Float64} (3 analytes, 6 cores)
mass = 0.893787259936671
├─ Ca 0.0014619011912613343
├─ Mg 0.0004096586109330959
└─ Zr 1.076613413571555e-5

Here I use measurement noise of 4 %, 5 %, and 7 %, representing error through the whole process of splitting, preparing, and analyzing the samples. I also use 0.5 % error for the mass.

measure!(rng, sim, (Ca=0.04, Mg=0.05, Zr=0.07), 0.005)
sim.measurements[1]
Measurement{Float64} (3 analytes)
mass = 0.8918037513521073
├─ Ca = 0.0015464584487153632
├─ Mg = 0.0004071975912714168
└─ Zr = 1.1853471656283414e-5

Alternatively, all of this can be executed at once with the analyze! function, which calls core!, composite!, and measure! for you.

analyze!(rng, sim, (Ca=0.04, Mg=0.05, Zr=0.07), 0.005)
sim.measurements[1]
Measurement{Float64} (3 analytes)
mass = 0.8869860838146877
├─ Ca = 0.0014720303824193723
├─ Mg = 0.00039804103987457783
└─ Zr = 1.1389874760469995e-5

The final, simulated measurements are all in the sim.measurements vector. They can be arranged into a GeoTable, and then exported in whatever format is convenient (again, see GeoIO.jl).

table = GeoTable(sim, samp, plan)
75×9 GeoTable over 75 PointSet{2,Float64}
location round time control Ca Mg Zr mass geometry
Categorical Categorical Continuous Categorical Continuous Continuous Continuous Continuous Point2
[NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits] [NoUnits]
1 1 -0.03 false 0.00147203 0.000398041 1.13899e-5 0.886986 (34.6175, 97.991)
1 2 0.05 false 0.00283526 0.00165832 3.66167e-5 0.930776 (31.0702, 101.309)
1 3 1.0 false 0.00231703 0.000991857 3.19985e-5 0.921573 (32.2801, 95.1252)
2 1 -0.03 false 0.00198404 0.00126061 8.40728e-6 1.01564 (99.8688, 12.8281)
2 2 0.05 false 0.00325977 0.00214506 3.07553e-5 0.959037 (98.9039, 10.8534)
2 3 1.0 false 0.00291195 0.00169793 3.61131e-5 0.967251 (101.44, 11.8447)
3 1 -0.03 false 0.00172326 0.000648939 1.15438e-5 0.948495 (72.1649, 52.1983)
3 2 0.05 false 0.0036395 0.00232167 3.67911e-5 0.935963 (68.0087, 48.5853)
3 3 1.0 false 0.00292917 0.00144091 3.76838e-5 0.917773 (69.4531, 48.5659)
4 1 -0.03 false 0.00179006 0.00064769 9.55835e-6 0.871219 (80.8013, 49.4805)

The geometry column above represents an average of the realized core locations. To use only the planned locations, omit samp from the constructor with GeoTable(sim, plan).

Analysis

With a synthetic dataset in hand, we can do whatever we want with it. In many cases, much of the process above will be repeated to generate a large number of datasets using the same variability and physical properties. After defining a sample plan, we can realize lots of core sets and then generate lots of hypothetical samples and measurements. There are a few more tricks to run repeated simulations efficiently, with zero memory overhead, using the in-place methods highlighted above. An example is also shown on the Efficient Simulation page.

Here, however, I'll just do a basic visualization of the individual dataset generated above. First, it can be converted to a DataFrame.

using DataFrames

df = table |> DataFrame
df[1:5,:]
5×9 DataFrame
RowlocationroundtimecontrolCaMgZrmassgeometry
UInt16UInt16Float64BoolFloat64Float64Float64Float64Point…
111-0.03false0.001472030.0003980411.13899e-50.886986(34.6175, 97.991)
2120.05false0.002835260.001658323.66167e-50.930776(31.0702, 101.309)
3131.0false0.002317030.0009918573.19985e-50.921573(32.2801, 95.1252)
421-0.03false0.001984040.001260618.40728e-61.01564(99.8688, 12.8281)
5220.05false0.003259770.002145063.07553e-50.959037(98.9039, 10.8534)

Here is a summary of the concentrations in each round, converted to units of ppm by mass.

Example block output

The blue dots show the 20 treatment points for each round and the orange ones show the 5 control points. The horizontal axis is the sampling round. This is easier to visualize than the actual time of each sampling round, because the first and second rounds occur almost at the same time (before and after spreading). A few cursory observations:

  • For each element, there is no clear difference in the concentrations for control points over the different sampling rounds. All we see is the natural variation of the baseline soil, averaged into our composite core for each sample and then measured with analytical error.
  • Concentrations for Ca and Mg go up after spreading, then down again. This is the expected pattern for spreading then weathering and leaching.
  • Concentrations for Zr, however, go up and stay up. This is because we used a NoLeaching model for Zr. We assumed that it was perfectly immobile.
  • Although there are relatively few control points, the spread in the points seems bigger for the treatment points. This is a consequence of the variability in spreading/application of feedstock, which we designed into the simulation above.
  • This is an example of nice, well-behaved data. It would be considered high-quality MRV data for a real deployment. In general, real data will be noisier and messier.