Mixing Tutorial

The central element of Monty is a mixing model describing how the concentration of an individual soil sample is determined by a mixture of soil and (possibly weathered) feedstock. This component of the package is deterministic. For a single core, physical parameters for the feedstock and the soil are passed in and the concentrations of specified elements are computed, in addition to the core's mass.

The mixing model assumes that baseline soil is vertically homogenous and that the weathering/dissolution of feedstock is also vertically homogenous. It does not, however, assume that the mixing of feedstock into the soil is vertically homogenous. It allows for different mixing profiles to be prescribed.

Parameters

The inputs for the mixing model are tabulated below. These represent the physical and chemical conditions in a single soil core.

Parameter	SI Unit	Description
`γ`	kg/kg	fraction of applied feedstock present in the sampling depth
`d`	m	sample depth
`a`	m$^2$	cross-sectional area
`Q`	kg/m$^2$	feedstock application rate (dry material)
`ρf`	kg/m$^3$	feedstock bulk density
`cf`	kg/kg	elemental concentration(s) in feedstock
`ρs`	kg/m$^3$	soil bulk density
`cs`	kg/kg	elemental concentration(s) in soil
`𝓁`	kg/kg	loss fraction(s) of elemental concentrations in feedstock due to weathering
`ℒ`	kg/kg	loss fraction of total feedstock mass due to weathering

Examples

First, load the relevant packages.

using Unitful
using Monty
using Distributions
using BenchmarkTools

To pull a hypothetical soil sample, use the mixing function. For this example, we use unitful numbers, which automatically check that the dimensions of the inputs and outputs are sensible.

mixing(
    1.0u"kg/kg",
    0.1u"m",
    1.25e-3u"m^2",
    2.2u"kg/m^2",
    2e3u"kg/m^3",
    5e4u"ppm",
    900.0u"kg/m^3",
    1e3u"ppm",
    0.0u"kg/kg",
    0.0u"kg/kg",
)

Sample{Unitful.Quantity{Float64, NoDims, Unitful.FreeUnits{(ppm,), NoDims, nothing}}} (1 analyte, 1 core)
mass = 0.11401250000000002 kg
└─ analyte 2181.8879508825785 ppm

Calling the mixing model creates a Sample, which is printed out above with information about its type and its contents. The numeric type of the sample is complex because we used units. The value for the mass is straightforward. Because we didn't specify the name of the analyte, the default name is simply analyte and its concentration is printed with the same units as the inputs. The hypothetical soil core has a concentration of about 2182 ppm for this arbitrary analyte.

If we take out the units, it's a little simpler. Without units, the concentrations are assumed to be mass fractions (kg/kg) and the Sample type is simply Float64.

mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)

Sample{Float64} (1 analyte, 1 core)
mass = 0.11401250000000002
└─ analyte 0.002181887950882579

Most of the time, we're interested in multiple elements. The mixing model runs multiple elements automatically if the concentrations are specified by named tuples. This is a fast and simple way for the mixing model to keep track of different analyte names and it forces us to be specific. For example, to use two elements, just specify their concentrations in both feedstock and soil, as well as their loss fractions (𝓁) from feedstock.

cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, cf, 900.0, cs, 𝓁, 0.0)

Sample{Float64} (2 analytes, 1 core)
mass = 0.11401250000000002
├─ Ca 0.002181887950882579
└─ Mg 0.0019351496546431313

Warning

The model is very strict about using consistent numeric types in the named tuples. You can't use (Ca=0.0, Mg=0), for example, because it mixes an integer with a float.

Instead of supplying the feedstock mixing fraction γ directly, we can supply the entire vertical mixing profile Γ, which must be a univariate, non-negative probability distribution. Specifically, it must be a UnivariateDistribution for which minimum(Γ) >= 0.

Γ = truncated(Normal(0.0, 0.05), 0.0, Inf) # a half normal with σ = 5 cm
cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
mixing(Γ, 0.1, 1.25e-3, 2.2, 2e3, cf, 900.0, cs, 𝓁, 0.0)

Sample{Float64} (2 analytes, 1 core)
mass = 0.11394368085085678
├─ Ca 0.0021287930886515556
└─ Mg 0.0018706773219340318

Finally, the mixing model is fast. Even though it creates Sample instances, these are mainly just wrappers for named tuples. Here is a benchmark.

cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
@benchmark mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, $cf, 900.0, $cs, $𝓁, 0.0)

BenchmarkTools.Trial: 10000 samples with 994 evaluations.
 Range (min … max):  28.382 ns … 72.217 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     28.474 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   30.156 ns ±  4.637 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▃▅▆  ▅                                        ▁            ▁
  █████▇█▄▄▁▁▄▁▃▁▁▁▁▁▄▆▆▅▆▅▅▅▄▄▄▃▃▃▁▁▁▃▃▁▄▄▃▆▆▆█▇██████▇▇▆▇▇▆ █
  28.4 ns      Histogram: log(frequency) by time      50.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

The model is simple, so it ought to be fast, but it's good confirmation that it allocates no memory and takes only tens of nanoseconds to run.

For one last example, we can also use multiple analytes with unitful numbers. No problem.

mixing(
    1.0u"kg/kg",
    0.1u"m",
    1.25e-3u"m^2",
    2.2u"kg/m^2",
    2e3u"kg/m^3",
    (Ca=5e4u"ppm", Mg=6e4u"ppm"),
    900.0u"kg/m^3",
    (Ca=1e3u"ppm", Mg=1200.0u"ppm"),
    (Ca=0.0u"kg/kg", Mg=0.0u"kg/kg"),
    0.0u"kg/kg",
)

Sample{Unitful.Quantity{Float64, NoDims, Unitful.FreeUnits{(ppm,), NoDims, nothing}}} (2 analytes, 1 core)
mass = 0.11401250000000002 kg
├─ Ca 2181.8879508825785 ppm
└─ Mg 2618.265541059094 ppm

Combining samples

The physical combination of cores and samples (compositing) is easy. Just add them together with +, use sum, or any other version of addition.

a = mixing(1.0, 0.11, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)

Sample{Float64} (1 analyte, 1 core)
mass = 0.1252625
└─ analyte 0.0020757409440175633

b = mixing(1.0, 0.09, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)

Sample{Float64} (1 analyte, 1 core)
mass = 0.10276249999999999
└─ analyte 0.0023112760004865593

c = a + b

Sample{Float64} (1 analyte, 2 cores)
mass = 0.22802499999999998
└─ analyte 0.0021818879508825792

Adding the two individual cores gives you a Sample with 2 cores and you can verify that the concentration is a mass-weighted average of the two individual concentrations. There's no limit to the number of cores you can combine.

d = mixing(1.0, 0.09, 1.25e-3, 2.2, 2e3, 0.04, 1100.0, 0.001, 0.0, 0.0)
c + d

Sample{Float64} (1 analyte, 3 cores)
mass = 0.35301249999999995
└─ analyte 0.002067242661378847

Given a vector or other collection of samples, they can be combined by summing them.

v = [a, b, c, d]
sum(v)

Sample{Float64} (1 analyte, 5 cores)
mass = 0.5810375
└─ analyte 0.002112234580384227

These operations are also fast because they are just doing arithmetic and forming named tuples underneath.

@benchmark sum($v)

BenchmarkTools.Trial: 10000 samples with 998 evaluations.
 Range (min … max):  15.439 ns … 47.293 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     15.500 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   15.935 ns ±  1.091 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄▆▃▂▁▁▂▁  ▁       ▁      ▂▂▂                        ▂▄▄▄▃▂ ▂
  ██████████▇██▇▇▇▇▆███▇▆▅▆▆████▆▅▃▃▄▆▅▅▅▃▃▃▄▁▁▄▁▁▄▁▁▃▆██████ █
  15.4 ns      Histogram: log(frequency) by time      17.5 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

A Simple Simulation

The mixing function itself can be used for simulations, instead of an entire hypothetical deployment as in the tutorial and efficient tutorial. For example, we can use the Turing.jl modeling tools to do a simple Monte Carlo simulation, looking at how parameters in the mixing model relate to each other.

Here I define a model where each parameter is assigned a probability distribution to sample from. To make it as simple as possible, I just use uniform distributions for each parameter, drawing randomly from a prescribed range.

using Turing

import CairoMakie as Mke

@model function model()

    γ ~ Uniform()
    d ~ Uniform(0.05, 0.15)
    a = 1.25e-3
    Q ~ Uniform(1.0, 4.0)
    ρf = 2e3
    cf ~ Uniform(4e4, 6e4) # defined in ppm for convenience
    ρs ~ Uniform(750, 1250)
    cs ~ Uniform(1e3, 4e3) # defined in ppm for convenience
    𝓁 ~ Uniform()
    ℒ ~ Uniform()

    core = mixing(γ, d, a, Q, ρf, cf / 1e6, ρs, cs / 1e6, 𝓁, ℒ)
    # capture the core's mixed concentration
    c ~ Dirac(1e6 * core[:analyte])

end

model (generic function with 2 methods)

This model can be sampled like any model. In this case, we use the prior sampler to do the Monte Carlo for us. Below, we create a chain of 64,000 samples and then print out the summary.

chain = sample(model(), Prior(), 64_000)

Chains MCMC chain (64000×10×1 Array{Float64, 3}):

Iterations        = 1:1:64000
Number of chains  = 1
Samples per chain = 64000
Wall duration     = 4.73 seconds
Compute duration  = 4.73 seconds
parameters        = γ, d, Q, cf, ρs, cs, 𝓁, ℒ, c
internals         = lp

Summary Statistics
  parameters         mean         std      mcse     ess_bulk     ess_tail      ⋯
      Symbol      Float64     Float64   Float64      Float64      Float64   Fl ⋯

           γ       0.4985      0.2889    0.0011   63686.1246   63541.0337    1 ⋯
           d       0.0998      0.0288    0.0001   65020.2686   64090.3643    1 ⋯
           Q       2.5044      0.8644    0.0034   64755.9197   62972.8509    1 ⋯
          cf   49978.9053   5770.3369   23.0076   63438.4074   64300.0842    1 ⋯
          ρs    1000.0273    144.6380    0.5747   63355.4404   63378.4111    1 ⋯
          cs    2495.9676    869.1354    3.4603   63048.7783   63236.8160    1 ⋯
           𝓁       0.5003      0.2887    0.0011   63994.2501   63037.0723    1 ⋯
           ℒ       0.4996      0.2881    0.0011   64055.3322   64072.8631    1 ⋯
           c    2825.5848    942.4266    3.7284   63982.2719   63752.4979    1 ⋯
                                                               2 columns omitted

Quantiles
  parameters         2.5%        25.0%        50.0%        75.0%        97.5%
      Symbol      Float64      Float64      Float64      Float64      Float64

           γ       0.0246       0.2476       0.4986       0.7475       0.9750
           d       0.0525       0.0749       0.0999       0.1246       0.1474
           Q       1.0750       1.7559       2.5076       3.2532       3.9227
          cf   40513.4018   44950.1809   49997.0097   54977.9588   59485.3921
          ρs     762.4901     874.6101     999.9261    1126.4738    1237.3112
          cs    1073.7050    1745.1261    2492.6461    3248.1599    3922.5793
           𝓁       0.0250       0.2505       0.5019       0.7496       0.9758
           ℒ       0.0262       0.2498       0.5008       0.7480       0.9748
           c    1231.3287    2055.1200    2820.8366    3576.5021    4547.1344

We can do anything we want with these samples. Here are a few plots showing the relationships between each parameter and the core concentration after mixing.

fig = Mke.Figure(size=(700,1100))
for (i,x) in enumerate([:γ, :d, :Q, :cf, :ρs, :cs, :𝓁, :ℒ])
    ax = Mke.Axis(
        fig[(i - 1) ÷ 2 + 1,
        (i - 1) % 2 + 1],
        xlabel=string(x),
        ylabel="core concentration (ppm)"
    )
    Mke.hexbin!(ax, chain[x] |> vec, chain[:c] |> vec, bins=30)
end
fig

There's one very clear feature of the plots above: the baseline soil concentration (cs) absolutely dominates the response in the core concentration. There is an almost 1:1 correspondence between baseline soil concentration and mixed soil concentration. To see the influence of other parameters, we could hold baseline soil constant and vary everything else around it.