Mixing Tutorial
The central element of Monty
is a mixing model describing how the concentration of an individual soil sample is determined by a mixture of soil and (possibly weathered) feedstock. This component of the package is deterministic. For a single core, physical parameters for the feedstock and the soil are passed in and the concentrations of specified elements are computed, in addition to the core's mass.
The mixing model assumes that baseline soil is vertically homogenous and that the weathering/dissolution of feedstock is also vertically homogenous. It does not, however, assume that the mixing of feedstock into the soil is vertically homogenous. It allows for different mixing profiles to be prescribed.
Parameters
The inputs for the mixing model are tabulated below. These represent the physical and chemical conditions in a single soil core.
Parameter | SI Unit | Description |
---|---|---|
γ | kg/kg | fraction of applied feedstock present in the sampling depth |
d | m | sample depth |
a | m$^2$ | cross-sectional area |
Q | kg/m$^2$ | feedstock application rate (dry material) |
ρf | kg/m$^3$ | feedstock bulk density |
cf | kg/kg | elemental concentration(s) in feedstock |
ρs | kg/m$^3$ | soil bulk density |
cs | kg/kg | elemental concentration(s) in soil |
𝓁 | kg/kg | loss fraction(s) of elemental concentrations in feedstock due to weathering |
ℒ | kg/kg | loss fraction of total feedstock mass due to weathering |
Examples
First, load the relevant packages.
using Unitful
using Monty
using Distributions
using BenchmarkTools
To pull a hypothetical soil sample, use the mixing
function. For this example, we use unitful numbers, which automatically check that the dimensions of the inputs and outputs are sensible.
mixing(
1.0u"kg/kg",
0.1u"m",
1.25e-3u"m^2",
2.2u"kg/m^2",
2e3u"kg/m^3",
5e4u"ppm",
900.0u"kg/m^3",
1e3u"ppm",
0.0u"kg/kg",
0.0u"kg/kg",
)
Sample{Unitful.Quantity{Float64, NoDims, Unitful.FreeUnits{(ppm,), NoDims, nothing}}} (1 analyte, 1 core)
mass = 0.11401250000000002 kg
└─ analyte 2181.8879508825785 ppm
Calling the mixing model creates a Sample
, which is printed out above with information about its type and its contents. The numeric type of the sample is complex because we used units. The value for the mass
is straightforward. Because we didn't specify the name of the analyte, the default name is simply analyte
and its concentration is printed with the same units as the inputs. The hypothetical soil core has a concentration of about 2182 ppm for this arbitrary analyte.
If we take out the units, it's a little simpler. Without units, the concentrations are assumed to be mass fractions (kg/kg) and the Sample
type is simply Float64
.
mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)
Sample{Float64} (1 analyte, 1 core)
mass = 0.11401250000000002
└─ analyte 0.002181887950882579
Most of the time, we're interested in multiple elements. The mixing model runs multiple elements automatically if the concentrations are specified by named tuples. This is a fast and simple way for the mixing model to keep track of different analyte names and it forces us to be specific. For example, to use two elements, just specify their concentrations in both feedstock and soil, as well as their loss fractions (𝓁
) from feedstock.
cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, cf, 900.0, cs, 𝓁, 0.0)
Sample{Float64} (2 analytes, 1 core)
mass = 0.11401250000000002
├─ Ca 0.002181887950882579
└─ Mg 0.0019351496546431313
The model is very strict about using consistent numeric types in the named tuples. You can't use (Ca=0.0, Mg=0)
, for example, because it mixes an integer with a float.
Instead of supplying the feedstock mixing fraction γ
directly, we can supply the entire vertical mixing profile Γ
, which must be a univariate, non-negative probability distribution. Specifically, it must be a UnivariateDistribution
for which minimum(Γ) >= 0
.
Γ = truncated(Normal(0.0, 0.05), 0.0, Inf) # a half normal with σ = 5 cm
cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
mixing(Γ, 0.1, 1.25e-3, 2.2, 2e3, cf, 900.0, cs, 𝓁, 0.0)
Sample{Float64} (2 analytes, 1 core)
mass = 0.11394368085085678
├─ Ca 0.0021287930886515556
└─ Mg 0.0018706773219340318
Finally, the mixing model is fast. Even though it creates Sample
instances, these are mainly just wrappers for named tuples. Here is a benchmark.
cf = (Ca=0.05, Mg=0.06)
cs = (Ca=0.001, Mg=0.0005)
𝓁 = (Ca=0.0, Mg=0.0)
@benchmark mixing(1.0, 0.1, 1.25e-3, 2.2, 2e3, $cf, 900.0, $cs, $𝓁, 0.0)
BenchmarkTools.Trial: 10000 samples with 994 evaluations.
Range (min … max): 28.382 ns … 72.217 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 28.474 ns ┊ GC (median): 0.00%
Time (mean ± σ): 30.156 ns ± 4.637 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▃▅▆ ▅ ▁ ▁
█████▇█▄▄▁▁▄▁▃▁▁▁▁▁▄▆▆▅▆▅▅▅▄▄▄▃▃▃▁▁▁▃▃▁▄▄▃▆▆▆█▇██████▇▇▆▇▇▆ █
28.4 ns Histogram: log(frequency) by time 50.5 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
The model is simple, so it ought to be fast, but it's good confirmation that it allocates no memory and takes only tens of nanoseconds to run.
For one last example, we can also use multiple analytes with unitful numbers. No problem.
mixing(
1.0u"kg/kg",
0.1u"m",
1.25e-3u"m^2",
2.2u"kg/m^2",
2e3u"kg/m^3",
(Ca=5e4u"ppm", Mg=6e4u"ppm"),
900.0u"kg/m^3",
(Ca=1e3u"ppm", Mg=1200.0u"ppm"),
(Ca=0.0u"kg/kg", Mg=0.0u"kg/kg"),
0.0u"kg/kg",
)
Sample{Unitful.Quantity{Float64, NoDims, Unitful.FreeUnits{(ppm,), NoDims, nothing}}} (2 analytes, 1 core)
mass = 0.11401250000000002 kg
├─ Ca 2181.8879508825785 ppm
└─ Mg 2618.265541059094 ppm
Combining samples
The physical combination of cores and samples (compositing) is easy. Just add them together with +
, use sum
, or any other version of addition.
a = mixing(1.0, 0.11, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)
Sample{Float64} (1 analyte, 1 core)
mass = 0.1252625
└─ analyte 0.0020757409440175633
b = mixing(1.0, 0.09, 1.25e-3, 2.2, 2e3, 0.05, 900.0, 0.001, 0.0, 0.0)
Sample{Float64} (1 analyte, 1 core)
mass = 0.10276249999999999
└─ analyte 0.0023112760004865593
c = a + b
Sample{Float64} (1 analyte, 2 cores)
mass = 0.22802499999999998
└─ analyte 0.0021818879508825792
Adding the two individual cores gives you a Sample
with 2 cores and you can verify that the concentration is a mass-weighted average of the two individual concentrations. There's no limit to the number of cores you can combine.
d = mixing(1.0, 0.09, 1.25e-3, 2.2, 2e3, 0.04, 1100.0, 0.001, 0.0, 0.0)
c + d
Sample{Float64} (1 analyte, 3 cores)
mass = 0.35301249999999995
└─ analyte 0.002067242661378847
Given a vector or other collection of samples, they can be combined by summing them.
v = [a, b, c, d]
sum(v)
Sample{Float64} (1 analyte, 5 cores)
mass = 0.5810375
└─ analyte 0.002112234580384227
These operations are also fast because they are just doing arithmetic and forming named tuples underneath.
@benchmark sum($v)
BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min … max): 15.439 ns … 47.293 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 15.500 ns ┊ GC (median): 0.00%
Time (mean ± σ): 15.935 ns ± 1.091 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▄▆▃▂▁▁▂▁ ▁ ▁ ▂▂▂ ▂▄▄▄▃▂ ▂
██████████▇██▇▇▇▇▆███▇▆▅▆▆████▆▅▃▃▄▆▅▅▅▃▃▃▄▁▁▄▁▁▄▁▁▃▆██████ █
15.4 ns Histogram: log(frequency) by time 17.5 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
A Simple Simulation
The mixing function itself can be used for simulations, instead of an entire hypothetical deployment as in the tutorial and efficient tutorial. For example, we can use the Turing.jl modeling tools to do a simple Monte Carlo simulation, looking at how parameters in the mixing model relate to each other.
Here I define a model where each parameter is assigned a probability distribution to sample from. To make it as simple as possible, I just use uniform distributions for each parameter, drawing randomly from a prescribed range.
using Turing
import CairoMakie as Mke
@model function model()
γ ~ Uniform()
d ~ Uniform(0.05, 0.15)
a = 1.25e-3
Q ~ Uniform(1.0, 4.0)
ρf = 2e3
cf ~ Uniform(4e4, 6e4) # defined in ppm for convenience
ρs ~ Uniform(750, 1250)
cs ~ Uniform(1e3, 4e3) # defined in ppm for convenience
𝓁 ~ Uniform()
ℒ ~ Uniform()
core = mixing(γ, d, a, Q, ρf, cf / 1e6, ρs, cs / 1e6, 𝓁, ℒ)
# capture the core's mixed concentration
c ~ Dirac(1e6 * core[:analyte])
end
model (generic function with 2 methods)
This model can be sampled like any model. In this case, we use the prior sampler to do the Monte Carlo for us. Below, we create a chain of 64,000 samples and then print out the summary.
chain = sample(model(), Prior(), 64_000)
Chains MCMC chain (64000×10×1 Array{Float64, 3}):
Iterations = 1:1:64000
Number of chains = 1
Samples per chain = 64000
Wall duration = 4.73 seconds
Compute duration = 4.73 seconds
parameters = γ, d, Q, cf, ρs, cs, 𝓁, ℒ, c
internals = lp
Summary Statistics
parameters mean std mcse ess_bulk ess_tail ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Fl ⋯
γ 0.4985 0.2889 0.0011 63686.1246 63541.0337 1 ⋯
d 0.0998 0.0288 0.0001 65020.2686 64090.3643 1 ⋯
Q 2.5044 0.8644 0.0034 64755.9197 62972.8509 1 ⋯
cf 49978.9053 5770.3369 23.0076 63438.4074 64300.0842 1 ⋯
ρs 1000.0273 144.6380 0.5747 63355.4404 63378.4111 1 ⋯
cs 2495.9676 869.1354 3.4603 63048.7783 63236.8160 1 ⋯
𝓁 0.5003 0.2887 0.0011 63994.2501 63037.0723 1 ⋯
ℒ 0.4996 0.2881 0.0011 64055.3322 64072.8631 1 ⋯
c 2825.5848 942.4266 3.7284 63982.2719 63752.4979 1 ⋯
2 columns omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
γ 0.0246 0.2476 0.4986 0.7475 0.9750
d 0.0525 0.0749 0.0999 0.1246 0.1474
Q 1.0750 1.7559 2.5076 3.2532 3.9227
cf 40513.4018 44950.1809 49997.0097 54977.9588 59485.3921
ρs 762.4901 874.6101 999.9261 1126.4738 1237.3112
cs 1073.7050 1745.1261 2492.6461 3248.1599 3922.5793
𝓁 0.0250 0.2505 0.5019 0.7496 0.9758
ℒ 0.0262 0.2498 0.5008 0.7480 0.9748
c 1231.3287 2055.1200 2820.8366 3576.5021 4547.1344
We can do anything we want with these samples. Here are a few plots showing the relationships between each parameter and the core concentration after mixing.
fig = Mke.Figure(size=(700,1100))
for (i,x) in enumerate([:γ, :d, :Q, :cf, :ρs, :cs, :𝓁, :ℒ])
ax = Mke.Axis(
fig[(i - 1) ÷ 2 + 1,
(i - 1) % 2 + 1],
xlabel=string(x),
ylabel="core concentration (ppm)"
)
Mke.hexbin!(ax, chain[x] |> vec, chain[:c] |> vec, bins=30)
end
fig
There's one very clear feature of the plots above: the baseline soil concentration (cs
) absolutely dominates the response in the core concentration. There is an almost 1:1 correspondence between baseline soil concentration and mixed soil concentration. To see the influence of other parameters, we could hold baseline soil constant and vary everything else around it.