API walkthrough
The function derivative_estimate
transforms a stochastic program containing discrete randomness into a new program whose average is the derivative of the original.
StochasticAD.derivative_estimate
— Functionderivative_estimate(X, p::Real; backend=StochasticAD.PrunedFIs)
Computes an unbiased estimate of $\frac{\mathrm{d}\mathbb{E}[X(p)]}{\mathrm{d}p}$, the derivative of the expectation of the random function X(p)
with respect to its input p
. The backend
keyword argument describes the algorithm used by the third component of the stochastic triple, see Technical details for more details.
While derivative_estimate
is self-contained, we can also use the functions below to work with stochastic triples directly.
StochasticAD.stochastic_triple
— Functionstochastic_triple(X, p::Real; kwargs...)
stochastic_triple(p::Real; kwargs...)
Return the result of propagating the stochastic triple p + ε
through the random function X(p)
. When X
is not provided, the identity function is used, i.e. the triple p + ε
is returned.
StochasticAD.derivative_contribution
— Functionderivative_contribution(st::StochasticTriple)
Return the derivative estimate given by combining the dual and triple components of st
.
StochasticAD.value
— Functionvalue(st::StochasticTriple)
Return the primal value of st
.
StochasticAD.delta
— Functiondelta(st::StochasticTriple)
Return the almost-sure derivative of st
, i.e. the rate of infinitesimal change.
StochasticAD.perturbations
— Functionperturbations(st::StochasticTriple)
Return the finite perturbation(s) of st
, in a format dependent on the backend used for storing perturbations.
Note that derivative_estimate
is simply the composition of stochastic_triple
and derivative_contribution
.
Smoothing
What happens if we run derivative_contribution
after each step, instead of only at the end? This is smoothing, which combines the second and third components of a single stochastic triple into a single dual component. Smoothing no longer has a guarantee of unbiasedness, but is surprisingly accurate in a number of situations.
[Smoothing functionality coming soon.]
Optimization
We also provide utilities to make it easier to get started with forming and training a model via stochastic gradient descent:
StochasticAD.StochasticModel
— TypeStochasticModel(p, X)
Combine stochastic program X
with parameter p
into a trainable model using Functors (formulate as a minimization problem, i.e. find $p$ that minimizes $\mathbb{E}[X(p)]$).
StochasticAD.stochastic_gradient
— Functionstochastic_gradient(m::StochasticModel)
Compute gradient with respect to the trainable parameter p
of StochasticModel(p, X)
These are used in the tutorial on stochastic optimization.