On Reproducibility: Provenance and Strategy Tiers
Source:vignettes/reproducibility.Rmd
reproducibility.Rmd
library(ledgr)
library(dplyr)
data("ledgr_demo_bars", package = "ledgr")
bars <- ledgr_demo_bars |>
filter(
instrument_id %in% c("DEMO_01", "DEMO_02"),
between(
ts_utc,
ledgr_utc("2019-01-01"),
ledgr_utc("2019-06-30")
)
)ledgr treats a backtest result as an experiment artifact. The question is not only “what was the return?” The question is:
which sealed data, which strategy, which parameters, which features,
which opening state, and which execution assumptions produced this run?
This article explains the reproducibility model behind that question.
The Experiment Model
A ledgr run is produced from explicit inputs. The experiment fixes:
- a sealed snapshot;
- a strategy function;
- registered feature definitions;
- an opening state;
- a universe and execution options.
ledgr_run() then supplies strategy parameters and an
immutable run_id. The run derives fills, ledger events,
equity, trades, metrics, and comparison tables from those inputs. The
sealed snapshot is the evidence base. The strategy declares desired
holdings at each pulse. The feature list declares which derived values
are available to the strategy. The opening state declares cash and any
starting positions.
That shape matters because each part has a different reproducibility role. Market data are sealed. Parameters are stored and hashed. Strategy source is captured when possible. Results are derived from ledger events rather than remembered from an in-memory session.
snapshot <- ledgr_snapshot_from_df(bars, snapshot_id = "research_snapshot")
features <- list(ledgr_ind_returns(5))
strategy <- function(ctx, params) {
targets <- ctx$flat()
for (id in ctx$universe) {
ret_5 <- ctx$feature(id, "return_5")
if (is.finite(ret_5) && ret_5 > params$min_return) {
targets[id] <- params$qty
}
}
targets
}
exp <- ledgr_experiment(
snapshot = snapshot,
strategy = strategy,
features = features,
opening = ledgr_opening(cash = 10000)
)
bt <- ledgr_run(
exp,
params = list(min_return = 0, qty = 10),
run_id = "qty_10"
)The Provenance Model
For completed runs, ledgr stores run provenance alongside the result tables. The provenance record includes the captured strategy source where available, source hash, parameter JSON and parameter hash, dependency-version metadata, ledgr version, R version, and reproducibility tier.
Those fields do not make every run perfectly replayable. They make the claim inspectable. A run should be explainable later: what source text ledgr captured, which parameters were supplied, whether the source hash still verifies, and what reproducibility tier ledgr assigned before execution.
ledgr_run_info(snapshot, "qty_10")Extract Stored Strategy Source
ledgr_extract_strategy() inspects stored strategy
provenance for a run. The default is intentionally read-only:
stored <- ledgr_extract_strategy(snapshot, "qty_10", trust = FALSE)
stored
writeLines(stored$strategy_source_text)trust = FALSE returns source text and metadata without
parsing, evaluating, or executing the stored source. In this mode, the
source text is just data.
Use trust = TRUE only when you explicitly trust the
experiment store and intentionally want ledgr to parse and evaluate the
stored text into a function object.
trusted <- ledgr_extract_strategy(snapshot, "qty_10", trust = TRUE)
trusted$strategy_functionHash verification proves stored-text identity, not code safety. A verified hash means the stored text matches the stored hash. It does not mean the source is safe to evaluate, economically sensible, or independent from external state.
Legacy/pre-provenance runs and strategy types without capturable
source may report strategy_source_text = NA. Those runs can
still be inspected through ledgr_run_info() and result
tables, but the strategy function cannot be recovered from provenance
alone.
Stored Source Is Not Full Reproducibility
Stored source is a strong audit artifact, but it is only one part of reproducibility. A strategy may call external packages. It may close over data objects. It may rely on package versions, system libraries, or runtime state outside ledgr’s database.
That is why ledgr classifies strategies before execution.
Reproducibility Tiers
Tier 1 is self-contained under ledgr’s static preflight rules. The strategy can be understood from stored source and explicit parameters, using base/recommended R references and ledgr’s exported public namespace.
tier_1_strategy <- function(ctx, params) {
targets <- ctx$flat()
for (id in ctx$universe) {
targets[id] <- params$qty
}
targets
}
ledgr_strategy_preflight(tier_1_strategy)Tier 2 is inspectable but needs environment management outside ledgr. Examples include package-qualified calls outside the active R distribution and resolved non-function objects captured from the strategy environment.
tier_2_strategy <- function(ctx, params) {
jsonlite::toJSON(params, auto_unbox = TRUE)
ctx$flat()
}
ledgr_strategy_preflight(tier_2_strategy)The jsonlite::toJSON() call is written this way on
purpose. Namespace qualification tells ledgr which package supplies the
function. That makes the dependency visible in the preflight result and
keeps the strategy inspectable. The run can proceed, but ledgr cannot
preserve the installed jsonlite version or its system
requirements by itself.
Tier 2 is allowed for ordinary runs and future sweep mode. It is not fully reproducible by ledgr alone. Users own package installation, package version parity, system libraries, and any other runtime environment needed by their strategy.
Common environment-management approaches in R projects include
renv, Docker, rix (https://github.com/ropensci/rix), and {uvr}
(https://github.com/nbafrank/uvr). ledgr does not require
those tools and this article does not teach them. The point is simpler:
if a strategy is Tier 2, ledgr can preserve the run evidence, but the
user must preserve the surrounding environment.
Tier 3 is external state ledgr cannot recover. The most common example is an unqualified helper function from the interactive session.
my_helper <- function(ctx) ctx$flat()
tier_3_strategy <- function(ctx, params) {
my_helper(ctx)
}
ledgr_strategy_preflight(tier_3_strategy)Tier 3 strategies fail before execution by default. There is no
force = TRUE override in v0.1.7.8.
Why params Is The Boundary
Use params for strategy variation. Parameters are
canonicalized, hashed, and stored with the run. Hidden globals are
not.
strategy <- function(ctx, params) {
targets <- ctx$flat()
for (id in ctx$universe) {
if (ctx$close(id) > params$threshold) {
targets[id] <- params$qty
}
}
targets
}The same rule matters for future sweep workers. Sweep mode can send explicit parameter combinations to workers. It cannot reliably send an arbitrary interactive session.
Hidden Mutable State
Static analysis is not proof of semantic reproducibility. Patterns
such as <<-, mutable captured environments, dynamic
dispatch, and dynamically constructed calls can make a strategy
order-dependent or worker-dependent even when some symbols resolve.
counter <- 0
bad_strategy <- function(ctx, params) {
counter <<- counter + 1
ctx$flat()
}Avoid this pattern. Store intentional strategy variation in
params, and let ledgr record decisions and state changes
through the run artifacts.
What To Remember
Reproducibility in ledgr is a chain:
sealed snapshot -> experiment inputs -> preflight tier -> run provenance
-> ledger events -> derived results -> stored source inspection
Tier 1 is the cleanest path. Tier 2 is allowed but requires user-managed environment parity. Tier 3 fails by default because ledgr cannot recover what the strategy depended on.