The experiment store is the DuckDB file behind a sealed snapshot. It keeps market data, run artifacts, provenance, labels, tags, archive state, and compact telemetry together.
snapshot handle -> run experiments -> list / inspect / compare / reopen
run_id is immutable. Labels, tags, and archive state are
mutable metadata.
The examples use dplyr and tibble for data
preparation and compact display. They are suggested packages used by the
vignettes, not part of the experiment store contract.
Create A Durable Snapshot
Market data and derived data have different lifecycle rules in ledgr. A sealed snapshot freezes the real market-data input and its hash. If you need more instruments, more dates, corrected bars, or tick-derived bars, create a new snapshot. Indicators, runs, labels, tags, comparisons, and telemetry are derived from sealed market data and can be added later without mutating the snapshot.
This vignette uses tempfile() so it can run without
writing into your project directory. For real research, use a stable
path such as "research.duckdb" and a snapshot ID you will
recognize later.
db_path <- tempfile("ledgr_store_", fileext = ".duckdb")
bars <- ledgr_demo_bars |>
filter(
instrument_id %in% c("DEMO_01", "DEMO_02"),
between(
ts_utc,
ledgr::ledgr_utc("2019-01-01"),
ledgr::ledgr_utc("2019-06-30")
)
)
snapshot <- ledgr_snapshot_from_df(
bars,
db_path = db_path,
snapshot_id = "store_demo_snapshot"
)After snapshot creation, store operations take snapshot,
not db_path. In a new R session, recover the handle with
ledgr_snapshot_load(db_path, snapshot_id).
If your market data starts in CSV, seal the CSV into the same kind of
durable store. The CSV must contain at least instrument_id,
ts_utc, open, high,
low, and close.
snapshot <- ledgr_snapshot_from_csv(
"data/daily_bars.csv",
db_path = "research.duckdb",
snapshot_id = "eod_2019_h1"
)In any later session, recover the handle without re-sealing the data:
snapshot <- ledgr_snapshot_load("research.duckdb", snapshot_id = "eod_2019_h1")Yahoo imports follow the same lifecycle, but the adapter downloads bars before sealing the snapshot:
snapshot <- ledgr_snapshot_from_yahoo(
symbols = c("SPY", "QQQ"),
from = "2019-01-01",
to = "2019-06-30",
db_path = "research.duckdb",
snapshot_id = "yahoo_2019_h1"
)The returned handle is already sealed. Use
ledgr_snapshot_info(snapshot) to inspect
status, snapshot_hash, bar_count,
instrument_count, start_date,
end_date, and raw meta_json. The dates are ISO
UTC values. meta_json is envelope metadata; snapshot
identity comes from normalized bars and instruments, not from human
descriptions.
Run Variants
features <- list(ledgr_ind_sma(20))
trend_strategy <- function(ctx, params) {
targets <- ctx$flat()
for (id in ctx$universe) {
sma <- ctx$feature(id, "sma_20")
if (is.finite(sma) && ctx$close(id) > sma) {
targets[id] <- params$qty
}
}
targets
}
exp <- ledgr_experiment(
snapshot = snapshot,
strategy = trend_strategy,
features = features,
opening = ledgr_opening(cash = 10000)
)
bt_small <- exp |>
ledgr_run(params = list(qty = 5), run_id = "trend_qty_5")
#> Warning: no DISPLAY variable so Tk is not available
bt_large <- exp |>
ledgr_run(params = list(qty = 15), run_id = "trend_qty_15")Discover Runs
ledgr_run_list() is the store discovery view.
ledgr_run_list(snapshot)
#> # ledgr run list
#> # A tibble: 2 x 8
#> run_id label tags status final_equity total_return execution_mode reproducibility_level
#> <chr> <chr> <lgl> <chr> <dbl> <chr> <chr> <chr>
#> 1 trend~ NA NA DONE 10042. +0.4% audit_log tier_1
#> 2 trend~ NA NA DONE 10125. +1.3% audit_log tier_1
#>
#> # i Full identity and telemetry columns remain available on this tibble.
#> # i Inspect one run with ledgr_run_info(snapshot, run_id).Use labels and tags for mutable human-facing organization.
snapshot <- snapshot |>
ledgr_run_label("trend_qty_5", "Baseline quantity") |>
ledgr_run_tag("trend_qty_5", c("baseline", "trend")) |>
ledgr_run_tag("trend_qty_15", c("trend", "larger-size"))
ledgr_run_list(snapshot)
#> # ledgr run list
#> # A tibble: 2 x 8
#> run_id label tags status final_equity total_return execution_mode reproducibility_level
#> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
#> 1 trend~ Base~ base~ DONE 10042. +0.4% audit_log tier_1
#> 2 trend~ NA larg~ DONE 10125. +1.3% audit_log tier_1
#>
#> # i Full identity and telemetry columns remain available on this tibble.
#> # i Inspect one run with ledgr_run_info(snapshot, run_id).Tags and labels do not alter snapshot hashes, strategy hashes, parameter hashes, config hashes, or result artifacts.
The returned objects are still tibbles. When you need a custom view, convert to a tibble and select the columns you want.
ledgr_run_list(snapshot) |>
as_tibble() |>
select(run_id, label, tags, status, final_equity, execution_mode)
#> # A tibble: 2 x 6
#> run_id label tags status final_equity execution_mode
#> <chr> <chr> <chr> <chr> <dbl> <chr>
#> 1 trend_qty_5 Baseline quantity baseline, trend DONE 10042. audit_log
#> 2 trend_qty_15 NA larger-size, trend DONE 10125. audit_logInspect And Compare
info <- ledgr_run_info(snapshot, "trend_qty_5")
info
#> ledgr Run Info
#> ==============
#>
#> Run ID: trend_qty_5
#> Label: Baseline quantity
#> Status: DONE
#> Archived: FALSE
#> Tags: baseline, trend
#> Snapshot: store_demo_snapshot
#> Snapshot Hash: 6eeff5ca520c516a61e0228c5ac06d22548c9d74e4e98d1e9f71fccdd2b8a87e
#> Config Hash: d55a06be68e0372c76883149cd34ad773581e6d2a4da56712338aab26d8ac7f1
#> Strategy Hash: c413dd07662e72e003890ed30da11b77113c505d17f99e99dbe701e7485e5236
#> Params Hash: f1bc254d9d195c0cff7056644ba06c2ba5968db959e689837a76853dd47990ae
#> Reproducibility: tier_1
#> Execution Mode: audit_log
#> Elapsed Sec: 1.064
#> Persist Features:TRUE
#> Cache Hits: 0
#> Cache Misses: 2ledgr_run_info() is the detailed metadata view. It
includes execution mode, compact telemetry, status, identity hashes, and
reproducibility tier.
Useful fields include:
| Field | Meaning |
|---|---|
run_id, status, label,
tags, archived
|
mutable and immutable run organization fields |
snapshot_id, snapshot_hash,
data_hash
|
sealed data identity |
strategy_source_hash,
strategy_params_hash, config_hash
|
strategy, parameter, and run-configuration identity |
reproducibility_level |
strategy preflight tier recorded with the run |
execution_mode, elapsed_sec,
pulse_count
|
execution telemetry |
persist_features, feature_cache_hits,
feature_cache_misses
|
compact feature-engine telemetry |
error_msg |
failure diagnostic for non-completed runs |
comparison <- ledgr_compare_runs(snapshot, run_ids = c("trend_qty_5", "trend_qty_15"))
comparison
#> # ledgr comparison
#> # A tibble: 2 x 9
#> run_id label final_equity total_return sharpe_ratio max_drawdown n_trades win_rate
#> <chr> <chr> <dbl> <chr> <dbl> <chr> <int> <chr>
#> 1 trend_qty_5 Base~ 10042. +0.4% 0.838 -0.5% 12 25.0%
#> 2 trend_qty_15 NA 10125. +1.3% 0.851 -1.5% 12 25.0%
#> # i 1 more variable: reproducibility_level <chr>
#>
#> # i Full identity and telemetry columns remain available on this tibble.
#> # i Inspect one run with ledgr_run_info(snapshot, run_id).Comparison is read-only and does not rerun strategies.
n_trades counts closed, realised trade observations, not
every fill. A run can have fills but no closed trades yet, in which case
win rate is not defined.
The printed comparison formats some columns for reading. Programmatic code gets raw numeric columns from the tibble:
comparison |>
select(run_id, final_equity, total_return, sharpe_ratio, max_drawdown, n_trades)
#> # ledgr comparison
#> # A tibble: 2 x 6
#> run_id final_equity total_return sharpe_ratio max_drawdown n_trades
#> <chr> <dbl> <chr> <dbl> <chr> <int>
#> 1 trend_qty_5 10042. +0.4% 0.838 -0.5% 12
#> 2 trend_qty_15 10125. +1.3% 0.851 -1.5% 12
#>
#> # i Full identity and telemetry columns remain available on this tibble.
#> # i Inspect one run with ledgr_run_info(snapshot, run_id).After selecting a run, reopen it and inspect the underlying result tables rather than parsing the printed comparison:
best_run_id <- comparison |>
arrange(desc(total_return)) |>
pull(run_id) |>
first()
best_bt <- ledgr_run_open(snapshot, best_run_id)
tail(ledgr_results(best_bt, what = "equity"), 3)
#> # A tibble: 3 x 6
#> ts_utc equity cash positions_value running_max drawdown
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2019-06-26 10125. 10125. 0 10201. -0.00743
#> 2 2019-06-27 10125. 10125. 0 10201. -0.00743
#> 3 2019-06-28 10125. 10125. 0 10201. -0.00743
close(best_bt)Inspect Stored Strategy Source
Completed runs keep strategy provenance in the experiment store. This
is one of the most useful audit artifacts: you can inspect the source
text that produced a run without reopening the backtest handle and
without rerunning the strategy. The full trust and tier model lives in
vignette("reproducibility", package = "ledgr"); this
section shows the store workflow.
Use trust = FALSE for safe inspection. It returns stored
source text, parameters, hashes, dependency metadata, and warnings
without parsing, evaluating, or executing the source.
stored_strategy <- ledgr_extract_strategy(snapshot, "trend_qty_5", trust = FALSE)
stored_strategy
#> ledgr Extracted Strategy
#> ========================
#>
#> Run ID: trend_qty_5
#> Reproducibility: tier_1
#> Source Hash: c413dd07662e72e003890ed30da11b77113c505d17f99e99dbe701e7485e5236
#> Params Hash: f1bc254d9d195c0cff7056644ba06c2ba5968db959e689837a76853dd47990ae
#> Hash Verified: TRUE
#> Trust: FALSE
#> Source Available:TRUEThe source text is just data in this mode.
writeLines(stored_strategy$strategy_source_text)
#> function (ctx, params)
#> {
#> targets <- ctx$flat()
#> for (id in ctx$universe) {
#> sma <- ctx$feature(id, "sma_20")
#> if (is.finite(sma) && ctx$close(id) > sma) {
#> targets[id] <- params$qty
#> }
#> }
#> targets
#> }Hash verification proves stored-text identity, not code safety. Use
trust = TRUE only when you already trust the experiment
store and intentionally want ledgr to parse and evaluate the stored text
into a function object. Legacy/pre-provenance runs remain inspectable
through ledgr_run_info() and stored result tables, but
their strategy function cannot be recovered from provenance alone.
When a run ID is missing, store lookup helpers fail with class
ledgr_run_not_found:
ledgr_run_info(snapshot, "missing_run")Trusted recovery can be used to rerun a stored strategy only after you have decided that evaluating the stored source is acceptable:
recovered <- ledgr_extract_strategy(snapshot, "trend_qty_5", trust = TRUE)
rerun_exp <- ledgr_experiment(
snapshot = snapshot,
strategy = recovered$strategy_function,
features = features,
opening = ledgr_opening(cash = 10000)
)
ledgr_run(
rerun_exp,
params = recovered$strategy_params,
run_id = "trend_qty_5_rerun"
)Reopen A Completed Run In A Later Session
ledgr_run_open() reconstructs a completed run handle
from stored artifacts. It does not recompute the strategy. This is
useful when you want full result tables or plots after restarting R.
reopened <- ledgr_run_open(snapshot, "trend_qty_5")
summary(reopened)
#> ledgr Backtest Summary
#> ======================
#>
#> Performance Metrics:
#> Total Return: 0.42%
#> Annualized Return: 0.82%
#> Max Drawdown: -0.50%
#>
#> Risk Metrics:
#> Volatility (annual): 0.98%
#> Sharpe Ratio: 0.838
#>
#> Trade Statistics:
#> Total Trades: 12
#> Win Rate: 25.00%
#> Avg Trade: $3.48
#>
#> Exposure:
#> Time in Market: 66.67%
tail(ledgr_results(reopened, what = "equity"), 3)
#> # A tibble: 3 x 6
#> ts_utc equity cash positions_value running_max drawdown
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2019-06-26 10042. 10042. 0 10067. -0.00251
#> 2 2019-06-27 10042. 10042. 0 10067. -0.00251
#> 3 2019-06-28 10042. 10042. 0 10067. -0.00251
close(reopened)Only completed runs can be reopened. Failed or incomplete runs remain
inspectable through ledgr_run_info().
Store-level helpers such as ledgr_run_info(),
ledgr_run_list(), and ledgr_compare_runs() use
the snapshot handle and remain available after a completed run handle is
closed. Result-table helpers such as ledgr_results() need a
live or reopened backtest handle.
Archive Without Deleting
snapshot <- snapshot |>
ledgr_run_archive("trend_qty_15", reason = "larger position kept for reference")
ledgr_run_list(snapshot)
#> # ledgr run list
#> # A tibble: 1 x 8
#> run_id label tags status final_equity total_return execution_mode reproducibility_level
#> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
#> 1 trend~ Base~ base~ DONE 10042. +0.4% audit_log tier_1
#>
#> # i Full identity and telemetry columns remain available on this tibble.
#> # i Inspect one run with ledgr_run_info(snapshot, run_id).Archiving hides a run from default listings without deleting artifacts.
Bridge A Low-Level CSV Import
This is advanced import material. The high-level CSV helper above is the normal path. The lower-level path is useful when you want to create the snapshot row, import one or more CSV files, inspect the sealed metadata, and then load the sealed artifact in a separate step. A future article, “Data Input And Snapshot Creation”, may move this bridge out of the store workflow.
The order is important:
- create the snapshot envelope;
- import bars into that CREATED snapshot;
- seal it to validate bars and write the snapshot hash;
- load it with
verify = TRUE; - pass the loaded snapshot to
ledgr_experiment()andledgr_run().
csv_db_path <- tempfile("ledgr_csv_bridge_", fileext = ".duckdb")
csv_bars_path <- tempfile("ledgr_csv_bars_", fileext = ".csv")
csv_bars <- bars |>
filter(ts_utc <= ledgr::ledgr_utc("2019-01-15")) |>
mutate(ts_utc = format(ts_utc, "%Y-%m-%dT%H:%M:%SZ", tz = "UTC"))
utils::write.csv(csv_bars, csv_bars_path, row.names = FALSE)
csv_con <- ledgr_db_init(csv_db_path)
csv_snapshot_id <- ledgr_snapshot_create(
csv_con,
snapshot_id = "csv_bridge_snapshot",
meta = list(description = "low-level CSV bridge demo")
)
ledgr_snapshot_import_bars_csv(
csv_con,
csv_snapshot_id,
bars_csv_path = csv_bars_path,
instruments_csv_path = NULL,
auto_generate_instruments = TRUE
)
csv_hash <- ledgr_snapshot_seal(csv_con, csv_snapshot_id)
csv_info <- ledgr_snapshot_info(csv_con, csv_snapshot_id)
DBI::dbDisconnect(csv_con, shutdown = TRUE)
csv_hash
#> [1] "e80b4f4f7df20364e904804ebddd0626da16fc7fc0b5bde8f452e93fa3e733ff"
csv_info |>
select(
snapshot_id,
status,
snapshot_hash,
bar_count,
instrument_count,
start_date,
end_date,
meta_json
)
#> # A tibble: 1 x 8
#> snapshot_id status snapshot_hash bar_count instrument_count start_date end_date
#> <chr> <chr> <chr> <int> <int> <chr> <chr>
#> 1 csv_bridge_snapshot SEALED e80b4f4f7df20~ 22 2 2019-01-0~ 2019-01~
#> # i 1 more variable: meta_json <chr>bar_count and instrument_count are live
counts from the sealed snapshot tables. The raw meta_json
is envelope metadata on the snapshot row; seal-time metadata inside it
uses n_bars and n_instruments. Snapshot
identity does not come from that metadata. snapshot_hash
identifies the normalized bars and instruments only, so adding a human
description to meta_json does not change the artifact
hash.
Load the sealed snapshot before constructing the experiment. This is the same handle you would use in a later R session.
csv_snapshot <- ledgr_snapshot_load(
csv_db_path,
snapshot_id = csv_snapshot_id,
verify = TRUE
)
csv_strategy <- function(ctx, params) {
targets <- ctx$flat()
targets["DEMO_01"] <- params$qty
targets
}
csv_exp <- ledgr_experiment(
snapshot = csv_snapshot,
strategy = csv_strategy,
opening = ledgr_opening(cash = 10000)
)
csv_bt <- ledgr_run(csv_exp, params = list(qty = 1), run_id = "csv_bridge_run")
tail(ledgr_results(csv_bt, what = "equity"), 3)
#> # A tibble: 3 x 6
#> ts_utc equity cash positions_value running_max drawdown
#> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2019-01-11 9997. 9909. 88.4 10000 -0.000311
#> 2 2019-01-14 9996. 9909. 87.6 10000 -0.000388
#> 3 2019-01-15 9996. 9909. 87.0 10000 -0.000445Current Feature Persistence Boundary
Run metadata records whether feature persistence was enabled, and pulse inspection lets you view registered feature values at one decision time. Public feature inspection in v0.1.7.9 is intentionally scoped to feature contracts, warmup feasibility, and pulse-time feature views:
-
ledgr_feature_contracts(features)shows declared feature requirements; -
ledgr_feature_contract_check(snapshot, features)checks whether those requirements are achievable in a sealed snapshot; -
ledgr_pulse_snapshot()plusledgr_pulse_features()orledgr_pulse_wide()inspects one pulse.
A full persisted feature-series retrieval API is deferred to the v0.1.8 precompute and sweep-result design.
ledgr_run() and ledgr_run_open() return
live handles for durable run artifacts. The artifacts are already
durable when a run completes, and ordinary result inspection opens and
closes read connections per operation. Use close(bt) as
explicit resource cleanup in long sessions, tests, explicit-open
workflows, and lazy result cursors. Close snapshot handles when the
workflow is finished.
close(bt_small)
close(bt_large)
close(csv_bt)
ledgr_snapshot_close(csv_snapshot)
ledgr_snapshot_close(snapshot)
unlink(csv_bars_path)What’s Next?
For fills, trades, equity rows, and metric definitions, read
vignette("metrics-and-accounting", package = "ledgr"). For
strategy authoring, read
vignette("strategy-development", package = "ledgr").