Custom Indicators And External Features
Source:vignettes/custom-indicators.Rmd
custom-indicators.RmdCustom indicators are ledgr’s extension point for derived market data. They are useful when the built-in indicators and TTR-backed indicators do not express the feature you want.
They are also the highest-risk feature boundary. A custom indicator can keep a strategy pulse-safe, or it can hide future information in an ordinary-looking feature value. This article explains the authoring contract.
The Indicator Object
ledgr_indicator() creates a feature definition. The
important fields are:
| Field | Meaning |
|---|---|
id |
Stable feature ID used in
ctx$feature(instrument_id, feature_id). |
fn |
Scalar function for one bounded historical window. |
series_fn |
Optional vectorized function for one instrument’s full bar series. |
requires_bars |
Minimum lookback requirement for the indicator definition. |
stable_after |
First row where the output is considered usable. |
params |
Named deterministic parameter list included in the fingerprint. |
source |
Source label: "ledgr", "TTR", or
"custom". |
Use params for intentional configuration. Do not close
over mutable session objects when the value should be part of the
feature definition.
Scalar Indicators
The scalar path is the simplest contract:
range_3 <- ledgr_indicator(
id = "range_3",
fn = function(window, params) {
mean(window$high - window$low)
},
requires_bars = 3,
stable_after = 3,
params = list()
)The engine calls fn(window, params) on a bounded
historical window ending at the current bar. Before
stable_after, ledgr returns NA_real_ for that
feature. After warmup, the scalar result must be one finite numeric
value.
This path is easy to reason about because the function receives only historical rows up to the current decision point. It is the right first implementation for most custom features.
Vectorized Indicators
series_fn is the fast path for indicators that are
naturally computed over a whole series. When both fn and
series_fn are supplied, the engine uses
series_fn for full-series feature computation and keeps
fn as the scalar definition for the same feature. They
should be equivalent after warmup.
sma_3_custom <- ledgr_indicator(
id = "sma_3_custom",
fn = function(window, params) {
mean(utils::tail(window$close, params$n))
},
series_fn = function(bars, params) {
stats::filter(
bars$close,
rep(1 / params$n, params$n),
sides = 1
) |>
as.numeric()
},
requires_bars = 3,
stable_after = 3,
params = list(n = 3)
)The series_fn(bars, params) contract is strict:
-
barscontains one instrument’s bars in ascending timestamp order; - the return value must be an atomic numeric vector;
- the return length must equal
nrow(bars); - output position
ibelongs to input rowi; - warmup rows before
stable_afterare normalized toNA_real_; - post-warmup
NA,NaN, and infinite values are errors.
The example uses sides = 1, so row i is
computed from row i and earlier rows only. That is the
causal alignment a vectorized feature must preserve.
Output validation proves shape and value validity. It does not prove
causal correctness. Because series_fn receives the full bar
series, a badly written vectorized function can still use future rows
internally while returning a correctly shaped vector.
That is why scalar fn is often the safer first version.
Add series_fn when the feature logic is stable and the
alignment is obvious.
Warmup And Stability
requires_bars and stable_after are related
but not identical.
requires_bars says how much history the indicator
definition needs. stable_after says when the output is
usable in the feature series. It must be greater than or equal to
requires_bars.
For a three-bar moving average, both are usually 3. For
indicators with a longer settling period, stable_after can
be larger. ledgr treats rows before stable_after as warmup
and exposes them as NA_real_.
Warmup NA is expected. Post-warmup NA,
NaN, or infinite values mean the feature did not satisfy
its contract.
Fingerprints
Indicator definitions are fingerprinted so runs can later verify that the registered feature definition still matches the one recorded with the run.
The fingerprint includes the feature ID, scalar function, vectorized
function when present, requires_bars,
stable_after, and deterministic params.
Fingerprints are an identity check, not a semantic proof. They help
ledgr answer “is this the same feature definition?” They do not prove
that a custom series_fn avoided lookahead or that an
external data source was historically available at the simulated
decision time.
Deterministic Parameters And Unsafe Calls
params must be a named list of deterministic values. Use
strings for dates and timestamps when they are part of the feature
definition. Do not pass open connections, environments, external
pointers, or live session objects.
Indicator functions are checked for common unsafe patterns. Examples include:
- global assignment with
<<-; - wall-clock calls such as
Sys.time()andSys.Date(); - randomness such as
runif(),rnorm(), andsample(); - dynamic lookup and execution helpers such as
get(),eval(), andassign(); - environment reads such as
Sys.getenv().
These checks are guardrails. They do not replace careful review of the feature logic.
Adapter Helpers
ledgr_adapter_r() wraps a function that operates on the
close series of the bounded window. It is useful for simple package or
base R functions:
median_close <- ledgr_adapter_r(
stats::median,
id = "median_close_5",
requires_bars = 5
)The adapter stores the adapted function identity and arguments in
indicator parameters. It still creates an ordinary
ledgr_indicator.
ledgr_adapter_csv() adapts a CSV of precomputed
values:
csv_indicator <- ledgr_adapter_csv(
"features/my_values.csv",
value_col = "my_value",
id = "my_value"
)The CSV must identify timestamp, instrument, and value columns. This is useful for external feature pipelines, but it moves availability discipline outside ledgr. The CSV values must already respect the simulated decision times. ledgr can hash and look up the values; it cannot know whether the upstream pipeline used future information.
Register And Read
Custom indicators are registered with the experiment just like built-in indicators:
bars <- ledgr_demo_bars |>
filter(
instrument_id %in% c("DEMO_01", "DEMO_02"),
between(
ts_utc,
ledgr_utc("2019-01-01"),
ledgr_utc("2019-02-28")
)
)
snapshot <- ledgr_snapshot_from_df(
bars,
snapshot_id = paste0("custom-indicators-", Sys.getpid())
)
features <- list(range_3)
strategy <- function(ctx, params) {
targets <- ctx$flat()
for (id in ctx$universe) {
value <- ctx$feature(id, "range_3")
if (is.finite(value) && value < params$max_range) {
targets[id] <- params$qty
}
}
targets
}
exp <- ledgr_experiment(
snapshot = snapshot,
strategy = strategy,
features = features,
opening = ledgr_opening(cash = 10000)
)
ledgr_feature_id(features)
#> [1] "range_3"Inside the strategy, ctx$feature(id, "range_3") reads
the exact feature ID from the pulse context for one instrument. Unknown
feature IDs fail loudly. Warmup for a known feature is represented by
NA_real_.
What To Remember
Custom indicators let external feature logic enter ledgr’s deterministic pulse engine. Keep the boundary explicit:
- prefer scalar
fnuntil the logic is stable; - add
series_fnonly when full-series alignment is clear; - treat
series_fnand CSV adapters as review points for leakage; - keep all intentional configuration in deterministic
params; - register every feature before
ledgr_run(); - use
ledgr_feature_id()to confirm the exact ID a strategy should read.