uno.RmdHarrell’s concordance statistic
(the
-index)
is widely used to quantify discriminative performance in right-censored
survival analysis and is used throughout randomForestSRC to
report out-of-bag (OOB) and test error rates. The package recently
introduced two related enhancements for survival and competing risks
forests. First, concordance is now computed using the
inverse-probability-of-censoring weighted (IPCW) estimator of Uno
et~al. (2011) [1], which targets a
censoring-free population concordance; legacy unweighted Harrell
concordance for survival forests can be recovered by setting
use.uno = FALSE. Because IPCW weights may become inflated
in the heavily censored tail, a weight-stabilization (gating) rule is
applied that selects a truncation threshold
using an effective sample size criterion. Second, concordance is
computed using the tree-based counting strategy described by Therneau
(2024)[2], with the required rank-count
updates and prefix queries implemented using a binary indexed tree [3], replacing the naive
pairwise calculation with a signficantly faster
time. This vignette describes these methodological and computational
changes and presents simulation benchmarks illustrating the
finite-sample behavior of unweighted and IPCW concordance under
different levels of censoring in both survival and competing risk
settings.
Let
denote observed survival data, where
with
the event time,
the censoring time, and
indicating an observed event. Let
be a model-based risk score such that larger values correspond to higher
event risk; for example, in random survival forests (RSF),
is the OOB mortality estimate returned by rfsrc().
A central goal of performance assessment is to quantify how well
ranks subjects by their latent failure times. For continuous event
times, the concordance functional is
the probability that, for a randomly
selected pair
,
the subject who fails earlier is assigned higher predicted risk. For a
clear exposition, we present definitions under the simplifying
assumption that there are no ties. In applications, ties in observed
follow-up times and in predicted risk scores are common;
randomForestSRC handles these using a deterministic
time-ordering and partial-credit scheme described in Section
@ref(sec:interface).
Throughout, we focus on the concordance index
where larger is better. The exported get.cindex() function
and the internal performance summaries in randomForestSRC
return the
;
this is a monotone reparameterization and follows the package’s
convention that smaller error is better.
Harrell’s estimator replaces unobserved event times by observed follow-up times and restricts comparisons to pairs for which the earlier time corresponds to an observed event. The classical estimator can be written as The denominator counts the set of pairs in which subject experiences an observed event before subject ’s observed follow-up time; the numerator counts the subset of those pairs concordant with the predicted risk ordering.
When censoring is low, the set of comparable pairs approximates the set of all pairs with , and is often a useful discrimination summary. Under heavier censoring, however, the comparable-pair restriction induces a selection that depends on the censoring mechanism. Even under independent censoring, converges to a censoring-dependent limit (a conditional concordance over comparable pairs) and can differ systematically from the target functional for a fixed risk score.
Uno et~al. (2011) [1] proposed an IPCW modification that reweights observed events to represent the full cohort at the corresponding follow-up time. Let denote the censoring survival function and let be the Kaplan–Meier estimator fit to the reverse-censoring data . The IPCW weight for subject is where denotes the left limit of the Kaplan–Meier estimate at . The squared term arises because, for a pair to be comparable at an event time , both subjects must be uncensored beyond , giving probability under independent censoring. Uno’s estimator replaces by the weighted event contribution : Compared to Harrell’s estimator, the IPCW formulation increases the contribution of events occurring at times where the probability of remaining uncensored is small, i.e., in the tail where is small.
Under independent censoring and consistent estimation of , the IPCW estimator targets the population concordance functional for a fixed risk score and is consistent for that target [1]. In finite samples, IPCW typically reduces the bias of unweighted concordance that arises when heavy censoring disproportionately removes later events from the comparable-pair set.
IPCW weights can be highly variable when
approaches 0. Because
scales as
,
even a small number of events can generate extremely large weights if
these occur deep in the tail and can lead to unstable finite-sample
behavior for
.
For this reason, randomForestSRC therefore applies a
default stabilization (or ) rule.
In finite samples, a small number of late events can yield extremely
large Uno weights when
is small, which can destabilize
.
To avoid this, randomForestSRC introduces a gate
and uses the stabilized weight vector
where
is a very small constant (defaulting to machine precision). The
stabilized IPCW concordance replaces
in Section~3 by
:
Because
implies
,
the gate caps the magnitude of the non-negligible event weights by
and prevents a small number of tail events from dominating the
criterion. Events with
receive negligible weight, so their event contribution is effectively
removed. The small positive value
is used (instead of a literal zero) so that these observations remain
available rather than excluding the observation entirely.
The gate
is selected on the training data using only the distribution of
event-time weights. Let
denote the set of observed events and
its size. For a candidate threshold
,
define the retained event-time weights
Define the effective sample size
functional~[4]
The default ESS target is specified by
fixed constants ess_frac and ess_min:
The automatic gating rule chooses the
smallest
such that
Because
is monotone in
,
this procedure is equivalent to dropping the largest event weights (the
smallest values of
)
until the ESS constraint is satisfied;
is the corresponding cutoff on
.
randomForestSRC
For survival and competing risks forests, the package now adopts a
default concordance-based performance calculation using Uno’s IPCW
estimator. This affects all concordance-derived performance summaries,
including OOB and test error rates, as well as all downstream
calculations based on those values. The legacy unweighted Harrell
concordance can be recovered by setting use.uno = FALSE
when fitting a survival forest:
fit <- rfsrc(Surv(time, status) ~ ., data = dat, use.uno = FALSE)For standalone computation of concordance for a given outcome and
risk score, the exported get.cindex() provides a direct
interface to the native concordance routines:
err <- get.cindex(time, censoring, predicted, weight = NULL, fast = TRUE)
C <- 1 - errThe arguments correspond to the observed outcomes, risk score, and
(optionally) precomputed IPCW weights:
For large samples,
fast = TRUE yields major speedups by using the Fenwick-tree
implementation (if not specified, by default the fast version is used
when
exceeds 500 or the number of events exceed 250). When
weight = NULL, get.cindex() computes
unweighted Harrell concordance; when weight is supplied, it
computes Uno IPCW concordance using the provided weights (and therefore
reflects any stabilization encoded in those weights).
So far our definitions have assumed continuous time and strictly
ordered risk scores. The randomForestSRC concordance
implementation handles time ties and prediction ties using the following
set of conventions.
Pairs with equal observed times are handled by imposing an ordering on at each unique time: events are treated as occurring before censorings recorded at the same time. Thus, if with , the pair is treated as comparable with subject the earlier event.
If for a comparable pair, the pair contributes a half-credit to the numerator.
get.cindex()
The exported function get.cindex() is a direct interface
to the native concordance routines used by randomForestSRC.
Advanced users may use this interface to reproduce (or customize) the
concordance calculations performed internally for OOB and test
performance when calling rfsrc() and
predict.rfsrc().
In the Uno IPCW formulation, concordance weights depend on the
censoring survival function
.
When evaluating on an external test set, randomForestSRC
estimates
on the training data and then evaluates
at test follow-up times to construct test weights, using the same
stabilization (gating) threshold selected on the training sample. This
avoids reusing test outcomes to estimate the censoring model.
The code below illustrates a train/test workflow on the
pbc data using the get.cindex() function. The
example makes use of helpers get.uno.weights.train() and
get.uno.weights.test() which are currently internal
(non-exported) functions that mirror the package’s default construction
of stabilized Uno IPCW weights. The training call returns both the
weight vector and a fitted censoring model (including the automatically
selected gate
),
which can then be reused to evaluate
and construct weights on an external test set using the test call.
For survival outcomes, predicted is a vector risk score.
For competing risks outcomes (discussed next), predicted is
a matrix with one column per event type, and get.cindex()
returns a vector of cause-specific concordance errors. The same
train/test weight construction applies, since the censoring survival
model does not depend on event type.
library(randomForestSRC)
## PBC data: remove NAs, split train/test
data(pbc, package = "randomForestSRC")
pbc.clean <- na.omit(pbc)
## right-censoring indicator (1=event, 0=censor)
pbc.clean$status <- as.integer(pbc.clean$status > 0)
set.seed(1)
trn <- sample(1:nrow(pbc.clean), nrow(pbc.clean)/2, replace = FALSE)
pbc.trn <- pbc.clean[trn, ]
pbc.tst <- pbc.clean[-trn, ]
## Fit forest on training data (Uno weighting is default)
fit <- rfsrc(Surv(days, status) ~ ., data = pbc.trn)
## outcomes and risk scores
time.trn <- fit$yvar[,1]
status.trn <- fit$yvar[,2]
risk.trn <- fit$predicted.oob
## Uno weights estimated from the training censoring distribution
uno.trn <- randomForestSRC:::get.uno.weights.train(time.trn, status.trn)
w.trn <- uno.trn$weight
C.harrell.trn <- 1 - get.cindex(time.trn, status.trn, risk.trn)
C.uno.trn <- 1 - get.cindex(time.trn, status.trn, risk.trn,
weight = w.trn)
## ----- test concordance (external risk) -----
pred <- predict(fit, newdata = pbc.tst)
time.tst <- pred$yvar[,1]
status.tst <- pred$yvar[,2]
risk.tst <- pred$predicted
## Construct test weights from the training censoring model and gate
w.tst <- randomForestSRC:::get.uno.weights.test(time.tst, uno.trn$fit)
C.harrell.tst <- 1 - get.cindex(time.tst, status.tst, risk.tst)
C.uno.tst <- 1 - get.cindex(time.tst, status.tst, risk.tst,
weight = w.tst)
c(C.harrell.trn = C.harrell.trn,
C.harrell.tst = C.harrell.tst,
C.uno.trn = C.uno.trn,
C.uno.tst = C.uno.tst)
C.harrell.trn C.harrell.tst C.uno.trn C.uno.tst
0.8038976 0.8277734 0.7953605 0.7817458 The above code shows train/test example using
get.cindex. The censoring survival function
is estimated on the training sample, yielding Uno
inverse–probability–of–censoring weights
together with an automatically chosen tail gate (truncation threshold)
for stability. The fitted censoring model and the same
are then reused to construct test-set weights, enabling computation of
both Harrell’s
(unweighted) and Uno’s IPCW
(gated) on training (using out-of-bag risk scores) and on an independent
test set (using predicted risk scores).
We briefly describe how the above ideas extend to competing risks.
Let
denote the number of event types and write
where
is the latent event type and
indicates censoring. For each event type
,
let
denote a cause-specific risk score, with larger values indicating higher
predicted risk for event
.
In randomForestSRC,
is derived from an event-specific ensemble yielding
cause-
mortality risk scores [5], but the
concordance definition below applies to any monotone risk score.
For a fixed cause , a pair of subjects can be informative in two distinct ways. Ignoring ties for notational clarity, define the following (directed) comparability indicators: Type-(1) pairs compare a type- failure to a subject who is still event-free at . Type-(2) pairs compare a type- failure to a subject who experiences a competing event before (or at) ; such pairs are informative because the competing event precludes a type- failure occurring earlier than .
Define $ (a,b) = I(a>b) + I(a=b) $ to be the concordance score for a pair of risk scores. The (population) cause-specific concordance for event is then In applications, ties in the risk scores are handled directly by . Ties in the observed follow-up times are handled by adopting a deterministic time-ordering convention (event before censoring at the same recorded time) together with partial-credit rules for event–event time ties similar to Section @ref(sec:interface).
Previously, concordance for competing risks in
randomForestSRC was computed by reducing the competing
risks problem to a right-censored problem . For event
,
only subjects with
were retained (censoring or event
),
subjects experiencing competing events were discarded, and Harrell’s
estimator was applied to this subset. This produces a conditional
discrimination measure that ignores comparisons involving subjects who
experience competing events. However, for backwards compatability, we
continue to support this calculation, and revert to the legacy version
if the user choses the option use.uno = FALSE when
fitting a competing risk forest.
The current default is an IPCW estimator that retains both comparison types and corrects for censoring. Let denote the censoring survival function and let be the reverse Kaplan–Meier estimate computed from the data, treating any observed event (of any type) as a failure in the censoring model. As in the survival setting, we evaluate left limits and apply the same ESS-based gating rule of Section @ref(sec:gating) to select and construct stabilized weights .
For notational convenience, define so that, in the ideal IPCW form, and (up to the gating convention in Section @ref(sec:gating)).
For a fixed event type , introduce the index sets (with minor tie adjustments as described earlier). The cause-specific IPCW concordance estimator can be written as where The type-(1) terms compare a type- event to subjects still event-free at and use the squared IPCW weights , mirroring the right-censored IPCW concordance. The type-(2) terms compare a type- event to subjects with earlier competing events and use product weights , corresponding to .
Cite this vignette as
H. Ishwaran, M. Lu, and
U. B. Kogalur. 2026. “Uno IPCW Weighting, Stabilization, and Fast
Computation for Improved Concordance.” http://randomforestsrc.org/articles/uno.html.
@misc{HemantVIMP,
author = "Hemant Ishwaran and Min Lu and Udaya B. Kogalur",
title = {{randomForestSRC}: Uno IPCW Weighting, Stabilization, and Fast Computation for Improved Concordance},
year = {2026},
url = {http://randomforestsrc.org/articles/uno.html},
howpublished = "\url{http://randomforestsrc.org/articles/uno.html}",
note = "[accessed date]"
}