`find.interaction.rfsrc.Rd`

Find pairwise interactions between variables.

```
find.interaction(object, xvar.names, cause, m.target,
importance = c("permute", "random", "anti",
"permute.ensemble", "random.ensemble", "anti.ensemble"),
method = c("maxsubtree", "vimp"), sorted = TRUE, nvar, nrep = 1,
na.action = c("na.omit", "na.impute", "na.random"),
seed = NULL, do.trace = FALSE, verbose = TRUE, ...)
```

- object
An object of class

`(rfsrc, grow)`

or`(rfsrc, forest)`

.- xvar.names
Character vector of names of target x-variables. Default is to use all variables.

- cause
For competing risk families, integer value between 1 and

`J`

indicating the event of interest, where`J`

is the number of event types. The default is to use the first event type.- m.target
Character value for multivariate families specifying the target outcome to be used. If left unspecified, the algorithm will choose a default target.

- importance
Type of variable importance (VIMP). See

`rfsrc`

for details.- method
Method of analysis: maximal subtree or VIMP. See details below.

- sorted
Should variables be sorted by VIMP? Does not apply for competing risks.

- nvar
Number of variables to be used.

- nrep
Number of Monte Carlo replicates when method="vimp".

- na.action
Action to be taken if the data contains

`NA`

values. Applies only when method="vimp".- seed
Seed for random number generator. Must be a negative integer.

- do.trace
Number of seconds between updates to the user on approximate time to completion.

- verbose
Set to

`TRUE`

for verbose output.- ...
Further arguments passed to or from other methods.

Using a previously grown forest, identify pairwise interactions for all pairs of variables from a specified list. There are two distinct approaches specified by the option method.

method="maxsubtree"

This invokes a maximal subtree analysis. In this case, a matrix is returned where entries [i][i] are the normalized minimal depth of variable [i] relative to the root node (normalized wrt the size of the tree) and entries [i][j] indicate the normalized minimal depth of a variable [j] wrt the maximal subtree for variable [i] (normalized wrt the size of [i]'s maximal subtree). Smaller [i][i] entries indicate predictive variables. Small [i][j] entries having small [i][i] entries are a sign of an interaction between variable i and j (note: the user should scan rows, not columns, for small entries). See Ishwaran et al. (2010, 2011) for more details.

method="vimp"

This invokes a joint-VIMP approach. Two variables are paired and their paired VIMP calculated (refered to as 'Paired' importance). The VIMP for each separate variable is also calculated. The sum of these two values is refered to as 'Additive' importance. A large positive or negative difference between 'Paired' and 'Additive' indicates an association worth pursuing if the univariate VIMP for each of the paired-variables is reasonably large. See Ishwaran (2007) for more details.

Computations might be slow depending upon the size of the data and the forest. In such cases, consider setting nvar to a smaller number. If method="maxsubtree", consider using a smaller number of trees in the original grow call.

If nrep is greater than 1, the analysis is repeated
`nrep`

times and results averaged over the replications
(applies only when method="vimp").

Invisibly, the interaction table (a list for competing risk data) or the maximal subtree matrix.

Ishwaran H. (2007). Variable importance in binary regression
trees and forests, *Electronic J. Statist.*, 1:519-537.

Ishwaran H., Kogalur U.B., Gorodeski E.Z, Minn A.J. and
Lauer M.S. (2010). High-dimensional variable selection for survival
data. *J. Amer. Statist. Assoc.*, 105:205-217.

Ishwaran H., Kogalur U.B., Chen X. and Minn A.J. (2011). Random
survival forests for high-dimensional data. *Statist. Anal. Data
Mining*, 4:115-132.

```
# \donttest{
## ------------------------------------------------------------
## find interactions, survival setting
## ------------------------------------------------------------
data(pbc, package = "randomForestSRC")
pbc.obj <- rfsrc(Surv(days,status) ~ ., pbc, importance = TRUE)
find.interaction(pbc.obj, method = "vimp", nvar = 8)
## ------------------------------------------------------------
## find interactions, competing risks
## ------------------------------------------------------------
data(wihs, package = "randomForestSRC")
wihs.obj <- rfsrc(Surv(time, status) ~ ., wihs, nsplit = 3, ntree = 100,
importance = TRUE)
find.interaction(wihs.obj)
find.interaction(wihs.obj, method = "vimp")
## ------------------------------------------------------------
## find interactions, regression setting
## ------------------------------------------------------------
airq.obj <- rfsrc(Ozone ~ ., data = airquality, importance = TRUE)
find.interaction(airq.obj, method = "vimp", nrep = 3)
find.interaction(airq.obj)
## ------------------------------------------------------------
## find interactions, classification setting
## ------------------------------------------------------------
iris.obj <- rfsrc(Species ~., data = iris, importance = TRUE)
find.interaction(iris.obj, method = "vimp", nrep = 3)
find.interaction(iris.obj)
## ------------------------------------------------------------
## interactions for multivariate mixed forests
## ------------------------------------------------------------
mtcars2 <- mtcars
mtcars2$cyl <- factor(mtcars2$cyl)
mtcars2$carb <- factor(mtcars2$carb, ordered = TRUE)
mv.obj <- rfsrc(cbind(carb, mpg, cyl) ~., data = mtcars2, importance = TRUE)
find.interaction(mv.obj, method = "vimp", outcome.target = "carb")
find.interaction(mv.obj, method = "vimp", outcome.target = "mpg")
find.interaction(mv.obj, method = "vimp", outcome.target = "cyl")
# }
```