Using the survival tree from the "rpart" package in R to predict new observations

I am trying to use the "rpart" package in R to build a survival tree, and I hope to use this tree to then make predictions for other observations.

I know that there were many SO questions related to rpart and prediction; however, I could not find any address for the problem that (I think) is specific to using rpart with the "Surv" object.

My particular problem is with the interpretation of the results of the predict function. An example is useful:

library(rpart)
library(OIsurv)

# Make Data:
set.seed(4)
dat = data.frame(X1 = sample(x = c(1,2,3,4,5), size = 1000, replace=T))
dat$t = rexp(1000, rate=dat$X1)
dat$t = dat$t / max(dat$t)
dat$e = rbinom(n = 1000, size = 1, prob = 1-dat$t )

# Survival Fit:
sfit = survfit(Surv(t, event = e) ~ 1, data=dat)
plot(sfit)

# Tree Fit:
tfit = rpart(formula = Surv(t, event = e) ~ X1 , data = dat, control=rpart.control(minsplit=30, cp=0.01))
plot(tfit); text(tfit)

# Survival Fit, Broken by Node in Tree:
dat$node = as.factor(tfit$where)
plot( survfit(Surv(dat$t, event = dat$e)~dat$node) )

. , , , rpart . , , , predict(tfit), , . , , predict(fit)[1] .46, , P(s) = exp(−λt), λ=.46.

, . ( ) , / . (EDIT: , , , - /, . , ).

, ...

# Predict:
# an attempt to use the rates extracted from the tree to
# capture the survival curve formula in each tree node.
rates = unique(predict(tfit))
for (rate in rates) {
  grid= seq(0,1,length.out = 100)
  lines(x= grid, y= exp(-rate*(grid)), col=2)
}

plot

, , , , survfit . . , ( ) "rate" ( ) .

, , : , X - .

, , , , rpart/survival . - (1) rpart (2) ?

+4
1

, node 1.000. , predict(), node, . . 8.4 vignette("longintro", package = "rpart"). , -, , , rpart.

- , rpart constparty, partykit:

library("partykit")
(tfit2 <- as.party(tfit))
## Model formula:
## Surv(t, event = e) ~ X1
## 
## Fitted party:
## [1] root
## |   [2] X1 < 2.5
## |   |   [3] X1 < 1.5: 0.192 (n = 213)
## |   |   [4] X1 >= 1.5: 0.082 (n = 213)
## |   [5] X1 >= 2.5: 0.037 (n = 574)
## 
## Number of inner nodes:    2
## Number of terminal nodes: 3
##
plot(tfit2)

survival tree

-. predict(), type "response" "prob" .

predict(tfit2, type = "response")[1]
##          5 
## 0.03671885 
predict(tfit2, type = "prob")[[1]]
## Call: survfit(formula = y ~ 1, weights = w, subset = w > 0)
## 
##  records    n.max  n.start   events   median  0.95LCL  0.95UCL 
## 574.0000 574.0000 574.0000 542.0000   0.0367   0.0323   0.0408 

rpart ctree() ( ) mob() partykit.

+4

All Articles