PPC for Cox model · paul-buerkner/brms#966

(5 留言) (0 反應) (0 負責人)R (1,402 star) (220 fork)batch import

featuregood first issuepost-processing

描述

A posterior predictive check (PPC) for the Cox model would be nice, but it would probably have to be conducted differently than in all other models since the spline estimate of the baseline hazard makes it hard to sample from the response distribution. Section 4.4 ("Standardised survival probabilities") of the preprint by Brilleman et al. (2020) describes a PPC which is feasible in this situation. In my understanding (and only under right-censoring, I guess), this PPC basically compares the Kaplan-Meier estimate of the survival curve (i.e. the estimated CCDF) for the observed data to the posterior CCDFs which are averages of the individual posterior CCDFs. Each individual posterior CCDF is obtained by plugging in the predictor values for one individual from the observed dataset. With the default settings, there would be 4000 posterior CCDFs, so as usual, only a random subset of them will be used for the overlay plot. The resulting plot should be similar to bayesplot::ppc_ecdf_overlay(), but taking the right censoring of the observed data into account and applying the "CCDF = 1 - CDF" transformation (although this transformation is not strictly necessary and would just be a convention from traditional survival analysis).

In principle, the posterior CCDFs could be computed in the generated quantities block, but for better generalizability (e.g. later also CCDF prediction for new data), it would probably be desirable to compute them in R.

The same style of PPC could also be used for the non-Cox time-to-event models already existing in brms (e.g. right-censored log-normal or right-censored Weibull), but with an easier implementation because sampling from the (uncensored) response distribution is easy for these distributions. (Sampling leads to an easier implementation because it doesn't need extra code to compute the posterior CCDFs exactly, but also because it doesn't need individual posterior CCDFs and their averaging.) However, this is a different topic, so I'll open a new issue for that.

貢獻者指南

技術棧: r
領域: data
議題類型: feature
難度: 5
預計時間: over 1 week
活動狀態: stale
清晰度: mostly clear
前置要求: Knowledge of survival analysisFamiliarity with brms packageBayesian inference concepts
新手友善度: 15
研究方向: Review the PPC approach described in Section 4.4 of Brilleman et al. (2020) for standardized survival probabilities. Examine existing PPC implementations in brms for other time to event models to understand the code structure. Consider implementation in R or the generated quantities block in Stan. Focus on the Cox model's baseline hazard spline estimation. The issue suggests starting with right censored data and comparing Kaplan Meier estimates with posterior CCDFs.