生存時間解析の基本
パッケージのインストール
install.packages("survival") library(survival)
survivalのデータセットheartを使用します
head(heart) start, stop, event: Entry and exit time and status for this interval of time age: age-48 years year: year of acceptance (in years after 1 Nov 1967) surgery: prior bypass surgery 1=yes#バイパス手術あり transplant: received transplant 1=yes#移植を受けている id: patient id
バイパス手術あり&なしの中央値
pfit <- survfit(Surv(stop-start,event) ~ surgery, heart) Call: survfit(formula = Surv(stop - start, event) ~ surgery, data = heart) n events median 0.95LCL 0.95UCL surgery=0 143 66 102 64 280 surgery=1 29 9 897 322 NA
バイパス手術あり&なしで生存曲線を描く、赤(あり)vs黒(なし)
plot(pfit, col = 1:2)
バイパス手術あり&なしの有意差検定 log-rank test
rho = 0 (デフォルト)this is the log-rank or Mantel-Haenszel test
rho = 1 it is equivalent to the Peto & Peto modification of the Gehan-Wilcoxon test.
diff_surg <- survdiff(Surv(stop-start,event)~surgery , heart, rho = 0) Call: survdiff(formula = Surv(stop - start, event) ~ surgery, data = heart, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V surgery=0 143 66 57.8 1.15 5.17 surgery=1 29 9 17.2 3.87 5.17 Chisq= 5.2 on 1 degrees of freedom, p= 0.02
p値のみを算出
pchisq(diff_surg <- survdiff(Surv(stop-start,event)~surgery , heart, rho = 0)$chisq, 1, lower.tail=FALSE ) [1] 0.02294185
surgeryを1にすれば、全体の曲線と95%信頼区間をプロットする
pfit2 <- survfit(Surv(stop-start,event) ~ 1, heart) plot(pfit2, col = c(1,4,4))
年齢で比較 48歳以上と48歳未満で検定
diff_age1 <- survdiff(Surv(stop-start,event)~ age<0 , heart, rho = 0) Call: survdiff(formula = Surv(stop - start, event) ~ age < 0, data = heart, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V age < 0=FALSE 81 39 32.5 1.283 2.29 age < 0=TRUE 91 36 42.5 0.983 2.29 Chisq= 2.3 on 1 degrees of freedom, p= 0.1
年齢で比較 50歳以上と40以下
sub <- subset(heart, age>=2 | age <= -8) diff_age2 <- survdiff(Surv(stop-start,event)~ age>=2 , sub, rho = 0) Call: survdiff(formula = Surv(stop - start, event) ~ age >= 2, data = sub, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V age >= 2=FALSE 33 10 18.2 3.67 6.82 age >= 2=TRUE 57 31 22.8 2.92 6.82 Chisq= 6.8 on 1 degrees of freedom, p= 0.009
カテゴリー変数がある場合の層別 カテゴリー変数名をAとします。Aには1,2,3,4が含まれており、1と3を比較します
x<-1; y<-3 cat <- subset(heart, A==x | A==y, data=heart ) diff_c <- survdiff(Surv(stop-start,event)~"カテゴリ-変数名" , cat)
COX比例ハザードモデル
cox_age_sur <- coxph(Surv(stop-start,event) ~ age + surgery, data=heart) summary(cox_age_sur) Call: coxph(formula = Surv(stop - start, event) ~ age + surgery, data = heart) n= 172, number of events= 75 coef exp(coef) se(coef) z Pr(>|z|) age 0.02941 1.02985 0.01384 2.125 0.0336 * surgery -0.82104 0.43997 0.35872 -2.289 0.0221 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 exp(coef) exp(-coef) lower .95 upper .95 age 1.03 0.971 1.0023 1.0582 surgery 0.44 2.273 0.2178 0.8887 Concordance= 0.603 (se = 0.039 ) Rsquare= 0.062 (max possible= 0.975 ) Likelihood ratio test= 10.96 on 2 df, p=0.004 Wald test = 9.77 on 2 df, p=0.008 Score (logrank) test = 10.17 on 2 df, p=0.006
生存率の予測
fit_surv <- survfit(cox_age_sur) plot(fit_surv,col = c(2,3,3)) haz50 <- summary(fit_surv) Call: survfit(formula = cox_age_sur) time n.risk n.event survival std.err lower 95% CI upper 95% CI 0.5 172 1 0.995 0.00537 0.9842 1.000 1.0 171 2 0.984 0.00926 0.9659 1.000 2.0 166 3 0.967 0.01314 0.9421 0.994 3.0 160 4 0.945 0.01703 0.9124 0.979 5.0 150 1 0.939 0.01792 0.9050 0.975 6.0 147 2 0.928 0.01962 0.8900 0.967 8.0 144 1 0.922 0.02042 0.8826 0.963 9.0 141 1 0.916 0.02121 0.8750 0.958 10.0 140 1 0.910 0.02197 0.8675 0.954