Is it possible to pass samples of unequal size to function boot in R -
i'm writing tutorial bootstrapping in r
. settled on function boot
in boot
package. got book "an introduction bootstrap" efron/tibshirani (1993) , replicate few of examples.
quite in examples, compute statistics based on different samples. instance, have 1 example have sample of 16 mice. 7 of mice received treatment meant prolong survival time after test surgery. remaining 9 mice did not receive treatment. each mouse, number of days survived collected (values given below).
now, want use bootstrapping approach find out if difference of mean significant or not. however, if understand page of boot
correctly, can't pass 2 different samples unequal sample size function. workaround follows:
#load package boot library(boot) #read in survival time in days each mouse treatment <- c(94, 197, 16, 38, 99, 141, 23) control <- c(52, 104, 146, 10, 51, 30, 40, 27, 46) #call boot twice(!) b1 <- boot(data = treatment, statistic = function(x, i) {mean(x[i])}, r = 10000) b2 <- boot(data = control, statistic = function(x, i) {mean(x[i])}, r = 10000) #compute difference of mean manually mean_diff <- b1$t -b2$t
in opinion, solution bit of hack. statistic i'm interested in saved in vector mean_diff
, don't great functionality of boot
package anymore. can't call boot.ci
on mean_diff
, etc.
so question if hack way bootstrap boot
package in r
, statistics compare 2 different samples. or there way?
i thought passing 1 data.frame in 16 rows , additional column "group":
df <- data.frame(survival=c(treatment, control), group=c(rep(1, length(treatment)), rep(2, length(control)))) head(df) survival group 1 94 1 2 197 1 3 16 1 4 38 1 5 99 1 6 141 1
however, have tell boot
has sample 7 observations first 7 rows , 9 observations last 9 rows , treat these separate samples. not know how that.
i've never figured out big advantage of boot is, since easy manually code bootstrap procedures. try example following using replicate
:
myboot1 <- function(){ booty <- tapply(df$survival,df$group,fun=function(x) sample(x,length(x),true)) sapply(booty,mean) } out1 <- replicate(1000,myboot1())
from can bunch of useful statistics quite easily:
rowmeans(out1) # group means diff(rowmeans(out1)) # difference mean(out1[1,]-out1[2,]) # way of getting difference apply(out1,1,quantile,c(0.025,0.975)) # treatment-group cis quantile(out1[1,]-out1[2,],c(0.025,0.975)) # ci difference
Comments
Post a Comment