R data.table subsetting within a group and splitting a data table into two -
i have following data.table.
ts,id 1,a 2,a 3,a 4,a 5,a 6,a 7,a 1,b 2,b 3,b 4,b
i want subset data.table two. criteria have approximately first half each group (in case column "id") in 1 data table , remaining in data.table. expected result 2 data.tables follows
ts,id 1,a 2,a 3,a 4,a 1,b 2,b
and
ts,id 5,a 6,a 7,a 3,b 4,b
i tried following,
z1 = x[,.sd[.i < .n/2,],by=dev] z1
and got following
id ts 1 2 3
somehow, .i within .sd isn't working way think should. appreciated. in advance.
.i
gives row locations respect whole data.table. can't used within .sd
.
something like
dt[, subset := seq_len(.n) > .n/2,by='id'] subset1 <- dt[(subset)][,subset:=null] subset2 <- dt[!(subset)][,subset:=null] subset1 # ts id # 1: 4 # 2: 5 # 3: 6 # 4: 7 # 5: 3 b # 6: 4 b subset2 # ts id # 1: 1 # 2: 2 # 3: 3 # 4: 1 b # 5: 2 b
should work
for more 2 groups, use cut
create factor appropriate number of levels
something like
dt[, subset := cut(seq_len(.n), 3, labels= false),by='id'] # copy global environment subset each, # not memory efficient! list2env(setattr(split(dt, dt[['subset']]),'names', paste0('s',1:3)), .globalenv)
Comments
Post a Comment