performance - R Optimise a while loop nested in a for loop to introduce missing values in a dataframe -
i have (70 rows x 4 columns) dataframe (data) has 10% of nas. dataframe has no more 1 na per row. dataset, produce 10 dataframes 60% of nas. not want have entirely empty (=all-na) rows. made while loop nested loop. code working taking long time run. need run loop many datasets know if there easy way improve it.
my dataframe looks that:
library(missforest) data<-iris[1:70,1:4] for(i in 1:28){ data[i,]<-prodna(data[i,],nona =0.25) }
and here loop:
missing.data<-list() for(j in 1:10){ missing.data[[j]]<-prodna(data, nona = 0.6) while(sum(rowsums(is.na(missing.data[[j]]))==4)!=0) { missing.data[[j]]<-prodna(data, nona = 0.6) } }
edit: loop becomes slow nona > 0.55 unfortunately need introduce 60% of na's.. also, na's introduced in loop introduced @ random, can "replace" na's in original dataframe (data).
i not sure if looking for:
library(missforest) data1<-iris[1:70,1:4] for(i in 1:28){ data1[i,]<-prodna(mydata[i,],nona =0.10) } table(is.na(data1)) n<-10 data2<-do.call("rbind", replicate(n, data1, simplify=false)) table(is.na(data2)) data3<-prodna(data2,nona=0.55) > table(is.na(data3)) false true 1133 1667
Comments
Post a Comment