r - data frame splitting based on specific conditions -
consider data frame having 1200 records , 30 variables. want divide data frame 6 sample each sample size of 200. far tried following r code:
createsample<-function(df) { totalsample<-ceiling((nrow(df)/200)) samplesize=200 for(i in 1:totalsample) { ## user should have define file name , start & end row file <-'demo.csv' start <- (i-1)*samplesize end <- (i*samplesize) function1(file,start,end) ## call function again control reaches here } } createsample(rawdata) ## function call
above code result unbound error, because can’t access first records 0 index value, instead in r can access first records index value 1.
expectation is: in first iteration of loop want access 1-200 records. in next iteration want access 201-400 records. till total 6 time repetition, because loop execute total of 6 times. reading data frame want start , end value should dynamically change in each iteration.. example: in first iteration start<-1 end<-200 in second iteration: start<-201 end<-400 , on... in advance….
as don't know second function mentioned in op's post, can skip part , instead split dataset list
of data.frame
s have each n rows (i.e. 200. last list
element have remaining rows if nrow
of dataset not multiple of n).
createsample <- function(df, n, sample=false){ seqn <- seq_len(nrow(df)) g1 <- (seqn-1)%/%n +1 start <- unname(tapply(seqn, g1, head, 1)) end <- unname(tapply(seqn, g1, tail, 1)) if(sample){ g1 <- sample(g1) } list(splitdat=lapply(split(seqn, g1), function(i) df[i,]), start=start, end=end) } createsample(yourdat, 200) createsample(yourdat, 200, true)
note: added option randomly sample
observations in function.
Comments
Post a Comment