knime - How can I get all the data from separated large files in R Revolution Enterprise? -
i'm using revor entreprise handle impoting large data files. example given in documentation states 10 files (1000000 rows each) imported dataset using rximport loop :
setwd("c:/users/fsociety/bigdatasamples") data.directory <- "c:/users/fsociety/bigdatasamples" data.file <- file.path(data.directory,"mortdefault") mortxdffilename <- "mortdefault.xdf" append <- "none" for(i in 2000:2009){ importfile <- paste(data.file,i,".csv",sep="") mortxdf <- rximport(importfile, mortxdffilename, append = append, overwrite = true, maxrowsbycols = null) append <- "rows" } mortxdfdata <- rxxdfdata(mortxdffilename) knime.out <- rxxdftodataframe(mortxdfdata)
the issue here 500000 rows in dataset due maxrowsbycols
argument default 1e+06
changed higher value , null
still truncates data file.
since importing xdf
maxrowsbycols
doesn't matter. also, on last line read data.frame
, sort of defeats purpose of using xdf
in first place.
this code work me on data http://packages.revolutionanalytics.com/datasets/mortdefault.zip, assume using.
the 500k rows due rowsperread
argument, determines block size. of data read in, in 500k increments, can changed match needs.
setwd("c:/users/fsociety/bigdatasamples") data.directory <- "c:/users/fsociety/bigdatasamples" data.file <- file.path(data.directory, "mortdefault") mortxdffilename <- "mortdefault.xdf" append <- "none" overwrite <- true for(i in 2000:2009){ importfile <- paste(data.file, i, ".csv", sep="") rximport(importfile, mortxdffilename, append=append, overwrite = true) append <- "rows" overwrite <- false } rxgetinfo(mortxdfdata, getblocksizes = true) # file name: c:\users\dnorton\onedrive\r\marchmadness2016\mortdefault.xdf # number of observations: 1e+07 # number of variables: 6 # number of blocks: 20 # rows per block (first 10): 5e+05 5e+05 5e+05 5e+05 5e+05 5e+05 5e+05 5e+05 5e+05 5e+05 # compression type: zlib
Comments
Post a Comment