Set up the design matrix X as a big.matrix object based on external
massive data file stored on disk that cannot be fullly loaded into memory.
The data file must be a well-formated ASCII-file, and contains only one
single type. Current version only supports double type. Other
restrictions about the data file are described in
biglasso-package. This function reads the massive data, and
creates a big.matrix object. By default, the resulting
big.matrix is file-backed, and can be shared across processors or
nodes of a cluster.
Arguments
- filename
The name of the data file. For example, "dat.txt".
- dir
The directory used to store the binary and descriptor files associated with the
big.matrix. The default is current working directory.- sep
The field separator character. For example, "," for comma-delimited files (the default); "\t" for tab-delimited files.
- backingfile
The binary file associated with the file-backed
big.matrix. By default, its name is the same asfilenamewith the extension replaced by ".bin".- descriptorfile
The descriptor file used for the description of the file-backed
big.matrix. By default, its name is the same asfilenamewith the extension replaced by ".desc".- type
The data type. Only "double" is supported for now.
- ...
Additional arguments that can be passed into function
bigmemory::read.big.matrix().
Value
A big.matrix object corresponding to a file-backed
bigmemory::big.matrix(). It's ready to be used as the design matrix X in
biglasso() and cv.biglasso().
Details
For a data set, this function needs to be called only one time to set up the
big.matrix object with two backing files (.bin, .desc) created in
current working directory. Once set up, the data can be "loaded" into any
(new) R session by calling attach.big.matrix(discriptorfile).
This function is a simple wrapper of bigmemory::read.big.matrix(). See
bigmemory for more
details.