Set up the design matrix X as a big.matrix
object based on external
massive data file stored on disk that cannot be fullly loaded into memory.
The data file must be a well-formated ASCII-file, and contains only one
single type. Current version only supports double
type. Other
restrictions about the data file are described in
biglasso-package. This function reads the massive data, and
creates a big.matrix
object. By default, the resulting
big.matrix
is file-backed, and can be shared across processors or
nodes of a cluster.
Arguments
- filename
The name of the data file. For example, "dat.txt".
- dir
The directory used to store the binary and descriptor files associated with the
big.matrix
. The default is current working directory.- sep
The field separator character. For example, "," for comma-delimited files (the default); "\t" for tab-delimited files.
- backingfile
The binary file associated with the file-backed
big.matrix
. By default, its name is the same asfilename
with the extension replaced by ".bin".- descriptorfile
The descriptor file used for the description of the file-backed
big.matrix
. By default, its name is the same asfilename
with the extension replaced by ".desc".- type
The data type. Only "double" is supported for now.
- ...
Additional arguments that can be passed into function
bigmemory::read.big.matrix()
.
Value
A big.matrix
object corresponding to a file-backed
bigmemory::big.matrix()
. It's ready to be used as the design matrix X
in
biglasso()
and cv.biglasso()
.
Details
For a data set, this function needs to be called only one time to set up the
big.matrix
object with two backing files (.bin, .desc) created in
current working directory. Once set up, the data can be "loaded" into any
(new) R session by calling attach.big.matrix(discriptorfile)
.
This function is a simple wrapper of bigmemory::read.big.matrix()
. See
bigmemory for more
details.