Set up the design matrix X as a big.matrix object based on external massive data file stored on disk that cannot be fullly loaded into memory. The data file must be a well-formated ASCII-file, and contains only one single type. Current version only supports double type. Other restrictions about the data file are described in biglasso-package. This function reads the massive data, and creates a big.matrix object. By default, the resulting big.matrix is file-backed, and can be shared across processors or nodes of a cluster.

setupX(
  filename,
  dir = getwd(),
  sep = ",",
  backingfile = paste0(unlist(strsplit(filename, split = "\\."))[1], ".bin"),
  descriptorfile = paste0(unlist(strsplit(filename, split = "\\."))[1], ".desc"),
  type = "double",
  ...
)

Arguments

filename

The name of the data file. For example, "dat.txt".

dir

The directory used to store the binary and descriptor files associated with the big.matrix. The default is current working directory.

sep

The field separator character. For example, "," for comma-delimited files (the default); "\t" for tab-delimited files.

backingfile

The binary file associated with the file-backed big.matrix. By default, its name is the same as filename with the extension replaced by ".bin".

descriptorfile

The descriptor file used for the description of the file-backed big.matrix. By default, its name is the same as filename with the extension replaced by ".desc".

type

The data type. Only "double" is supported for now.

...

Additional arguments that can be passed into function bigmemory::read.big.matrix().

Value

A big.matrix object corresponding to a file-backed bigmemory::big.matrix(). It's ready to be used as the design matrix X in biglasso() and cv.biglasso().

Details

For a data set, this function needs to be called only one time to set up the big.matrix object with two backing files (.bin, .desc) created in current working directory. Once set up, the data can be "loaded" into any (new) R session by calling attach.big.matrix(discriptorfile).

This function is a simple wrapper of bigmemory::read.big.matrix(). See bigmemory for more details.

See also

Author

Yaohui Zeng and Patrick Breheny

Examples

## see the example in "biglasso-package"