Skip to contents

A function to perform checks on passed objects before model fitting.

Usage

plmm_checks(
  design,
  K = NULL,
  eta = NULL,
  penalty = "lasso",
  init = NULL,
  gamma,
  alpha = 1,
  dfmax = NULL,
  trace = FALSE,
  save_rds = NULL,
  return_fit = TRUE,
  ...
)

Arguments

design

The design object, as created by create_design()

K

Similarity matrix used to rotate the data. This should either be (1) a known matrix that reflects the covariance of y, (2) an estimate (Default is \(\frac{1}{p}(XX^T)\)), or (3) a list with components s and U, as returned by a previous plmm() model fit on the same data.

eta

Optional argument to input a specific eta term rather than estimate it from the data. If K is a known covariance matrix that is full rank, this should be 1.

penalty

The penalty to be applied to the model. Either "MCP" (the default), "SCAD", or "lasso".

init

Initial values for coefficients. Default is 0 for all columns of X.

gamma

The tuning parameter of the MCP/SCAD penalty (see details). Default is 3 for MCP and 3.7 for SCAD.

alpha

Tuning parameter for the Mnet estimator which controls the relative contributions from the MCP/SCAD penalty and the ridge, or L2 penalty. alpha = 1 is equivalent to MCP/SCAD penalty, while alpha = 0 would be equivalent to ridge regression. However, alpha = 0 is not supported; alpha may be arbitrarily small, but not exactly 0.

dfmax

Maximum number of non-zero coefficients that may enter the model. Default is NULL (no maximum)

trace

If set to TRUE, inform the user of progress by announcing the beginning of each step of the modeling process. Default is FALSE.

save_rds

Optional: if a filepath and name is specified (e.g., save_rds = "~/dir/my_results.rds"), then the model results are saved to the provided location. Defaults to NULL, which does not save the result.

return_fit

Optional: a logical value indicating whether the fitted model should be returned as a plmm object in the current (assumed interactive) session. Defaults to TRUE.

...

Additional arguments to get_data()

Value

A list which includes 16 items:

  • std_X: The standardized design matrix. If design matrix is filebacked, the descriptor for the filebacked data is returned using bigmemory::describe().

  • std_X_details: Metadata for std_X.

  • std_X_n: Number of rows in std_X.

  • std_X_p: Number of columns in std_X.

  • y: Original outcome vector.

  • y_name: Variable name of y.

  • centered_y: The centered outcome vector.

  • K: The relationship matrix (as passed by plmm(), may be NULL)

  • eta: Estimated proportion of the variance in the outcome attributable to population/correlation structure (as passed by plmm(), may be NULL)

  • fbm_flag: Logical, is std_X filebacked?

  • plink_flag: Logical, does std_X originate from PLINK files?

  • penalty: A character string indicating the penalty type.

  • gamma: Tuning parameter for the SCAD or MCP penalties.

  • init: Initialized values for beta coefficients.

  • dfmax: Maximum number of non-zero coefficients that may enter the model.

  • n: Number of rows in the original design matrix prior to standardization procedures.

  • p: Number of columns in the original design matrix prior to standardization procedures.