Predict method for plmm class
Arguments
- object
An object of class
plmm.- newX
Matrix of values at which predictions are to be made (not used for
type= "coefficients", "vars", or "nvars"). This can be either a filebackedbig.matrixor a matrix object. Note: Columns of this argument must be named!- type
A character argument indicating what type of prediction should be returned. Options are "lp," "coefficients," "vars," "nvars," and "blup." See details.
- X
Optional: if
type = 'blup'and the model was fit in-memory, the design matrix used to fit the model represented inobjectmust be supplied. When supplied, this design matrix will be standardized using the center/scale values inobject$std_X_details, so please do not standardize this matrix before supplying here. Note: If the model was fit file-backed, then the filepath to the.bkfile with this standardized design matrix is returned asstd_Xin the fit supplied toobject.- lambda
A numeric vector of regularization parameter
lambdavalues at which predictions are requested.- idx
Vector of indices of regularization parameter
lambdaat which predictions are requested. By default, all indices are returned.- ...
Additional optional arguments
Details
The options for type are as follows:
lp(linear predictor): uses the product ofnewXand the beta coefficients ofobjectto predict new values of the outcome. This does not incorporate the correlation structure of the data.blup(default, acronym for Best Linear Unbiased Predictor): adds to thelpa value that represents the estimated random effect. This addition is a way of incorporating the estimated correlation structure of the data into our prediction of the outcome.coefficients: returns the estimated beta coefficients.vars: returns the indices of variables (e.g., SNPs) with nonzero coefficients at each value of lambda. EXCLUDES intercept.nvars: returns the number of variables (e.g., SNPs) with nonzero coefficients at each value of lambda. EXCLUDES intercept.
Examples
set.seed(123)
train_idx <- sample(1:nrow(admix$X), 100)
# Note: ^ shuffling is important here! Keeps test and train groups comparable.
train <- list(X = admix$X[train_idx,], y = admix$y[train_idx])
train_design <- create_design(X = train$X, y = train$y)
test <- list(X = admix$X[-train_idx,], y = admix$y[-train_idx])
fit <- plmm(design = train_design)
# make predictions for all lambda values
pred1 <- predict(object = fit, newX = test$X, type = "lp")
pred2 <- predict(object = fit, newX = test$X, type = "blup", X = train$X)
# look at mean squared prediction error
mspe <- apply(pred1, 2, function(c){crossprod(test$y - c)/length(c)})
min(mspe)
#> [1] 2.813242
mspe_blup <- apply(pred2, 2, function(c){crossprod(test$y - c)/length(c)})
min(mspe_blup) # BLUP is better
#> [1] 2.126016
# compare the MSPE of our model to a null model, for reference
# null model = intercept only -> y_hat is always mean(y)
crossprod(mean(test$y) - test$y)/length(test$y)
#> [,1]
#> [1,] 6.381748