Inferential summaries for ncvreg
and ncvsurv
objects based on local marginal false discovery rates.
An ncvreg
or ncvsurv
object.
The regularization parameter value at which inference should be reported.
Alternatively, lambda
may be specified by index; which=10
means: report inference for the 10th value of lambda
along the
regularization path. If both lambda
and which
are specified, lambda
takes precedence.
By default, summary
will provide an inferential summary for
each variable that has been selected (i.e., each variable with a nonzero
coefficient). Specifying number=5
, for example, means that the summary
table will include the 5 features with the lowest mfdr values, regardless
of whether they were selected. To see all features, number=Inf
.
Alternatively, specifying for example cutoff=0.3
will report
inference for all features with mfdr under 30%. If both number
and
cutoff
are specified, the intersection between both sets of features is
reported.
Should the results be sorted by mfdr
? (default: TRUE)
For linear regression models, users can supply an estimate of the residual standard deviation. The default is to use RSS / DF, where degrees of freedom are approximated using the number of nonzero coefficients.
Further arguments; in particular, if you have set returnX=FALSE
,
you will need to supply X
and y
in order to calculate local mFDRs.
A summary.ncvreg
object.
Number of digits past the decimal point to print out. Can be a vector specifying different display digits for each of the five non-integer printed values.
An object with S3 class summary.ncvreg
. The class has its own
print method and contains the following list elements:
The penalty used by ncvreg
or ncvsurv
Either "linear"
, "logistic"
, or "Cox"
.
Number of instances.
Number of regression coefficients (not including the intercept).
The lambda
value at which inference is being reported.
The number of nonzero coefficients (again, not including the intercept) at that value of lambda
.
A table containing estimates, normalized test statistics (z), and an estimate of the local mfdr for each coefficient. The mfdr may be loosely interpreted, in an empirical Bayes sense, as the probability that the given feature is null.
If there are any unpenalized coefficients, a separate inferential summary is given for them. Currently, this is based on lm
/glm
/coxph
using the penalized coefficients to provide an offset. This is useful and more or less accurate, but not ideal; we hope to improve the inferential methods for unpenalized variables in the future.
# Linear regression --------------------------------------------------
data(Prostate)
fit <- ncvreg(Prostate$X, Prostate$y)
summary(fit, lambda=0.08)
#> MCP-penalized linear regression with n=97, p=8
#> At lambda=0.0800:
#> -------------------------------------------------
#> Nonzero coefficients : 5
#> Expected nonzero coefficients: 1.90
#> Average mfdr (5 features) : 0.381
#>
#> Estimate z mfdr Selected
#> lcavol 5.263e-01 8.618 < 1e-04 *
#> svi 6.724e-01 3.867 0.016288 *
#> lweight 6.410e-01 3.815 0.019837 *
#> lbph 1.322e-02 1.295 0.926736 *
#> age -6.203e-05 -1.095 0.941368 *
# Logistic regression ------------------------------------------------
data(Heart)
fit <- ncvreg(Heart$X, Heart$y, family="binomial")
summary(fit, lambda=0.05)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0500:
#> -------------------------------------------------
#> Nonzero coefficients : 5
#> Expected nonzero coefficients: 0.18
#> Average mfdr (5 features) : 0.036
#>
#> Estimate z mfdr Selected
#> age 0.047479 6.645 < 1e-04 *
#> famhist 0.611522 4.332 0.0010983 *
#> tobacco 0.037272 3.465 0.0308975 *
#> ldl 0.075045 3.385 0.0403379 *
#> typea 0.009799 3.057 0.1077203 *
# Cox regression -----------------------------------------------------
data(Lung)
fit <- ncvsurv(Lung$X, Lung$y)
summary(fit, lambda=0.1)
#> MCP-penalized Cox regression with n=137, p=8
#> At lambda=0.1000:
#> -------------------------------------------------
#> Nonzero coefficients : 5
#> Expected nonzero coefficients: 2.93
#> Average mfdr (5 features) : 0.586
#>
#> Estimate z mfdr Selected
#> karno -0.03196 -6.352 < 1e-04 *
#> squamous -0.64314 -3.120 0.24761 *
#> adeno 0.28017 2.051 0.83922 *
#> large -0.20731 -1.799 0.89449 *
#> trt 0.03738 1.332 0.94627 *
# Options ------------------------------------------------------------
fit <- ncvreg(Heart$X, Heart$y, family="binomial")
summary(fit, lambda=0.08, number=3)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 3
#> Average mfdr among chosen features : 0.0026
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.771 < 1e-04 *
#> famhist 0.277822 4.627 0.00029409 *
#> ldl 0.009273 3.847 0.00751773 *
summary(fit, lambda=0.08, number=Inf)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 9
#> Average mfdr among chosen features : 0.377
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.7712 < 1e-04 *
#> famhist 0.277822 4.6274 0.00029409 *
#> ldl 0.009273 3.8473 0.00751773 *
#> tobacco 0.005783 3.7881 0.00938405 *
#> typea 0.000000 2.9383 0.13986436
#> adiposity 0.000000 1.8121 0.70188341
#> sbp 0.000000 1.7780 0.71453183
#> alcohol 0.000000 0.7807 0.89965322
#> obesity 0.000000 0.4339 0.91712823
summary(fit, lambda=0.08, cutoff=0.5)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 5
#> Average mfdr among chosen features : 0.0314
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.771 < 1e-04 *
#> famhist 0.277822 4.627 0.00029409 *
#> ldl 0.009273 3.847 0.00751773 *
#> tobacco 0.005783 3.788 0.00938405 *
#> typea 0.000000 2.938 0.13986436
summary(fit, lambda=0.08, number=3, cutoff=0.5)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 3
#> Average mfdr among chosen features : 0.0026
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.771 < 1e-04 *
#> famhist 0.277822 4.627 0.00029409 *
#> ldl 0.009273 3.847 0.00751773 *
summary(fit, lambda=0.08, number=5, cutoff=0.1)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 4
#> Average mfdr among chosen features : 0.0043
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.771 < 1e-04 *
#> famhist 0.277822 4.627 0.00029409 *
#> ldl 0.009273 3.847 0.00751773 *
#> tobacco 0.005783 3.788 0.00938405 *
summary(fit, lambda=0.08, number=Inf, sort=FALSE)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 9
#> Average mfdr among chosen features : 0.377
#>
#> Estimate z mfdr Selected
#> sbp 0.000000 1.7780 0.71453183
#> tobacco 0.005783 3.7881 0.00938405 *
#> ldl 0.009273 3.8473 0.00751773 *
#> adiposity 0.000000 1.8121 0.70188341
#> famhist 0.277822 4.6274 0.00029409 *
#> typea 0.000000 2.9383 0.13986436
#> obesity 0.000000 0.4339 0.91712823
#> alcohol 0.000000 0.7807 0.89965322
#> age 0.041921 7.7712 < 1e-04 *
summary(fit, lambda=0.08, number=3, cutoff=0.5, sort=FALSE)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Features satisfying criteria : 3
#> Average mfdr among chosen features : 0.0026
#>
#> Estimate z mfdr Selected
#> ldl 0.009273 3.847 0.00751773 *
#> famhist 0.277822 4.627 0.00029409 *
#> age 0.041921 7.771 < 1e-04 *
# If X and y are not returned with the fit, they must be supplied
fit <- ncvreg(Heart$X, Heart$y, family="binomial", returnX=FALSE)
summary(fit, X=Heart$X, y=Heart$y, lambda=0.08)
#> MCP-penalized logistic regression with n=462, p=9
#> At lambda=0.0800:
#> -------------------------------------------------
#> Nonzero coefficients : 4
#> Expected nonzero coefficients: 0.02
#> Average mfdr (4 features) : 0.004
#>
#> Estimate z mfdr Selected
#> age 0.041921 7.771 < 1e-04 *
#> famhist 0.277822 4.627 0.00029409 *
#> ldl 0.009273 3.847 0.00751773 *
#> tobacco 0.005783 3.788 0.00938405 *