Skip to contents

Constructs projection based test statistics that can be used to control FDR along with intervals for a penalized regression model.

Usage

intervals(
  fit,
  lambda,
  sigma,
  level = 0.95,
  posterior = TRUE,
  relaxed = FALSE,
  adjust_projection = FALSE,
  X = NULL
)

Arguments

fit

An optional fit of class ncvreg or cv.ncvreg. If supplied, X should only be supplied if fit does not contain it.

lambda

The penalty at which the tests and intervals are to be constructed. If left unspecified, will be selected using cross validation.

sigma

Standard deviation estimate used to compute test statistic and intervals. If left unspecified (default) it will be estimated using the recommendation from Reid et al. (2016)

level

the confidence level required.

posterior

whether the intervals returned should be posterior intervals (default) or debiased intervals (if FALSE). Posterior intervals are constructed from distributions where the coefficient estimates are the is the posterior mode. Debiased intervals are constructed around the estimates.

relaxed

whether the relaxed lasso based statistic / intervals should be used. Default is FALSE in which case PIPE based intervals are constructed (recommended). This affects the estimate.

adjust_projection

whether a Local Quadratic Approximation should be used in the projection for determining variance of the PIPE based test statistic. Default is FALSE as more research has been done without this adjustment, however without this adjustment, statistics and intervals may be over conservative in the presence of correlation. This affects the SE.

X

The original design matrix supplied to fit. Required if fit does not contain X.

Value

An data.frame containing the following columns:

variable

colnames(X)

coef

The original estimates at the specified parameters (lambda, gamma, alpha)

estimate

The debiased estimates.

SE

The standard errors.

t

The PIPE / Relaxed Lasso / LQA test statistics

lower

Interval lower bounds

upper

Intervals upper bounds

p.value

The unadjusted p-value

p.adjust

The Benhamini and Hochberg corrected p-value

penalty

The penalty used.

lambda

The lambda value the test statistics and intervals were constructed at.

gamma

The gamma value the test statistics and intervals were constructed at (for MCP/SCAD).

alpha

The alpha value the test statistics and intervals were constructed at.

level

The confidence level set for interval construction.

sigma

The standard deviation used for constructing the test statistis and intervals.

Details

The function constructs test statistics and intervals based off an approximate projection onto the column space of the active features. The test statistic can be used to control FDR and the intervals generally have good coverage. However, both tend to be conservative with the introduction of correlation.

The intervals produced can either be biased (like the point estimates) or debiased by setting the parameter posterior accordingly. The resulting behavior is quite different. See references for more details.

References

Harris L and Breheny P. (2025) A new perspective on high dimensional confidence intervals. arXiv preprint, arXiv:2508.03504. https://arxiv.org/abs/2508.03504

Harris L and Breheny P. (2025) Alternative Likelihood Approximations for High-Dimensional Intervals for Lasso. arXiv preprint, arXiv:2509.14971. https://arxiv.org/abs/2509.14971

Dai B. (2019) Projection-based inference and model selection for penalized regression. PhD dissertation, University of Iowa, Iowa City, IA. doi:10.17077/etd.005250

Author

Logan Harris, Patrick Breheny, and Biyue Dai

Examples

# Linear regression (SCAD-Net penalty, PIPE intervals, pass ncvreg object)
fit <- ncvreg(Prostate$X, Prostate$y, penalty = "SCAD", alpha = 0.9)
intervals(fit) |> head()
#>         variable        coef    estimate         SE         t       lower
#> lcavol    lcavol  0.56705170  0.56823231 0.08536950  6.656151  0.39993486
#> lweight  lweight  0.61396251  0.61523624 0.19734546  3.117560  0.22673738
#> age          age -0.02075306 -0.02079495 0.01091667 -1.904881 -0.04210383
#> lbph        lbph  0.09698225  0.09717897 0.05726386  1.697039 -0.01541831
#> svi          svi  0.74938522  0.75104331 0.23685484  3.170901  0.28524770
#> lcp          lcp -0.10234574 -0.10256268 0.08884987 -1.154337 -0.27616791
#>                upper      p.value     p.adjust penalty    lambda gamma alpha
#> lcavol  0.7342398444 2.810907e-11 2.248726e-10    SCAD 0.0201901   3.7   0.9
#> lweight 1.0003747644 1.823550e-03 4.862801e-03    SCAD 0.0201901   3.7   0.9
#> age     0.0008049362 5.679554e-02 1.135911e-01    SCAD 0.0201901   3.7   0.9
#> lbph    0.2089392044 8.968937e-02 1.435030e-01    SCAD 0.0201901   3.7   0.9
#> svi     1.2132756360 1.519668e-03 4.862801e-03    SCAD 0.0201901   3.7   0.9
#> lcp     0.0714558667 2.483621e-01 2.838424e-01    SCAD 0.0201901   3.7   0.9
#>         level    sigma
#> lcavol   0.95 0.692084
#> lweight  0.95 0.692084
#> age      0.95 0.692084
#> lbph     0.95 0.692084
#> svi      0.95 0.692084
#> lcp      0.95 0.692084

# Logistic regression (lasso penalty, LQA intervals, pass cv.ncvreg object) 
data(Heart)
cv_fit <- cv.ncvreg(Heart$X, Heart$y, family="binomial", penalty = "lasso")
intervals(cv_fit, adjust_projection = TRUE) |> head()
#>            variable        coef    estimate         SE         t        lower
#> sbp             sbp 0.004591841 0.006691921 0.00542863 1.2327090 -0.004042523
#> tobacco     tobacco 0.072039912 0.081846321 0.02488849 3.2885212  0.023402587
#> ldl             ldl 0.153354310 0.175073685 0.05424203 3.2276386  0.047449143
#> adiposity adiposity 0.000000000 0.004350699 0.01881224 0.2312696 -0.029110113
#> famhist     famhist 0.828918109 0.917515358 0.22125877 4.1467977  0.395299160
#> typea         typea 0.031161890 0.036060712 0.01169760 3.0827432  0.008391284
#>                upper      p.value     p.adjust penalty      lambda gamma alpha
#> sbp       0.01537976 2.176844e-01 0.3244134078   lasso 0.008236943     3     1
#> tobacco   0.12081978 1.007152e-03 0.0028083724   lasso 0.008236943     3     1
#> ldl       0.25967645 1.248165e-03 0.0028083724   lasso 0.008236943     3     1
#> adiposity 0.03657256 8.171053e-01 0.8810252658   lasso 0.008236943     3     1
#> famhist   1.26256560 3.371576e-05 0.0001517209   lasso 0.008236943     3     1
#> typea     0.05409455 2.051020e-03 0.0036918367   lasso 0.008236943     3     1
#>           level sigma
#> sbp        0.95    NA
#> tobacco    0.95    NA
#> ldl        0.95    NA
#> adiposity  0.95    NA
#> famhist    0.95    NA
#> typea      0.95    NA