Skip to main content

Timothy M. Christensen Publications

Publish Date
Discussion Paper
Abstract

We introduce computationally simple, data-driven procedures for estimation and inference on a structural function h0 and its derivatives in nonparametric models using instrumental variables. Our first procedure is a bootstrap-based, data-driven choice of sieve dimension for sieve nonparametric instrumental variables (NPIV) estimators. When implemented with this data-driven choice, sieve NPIV estimators of h0 and its derivatives are adaptive: they converge at the best possible (i.e., minimax) sup-norm rate, without having to know the smoothness of h0, degree of endogeneity of the regressors, or instrument strength. Our second procedure is a data-driven approach for constructing honest and adaptive uniform confidence bands (UCBs) for h0 and its derivatives. Our data-driven UCBs guarantee coverage for h0 and its derivatives uniformly over a generic class of data-generating processes (honesty) and contract at, or within a logarithmic factor of, the minimax sup-norm rate (adaptivity). As such, our data-driven UCBs deliver asymptotic efficiency gains relative to UCBs constructed via the usual approach of undersmoothing. In addition, both our procedures apply to nonparametric regression as a special case. We use our procedures to estimate and perform inference on a nonparametric gravity equation for the intensive margin of firm exports and nd evidence against common parameterizations of the distribution of unobserved firm productivity.

Abstract

In complicated/nonlinear parametric models, it is generally hard to know whether the model parameters are point identified. We provide computationally attractive procedures to construct confidence sets (CSs) for identified sets of full parameters and of subvectors in models defined through a likelihood or a vector of moment equalities or inequalities. These CSs are based on level sets of optimal sample criterion functions (such as likelihood or optimally-weighted or continuously-updated GMM criterions). The level sets are constructed using cutoffs that are computed via Monte Carlo (MC) simulations directly from the quasi-posterior distributions of the criterions. We establish new Bernstein-von Mises (or Bayesian Wilks) type theorems for the quasi-posterior distributions of the quasi-likelihood ratio (QLR) and profile QLR in partially-identified regular models and some non-regular models. These results imply that our MC CSs have exact asymptotic frequentist coverage for identified sets of full parameters and of subvectors in partially-identified regular models, and have valid but potentially conservative coverage in models with reduced-form parameters on the boundary. Our MC CSs for identified sets of subvectors are shown to have exact asymptotic coverage in models with singularities. We also provide results on uniform validity of our CSs over classes of DGPs that include point and partially identified models. We demonstrate good finite-sample coverage properties of our procedures in two simulation experiments. Finally, our procedures are applied to two non-trivial empirical examples: an airline entry game and a model of trade flows.

Abstract

In complicated/nonlinear parametric models, it is hard to determine whether a parameter of interest is formally point identified. We provide computationally attractive procedures to construct confidence sets (CSs) for identified sets of parameters in econometric models defined through a likelihood or a vector of moments. The CSs for the identified set or for a function of the identified set (such as a subvector) are based on inverting an optimal sample criterion (such as likelihood or continuously updated GMM), where the cutoff values are computed directly from Markov Chain Monte Carlo (MCMC) simulations of a quasi posterior distribution of the criterion. We establish new Bernstein-von Mises type theorems for the posterior distributions of the quasi-likelihood ratio (QLR) and profile QLR statistics in partially identified models, allowing for singularities. These results imply that the MCMC criterion-based CSs have correct frequentist coverage for the identified set as the sample size increases, and that they coincide with Bayesian credible sets based on inverting a LR statistic for point-identified likelihood models. We also show that our MCMC optimal criterion-based CSs are uniformly valid over a class of data generating processes that include both partially- and point- identified models. We demonstrate good finite sample coverage properties of our proposed methods in four non-trivial simulation experiments: missing data, entry game with correlated payoff shocks, Euler equation and finite mixture models.

Abstract

We show that spline and wavelet series regression estimators for weakly dependent regressors attain the optimal uniform (i.e., sup-norm) convergence rate (n/log n)-p/(2p+d) of Stone (1982), where d is the number of regressors and p is the smoothness of the regression function. The optimal rate is achieved even for heavy-tailed martingale difference errors with finite (2 + (d/p))th absolute moment for d/p < 2. We also establish the asymptotic normality of t statistics for possibly nonlinear, irregular functionals of the conditional mean function under weak conditions. The results are proved by deriving a new exponential inequality for sums of weakly dependent random matrices, which is of independent interest.

Abstract

We study the problem of nonparametric regression when the regressor is endogenous, which is an important nonparametric instrumental variables (NPIV) regression in econometrics and a difficult ill-posed inverse problem with unknown operator in statistics. We first establish a general upper bound on the sup-norm (uniform) convergence rate of a sieve estimator, allowing for endogenous regressors and weakly dependent data. This result leads to the optimal sup-norm convergence rates for spline and wavelet least squares regression estimators under weakly dependent data and heavy-tailed error terms. This upper bound also yields the sup-norm convergence rates for sieve NPIV estimators under i.i.d. data: the rates coincide with the known optimal L2-norm rates for severely ill-posed problems, and are power of log(n) slower than the optimal L2-norm rates for mildly ill-posed problems. We then establish the minimax risk lower bound in sup-norm loss, which coincides with our upper bounds on sup-norm rates for the spline and wavelet sieve NPIV estimators. This sup-norm rate optimality provides another justification for the wide application of sieve NPIV estimators. Useful results on weakly-dependent random matrices are also provided.

Abstract

This paper makes several contributions to the literature on the important yet difficult problem of estimating functions nonparametrically using instrumental variables. First, we derive the minimax optimal sup-norm convergence rates for nonparametric instrumental variables (NPIV) estimation of the structural function h0 and its derivatives. Second, we show that a computationally simple sieve NPIV estimator can attain the optimal sup-norm rates for h0 and its derivatives when h0 is approximated via a spline or wavelet sieve. Our optimal sup-norm rates surprisingly coincide with the optimal L2-norm rates for severely ill-posed problems, and are only up to a [log(n)] ε (with ε < 1/2) factor slower than the optimal L2-norm rates for mildly ill-posed problems. Third, we introduce a novel data-driven procedure for choosing the sieve dimension optimally. Our data-driven procedure is sup-norm rate-adaptive: the resulting estimator of h0 and its derivatives converge at their optimal sup-norm rates even though the smoothness of h0 and the degree of ill-posedness of the NPIV model are unknown. Finally, we present two non-trivial applications of the sup-norm rates to inference on nonlinear functionals of h0 under low-level conditions. The first is to derive the asymptotic normality of sieve t-statistics for exact consumer surplus and deadweight loss functionals in nonparametric demand estimation when prices, and possibly incomes, are endogenous. The second is to establish the validity of a sieve score bootstrap for constructing asymptotically exact uniform confidence bands for collections of nonlinear functionals of h0. Both applications provide new and useful tools for empirical research on nonparametric models with endogeneity.

Abstract

This paper makes several important contributions to the literature about nonparametric instrumental variables (NPIV) estimation and inference on a structural function h0 and its functionals. First, we derive sup-norm convergence rates for computationally simple sieve NPIV (series 2SLS) estimators of h0 and its derivatives. Second, we derive a lower bound that describes the best possible (minimax) sup-norm rates of estimating h0 and its derivatives, and show that the sieve NPIV estimator can attain the minimax rates when h0 is approximated via a spline or wavelet sieve. Our optimal sup-norm rates surprisingly coincide with the optimal root-mean-squared rates for severely ill-posed problems, and are only a logarithmic factor slower than the optimal root-mean-squared rates for mildly ill-posed problems. Third, we use our sup-norm rates to establish the uniform Gaussian process strong approximations and the score bootstrap uniform confidence bands (UCBs) for collections of nonlinear functionals of h0 under primitive conditions, allowing for mildly and severely ill-posed problems. Fourth, as applications, we obtain the first asymptotic pointwise and uniform inference results for plug-in sieve t-statistics of exact consumer surplus (CS) and deadweight loss (DL) welfare functionals under low-level conditions when demand is estimated via sieve NPIV. Empiricists could read our real data application of UCBs for exact CS and DL functionals of gasoline demand that reveals interesting patterns and is applicable to other markets.