Skip to main content

Xiaohong Chen Publications

Publish Date
Abstract

This paper studies nonparametric estimation of conditional moment models in which the generalized residual functions can be nonsmooth in the unknown functions of endogenous variables. This is a nonparametric nonlinear instrumental variables (IV) problem. We propose a class of penalized sieve minimum distance (PSMD) estimators which are minimizers of a penalized empirical minimum distance criterion over a collection of sieve spaces that are dense in the infinite dimensional function parameter space. Some of the PSMD procedures use slowly growing finite dimensional sieves with flexible penalties or without any penalty; some use large dimensional sieves with lower semicompact and/or convex penalties. We establish their consistency and the convergence rates in Banach space norms (such as a sup-norm or a root mean squared norm), allowing for possibly non-compact infinite dimensional parameter spaces. For both mildly and severely ill-posed nonlinear inverse problems, our convergence rates in Hilbert space norms (such as a root mean squared norm) achieve the known minimax optimal rate for the nonparametric mean IV regression. We illustrate the theory with a nonparametric additive quantile IV regression. We present a simulation study and an empirical application of estimating nonparametric quantile IV Engel curves.

Abstract

This paper studies nonparametric estimation of conditional moment restrictions in which the generalized residual functions can be nonsmooth in the unknown functions of endogenous variables. This is a nonparametric nonlinear instrumental variables (IV) problem. We propose a class of penalized sieve minimum distance (PSMD) estimators, which are minimizers of a penalized empirical minimum distance criterion over a collection of sieve spaces that are dense in the infinite dimensional function parameter space. Some of the PSMD procedures use slowly growing finite dimensional sieves with flexible penalties or without any penalty; others use large dimensional sieves with lower semicompact and/or convex penalties. We establish their consistency and the convergence rates in Banach space norms (such as a sup-norm or a root mean squared norm), allowing for possibly non-compact infinite dimensional parameter spaces. For both mildly and severely ill-posed nonlinear inverse problems, our convergence rates in Hilbert space norms (such as a root mean squared norm) achieve the known minimax optimal rate for the nonparametric mean IV regression. We illustrate the theory with a nonparametric additive quantile IV regression. We present a simulation study and an empirical application of estimating nonparametric quantile IV Engel curves.

Abstract

This paper studies nonparametric estimation of conditional moment models in which the residual functions could be nonsmooth with respect to the unknown functions of endogenous variables. It is a problem of nonparametric nonlinear instrumental variables (IV) estimation, and a difficult nonlinear ill-posed inverse problem with an unknown operator. We first propose a penalized sieve minimum distance (SMD) estimator of the unknown functions that are identified via the conditional moment models. We then establish its consistency and convergence rate (in strong metric), allowing for possibly non-compact function parameter spaces, possibly non-compact finite or infinite dimensional sieves with flexible lower semicompact or convex penalty, or finite dimensional linear sieves without penalty. Under relatively low-level sufficient conditions, and for both mildly and severely ill-posed problems, we show that the convergence rates for the nonlinear ill-posed inverse problems coincide with the known minimax optimal rates for the nonparametric mean IV regression. We illustrate the theory by two important applications: root-n asymptotic normality of the plug-in penalized SMD estimator of a weighted average derivative of a nonparametric nonlinear IV regression, and the convergence rate of a nonparametric additive quantile IV regression. We also present a simulation study and an empirical estimation of a system of nonparametric quantile IV Engel curves.

Abstract

We study semiparametric efficiency bounds and efficient estimation of parameters defined through general nonlinear, possibly non-smooth and over-identified moment restrictions, where the sampling information consists of a primary sample and an auxiliary sample. The variables of interest in the moment conditions are not directly observable in the primary data set, but the primary data set contains proxy variables which are correlated with the variables of interest. The auxiliary data set contains information about the conditional distribution of the variables of interest given the proxy variables. Identification is achieved by the assumption that this conditional distribution is the same in both the primary and auxiliary data sets. We provide semiparametric efficiency bounds for both the “verify-out-of-sample” case, where the two samples are independent, and the “verify-in-sample” case, where the auxiliary sample is a subset of the primary sample; and the bounds are derived when the propensity score is unknown, or known, or belongs to a correctly specified parametric family. These efficiency variance bounds indicate that the propensity score is ancillary for the “verify-in-sample” case, but is not ancillary for the “verify-out-of-sample” case. We show that sieve conditional expectation projection based GMM estimators achieve the semiparametric efficiency bounds for all the above mentioned cases, and establish their asymptotic efficiency under mild regularity conditions. Although inverse probability weighting based GMM estimators are also shown to be semiparametrically efficient, they need stronger regularity conditions and clever combinations of nonparametric and parametric estimates of the propensity score to achieve the efficiency bounds for various cases. Our results contribute to the literature on non-classical measurement error models, missing data and treatment effects.

Abstract

For semi/nonparametric conditional moment models containing unknown parametric components (theta) and unknown functions of endogenous variables (h), Newey and Powell (2003) and Ai and Chen (2003) propose sieve minimum distance (SMD) estimation of (theta, h) and derive the large sample properties. This paper greatly extends their results by establishing the followings: (1) The penalized SMD (PSMD) estimator (hat{theta}, hat{h}) can simultaneously achieve root-n asymptotic normality of theta hat and nonparametric optimal convergence rate of hat{h}, allowing for models with possibly nonsmooth residuals and/or noncompact infinite dimensional parameter spaces. (2) A simple weighted bootstrap procedure can consistently estimate the limiting distribution of the PSMD hat{theta}. (3) The semiparametric efficiency bound results of Ai and Chen (2003) remain valid for conditional models with nonsmooth residuals, and the optimally weighted PSMD estimator achieves the bounds. (4) The profiled optimally weighted PSMD criterion is asymptotically Chi-square distributed, which implies an alternative consistent estimation of confidence region of the efficient PSMD estimator of theta. All the theoretical results are stated in terms of any consistent nonparametric estimator of conditional mean functions. We illustrate our general theories using a partially linear quantile instrumental variables regression, a Monte Carlo study, and an empirical estimation of the shape-invariant quantile Engel curves with endogenous total expenditure.

Abstract

This paper considers semiparametric efficient estimation of conditional moment models with possibly nonsmooth residuals in unknown parametric components (theta) and unknown functions (h) of endogenous variables. We show that: (1) the penalized sieve minimum distance (PSMD) estimator (theta\hat,h\hat) can simultaneously achieve root-n asymptotic normality of theta\hat and nonparametric optimal convergence rate of h\hat, allowing for noncompact function parameter spaces; (2) a simple weighted bootstrap procedure consistently estimates the limiting distribution of the PSMD theta\hat; (3) the semiparametric efficiency bound formula of Ai and Chen (2003) remains valid for conditional models with nonsmooth residuals, and the optimally weighted PSMD estimator achieves the bound; (4) the centered, profiled optimally weighted PSMD criterion is asymptotically chi-square distributed. We illustrate our theories using a partially linear quantile instrumental variables (IV) regression, a Monte Carlo study, and an empirical estimation of the shape-invariant quantile IV Engel curves.

Abstract

In this paper, we clarify the relations between the existing sets of regularity conditions for convergence rates of nonparametric indirect regression (NPIR) and nonparametric instrumental variables (NPIV) regression models. We establish minimax risk lower bounds in mean integrated squared error loss for the NPIR and the NPIV models under two basic regularity conditions that allow for both mildly ill-posed and severely ill-posed cases. We show that both a simple projection estimator for the NPIR model, and a sieve minimum distance estimator for the NPIV model, can achieve the minimax risk lower bounds, and are rate-optimal uniformly over a large class of structure functions, allowing for mildly ill-posed and severely ill-posed cases.

Econometric Theory
Abstract

In this paper, we clarify the relations between the existing sets of regularity conditions for convergence rates of nonparametric indirect regression (NPIR) and nonparametric instrumental variables (NPIV) regression models. We establish minimax risk lower bounds in mean integrated squared error loss for the NPIR and the NPIV models under two basic regularity conditions that allow for both mildly ill-posed and severely ill-posed cases. We show that both a simple projection estimator for the NPIR model, and a sieve minimum distance estimator for the NPIV model, can achieve the minimax risk lower bounds, and are rate-optimal uniformly over a large class of structure functions, allowing for mildly ill-posed and severely ill-posed cases.

Keywords: Nonparametric instrumental regression, Nonparametric indirect regression, Statistical ill-posed inverse problems, Minimax risk lower bound, Optimal rate

JEL Classifications: C14, C30

Abstract

This paper considers identification and inference of a general latent nonlinear model using two samples, where a covariate contains arbitrary measurement errors in both samples, and neither sample contains an accurate measurement of the corresponding true variable. The primary sample consists of some dependent variables, some error-free covariates and an error-ridden covariate, where the measurement error has unknown distribution and could be arbitrarily correlated with the latent true values. The auxiliary sample consists of another noisy measurement of the mismeasured covariate and some error-free covariates. We first show that a general latent nonlinear model is nonparametrically identified using the two samples when both could have nonclassical errors, with no requirement of instrumental variables nor independence between the two samples. When the two samples are independent and the latent nonlinear model is parameterized, we propose sieve quasi maximum likelihood estimation (MLE) for the parameter of interest, and establish its root-n consistency and asymptotic normality under possible misspecification, and its semiparametric efficiency under correct specification. We also provide a sieve likelihood ratio model selection test to compare two possibly misspecified parametric latent models. A small Monte Carlo simulation and an empirical example are presented.