This note shows that the mixed normal asymptotic limit of the trend IV estimator with a fixed number of deterministic instruments (fTIV) holds in both singular (multicointegrated) and nonsingular cointegration systems, thereby relaxing the exogeneity condition in (Phillips and Kheifets, 2024, Theorem 1(ii)). The mixed normality of the limiting distribution of fTIV allows for asymptotically pivotal F tests about the cointegration parameters and for simple efficiency comparisons of the estimators for different numbers K of instruments, as well as comparisons with the trend IV estimator when K → ∞ with the sample size.
The method of sieves has been widely used in estimating semiparametric and nonparametric models. In this paper, we first provide a general theory on the asymptotic normality of plug-in sieve M estimators of possibly irregular functionals of semi/nonparametric time series models. Next, we establish a surprising result that the asymptotic variances of plug-in sieve M estimators of irregular (i.e., slower than root-T estimable) functionals do not depend on temporal dependence. Nevertheless, ignoring the temporal dependence in small samples may not lead to accurate inference. We then propose an easy-to-compute and more accurate inference procedure based on a “pre-asymptotic” sieve variance estimator that captures temporal dependence. We construct a “pre-asymptotic” Wald statistic using an orthonormal series long run variance (OS-LRV) estimator. For sieve M estimators of both regular (i.e., root-T estimable) and irregular functionals, a scaled “pre-asymptotic” Wald statistic is asymptotically F distributed when the series number of terms in the OS-LRV estimator is held fixed. Simulations indicate that our scaled “pre-asymptotic” Wald test with F critical values has more accurate size in finite samples than the usual Wald test with chi-square critical values.
A new approach to robust testing in cointegrated systems is proposed using nonparametric HAC estimators without truncation. While such HAC estimates are inconsistent, they still produce asymptotically pivotal tests and, as in conventional regression settings, can improve testing and inference. The present contribution makes use of steep origin kernels which are obtained by exponentiating traditional quadratic kernels. Simulations indicate that tests based on these methods have improved size properties relative to conventional tests and better power properties than other tests that use Bartlett or other traditional kernels with no truncation.
Employing power kernels suggested in earlier work by the authors (2003), this paper shows how to refine methods of robust inference on the mean in a time series that rely on families of untruncated kernel estimates of the long-run parameters. The new methods improve the size properties of heteroskedastic and autocorrelation robust (HAR) tests in comparison with conventional methods that employ consistent HAC estimates, and they raise test power in comparison with other tests that are based on untruncated kernel estimates. Large power parameter (ρ) asymptotic expansions of the nonstandard limit theory are developed in terms of the usual limiting chi-squared distribution, and corresponding large sample size and large ρ asymptotic expansions of the finite sample distribution of Wald tests are developed to justify the new approach. Exact finite sample distributions are given using operational techniques. The paper further shows that the optimal ρ that minimizes a weighted sum of type I and II errors has an expansion rate of at most O(T1/2) and can even be O(1) for certain loss functions, and is therefore slower than the O(T2/3) rate which minimizes the asymptotic mean squared error of the corresponding long run variance estimator. A new plug-in procedure for implementing the optimal rho is suggested. Simulations show that the new plug-in procedure works well in finite samples.
JEL Classification: C13; C14; C22; C51
Keywords: Asymptotic expansion, consistent HAC estimation, data-determined kernel estimation, exact distribution, HAR inference, large rho asymptotics, long run variance, loss function, power parameter, sharp origin kernel
A new class of kernel estimates is proposed for long run variance (LRV) and heteroskedastic autocorrelation consistent (HAC) estimation. The kernels are called steep origin kernels and are related to a class of sharp origin kernels explored by the authors (2003) in other work. They are constructed by exponentiating a mother kernel (a conventional lag kernel that is smooth at the origin) and they can be used without truncation or bandwidth parameters. When the exponent is passed to infinity with the sample size, these kernels produce consistent LRV/HAC estimates. The new estimates are shown to have limit normal distributions, and formulae for the asymptotic bias and variance are derived. With steep origin kernel estimation, bandwidth selection is replaced by exponent selection and data-based selection is possible. Rules for exponent selection based on minimum mean squared error (MSE) criteria are developed. Optimal rates for steep origin kernels that are based on exponentiating quadratic kernels are shown to be faster than those based on exponentiating the Bartlett kernel, which produces the sharp origin kernel. It is further shown that, unlike conventional kernel estimation where an optimal choice of kernel is possible in terms of MSE criteria (Priestley, 1962; Andrews, 1991), steep origin kernels are asymptotically MSE equivalent, so that choice of mother kernel does not matter asymptotically. The approach is extended to spectral estimation at frequencies omega < 0. Some simulation evidence is reported detailing the finite sample performance of steep kernel methods in LRV/HAC estimation and robust regression testing in comparison with sharp kernel and conventional (truncated) kernel methods.
Keywords: Exponentiated kernel, Lag kernel, Long run variance, Optimal exponent, Spectral window, Spectrum
A new family of kernels is suggested for use in heteroskedasticity and autocorrelation consistent (HAC) and long run variance (LRV) estimation and robust regression testing. The kernels are constructed by taking powers of the Bartlett kernel and are intended to be used with no truncation (or bandwidth) parameter. As the power parameter (ρ) increases, the kernels become very sharp at the origin and increasingly downweight values away from the origin, thereby achieving effects similar to a bandwidth parameter. Sharp origin kernels can be used in regression testing in much the same way as conventional kernels with no truncation, as suggested in the work of Kiefer and Vogelsang (2002a, 2002b). A unified representation of HAC limit theory for untruncated kernels is provided using a new proof based on Mercer’s theorem that allows for kernels which may or may not be differentiable at the origin. This new representation helps to explain earlier findings like the dominance of the Bartlett kernel over quadratic kernels in test power and yields new findings about the asymptotic properties of tests with sharp origin kernels. Analysis and simulations indicate that sharp origin kernels lead to tests with improved size properties relative to conventional tests and better power properties than other tests using Bartlett and other conventional kernels without truncation.
If ρ is passed to infinity with the sample size (T), the new kernels provide consistent HAC and LRV estimates as well as continued robust regression testing. Optimal choice of rho based on minimizing the asymptotic mean squared error of estimation is considered, leading to a rate of convergence of the kernel estimate of T1/3, analogous to that of a conventional truncated Bartlett kernel estimate with an optimal choice of bandwidth. A data-based version of the consistent sharp origin kernel is obtained which is easily implementable in practical work.
Within this new framework, untruncated kernel estimation can be regarded as a form of conventional kernel estimation in which the usual bandwidth parameter is replaced by a power parameter that serves to control the degree of downweighting. Simulations show that in regression testing with the sharp origin kernel, the power properties are better than those with simple untruncated kernels (where ρ = 1) and at least as good as those with truncated kernels. Size is generally more accurate with sharp origin kernels than truncated kernels. In practice a simple fixed choice of the exponent parameter around ρ = 16 for the sharp origin kernel produces favorable results for both size and power in regression testing with sample sizes that are typical in econometric applications.
JEL Classification: C13; C14; C22; C51
Keywords: Consistent HAC estimation, Data determined kernel estimation, Long run variance, Mercer’s theorem, Power parameter, Sharp origin kernel
The local Whittle (or Gaussian semiparametric) estimator of long range dependence, proposed by Künsch (1987) and analyzed by Robinson (1995a), has a relatively slow rate of convergence and a finite sample bias that can be large. In this paper, we generalize the local Whittle estimator to circumvent these problems. Instead of approximating the short-run component of the spectrum, φ(λ), by a constant in a shrinking neighborhood of frequency zero, we approximate its logarithm by a polynomial. This leads to a “local polynomial Whittle” (LPW) estimator. We specify a data-dependent adaptive procedure that adjusts the degree of the polynomial to the smoothness of φ(λ) at zero and selects the bandwidth. The resulting “adaptive LPW” estimator is shown to achieve the optimal rate of convergence, which depends on the smoothness of φ(λ) at zero, up to a logarithmic factor.
Keywords: Adaptive estimator, Asymptotic bias, Asymptotic normality, Bias reduction, Local polynomial, Long memory, Minimax rate, Optimal bandwidth, Whittle likelihood
The local Whittle (or Gaussian semiparametric) estimator of long range dependence, proposed by Künsch (1987) and analyzed by Robinson (1995a), has a relatively slow rate of convergence and a finite sample bias that can be large. In this paper, we generalize the local Whittle estimator to circumvent those problems. Instead of approximating the short-run component of the spectrum, φ(λ), by a constant in a shrinking neighborhood of frequency zero, we approximate its logarithm by a polynomial. This leads to a “local polynomial Whittle” (LPW) estimator.
Following the work of Robinson (1995a), we establish the asymptotic bias, variance, mean-squared error (MSE), and normality of the LPW estimator. We determine the asymptotically MSE-optimal bandwidth, and specify a plug-in selection method for its practical implementation. When φ(λ) is smooth enough near the origin, we find that the bias of the LPW estimator goes to zero at a faster rate than that of the local Whittle estimator, and its variance is only inflated by a multiplicative constant. In consequence, the rate of convergence of the LPW estimator is faster than that of the local Whittle estimator, given an appropriate choice of the bandwidth m.
We show that the LPW estimator attains the optimal rate of convergence for a class of spectra containing those for which φ(λ) is smooth of order s > 1 near zero. When φ(λ) is infinitely smooth near zero, the rate of convergence of the LPW estimator based on a polynomial of high degree is arbitrarily close to n-1/2.