1. The requirement in Lemma 2.3 is that for any linear subspace S⊂Rk, P(X∈S) = 0, the inclusion being proper. It is easy to see that this condition is necessary for uniqueness of the minimizer of g.
2. Theorem 2.1 is not the first result describing the strong consistency of LAD. Amemiya(1979) proves it for independent and identically distributed samples when U has infinite mean. The present proof is similar to that of Gross and Steiger (1979) from the context of time series and allows the independence assumption to be relaxed. Notice that i.i.d. samples are not required.
3. The assumptions of Theorem 2 do not imply Bassett-Koenker. Even though (x
i,yi) stationary and ergodic implies that the design sample covariance matrix (Z’Z)/n is positive definite for n large, the non-independence of the yi — requires new central limit theory. The idea of the proof is similar to, but simpler than, that of Amemiya (1979) which seems to have some mistakes. Ruppert and Carroll (1977) have taken a related approach for the location problem.
4. LAD is consistent for infinite variance regressions where least squares certainly is not. Suppose U,V are i.i.d. and integrable and write Z = U + V, X = U. Then X has a linear regression on Z with slope a = 1/2 because
$$ \begin{gathered}
E(X\left| Z \right.) = E\left( {U\left| {U + V} \right.} \right) \hfill \\
= E(V\left| {U + V} \right.) \hfill \\
= E(U + V\left| U \right. + V)/2 \hfill \\
= Z/2 \hfill \\
\end{gathered} $$
However if U and V are taken to be symmetric stable random variables of index α < 2, the least squares estimator ĉn = Σxizi/Σzi2 based on an i.i.d. sample (zi,xi), converges in distribution to S/(S + T) S,T being independent positive stable random variable of index α/2 [see Kanter and Steiger (1974)] so least squares is not consistent. A result in Kanter and Steiger (1977) shows that if α > 1, the LAD estimator ân → 1/2 in probability. Perhaps the most convenient source for material on stable laws and their domains of attraction is Feller (1971).
5. Jaeckel (1972) does not point out that choosing the scores as in (4.6) yields the LAD estimator nor do Bassett and Koenker (1978) acknowledge that asymptotic normality of LAD can follow from Jaeckel (1972) for unconstrained regressions. Furthermore, although Hogg (1979) discusses R-estimation for regressions using the sign-scores, he does not mention any connection with LAD regression. Hence it is reasonable to suppose that Lemma 4.1 is new. It was mentioned by M. Osborne in a seminar in Canberra in 1980.