ISSN:
1572-9265
Keywords:
Least absolute deviations
;
robust regression
;
smoothing and regression splines
;
thin plate splines
;
lowess
;
cross validation
;
nonparametric estimation
Source:
Springer Online Journal Archives 1860-2000
Topics:
Computer Science
,
Mathematics
Notes:
Abstract The computation ofL 1 smoothing splines on large data sets is often desirable, but computationally infeasible. A locally weighted, LAD smoothing spline based smoother is suggested, and preliminary results will be discussed. Specifically, one can seek smoothing splines in the spacesW m (D), with [0, 1] n ⊆D. We assume data of the formy i =f(t i )+ε i ,i=1,..., N with {t i } i=1 N ⊂D, the ε i are errors withE(ε i )=0, andf is assumed to be inW m . An LAD smoothing spline is the solution,s λ, of the following optimization problem $$\mathop {\min }\limits_{g \in W_m } \frac{1}{N}\sum\limits_{i = 1}^N {\left| {y_i - g(t_i )} \right| + \lambda J_m (f),} $$ whereJ m (g) is the seminorm consisting of the standard sum of the squaredL 2 norms of themth partial derivatives ofg. Such an LAD smoothing spline,s λ, would be expected to give robust smoothed estimates off in situations where the ε i are from a distribution with heavy tails. For fixed λ〉0, the solution to such a problem is known to be a thin plate spline onW m , and hences λ is assumed to be of the form $$s_\lambda = \sum\nolimits_{\nu = 1}^M {d_\nu } \phi _\nu + \sum\nolimits_{i = 1}^N {c_i } \zeta _i $$ where $$\zeta _i (t) = R_1 (t_i ,t),R(s,t) = R_0 (s,t) + R_1 (s,t)$$ is the reproducing kernel forW m (D), R 1 (t i ,t)=projW m 0 R(t i ,t), and the functions {φ v } v=1 M span the Kern (proj W m 0 )=Kern(J m ). Optimality conditions definings λ as the solution to (1) yield an algorithm for its computation. However, this computation becomes unwieldy whenN≃O(103). A possible remedy is to solve “local” problems of the form of (1), on neighborhoods of “size”b, and to blend these locally optimal LAD splines together producing a globally smooth estimator. Two smoothing parameters (the global value of “λ”, and the “local neighborhood” size “b”) should preferably have data driven, cross validated, choice.
Type of Medium:
Electronic Resource
URL:
http://dx.doi.org/10.1007/BF02143928
Permalink