This function calculates either the Rousseeuw-Croux Sn or Qn scale estimator.
Usage
rousseeuwCroux(x, estimator = c("Sn", "Qn"), drop.na = FALSE)
Details
The Sn and Qn estimators, proposed by Rousseeuw and Croux (1993), are robust measures of
scale (or dispersion) that are designed to have a high breakdown point, which is a desirable
property for estimators in the presence of outliers or heavy-tailed distributions. Specifically,
both estimators have a breakdown point of 50%. The Sn estimator is based on
the median, while the Qn estimator is based on a weighted combination of order statistics.
Both estimators are consistent estimators of the population scale parameter
under normality. Although, the function uses the robustbase
package for estimating Sn and
Qn, the bias-correction factors used in the calculations have been, however, refined
according to Akinshin, A., (2022).
References
Rousseeuw, P.J. and Croux, C. (1993). Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88(424):1273-1283.
Akinshin, A., (2022). Finite-sample Rousseeuw-Croux scale estimators. arxiv:2209.12268v1.
Examples
# Example 1:
x <- c(seq(1,100))
tibble::tibble(
sd = stats::sd(x),
mad = stats::mad(x),
Sn = rousseeuwCroux(x, estimator = "Sn"),
Qn = rousseeuwCroux(x, estimator = "Qn")
)
#> # A tibble: 1 × 4
#> sd mad Sn Qn
#> <dbl> <dbl> <dbl> <dbl>
#> 1 29.0 37.1 29.8 30.0
# Example 2:
x <- c(seq(1,99), 1e3) # An outlier at 1000
tibble::tibble(
sd = stats::sd(x),
mad = stats::mad(x),
Sn = rousseeuwCroux(x, estimator = "Sn"),
Qn = rousseeuwCroux(x, estimator = "Qn")
)
#> # A tibble: 1 × 4
#> sd mad Sn Qn
#> <dbl> <dbl> <dbl> <dbl>
#> 1 99.2 37.1 31.0 30.0