Skip to contents

This function calculates various descriptive statistics (robust and non-robust) for a specified variable or all variables in a given data frame or tibble.

Usage

summaryStats(x, var = NULL, digits = 2, robust = FALSE, drop.na = TRUE)

Arguments

x

A data frame or tibble.

var

A character vector specifying the variable(s) for which to calculate the summary statistics. If left as NULL (the default), summary statistics will be calculated for all variables in the data frame/tibble.

digits

An integer specifying the number of significant digits to display after the decimal point in the output.

robust

A logical value indicating whether to compute robust descriptive statistics. If FALSE (the default), computes the classical descriptive statistics for describing the distribution of a univariate variable.

drop.na

A logical value indicating whether to remove missing values (NA) from the calculations. If TRUE (the default), missing values will be removed. If FALSE, missing values will be included in the calculations.

Value

A data frame containing the summary statistics for the specified variable(s).

Author

Christian L. Goueguel

Examples

# Load the iris dataset
data(iris)

# Example1:
iris |> summaryStats()
#> # A tibble: 4 × 14
#>   variable      mean  mode median   IQR    sd variance    cv   min   max range
#>   <chr>        <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 Petal.Length  3.76   1.4   4.35   3.5  1.77    3.12   47.1   1     6.9   5.9
#> 2 Petal.Width   1.2    0.2   1.3    1.5  0.76    0.581  63.3   0.1   2.5   2.4
#> 3 Sepal.Length  5.84   5     5.8    1.3  0.83    0.686  14.2   4.3   7.9   3.6
#> 4 Sepal.Width   3.06   3     3      0.5  0.44    0.190  14.4   2     4.4   2.4
#> # ℹ 3 more variables: skewness <dbl>, kurtosis <dbl>, count <int>

# Example2:
iris |> summaryStats(
  var = c("Sepal.Length", "Petal.Length"),
  robust = TRUE
  )
#> # A tibble: 2 × 14
#>   variable    median   mad    Qn    Sn medcouple   LMC   RMC   rsd biloc biscale
#>   <chr>        <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
#> 1 Petal.Leng…   4.35  1.85  1.08  1.91     -0.4  -0.81  0.25  2.74  3.84    1.92
#> 2 Sepal.Leng…   5.8   1.04  0.87  0.83      0.06 -0.27  0.2   1.54  5.83    0.84
#> # ℹ 3 more variables: bivar <dbl>, rcv <dbl>, count <int>