Stats

Stats — Tools for statistical analysis

Functions

Types and Values

  igt_stats_t
struct igt_mean

Includes

#include <igt.h>

Description

Various tools to make sense of data.

igt_stats_t is a container of data samples. igt_stats_push() is used to add new samples and various results (mean, variance, standard deviation, ...) can then be retrieved.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
igt_stats_t stats;

igt_stats_init(&stats, 8);

igt_stats_push(&stats, 2);
igt_stats_push(&stats, 4);
igt_stats_push(&stats, 4);
igt_stats_push(&stats, 4);
igt_stats_push(&stats, 5);
igt_stats_push(&stats, 5);
igt_stats_push(&stats, 7);
igt_stats_push(&stats, 9);

printf("Mean: %lf\n", igt_stats_get_mean(&stats));

igt_stats_fini(&stats);

Functions

igt_stats_init ()

void
igt_stats_init (igt_stats_t *stats);

Initializes an igt_stats_t instance. igt_stats_fini() must be called once finished with stats .

Parameters

stats

An igt_stats_t instance

 

igt_stats_init_with_size ()

void
igt_stats_init_with_size (igt_stats_t *stats,
                          unsigned int capacity);

Like igt_stats_init() but with a size to avoid reallocating the underlying array(s) when pushing new values. Useful if we have a good idea of the number of data points we want stats to hold.

igt_stats_fini() must be called once finished with stats .

Parameters

stats

An igt_stats_t instance

 

capacity

Number of data samples stats can contain

 

igt_stats_fini ()

void
igt_stats_fini (igt_stats_t *stats);

Frees resources allocated in igt_stats_init().

Parameters

stats

An igt_stats_t instance

 

igt_stats_is_population ()

bool
igt_stats_is_population (igt_stats_t *stats);

Parameters

stats

An igt_stats_t instance

 

Returns

true if stats represents a population, false if only a sample.

See igt_stats_set_population() for more details.


igt_stats_set_population ()

void
igt_stats_set_population (igt_stats_t *stats,
                          bool full_population);

In statistics, we usually deal with a subset of the full data (which may be a continuous or infinite set). Data analysis is then done on a sample of this population.

This has some importance as only having a sample of the data leads to biased estimators. We currently used the information given by this method to apply Bessel's correction to the variance.

Note that even if we manage to have an unbiased variance by multiplying a sample variance by the Bessel's correction, n/(n - 1), the standard deviation derived from the unbiased variance isn't itself unbiased. Statisticians talk about a "corrected" standard deviation.

When giving true to this function, the data set in stats is considered a full population. It's considered a sample of a bigger population otherwise.

When newly created, stats defaults to holding sample data.

Parameters

stats

An igt_stats_t instance

 

full_population

Whether we're dealing with sample data or a full population

 

igt_stats_push ()

void
igt_stats_push (igt_stats_t *stats,
                uint64_t value);

Adds a new value to the stats dataset and converts the igt_stats from an integer collection to a floating point one.

Parameters

stats

An igt_stats_t instance

 

value

An floating point

 

igt_stats_push_float ()

void
igt_stats_push_float (igt_stats_t *stats,
                      double value);

igt_stats_push_array ()

void
igt_stats_push_array (igt_stats_t *stats,
                      const uint64_t *values,
                      unsigned int n_values);

Adds an array of values to the stats dataset.

Parameters

stats

An igt_stats_t instance

 

values

A pointer to an array of data points.

[array length=n_values]

n_values

The number of data points to add

 

igt_stats_get_min ()

uint64_t
igt_stats_get_min (igt_stats_t *stats);

Retrieves the minimal value in stats

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_max ()

uint64_t
igt_stats_get_max (igt_stats_t *stats);

Retrieves the maximum value in stats

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_range ()

uint64_t
igt_stats_get_range (igt_stats_t *stats);

Retrieves the range of the values in stats . The range is the difference between the highest and the lowest value.

The range can be a deceiving characterization of the values, because there can be extreme minimal and maximum values that are just anomalies. Prefer the interquatile range (see igt_stats_get_iqr()) or an histogram.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_quartiles ()

void
igt_stats_get_quartiles (igt_stats_t *stats,
                         double *q1,
                         double *q2,
                         double *q3);

Retrieves the quartiles of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

q1

lower or 25th quartile.

[out]

q2

median or 50th quartile.

[out]

q3

upper or 75th quartile.

[out]

igt_stats_get_iqr ()

double
igt_stats_get_iqr (igt_stats_t *stats);

Retrieves the interquartile range (IQR) of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_iqm ()

double
igt_stats_get_iqm (igt_stats_t *stats);

Retrieves the interquartile mean (IQM) of the stats dataset.

The interquartile mean is a "statistical measure of central tendency". It is a truncated mean that discards the lowest and highest 25% of values, and calculates the mean value of the remaining central values.

It's useful to hide outliers in measurements (due to cold cache etc).

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_mean ()

double
igt_stats_get_mean (igt_stats_t *stats);

Retrieves the mean of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_trimean ()

double
igt_stats_get_trimean (igt_stats_t *stats);

Retrieves the trimean of the stats dataset.

The trimean is a the most efficient 3-point L-estimator, even more robust than the median at estimating the average of a sample population.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_median ()

double
igt_stats_get_median (igt_stats_t *stats);

Retrieves the median of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_variance ()

double
igt_stats_get_variance (igt_stats_t *stats);

Retrieves the variance of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

igt_stats_get_std_deviation ()

double
igt_stats_get_std_deviation (igt_stats_t *stats);

Retrieves the standard deviation of the stats dataset.

Parameters

stats

An igt_stats_t instance

 

igt_mean_init ()

void
igt_mean_init (struct igt_mean *m);

Initializes or resets m .

Parameters

m

tracking structure

 

igt_mean_add ()

void
igt_mean_add (struct igt_mean *m,
              double v);

Adds a new value v to m .

Parameters

m

tracking structure

 

v

value

 

igt_mean_get ()

double
igt_mean_get (struct igt_mean *m);

Computes the current mean of the samples tracked in m .

Parameters

m

tracking structure

 

igt_mean_get_variance ()

double
igt_mean_get_variance (struct igt_mean *m);

Computes the current variance of the samples tracked in m .

Parameters

m

tracking structure

 

Types and Values

igt_stats_t

typedef struct {
	unsigned int n_values;
	unsigned int is_float : 1;
	union {
		uint64_t *values_u64;
		double *values_f;
	};
} igt_stats_t;

Members

unsigned int n_values;

The number of pushed values

 

unsigned int is_float : 1;

Whether values_f or values_u64 is valid

 

uint64_t *values_u64;

An array containing pushed integer values

 

double *values_f;

An array containing pushed float values

 

struct igt_mean

struct igt_mean {
};

Structure to compute running statistical numbers. Needs to be initialized with igt_mean_init(). Read out data using igt_mean_get() and igt_mean_get_variance().