dcor.homogeneity.energy_test

energy_test(*args, num_resamples=0, exponent=1, random_state=None, average=None, estimation_stat=EstimationStatistic.V_STATISTIC, n_jobs=1)[source]

Test of homogeneity based on the energy distance.

Compute the test of homogeneity based on the energy distance, for an arbitrary number of random vectors.

The test is a permutation test where the null hypothesis is that all random vectors have the same distribution.

Parameters
  • args (T) – Random vectors. The columns correspond with the individual random variables while the rows are individual instances of the random vector.

  • num_resamples (int) – Number of permutations resamples to take in the permutation test.

  • exponent (float) – Exponent of the Euclidean distance, in the range \((0, 2)\).

  • random_state (RandomLike) – Random state to generate the permutations.

  • average (Callable[[T], T] | None) – A function that will be used to calculate an average of distances. This defaults to np.mean.

  • estimation_stat (EstimationStatisticLike) – If EstimationStatistic.U_STATISTIC, calculate energy distance using Hoeffding’s unbiased U-statistics. Otherwise, use von Mises’s biased V-statistics. If this is provided as a string, it will first be converted to an EstimationStatistic enum instance.

  • n_jobs (int | None) – Number of jobs executed in parallel by Joblib.

Returns

Results of the hypothesis test.

Return type

HypothesisTest[T]

See also

energy_distance

Examples

>>> import numpy as np
>>> import dcor
>>> a = np.array([[1, 2, 3, 4],
...               [5, 6, 7, 8],
...               [9, 10, 11, 12],
...               [13, 14, 15, 16]])
>>> b = np.array([[1, 0, 0, 1],
...               [0, 1, 1, 1],
...               [1, 1, 1, 1]])
>>> c = np.array([[1000, 0, 0, 1000],
...               [0, 1000, 1000, 1000],
...               [1000, 1000, 1000, 1000]])
>>> dcor.homogeneity.energy_test(a, a)
HypothesisTest(pvalue=1.0, statistic=0.0)
>>> dcor.homogeneity.energy_test(a, b) 
HypothesisTest(pvalue=1.0, statistic=35.2766732...)
>>> dcor.homogeneity.energy_test(b, b)
HypothesisTest(pvalue=1.0, statistic=0.0)
>>> dcor.homogeneity.energy_test(a, b, num_resamples=5, random_state=0)
HypothesisTest(pvalue=0.1666666..., statistic=35.2766732...)
>>> dcor.homogeneity.energy_test(a, b, num_resamples=5, random_state=6)
HypothesisTest(pvalue=0.3333333..., statistic=35.2766732...)
>>> dcor.homogeneity.energy_test(a, c, num_resamples=7, random_state=0)
HypothesisTest(pvalue=0.125, statistic=4233.8935035...)

A different exponent for the Euclidean distance in the range \((0, 2)\) can be used:

>>> dcor.homogeneity.energy_test(a, b, exponent=1.5)
...                                               
HypothesisTest(pvalue=1.0, statistic=171.0623923...)