energy_test#
- energy_test(*args, num_resamples=0, exponent=1, random_state=None, average=None, estimation_stat=EstimationStatistic.V_STATISTIC, n_jobs=1)[source]#
Test of homogeneity based on the energy distance.
Compute the test of homogeneity based on the energy distance, for an arbitrary number of random vectors.
The test is a permutation test where the null hypothesis is that all random vectors have the same distribution.
- Parameters
args (T) – Random vectors. The columns correspond with the individual random variables while the rows are individual instances of the random vector.
num_resamples (int) – Number of permutations resamples to take in the permutation test.
exponent (float) – Exponent of the Euclidean distance, in the range \((0, 2)\).
random_state (RandomLike) – Random state to generate the permutations.
average (Callable[[Array], Array] | None) – A function that will be used to calculate an average of distances. This defaults to np.mean.
estimation_stat (EstimationStatisticLike) – If EstimationStatistic.U_STATISTIC, calculate energy distance using Hoeffding’s unbiased U-statistics. Otherwise, use von Mises’s biased V-statistics. If this is provided as a string, it will first be converted to an EstimationStatistic enum instance.
n_jobs (int | None) – Number of jobs executed in parallel by Joblib.
- Returns
Results of the hypothesis test.
- Return type
HypothesisTest[Array]
See also
energy_distance
Examples
>>> import numpy as np >>> import dcor >>> a = np.array([[1, 2, 3, 4], ... [5, 6, 7, 8], ... [9, 10, 11, 12], ... [13, 14, 15, 16]]) >>> b = np.array([[1, 0, 0, 1], ... [0, 1, 1, 1], ... [1, 1, 1, 1]]) >>> c = np.array([[1000, 0, 0, 1000], ... [0, 1000, 1000, 1000], ... [1000, 1000, 1000, 1000]]) >>> dcor.homogeneity.energy_test(a, a) HypothesisTest(pvalue=1.0, statistic=0.0) >>> dcor.homogeneity.energy_test(a, b) HypothesisTest(pvalue=1.0, statistic=35.2766732...) >>> dcor.homogeneity.energy_test(b, b) HypothesisTest(pvalue=1.0, statistic=0.0) >>> dcor.homogeneity.energy_test(a, b, num_resamples=5, random_state=0) HypothesisTest(pvalue=0.1666666..., statistic=35.2766732...) >>> dcor.homogeneity.energy_test(a, b, num_resamples=5, random_state=6) HypothesisTest(pvalue=0.3333333..., statistic=35.2766732...) >>> dcor.homogeneity.energy_test(a, c, num_resamples=7, random_state=0) HypothesisTest(pvalue=0.125, statistic=4233.8935035...)
A different exponent for the Euclidean distance in the range \((0, 2)\) can be used:
>>> dcor.homogeneity.energy_test(a, b, exponent=1.5) ... HypothesisTest(pvalue=1.0, statistic=171.0623923...)