dcor.energy_distance

energy_distance(x, y, *, average=None, exponent=1, estimation_stat=EstimationStatistic.V_STATISTIC)[source]

Estimator for energy distance.

Computes the estimator for the energy distance of the random vectors corresponding to \(x\) and \(y\). Both random vectors must have the same number of components.

Parameters
  • x (T) – First random vector. The columns correspond with the individual random variables while the rows are individual instances of the random vector.

  • y (T) – Second random vector. The columns correspond with the individual random variables while the rows are individual instances of the random vector.

  • exponent (float) – Exponent of the Euclidean distance, in the range \((0, 2)\).

  • average (Callable[[T], T] | None) – A function that will be used to calculate an average of distances. This defaults to the mean.

  • estimation_stat (EstimationStatisticLike) – Union[str, EstimationStatistic] If EstimationStatistic.U_STATISTIC, calculate energy distance using Hoeffding’s unbiased U-statistics. Otherwise, use von Mises’s biased V-statistics. If this is provided as a string, it will first be converted to an EstimationStatistic enum instance.

Returns

Value of the estimator of the energy distance.

Return type

T

Examples

>>> import numpy as np
>>> import dcor
>>> a = np.array([[1, 2, 3, 4],
...               [5, 6, 7, 8],
...               [9, 10, 11, 12],
...               [13, 14, 15, 16]])
>>> b = np.array([[1, 0, 0, 1],
...               [0, 1, 1, 1],
...               [1, 1, 1, 1]])
>>> dcor.energy_distance(a, a)
0.0
>>> dcor.energy_distance(a, b) 
20.5780594...
>>> dcor.energy_distance(b, b)
0.0

A different exponent for the Euclidean distance in the range \((0, 2)\) can be used:

>>> dcor.energy_distance(a, a, exponent=1.5)
0.0
>>> dcor.energy_distance(a, b, exponent=1.5)
... 
99.7863955...
>>> dcor.energy_distance(b, b, exponent=1.5)
0.0