rand_binary_similarity#
- skfp.distances.rand_binary_similarity(vec_a: list | ndarray | csr_array, vec_b: list | ndarray | csr_array) float#
Rand similarity for vectors of binary values.
Computes the Rand similarity [1] [2] (known as All-Bit [3] or Sokal-Michener) for binary data between two input arrays or sparse matrices, using the formula:
\[sim(a, b) = \frac{|a+d|}{n}\]\(a\) - both are 1 (\(|x \cap y|\), common “on” bits)
\(d\) - both are 0 (\(~|x \cap y|\), common “off” bits)
\(n\) - length of passed vectors
The calculated similarity falls within the range \([0, 1]\). Passing all-zero vectors to this function results in a similarity of 0.
- Parameters:
vec_a ({ndarray, sparse matrix}) – First binary input array or sparse matrix.
vec_b ({ndarray, sparse matrix}) – Second binary input array or sparse matrix.
- Returns:
similarity – Rand similarity between
vec_aandvec_b.- Return type:
float
References
Examples
>>> from skfp.distances import rand_binary_similarity >>> import numpy as np >>> vec_a = np.array([1, 0, 1]) >>> vec_b = np.array([1, 0, 0]) >>> sim = rand_binary_similarity(vec_a, vec_b) >>> sim 0.6666666666666666
>>> from scipy.sparse import csr_array >>> vec_a = csr_array([[1, 0, 1]]) >>> vec_b = csr_array([[1, 0, 0]]) >>> sim = rand_binary_similarity(vec_a, vec_b) >>> sim 0.6666666666666666