Metrics
- pycalib.metrics.ECE(y_true, probs, normalize=False, bins=15, ece_full=True)
Calculate ECE score based on model output probabilities and true labels
- Parameters:
- y_truelist
a list containing the actual class labels ndarray shape (n_samples) with a list containing actual class
labels
- ndarray shape (n_samples, n_classes) with largest value in
each row for the correct column class.
- probslist
a list containing probabilities for all the classes with a shape of (samples, classes)
- normalize: (bool)
in case of 1-vs-K calibration, the probabilities need to be normalized. (default = False)
- bins: (int)
into how many bins are probabilities divided (default = 15)
- ece_full: (bool)
whether to use ECE-full or ECE-max.
- Returns:
- ecefloat
expected calibration error
- pycalib.metrics.MCE(y_true, probs, normalize=False, bins=15, mce_full=False)
Calculate MCE score based on model output probabilities and true labels
- Parameters:
- y_truelist
containing the actual class labels
- probslist
containing probabilities for all the classes with a shape of (samples, classes)
- normalizebool
in case of 1-vs-K calibration, the probabilities need to be normalized. (default = False)
- binsint
into how many bins are probabilities divided (default = 15)
- mce_fullboolean
whether to use ECE-full or ECE-max for calculation MCE.
- Returns:
- mcefloat
maximum calibration error
- pycalib.metrics.accuracy(y_true, y_pred)
Classification accuracy score
Accuracy for binary and multiclass classification problems. Consists on the proportion of correct estimations assuming the maximum class probability of each score as the estimated class.
- Parameters:
- y_truelabel indicator matrix (n_samples, n_classes)
True labels. # TODO Add option to pass array with shape (n_samples, )
- y_predmatrix (n_samples, n_classes)
Predicted scores.
- Returns:
- scorefloat
Proportion of correct predictions as a value between 0 and 1.
Examples
>>> from pycalib.metrics import accuracy >>> Y = np.array([[0, 1], [0, 1]]) >>> S = np.array([[0.1, 0.9], [0.6, 0.4]]) >>> accuracy(Y, S) 0.5 >>> Y = np.array([[0, 1], [0, 1]]) >>> S = np.array([[0.1, 0.9], [0, 1]]) >>> accuracy(Y, S) 1.0
- pycalib.metrics.binary_ECE(y_true, probs, power=1, bins=15)
Binary Expected Calibration Error
\[\text{binary-ECE} = \sum_{i=1}^M \frac{|B_{i}|}{N} | \bar{y}(B_{i}) - \bar{p}(B_{i})|\]- Parameters:
- y_trueindicator vector (n_samples, )
True labels.
- probsmatrix (n_samples, )
Predicted probabilities for positive class.
- Returns:
- scorefloat
Examples
>>> from pycalib.metrics import binary_ECE >>> Y = np.array([0, 1]) >>> P = np.array([0.1, 0.9]) >>> print(round(binary_ECE(Y, P, bins=2), 8)) 0.1 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.1, .2, .3, .7, .8, .9]) >>> print(round(binary_ECE(Y, P, bins=2), 8)) 0.2 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.4, .4, .4, .6, .6, .6]) >>> print(round(binary_ECE(Y, P, bins=2), 8)) 0.4
- pycalib.metrics.binary_MCE(y_true, probs, power=1, bins=15)
Binary Maximum Calibration Error
\[\text{binary-MCE} = \max_{i \in \{1, ..., M\}} |\bar{y}(B_{i}) - \bar{p}(B_{i})|\]- Parameters:
- y_trueindicator vector (n_samples, )
True labels.
- probsmatrix (n_samples, )
Predicted probabilities for positive class.
- Returns:
- scorefloat
Examples
>>> from pycalib.metrics import binary_MCE >>> Y = np.array([0, 1]) >>> P = np.array([0.1, 0.6]) >>> print(round(binary_MCE(Y, P, bins=2), 8)) 0.4 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.1, .2, .3, .6, .7, .8]) >>> print(round(binary_MCE(Y, P, bins=2), 8)) 0.3 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.1, .2, .3, .3, .2, .1]) >>> print(round(binary_MCE(Y, P, bins=1), 8)) 0.3 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.1, .2, .3, .9, .9, .9]) >>> print(round(binary_MCE(Y, P, bins=2), 8)) 0.2 >>> Y = np.array([0, 0, 0, 1, 1, 1]) >>> P = np.array([.1, .1, .1, .6, .6, .6]) >>> print(round(binary_MCE(Y, P, bins=2), 8)) 0.4
- pycalib.metrics.brier_score(y_true, y_pred)
Brier score
Computes the Brier score between the true labels and the estimated probabilities. This corresponds to the Mean Squared Error between the estimations and the true labels.
- Parameters:
- y_truelabel indicator matrix (n_samples, n_classes)
True labels. # TODO Add option to pass array with shape (n_samples, )
- y_predmatrix (n_samples, n_classes)
Predicted scores.
- Returns:
- scorefloat
Positive value between 0 and 1.
Examples
>>> from pycalib.metrics import cross_entropy >>> Y = np.array([[0, 1], [0, 1]]) >>> S = np.array([[0.1, 0.9], [0.6, 0.4]]) >>> brier_score(Y, S) 0.185
- pycalib.metrics.classwise_ECE(y_true, probs, power=1, bins=15)
Classwise Expected Calibration Error
\[ \begin{align}\begin{aligned}\text{class-$j$-ECE} = \sum_{i=1}^M \frac{|B_{i,j}|}{N} |\bar{y}_j(B_{i,j}) - \bar{p}_j(B_{i,j})|,\\\text{classwise-ECE} = \frac{1}{K}\sum_{j=1}^K \text{class-$j$-ECE}\end{aligned}\end{align} \]- Parameters:
- y_truelabel indicator matrix (n_samples, n_classes)
True labels. # TODO Add option to pass array with shape (n_samples, )
- probsmatrix (n_samples, n_classes)
Predicted probabilities.
- Returns:
- scorefloat
Examples
>>> from pycalib.metrics import classwise_ECE >>> Y = np.array([[1, 0], [0, 1]]).T >>> P = np.array([[0.9, 0.1], [0.1, 0.9]]).T >>> print(round(classwise_ECE(Y, P, bins=2), 8)) 0.1 >>> Y = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]]).T >>> P = np.array([[.9, .8, .7, .3, .2, .1], [.1, .2, .3, .7, .8, .9]]).T >>> print(round(classwise_ECE(Y, P, bins=2), 8)) 0.2
- pycalib.metrics.classwise_MCE(y_true, probs, bins=15)
Classwise Maximum Calibration Error
\[ \begin{align}\begin{aligned}\text{class-$j$-MCE} = \max_{i \in {1, ..., M}} |\bar{y}_j(B_{i,j}) - \bar{p}_j(B_{i,j})|,\\\text{classwise-MCE} = \max_{j \in {1, ..., K}} \text{class-$j$-MCE}\end{aligned}\end{align} \]- Parameters:
- y_truelabel indicator matrix (n_samples, n_classes)
True labels. # TODO Add option to pass array with shape (n_samples, )
- probsmatrix (n_samples, n_classes)
Predicted probabilities.
- Returns:
- scorefloat
Examples
>>> from pycalib.metrics import classwise_MCE >>> Y = np.array([[1, 0], [0, 1]]).T >>> P = np.array([[0.8, 0.1], [0.2, 0.9]]).T >>> print(round(classwise_MCE(Y, P, bins=2), 8)) 0.2 >>> Y = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]]).T >>> P = np.array([[.8, .7, .6, .1, .1, .1], [.2, .3, .4, .9, .9, .9]]).T >>> print(round(classwise_MCE(Y, P, bins=2), 8)) 0.3
- pycalib.metrics.conf_ECE(y_true, probs, bins=15)
Confidence Expected Calibration Error
Calculate ECE score based on model maximum output probabilities and true labels
\[\text{confidence-ECE} = \sum_{i=1}^M \frac{|B_{i}|}{N} | \text{accuracy}(B_{i}) - \bar{p}(B_{i})|\]In which $p$ are the maximum predicted probabilities.
- Parameters:
- y_true:
a list containing the actual class labels
ndarray shape (n_samples) with a list containing actual class labels
ndarray shape (n_samples, n_classes) with largest value in each row for the correct column class.
- probs:
a list containing probabilities for all the classes with a shape of (samples, classes)
- bins: (int)
into how many bins are probabilities divided (default = 15)
- Returns:
- ecefloat
expected calibration error
Examples
>>> from pycalib.metrics import conf_ECE >>> Y = np.array([[1, 0], [0, 1]]).T >>> P = np.array([[0.9, 0.1], [0.1, 0.9]]).T >>> print(round(conf_ECE(Y, P, bins=2), 8)) 0.1 >>> Y = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]]).T >>> P = np.array([[.9, .8, .7, .3, .2, .1], [.1, .2, .3, .7, .8, .9]]).T >>> print(round(conf_ECE(Y, P, bins=2), 8)) 0.2
- pycalib.metrics.conf_MCE(y_true, probs, bins=15)
Calculate ECE score based on model output probabilities and true labels
- Parameters:
- y_true:
a list containing the actual class labels
ndarray shape (n_samples) with a list containing actual class labels
ndarray shape (n_samples, n_classes) with largest value in each row for the correct column class.
- probs:
a list containing probabilities for all the classes with a shape of (samples, classes)
- bins: (int)
into how many bins are probabilities divided (default = 15)
- Returns:
- mcefloat
maximum calibration error
- pycalib.metrics.cross_entropy(y_true, y_pred)
Cross-entropy score
Computes the cross-entropy (a.k.a. log-loss) for binary and multiclass classification scores.
- Parameters:
- y_truelabel indicator matrix (n_samples, n_classes)
True labels. # TODO Add option to pass array with shape (n_samples, )
- y_predmatrix (n_samples, n_classes)
Predicted scores.
- Returns:
- scorefloat
Examples
>>> from pycalib.metrics import cross_entropy >>> Y = np.array([[0, 1], [0, 1]]) >>> S = np.array([[0.1, 0.9], [0.6, 0.4]]) >>> cross_entropy(Y, S) 0.5108256237659906