Let's say we start with
import numpy as np from sklearn import metrics
Now we set the true y and the predicted scores :
y = np.array([0, 0, 1, 1]) scores = np.array([0.1, 0.4, 0.35, 0.8])
(Note that y has shifted by 1 from your problem. This is immaterial: exactly the same results (fpr, tpr, thresholds, etc.) are obtained regardless of whether you predict 1, 2 or 0, 1, but some sklearn.metrics functions - this is a drag and drop if not used 0, 1.)
Check out the AUC here:
>>> metrics.roc_auc_score(y, scores) 0.75
As in your example:
fpr, tpr, thresholds = metrics.roc_curve(y, scores) >>> fpr, tpr (array([ 0. , 0.5, 0.5, 1. ]), array([ 0.5, 0.5, 1. , 1. ]))
This gives the following graph:
plot([0, 0.5], [0.5, 0.5], [0.5, 0.5], [0.5, 1], [0.5, 1], [1, 1]);

By construction, the ROC for a finite length y will consist of rectangles:
At a sufficiently low threshold, everything will be classified as negative.
As the threshold increases continuously, at some points some negative classifications will be changed to positive.
Thus, for a finite y, ROC will always be characterized by a sequence of connected horizontal and vertical lines leading from (0, 0) to (1, 1).
AUC is the sum of these rectangles. Here, as shown above, the AUC is 0.75, since the rectangles have areas of 0.5 * 0.5 + 0.5 * 1 = 0.75.
In some cases, people choose to calculate AUC by linear interpolation. Say the length y is much larger than the actual number of points calculated for FPR and TPR. Then, in this case, linear interpolation is an approximation of what could be between the points. In some cases, people also follow the hypothesis that if it were large enough, the points between them would be interpolated linearly. sklearn.metrics does not use this hypothesis and to obtain results consistent with sklearn.metrics , you must use a rectangle rather than a trapezoidal summation.
Let's write our own function to calculate AUC directly from fpr and tpr :
import itertools import operator def auc_from_fpr_tpr(fpr, tpr, trapezoid=False): inds = [i for (i, (s, e)) in enumerate(zip(fpr[: -1], fpr[1: ])) if s != e] + [len(fpr) - 1] fpr, tpr = fpr[inds], tpr[inds] area = 0 ft = zip(fpr, tpr) for p0, p1 in zip(ft[: -1], ft[1: ]): area += (p1[0] - p0[0]) * ((p1[1] + p0[1]) / 2 if trapezoid else p0[1]) return area
This function accepts FPR and TPR and an optional parameter indicating whether keystone summation should be used. Running it, we get:
>>> auc_from_fpr_tpr(fpr, tpr), auc_from_fpr_tpr(fpr, tpr, True) (0.75, 0.875)
We get the same result as sklearn.metrics for summing a rectangle, and another, higher result for summing trapezoids.
So now we just need to see what happens to the FPR / TPR points if we end up with an FPR of 0.1. We can do this with the bisect module
import bisect def get_fpr_tpr_for_thresh(fpr, tpr, thresh): p = bisect.bisect_left(fpr, thresh) fpr = fpr.copy() fpr[p] = thresh return fpr[: p + 1], tpr[: p + 1]
How it works? It just checks where the thresh insertion point in fpr . Given the properties of FPR (it should start at 0), the insertion point should be in a horizontal line. Thus, all rectangles before this should not be affected, all rectangles after this should be deleted, and this should be possibly shortened.
Let it be applicable:
fpr_thresh, tpr_thresh = get_fpr_tpr_for_thresh(fpr, tpr, 0.1) >>> fpr_thresh, tpr_thresh (array([ 0. , 0.1]), array([ 0.5, 0.5]))
Finally, we just need to calculate the AUC from the updated versions:
>>> auc_from_fpr_tpr(fpr, tpr), auc_from_fpr_tpr(fpr, tpr, True) 0.050000000000000003, 0.050000000000000003)
In this case, both rectangular and trapezoidal sums give the same results. Please note that in general they will not. For consistency with sklearn.metrics , you should use the first.