First of all, you should remove the “roc” and “auc” tags, since the curve with the exact recall is something else:
ROC Curves:
- x axis: false positive speed FPR = FP / (FP + TN) = FP / N
- y-axis: True Positive Rate TPR = Recall = TP / (TP + FN) = TP / P
Precision curves:
- x axis: Recall = TP / (TP + FN) = TP / P = TPR
- y axis: Accuracy = TP / (TP + FP) = TP / PP
An example of cancer detection is a binary classification problem. Your predictions are based on probability. The likelihood of (in) the presence of cancer.
In general, an instance will be classified as A if P (A)> 0.5 (your threshold). For this value, you get your Recall-Precision pair based on True Positives, True Negatives, False Positives and False Negatives.
Now, when you change the threshold of 0.5, you get a different result (different pairs). You can already classify the patient as "cancer" for P (A)> 0.3. This will decrease accuracy and increase recall. You would tell someone that he has cancer, even if he did not, to make sure that patients with cancer will receive the necessary treatment. This is an intuitive compromise between TPR and FPR, as well as accuracy and recall, sensitivity and specificity.
Add these terms as you see them most often in biostatistics.
- Sensitivity = TP / P = Recall = TPR
- Specificity = TN / N = (1 - FPR)
ROC and Precision-Recall curves visualize all of these possible threshold values for your classifier.
You should consider these indicators, unless accuracy is a suitable measure of quality. Classifying all patients as “cancer-free” will give you maximum accuracy, but your ROC and Precision-Recall curves will be 1 s and 0 s.
lnathan
source share