I would like to give a practical answer
from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score, confusion_matrix, recall_score, roc_auc_score, precision_score X, y = make_classification( n_classes=2, class_sep=1.5, weights=[0.9, 0.1], n_features=20, n_samples=1000, random_state=10 ) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42) clf = LogisticRegression(class_weight="balanced") clf.fit(X_train, y_train) THRESHOLD = 0.25 preds = np.where(clf.predict_proba(X_test)[:,1] > THRESHOLD, 1, 0) pd.DataFrame(data=[accuracy_score(y_test, preds), recall_score(y_test, preds), precision_score(y_test, preds), roc_auc_score(y_test, preds)], index=["accuracy", "recall", "precision", "roc_auc_score"])
By changing THRESHOLD to 0.25 , you may find that recall and precision are decreasing. However, when you remove the argument, class_weight accuracy increases, and the recall indicator falls. See @accepted answer
source share