What is considered an acceptable or good F1 score for machine learning models?


This will depend on the type of model you are working with, how you intend to use the results, and the model stakeholder's comfortability. An F1 score closer to 1 indicates better model performance, where both precision and recall are high; while an F1 score closer to 0 indicates poor model performance or that the model was unable to predict on at least one of the classes. The latter usually occurs when the dataset is too unbalanced and the model was unable to learn how to predict on one or more of the classes. It's important to remember that nearly all models are at least a little wrong. We can't always predict how humans will act because we do not have 100% comprehensive data. All models will generate some false negatives and some false positives, but they can still be incredibly useful.

Generally speaking, for a voluntary attrition model, for the terminating label an F1 score in the 0.25 to 0.70 range is generally considered to be acceptable.  For the Non-Terminating label, we shoot for 0.80 to 0.95.  Results outside of those ranges might be acceptable but should be validated.

You can find the F1 score for your model in the Results Summary in the Classification (or Regression) Report Section (Data > Augmentations >Runs > Run Label > Results Summary)

Classification Report in the Results Summary from a Voluntary Attrition Model 



Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.