This article provides a summary of the information found on One AI's Modeling Reports.
Jump to the video - One AI Results Summary Report
Introduction
The Results Summary report provides a comprehensive overview of the configuration and performance of machine learning models in One AI. It covers details such as estimator configuration, feature selection information, upsampling and probability calibration information, prediction details, feature analysis, model performance, and any relevant messages or warnings. This report is used to make informed decisions regarding manual overrides and advanced configuration, as well as model deployment. It aims to offer transparency and insights into the inner workings of the machine learning process, enabling users to optimize their models effectively.
If you have used other machine learning and predictive analytics human resources tools, you may not be familiar with this type of report due to many model offerings being based on a proprietary model developed by that particular vendor. Therefore, they would consider sharing this sort of information as giving up their secret sauce because the results summary provides a large amount of detail about the model itself, including exactly how it ran, as well as how it performed. However, One Model believes in full transparency regarding machine learning, deeming this report necessary for you to be successful. The results summary, paired with the Exploratory Data Analysis (EDA) report, will provide you with all the information you need to understand the model you created so you can decide if you trust the results and want to leverage them to make strategic decisions or if you need to go back and edit your model first.
Finding the Results Summary
To access the Results Summary report:
- Select One AI from the top navigation bar.
- Scroll to the desired ML model and click on the "Runs" button. Any run in a pending, ignored, deployed, or deployed and persisted status should contain a Results Summary report.
- Click on the label of the specific run iteration you want to view. You must first view the EDA report before the entire results summary will display. This is because you will not be able to fully interpret this report without viewing the EDA report first.
- Click View Results Summary from the EDA report in the bottom right or the Results Summary tab at the top of the window.
Results Summary: Estimation Details
The Estimation Details section of the Results Summary report provides a comprehensive overview of how the machine learning model was configured. This section offers transparency into the model's setup, empowering users to understand the underlying algorithms and settings used in the model.
The left hand column details the classifier (or regressor) configuration. The selected classifier is at the top, with details about the parameters and hyperparameters following below.
The center column starts by displaying feature selection information for both the filter and wrapper steps in that process. In general, models in One AI are presented with several, sometimes hundreds of features. This section shows how the model removed the non-important features, resulting in the model only being presented with powerful features that will benefit the model performance and fit.
Next, in the center column, Upsampling and Probability Calibration details are shown. You can easily see which method, if any, were selected and performed.
Anything and everything in the first two columns can be manually overridden by the model builder by clicking edit on the model from the augmentation screen and scrolling to the One AI configuration. If you do not have the know-how or do not wish to perform manual configurations, that’s totally fine! One AI will try an intelligent subset of the available options and select the combination that results in the best model performance and fit.
For classification models, the right hand column provides a quick overview of the prediction details for this model, specifically what metric or column is being predicted (Binary Classification Target), the number of predictions in each class (No Termination/Termination), and which class is the positive label.The binary classification target and positive label can both be configured in the One AI query builder while you create or edit the model. For regression models, this column simply provides the regression target column.
Results Summary: Feature Analysis
The Feature Analysis section of the Results Summary report offers valuable insights into the importance and impact of features on the machine learning model's performance. It includes various tabs providing different perspectives on feature analysis, tailored to meet the diverse needs of users.
First and most important, the Feature Importances/Coefficients tab ranks and numerically scores features based on their significance in influencing model outcomes. By understanding which features have the most significant impact, users can prioritize their focus on key variables for further analysis or intervention. Additionally, users can increase or decrease the number of features the model can bring in depending on the feature importance value spread.
Please note that the format of these values will vary slightly depending on which estimator the model used. For example, for a classification model using a logistic regression classifier or a regression model using a Lasso regressor, feature coefficients, which are the weights by which the features are multiplied in the model, will be displayed in the place of feature importances in the Coefficients tab.
SHAP Beewarm is a visualization that offers a nuanced understanding of feature impact by providing numeric importance values for every feature across individual predictions at a glance. The beeswarm plot illustrates how each feature contributes to model predictions, helping users identify patterns and insights at a granular level. The horizontal axis indicates how predictive that feature is for that individual in a positive (where positive means the positive class or label) or negative direction. In addition, the beeswarm chart adds color coding of each dot where blue is lower and red is higher based on how the feature value for that individual compares to the average for the entire population.
In the image below, there are several red dots for the hire cohort terminations feature. Red means that there are a high number of terminations in that employee’s hire cohort. These red dots being located on the left side of the graph means that this high number of terminations in that employee’s hire cohort is predictive of hire failure because hire failure is the negative class in this specific example.
SHAP Average displays a bar chart that shows the average of the absolute value of the SHAP values for each feature. It is an indicator of how important the feature was to this set of predictions but not whether the feature made a positive classification more likely.
SHAP must be enabled from the Global Settings in the One AI configuration for the SHAP Beeswarm and SHAP Average tab to be created and available to view.
VIF Scores presents Variance Inflation Factor scores for each selected feature for the train, holdout, and predict set, indicating the degree of correlation between features. The lowest value is 1, while a value of 8 or more indicates a high degree of correlation with another column. By identifying highly correlated features, users can mitigate multicollinearity issues and optimize model performance.
Results Summary: Classification/Regression Report
The Classification/Regression Report section of the Results Summary report provides a comprehensive evaluation of the machine learning model's performance, specifically tailored for classification or regression tasks depending upon which type of problem is selected. This section offers a range of metrics and visualizations to assess the model's accuracy, precision, recall, and other performance indicators.
Classification Report
The first tab, the Classification Report, displays a table with the model’s F1, Precision, and Recall scores, providing insights into the model's overall performance in classifying data points into different categories. It is generated against the hold-out training set and helps users understand how well the model correctly identifies positive and negative instances. This indicates if the model should be deployed, shared, and trusted to use to make strategic decisions or if more configuration is needed.
Class Balance is generated against the prediction dataset and offers a visual representation of the distribution of predicted labels, offering insights into the model's ability to correctly classify instances across different categories. It helps users identify potential imbalances in the dataset.
The Class Prediction Error displays a stacked bar chart that compares the model's predictions with the actual labels, enabling users to visualize discrepancies and errors in classification. It is generated against the hold-out training set and helps identify patterns of misclassification and areas for improvement in model training.
The Confusion Matrix offers a tabular representation of the model's performance, showcasing the number of true positives, true negatives, false positives, and false negatives. It is generated against the hold-out training set and provides a detailed breakdown of classification outcomes, facilitating deeper analysis of model errors and performance across different classes.
The Precision Recall Curve visualizes the trade-off between precision and recall for different threshold settings, helping users understand the model's performance across a range of decision boundaries. It is generated against the hold-out training set and aids in selecting an appropriate threshold for model deployment based on specific use case requirements.
The Receiver Operating Characteristic (ROC) is a graphical representation of the model's true positive rate versus false positive rate, providing insights into its ability to discriminate between different classes. The Area Under the Curve (AUC) metric quantifies the overall performance of the model, with higher values indicating better discrimination ability. It is generated against the hold-out training set.
The Discrimination Threshold graph is generated against the hold-out training set and displays the score of Precision, Recall, F1 and queue rate on the y-axis vs. the discrimination threshold on the x-axis. The discrimination threshold is the probability that the positive class will be chosen over the negative class and the respective curves demonstrate the changing score values as the threshold is higher or lower.
The Calibration Curve is generated against the hold-out training set and plots the true frequency of the positive label (y-axis) vs. the predicted probability for that bin (x-axis) breaking up the predicted probabilities into 25 bins. For example, if the predicted probability was .2, then a well calibrated estimator will have a true frequency of positive labels around .2. The closer the calibration curve is to the dotted line, the more well calibrated its predicted probabilities.
Regression Report
The Regression Report provides crucial information about the performance of a regression model. The explained variance score indicates the proportion of variance in the dependent variable that is explained by the independent variables in the model, offering insights into how well the model fits the data. Meanwhile, the mean squared error quantifies the average squared difference between the actual and predicted values, serving as a measure of the model's accuracy in predicting the target variable. Together, these metrics help users assess the model's predictive capability and identify areas for improvement.
The Residuals Plot is generated against the train and hold-out (test) dataset and allows users to examine the distribution of residuals (the differences between observed and predicted values) against the predicted values. It serves as a diagnostic tool for evaluating the performance and assumptions of a regression model, guiding users in refining the model and enhancing its predictive accuracy.
The Prediction Error Plot is generated against the train and hold-out (test) dataset and visually displays the discrepancies between the actual and predicted values of the target variable. This plot shows the predicted value on the y-axis vs. the actual value of the record on the x-axis. In the best case, the line of best fit would be along the 45 degree line, so this plot is useful for visualizing the difference between the ideal scenario and the model's predictions.
The Prediction Distribution Plot is generated against the prediction dataset and illustrates the distribution of predicted values for the target variable across the dataset. A normal distribution is fit and overlaid to give context for outliers and insights into the spread and central tendency of the model's predictions. By examining this plot, users can assess the variability of predictions and identify any potential biases or patterns in the model's output.
The Prediction Comparison Plot is generated against the prediction and hold-out (test) dataset and juxtaposes the actual target variable values against the predicted values generated by the model. It offers users a visual representation of how closely the model's predictions align with the actual observed values across the dataset.
Results Summary: Messages and Warnings
The last section of the Results Summary report is the Messages and Warnings, which aims to further extend transparency into what happens when a machine learning augmentation is run. It displays information about the models created during the model selection process as well as setting overrides applied during the configuration. Specifically:
- Which algorithms were tested and scored for model selection?
- For each algorithm, was it run due to default settings, a heuristic, or a user override?
- Were there any setting combinations that were rejected due to the data fed to the model not meeting certain criteria?
Deploying a Model
Armed with the knowledge from the EDA report and Results summary, you can now choose to ignore, deploy, or deploy and persist the model in the bottom right depending on which action you wish to take.
Ignore - puts the model into an ignored status which prevents it from being deployed or deployed and persisted in the future.
Deploy - how data moves from the augmentation tab to the front end of One Model. Deploying loads the results of that individual model run into the data model, feeding any metrics, dimensions, and columns created for reporting on this data. Storyboards created from machine learning data are generally configured to display the most recent deployed run and should automatically refresh after your site has been processed.
Deploy and Persist - everything that described above for deploy happens, but the predictive model is also “frozen”. After persisting a model, a new model will not be trained on subsequent runs. Instead, the existing model (algorithm, settings, features, etc.) will be used. The only thing that may change is the data it's predicting on.
Several files are available for download if you wish to further analyze outside of One Model by clicking the download button and making a selection.
Summary
The Results Summary report in One AI offers a comprehensive overview of machine learning model configurations, feature analysis, and performance metrics. The radical transparency within this report equips users with the knowledge needed to understand, interpret, and optimize models effectively and make informed decisions to drive impactful outcomes.
Video - One AI Results Summary Report
Watch this video to learn more about the Results Summary Report for machine learning in One AI, which along with your EDA Report, provides essential information about your model.
You will learn:
-
What information is in the Results Summary report, to include estimation details, feature analysis, performance, and messages & warnings
- How to navigate to the Results Summary report in One Model
- How to use the Results Summary report to inform model refinement, fine tuning, and configuration overrides
- How to deploy your model for use in Explore and Storyboards from the Results Summary report
Run time: >21 minutes
Comments
0 comments
Please sign in to leave a comment.