Description: This guide helps you create the metrics and storyboard used to interpret a New Hire Success / Failure classification model in One Model. Building these assets typically requires coordination with your customer success team and data engineer to add the necessary code to your processing script so the model output tables and dimensions exist for metric and storyboard creation.
In most cases, if you are deploying another classification model using the same general output shape, you can reuse existing tables and dimensions and may not need additional data engineering work.
Module Type: Functional
Level: Intermediate-Advanced
Audience: Model & storyboard creators
Prerequisites: Access to and experience creating metrics & storyboards in One Model. "One AI Recipes", "Model Deployment", & "New Hire Failure Risk Model" modules
Installation Instructions
There are a few recommended steps that should be completed in this order to get started building new hire failure risk metrics and storyboards:
Deploy a New Hire Success / Failure classification model with SHAP enabled in the global settings.
Submit a ticket to have your data engineer add the required code to your processing script.
Allow your One Model site to reprocess.
Meet with your CS advisor to review the storyboard template structure and decide what you want to build and publish.
Note: Only create by-name and employee-level results if you have internal approval to do so.Using the metric guide, create the necessary metrics.
Using the storyboard guide, work with the CS team to create the sections of interest.
Ensure storyboard viewers have the appropriate data access to review.
The One Model team is here if you get stuck!
New Hire Success / Failure Metric Guide
Use the Metrics for New Hire Success / Failure Risk Models guide as the source of truth for metric definitions and calculations and to build metrics to visualize your model results. You are welcome to modify metrics for your organizational needs. Ensure you are only providing metric and storyboard permission to appropriate data access roles.
The standard New Hire Failure template usually requires metrics in a few categories:
Model scope & volume: who is scored, what population is included (typically recent external hires), and how many predictions exist for the latest run.
Model performance: how well the model identifies likely Failure vs. Success outcomes.
Driver summaries: SHAP-based summaries that support “what’s influencing risk” views at the overall level and, optionally, for a selected individual.
Risk distributions: risk bucket metrics (Low / Medium / High) and breakouts by key dimensions to show where risk concentrates (for example, pay grade, tenure groupings, managerial status, generation).
Note: Many metrics are filtered to a specific augmentation. In your build, make sure the augmentation filter matches your New Hire Success / Failure augmentation name.
Additionally, many metrics are filtered by the label value. Model creators set these labels in the One AI Recipe Screen during the "Give your prediction target meaningful labels" step. In this guide, the label for new hire failures is referred to as "Failure", and the label for new hire successes is called "Success". You can customize these names to fit your organization's needs, but we recommend keeping label names consistent across models of the same recipe type to avoid the need to create new metrics for each model.
New Hire Failure Risk Storyboard Template Descriptions
This section describes each section of the standard New Hire Failure storyboard and the value it provides. Your CS Advisor can show you this storyboard live upon request.
1. Model Information & Performance
What it is
A quick snapshot of what the model predicts, who it covers, and how well it performs.
What you will typically see
A plain-language “About the model” statement describing the prediction frame. Model context such as method, population size, how many new hires were predicted as failures, prediction rate, features selected, and deployment/run details. A performance view showing how well the model distinguishes Failure vs. Success using F1, precision, and recall.
Why it matters
This sets expectations before anyone interprets drivers or lists. It’s the fastest check for “is this model reliable enough for exploration and discussion?”
Important interpretation reminder
Performance differs by label. Interpret downstream views (drivers, distributions, lists) with that label context in mind.
2. Drivers and Directionality
What it is
A transparent view of what the model relies on most, and whether those factors generally push predictions toward new hire success or toward new hire failure.
What you will typically see
Ranked driver summaries for Success and for Failure, plus a directionality view showing which features tend to increase failure risk vs. reduce it in aggregate. This is powered by SHAP so viewers can see both impact and direction, and numeric/date values are often translated into “Higher Than Average” vs. “Lower Than Average” for readability.
Why it matters
Stakeholders get clear answers to “what signals are associated with higher failure risk?” Analysts can sanity-check whether those signals match expectations and investigate surprising drivers.
Important interpretation reminder
Drivers explain model behavior based on historical patterns. They do not prove causation or prescribe interventions.
3. Where Does Risk Sit
What it is
A distribution view that shows how new hire failure risk is spread across the population and across key segments.
What you will typically see
New hires grouped into Low / Medium / High risk buckets, plus risk distributions by selected dimensions (commonly things like pay grade, tenure groupings, managerial status, generation, or other relevant slices).
Why it matters
It turns individual risk into an organizational story: where risk concentrates, which groups are overrepresented in higher risk, and where follow-up cuts might be useful.
Important interpretation reminder
Buckets are a communication tool. Thresholds should be set intentionally based on your organization’s risk tolerance and use case.
4. Forecasts
What it is
A forward-looking view of trends that help frame the new hire failure story and support planning conversations.
What you will typically see
A trend over time extended into future periods. Depending on what your organization publishes, this may include separation/attrition trends, volumes of new hire separations/failures, and/or adjacent operational context trends that help stakeholders understand impact.
Why it matters
Forecasts help leaders anticipate directionality (up/down/stable) and decide where deeper analysis is needed—especially when paired with segmentation views from the risk distribution section.
Important interpretation reminder
Forecasts are projections from historical patterns and are sensitive to data coverage, seasonality, and organizational change.
5. By-name List(s)
What it is
A practical list view that lets approved audiences review who falls into higher risk groups and add business context.
What you will typically see
A by-name list (often focused on high risk) with employee identifiers and relevant context fields that make review actionable (for example, job family, employee type, pay grade, manager flag, and other review-friendly context). It is typically designed to be filterable so leaders can narrow to their org or subgroup.
Why it matters
This supports structured talent conversations: “who looks at risk,” “where are the clusters,” and “what should we validate further?” without losing the model context.
6. Employee-level Analysis
What it is
A drill-down explainer for one employee at a time, intended for careful review with the right stakeholders—not self-service decisioning.
What you will typically see
Nothing appears until the viewer selects a single person from a Person (Predictions) filter. Once selected, the storyboard shows the employee’s predicted risk of new hire failure and explanation views: which features pushed the prediction toward Failure vs. Success, and how the employee’s feature values compare to the model population average.
Why it matters
This is the “why did the model score this person this way?” section. It makes risk explainable and supports thoughtful discussion when individual review is appropriate.
Important interpretation reminder
Person-level explanations describe model reasoning for a single prediction; they should be used as inputs to discussion, not as automated decisions or guarantees.
Comments
0 comments
Please sign in to leave a comment.