Although there is no “one size fits all” approach, here are some recommendations.
The following are recommendations for leveraging survey data in predictive models in One AI. There is no “one size fits all” approach to including this data in models, hence the recommendations.
-
Dates
-
Determine which dates the survey results are aligned to
-
You may want to consider aligning the headcount you’re making predictions on to these dates. Failure to do so can result in newer hires missing data as well as irrelevant results if too much time has passed
-
-
How many years of history do you have?
-
If you only have survey data for the current year, it’s not useful to a 1yr model
-
Sometimes using 2yrs of history to train the model performs much better then 1yr. In these cases you may lose more than you gain by leveraging survey data if you only have 1yr of history
-
-
-
Consistency year over year
-
Survey questions sometimes change from one year to another. Review the question alignment prior to attempting to include data. Using “factors” instead of individual questions can help mitigate issues here.
-
-
Coverage
-
Was the survey administered to the entire employee population? If not, are you able to filter headcount to get at just the surveyed portion?
-
-
Is the data secured?
-
If not, you can add individual results
-
If so, you may still be able to add generative attributes aggregated to the team level
-
-
Generative Attributes
-
Here’s an example config of a generative attribute for a survey factor. Be sure to set the time selection to include the desired records
-
-
NULL Filling
-
Mean NULL filling on survey columns is recommended.
-
If you also wish to predict on the impact of the NULLs specifically, you can create a second binary generative attribute. There is an option in the generative attributes creation for "Is Binary".
-
At times there are a fair number of employees without results regardless of the filters you apply to the headcount. In these cases NULL filling is especially important.
-
Here’s an example of how to configure mean NULL filling:
-
-
Results
-
Including survey results does not always result in significant model performance gains. Ideally, run a model without survey results before adding them so you can get a sense of how they impact the model.
-
Comments
0 comments
Please sign in to leave a comment.