Welcome to the 2020.03.11 product release. This article provides an overview of the product innovations and improvements to be delivered on 11 March, 2020. The article is structured as follows:
User Experience
OneAi Innovations
Bugs, Performance & Platform Improvements
-
Data Ingestion Bugs Fixed and Minor Improvements
-
Improvements to Data Pipeline Processing
-
General Bugs Fixed and Minor Improvements
User Experience
New User Experience (UX) & Storyboards
-
We are rounding the final turn and getting close to closing the beta period for the new UX. In the latest release we smashed more bugs and added a bunch of new features to deliver on most of the key product features. We already have a number of customers that have fully migrated to the new UX and rolled out to their users. Please continue to send through your feedback.
One Ai Innovations
Multi-label Coefficient & Lime Explanations
-
We’ve expanded the amount of data that OneAI returns via the explanation file when results are deployed into the core OneModel data model. Previously, we returned LIME weights and mapped coefficient values for the primary class exclusively. With this update we’ve added LIME weights and mapped coefficient values for all classes.
-
This change will allow users to explore contributing factors on an employee basis for all possible labels not just the predicted label.
Composite Dataset ID
-
Composite Dataset ID allows users to easily build training datasets sampled over several time periods. This functionality has always been available but was previously difficult to set up.
-
This feature is currently only available via YML configurations within an augmentation. Over the next few releases we will move it into the GUI configuration as well.
-
Two new related YML keys have been added ( shown below ). Adding the Composite_dataset_id key to your YML and passing a value of True will let OneAI know you want to use a composite Dataset ID. By default OneAI will try to use the dataset_id ( i.g. employee_id ) in conjunction with the sample date ( i.g. TimePeriod ). If you don’t want to use the dataset_id and sample data you can pass in a list of column names that you want to use instead ( i.g. [TimePeriod, Employee_id, Org_Unit].
- Composite_dataset_id: True
- Composite_dataset_id_keys: None
Improved Handling of Collinearity
-
We’ve made some adjustments to the default behavior of OneAI to try and reduce the number of collinear features that can get selected when running a default run.
-
We will continue to push updates to improve our performance in this area. Because of the leveled nature of our data and the mixed type representation of single factors this problem can be difficult to detect. We have a plan to utilize some of the table relationships and other metadata about variables to better address this problem in upcoming releases.
Prescriptive Error Handling
-
We continue to roll out new error messages that attempt to recommend solutions to errors.
Improved Commute Time Data
-
We’ve added additional data points in our commute time augmentation to support deeper analysis.
-
Commute time data augmentations now accept a morning arrival time and an afternoon departure time. If employees need to be in the office at a specific time then we will calculate commute and traffic into and from the office based on the defined times.
Linear Regression for Scatter Plots on the EDA report
-
The EDA scatter plots for regression problems have been updated to show a facet scatter plot of the continuous label and the continuous variable with a linear regression fitted to the distribution.
Bugs, Performance & Platform Improvements
Data Ingestion Bugs Fixed and Minor Improvements
-
Once your data sources are re-processed into One Model we run caching processes to optimize the end-user experience and performance. In this release we have added a message to the Data Loads page to indicate when the Cache Warming process is complete. (ref 4171)
-
Fixed an issue where the password for an SFTP data source would not be retained if other configuration options were updated. (ref 4096)
-
Improved the stability of the Data Loads page for loads with a large number of files by showing a File Status Overview, and restricting the total number of files shown in the UI to 100. (ref 3690)
-
Added support for Ignoring Invalid rows in a Data Source, which allows for files with corrupted records (e.g. too many delimiters in the record) to still be loaded successfully into One Model. Details from the Errored rows, including the reason for the error, are then made available when looking into the File in the Data Loads page. (ref 4120)
-
Fixed an issue where the Greenhouse connector would not pull data if less than 500 records were returned for either Incremental or Destructive loads. (ref 3967)
Improvements to Data Pipeline Processing
-
Added support for "Select Table.*" in Processing Scripts. This allows users to select all columns from a table in one statement in a transformation that includes multiple tables. (ref 2833)
-
Fixed an issue where the "With Compression" feature of Processing Scripts would not work correctly when not combined with Date Slices. The "With Compression" is an optimization feature allows users to reduce the number of Effective Dated records that may exist across time in any given table. It works by combining records that have identical attributes into a single record that as ‘active’ for the total period of time the original identical records were valid. (ref 3846)
-
Made a number of additional improvements to the overall Data Orchestration process to make it more robust when under heavy load. (ref 3945)
-
Data Destinations will now be put on hold while Processing Scripts are running. This will prevent the Processing Script from causing errors in the Data Destination by changing something the Destination is using. (ref 3999)
-
Added Validation for the ‘Start’ and ‘Length’ parameters for the /Substring/ function in Processing Scripts to ensure they are both ‘Numeric’, and that the ‘Length’ parameter is a positive value. (ref 4050)
-
Added Validation for Processing Scripts to capture UPPER and LOWER functions being used on a Boolean field, where these functions are supposed to only be used for strings. (ref 4176)
General Bugs Fixed and Minor Improvements
-
Fixed an issue where Contextual Security could not be applied on a Slowly Changing Dimension. (ref 3534)
-
With the new UX & Storyboards starting to be adopted by customers in production we are now tracking Storyboards in Usage Statistics just like Dashboards. (ref 4255
Comments
0 comments
Please sign in to leave a comment.