Welcome to the 2019.07.10 product release. This article provides an overview of the product innovations and improvements to be delivered on 10 July 2019. The article is structured as follows:
New Features & Innovations
-
Improved User Interface for Configuring One AI
-
New Data Science Features in One AI
-
Improved Workflow for Data Destinations
Bugs, Performance & Platform Improvements
-
Correction to Time Function based Metrics
-
Corrections to Data Pipeline Processing
-
Performance & Stability
New Features & Innovations
Improved User Interface for Configuring One AI
As part of our vision to deliver data science capabilities for customers at scale, we have been working on a new user interface to configure the One AI application. Almost everything that was previously contained within a YAML script is now available via the new UI with the benefit of improving usability, learnability, consistency, less scripting errors and facilitating broader use of the One AI capability. For power users, the YAML configuration is still available to be used as an override if needed for any reason.
Sample Screenshots of the new user interface for configuring One AI
New Data Science Features in One AI
-
Binned columns can be used in persisted models
-
bin_strategy has been changed to bin_type
-
New Per Column Interventions:
-
Null Filling
-
Binning
-
Scaling
-
Force select single variables ( single values for categorical, not bins )
-
Only use defined columns
-
Prevent columns from being dropped
4. New Dimensionality Reduction (Beta):
-
PCA
-
kPCA
Improved Workflow for Data Destinations
As part of this release we have introduced a number of enhancements to the Data Destinations capability of One Model. As a reminder, you can pin data to a ‘Data Destination’ for staging into One AI or for exporting to other 3rd party applications.
Traditionally, a user would pin a table to a dashboard and then from the dashboard link that table to a Data Destination. If this dashboard table was ever updated it would also affect the data destination, so often users would create specific dashboards for pinning data to Data Destinations and not do this via general use dashboards lest changes have unintended impact on integrations.
This release delivers an improved and simplified workflow for delivering data to Data Destinations.
-
A user can now pin a table directly to a Data Destination from Explore without the need to first add it to a Dashboard.
-
Any existing tables in the Data Destinations that were pinned from a Dashboard have been copied to the Data Destination and are now no longer dynamically linked to the Dashboard.
-
Users are still able to pin tables from a Dashboard to a Data Destination, but it will be copied, and not dynamically linked.
-
Users are able to update the query in a Data Destination by selecting it in the Data Destinations folder and then make changes in Explore.
Sample Screenshot of the new option for creating a Data Destination from Explore.
Sample Screenshot of the new option for modifying an Explore query from within a Data Destination.
Sample Screenshot of the new option to copy a query to a Data Destination from a Dashboard.
Bugs, Performance & Platform Improvements
Correction to Time Function based Metrics
-
When a query includes a metric based on a time function or a time filter, the application of a filter (inclusion or exclusion of a time selection) would be ignored. For example, using a metric such as “Total Terminations (Year To Date)”, if you filtered on the time dimension for current year quarters (2018 Q1...2018 Q4), rather than selecting a time function, the Year to Date metric does not filter on the selected time periods.
-
The application has been changed so that time function metrics (e.g. YTD) now correctly respond when filtering.
Corrections to Data Pipeline Processing
-
Validation checks were enhanced to ensure a model isn’t referencing model tables in further transformation tables within a data pipeline script.
-
Enhanced validation of types when using the DATEADD function in data pipeline script.
-
Fixed an issue where, in some cases, the user was prevented from creating a new version of a data pipeline script.
Performance & Stability
-
Improved cache handling to provide better environmental stability
-
Proactive error handling in data pipeline scripting
-
One AI
-
One AI Persisted models:
- Fixed bug causing categorical columns with similar names to have persistence problems.
- Fixed bug where 'Other' values from OHE columns persisted incorrectly
-
Fixed bug where Catboost periodically errored
-
Fixed One AI crash caused by using SMOTENC on all continuous data sets.
-
Fixed bug causing gridsearch to select the lowest k-feature run incorrectly
4. Improved error handling and messaging for using Data Destinations with One AI
That's all for this release, more innovating coming soon.
Comments
0 comments
Please sign in to leave a comment.