What is a data snapshot?
A snapshot is a view of the dataset as at the current point in time. It does not contain historical records or show changes over time. A vendor that provides snapshot files is able to provide the data as it stands at the time of extraction and can not provide historical data.
How do we handle snapshot data?
If snapshot files are sent to One Model, this is commonly set up as a full load so each new file completely replaces the data from the previous file. When the data is queried in One Model, we can only see the current state of the data based on the most recent file sent from the vendor regardless of the time period selected in the Explore query.
Instead of loading the data as a full load, One Model has another option to store each snapshot file in order to build history. Using this method, the snapshots can be merged together and changes over time can be identified and effective dated based on the date the snapshot files are received. In this way, we could see changes and trends over time. However, there are several risks and limitations to consider before taking this approach.
Risks and limitations
This approach comes with some risks and limitations that should be considered. These are mainly due to the fact that the vendor is never able to re-send historical data. If the customer is sending the data directly, rather than through a vendor, the same risks and limitations apply if the data is snapshot.
-
The vendor is never able to re-send historical data because they only ever have access to the current point in time.
-
History only begins as of the first snapshot received by One Model.
-
If the dataset is fully refreshed at any time with a new full load, the prior history will no longer be available.
-
If a file is missed this gap can not be rectified since the vendor is unable to access that day's snapshot after the fact. For example, this may occur if there is an extraction or transmission failure from the vendor.
-
Historical records from earlier snapshots can not be updated or deleted. For example, fixing an event date that was entered incorrectly or removing a duplicate record.
-
If the file or data format changes the historical snapshots can not be updated with the change. For example, this may occur when a new column is added to the dataset, we can only receive the new column data from the date it was added, older history can not be updated with the column since the vendor is not able to extract history.
-
The files sent must include a date stamp in the file name in order to ensure a unique filename for each transmission and also provides the date of each snapshot to build logic from.
-
Since snapshot data would not be all flowing into One Model from a historically managed HRIS source, if there was an employee data removal request (e.g. under GDPR) the customer might have to manually remove the required data from the snapshots and resend. Or One Model could delete all snapshots, snapshot history would begin again from the latest file loaded since this should no longer include the employee data that's been removed from the HRIS source. In some cases, One Model may be able to delete specific records/employees, this can be discussed with OneModel and evaluated on a case by case basis.
-
OneModel should not be used as the only storage point for snapshot files/data. The customer should set up a process to store a copy of all snapshot files also.
Examples
Assume we have a data set that tells us the Org Unit that employees are assigned to. Here is an example of a full history file (not a snapshot file) we might receive. For the employee we would receive their full, effective dated history in one file, across many records. Because this history is present in each file, One Model will load the file in as a full file. We can see in this one file that the employee is assigned to Org A on 1 Jan 2022 and then moved to Org B on 3 Jan 2022 and we can report these changes in the One Model application through time.
If snapshot files are sent, we only receive one record for the employee each day showing their org unit on that day only. For the above employee we would receive one record in a 1 Jan 2022 file:
The following day we would receive one record in a 2 Jan 2022 file:
Then the following day we would receive one record in a 3 Jan 2022 file:
Our standard approach is to load each file as a full file. Each day, we load the latest file and that data completely replaces the data we were previously holding. With this approach, regardless of the time period queried in the One Model application, this employee will always report to Org B. We no longer have access to see the history of the employee reporting to Org A once the Org B change occurs on 3 Jan.
With a snapshot approach, One Model can store the data from every snapshot file received to build history. Logic can be applied to assign the employee to Org A on 1 Jan 2022 and then run comparisons against the snapshot files to re-assign them to Org B from 3 Jan 2022 onwards. In this way we can see the historical changes in One Model.
Comments
0 comments
Please sign in to leave a comment.