Point in Time Data

 
Article Contents

Overview

Unlike panelled datasets, whose changing populations can restate history, Hatched Analytics’ data feeds are built from sequential IDs, so the historical data does not naturally restate. In the rare cases where history is updated, changes are captured through the Point-in-Time (PIT) archive, allowing you to reproduce exactly what was visible on any prior date.
 

Structure & Naming

In Snowflake and S3, data is organised into three main schemas:
  • LIVE – The latest files for each data feed. These are overwritten on each delivery.
  • PIT – Point-in-time archives. Each delivery is appended as a new snapshot with a date post-fix, preserving a full version history. The post-fixed date indicates the latest data date contained in that snapshot.
  • SUPPLEMENTAL – Reference metadata (mappings, descriptions); not PIT-tracked.
 
 

Using PIT for Backtests

When running backtests, use PIT files to ensure your results only incorporate data that was available at the time, preventing look-ahead bias. PIT mirrors LIVE schemas (e.g., TS_INDEX, TS_RESULTS, MODEL_DETAILS), and includes fields such as RELEASEDATE so you can enforce strict “what-was-known-when” logic.

Restates

Restates are uncommon and clearly tracked. We provide a Python script (available on request, message connect@hatchedanalytics.com) to diff PIT snapshots for any ticker/file type and identify changes between deliveries. As of August 2025, we have delivered 33,500 data points with only 220 changes to the underlying data, a restatement rate of 0.66%.