Data Carve-Outs — From Narrative to Structure
A field experience at a Utility Firm showing how structure turns one-off data extraction into a repeatable, defensible system.
🧭 Context
Multiple phased carve-outs were delivered across the business. Each phase built on the last:
• Phase 1 → initial carve out business dataset 01
• Phase 2 → second carve out business dataset 02
• Phase 3 → third carve out business dataset 03
⚠️ The Problem
Each phase subtly redefined what “in scope” meant:
• Rules drifted
• Edge cases accumulated
• Reconciliation became harder
We weren’t repeating a process — we were redefining the problem each time.
🔍 Diagnosis
The carve-out existed as a narrative, not a structure.
- No fixed grain (what is a record)
- No stable axes (time, product, ownership)
- No reusable rule set
🛠️ Intervention
To make the concept land with an audience—largely of management accountants—I used a simple library analogy: splitting books by university faculty.
“Splitting books in a library”
- Define shelves → dimensions (time, product, entity)
- Define a book → grain
- Define allocation → rules
Not a platform. Just a consistent way of deciding what belongs.
🧱 Pushback
The steer from solution architecture was that this had been looked at and discounted on cost grounds — “we don’t have the budget to host a star schema, we just need to do the carve-out.”
To me, what was really being said was: we can’t afford to do it the less expensive way, because we are already doing it the more expensive way.
Translation:
• Structure was seen as infrastructure
• Not as a reusable definition
💡 Core Insight
Without structure, a carve-out is a story.
With structure, it becomes a system.
🔗 The Shape of the Solution
Date / period
validity window
Who the data
belongs to
What caused
the change
What gets
carved out
Time + Owner + Event → Outcome
📉 Outcome
- Increasing reconciliation effort
- Ambiguity across phases
- Manual correction and rework
✅ Better Way
Define once, reuse many times.
🧠 Diagnostic Conclusion
The answer was actually quite straightforward — we just weren’t modelling it that way.
The carve-out needed a temporal backbone.
- When did this data exist?
- Who did it belong to at that point in time?
- Why did it change — what event caused it?
Ownership at a point in time is only half the story. Without the causation event, you can’t explain it or repeat it.
Once you introduce time, ownership, and event, the problem stops being interpretive and starts being mechanical.
Same data. Same rules. Different date → predictable outcome.
Delivery. Data. Done Properly.
Define the shape once. Execute it many times.
Analytical Geometry for Data
Not guiding rockets — just making sure the data knows where it is and why it matters.
In utility data, a record isn’t just a row — it’s a positioned event.
Customer, asset, time, tariff — that’s a coordinate system.
Get the position right, and everything else follows. Get it wrong, and reconciliation becomes archaeology.
| Geometry | Data Equivalent | Meaning |
|---|---|---|
| Coordinates | Business keys + time | Defines position of a fact |
| Axes | Dimensions | Reference system (star schema) |
| Vectors | Data movement | Change over time |
| Transformation | ETL / ELT | Move without losing meaning |
| Intersection | Join | Where business concepts meet |
A meter reading is not just a value.
It is:
(Customer, Meter, Time, Value)Lose one axis → lose meaning.
Visual Overlay
PlantUML source
@startuml
title Analytical Geometry as Data Model
entity "Customer (Axis X)" as C
entity "Meter (Axis Y)" as M
entity "Time (Axis Z)" as T
entity "Reading (Fact Point)" as R {
value
}
C ||--o{ R
M ||--o{ R
T ||--o{ R
entity "Tariff" as TF
R }o--|| TF
note right of R
A fact is a coordinate:
(Customer, Meter, Time, Value)
end note
@enduml