📚

Data Carve-Outs — From Narrative to Structure

A field experience at a Utility Firm showing how structure turns one-off data extraction into a repeatable, defensible system.

🧭 Context

Multiple phased carve-outs were delivered across the business. Each phase built on the last:
• Phase 1 → initial carve out business dataset 01
• Phase 2 → second carve out business dataset 02
• Phase 3 → third carve out business dataset 03

⚠️ The Problem

Each phase subtly redefined what “in scope” meant:
• Rules drifted
• Edge cases accumulated
• Reconciliation became harder

We weren’t repeating a process — we were redefining the problem each time.

🔍 Diagnosis

The carve-out existed as a narrative, not a structure.

  • No fixed grain (what is a record)
  • No stable axes (time, product, ownership)
  • No reusable rule set

🛠️ Intervention

To make the concept land with an audience—largely of management accountants—I used a simple library analogy: splitting books by university faculty.

“Splitting books in a library”

  • Define shelves → dimensions (time, product, entity)
  • Define a book → grain
  • Define allocation → rules

Not a platform. Just a consistent way of deciding what belongs.

🧱 Pushback

The steer from solution architecture was that this had been looked at and discounted on cost grounds — “we don’t have the budget to host a star schema, we just need to do the carve-out.”

To me, what was really being said was: we can’t afford to do it the less expensive way, because we are already doing it the more expensive way.

Translation:
• Structure was seen as infrastructure
• Not as a reusable definition

💡 Core Insight

Without structure, a carve-out is a story.
With structure, it becomes a system.

🔗 The Shape of the Solution

⏱️
Time

Date / period
validity window

👤
Owner

Who the data
belongs to

Event

What caused
the change

📦
Outcome

What gets
carved out

Time + Owner + Event → Outcome

📉 Outcome

  • Increasing reconciliation effort
  • Ambiguity across phases
  • Manual correction and rework

✅ Better Way

Define once, reuse many times.

🧠 Diagnostic Conclusion

The answer was actually quite straightforward — we just weren’t modelling it that way.

The carve-out needed a temporal backbone.

  • When did this data exist?
  • Who did it belong to at that point in time?
  • Why did it change — what event caused it?

Ownership at a point in time is only half the story. Without the causation event, you can’t explain it or repeat it.

Once you introduce time, ownership, and event, the problem stops being interpretive and starts being mechanical.

Same data. Same rules. Different date → predictable outcome.

Delivery. Data. Done Properly.

Define the shape once. Execute it many times.

Analytical Geometry — Data Modelling Tile
📐

Analytical Geometry for Data

Not guiding rockets — just making sure the data knows where it is and why it matters.

In utility data, a record isn’t just a row — it’s a positioned event.

Customer, asset, time, tariff — that’s a coordinate system.
Get the position right, and everything else follows. Get it wrong, and reconciliation becomes archaeology.

Geometry Data Equivalent Meaning
Coordinates Business keys + time Defines position of a fact
Axes Dimensions Reference system (star schema)
Vectors Data movement Change over time
Transformation ETL / ELT Move without losing meaning
Intersection Join Where business concepts meet
Practical read:
A meter reading is not just a value.
It is:
(Customer, Meter, Time, Value)

Lose one axis → lose meaning.

Visual Overlay

Analytical Geometry as Data Model Customer Axis X Meter Axis Y Time Axis Z Reading Fact Point value Tariff Business context 1 to many 1 to many 1 to many many to 1 Interpretation A fact is a coordinate: (Customer, Meter, Time, Value) Why this matters Dimensions define position. ETL must preserve meaning, not just move rows about.
PlantUML source
@startuml
title Analytical Geometry as Data Model

entity "Customer (Axis X)" as C
entity "Meter (Axis Y)" as M
entity "Time (Axis Z)" as T

entity "Reading (Fact Point)" as R {
  value
}

C ||--o{ R
M ||--o{ R
T ||--o{ R

entity "Tariff" as TF
R }o--|| TF

note right of R
A fact is a coordinate:
(Customer, Meter, Time, Value)
end note

@enduml
Delivery. Data. Done Properly.