📚

Data Carve-Outs — From Narrative to Structure

A field experience at a Utility Firm showing how structure turns one-off data extraction into a repeatable, defensible system.

🧭 Context

Multiple phased carve-outs were delivered across the business. Each phase built on the last:
• Phase 1 → initial carve out business dataset 01
• Phase 2 → second carve out business dataset 02
• Phase 3 → third carve out business dataset 03

⚠️ The Problem

Each phase subtly redefined what “in scope” meant:
• Rules drifted
• Edge cases accumulated
• Reconciliation became harder

We weren’t repeating a process — we were redefining the problem each time.

🔍 Diagnosis

The carve-out existed as a narrative, not a structure.

No fixed grain (what is a record)
No stable axes (time, product, ownership)
No reusable rule set

🛠️ Intervention

To make the concept land with an audience—largely of management accountants—I used a simple library analogy: splitting books by university faculty.

“Splitting books in a library”

Define shelves → dimensions (time, product, entity)
Define a book → grain
Define allocation → rules

Not a platform. Just a consistent way of deciding what belongs.

🧱 Pushback

The steer from solution architecture was that this had been looked at and discounted on cost grounds — “we don’t have the budget to host a star schema, we just need to do the carve-out.”

To me, what was really being said was: we can’t afford to do it the less expensive way, because we are already doing it the more expensive way.

Translation:
• Structure was seen as infrastructure
• Not as a reusable definition

💡 Core Insight

Without structure, a carve-out is a story.
With structure, it becomes a system.

🔗 The Shape of the Solution

⏱️

Time

Date / period
validity window

👤

Owner

Who the data
belongs to

⚡

Event

What caused
the change

📦

Outcome

What gets
carved out

Time + Owner + Event → Outcome

📉 Outcome

Increasing reconciliation effort
Ambiguity across phases
Manual correction and rework

✅ Better Way

Define once, reuse many times.

🧠 Diagnostic Conclusion

The answer was actually quite straightforward — we just weren’t modelling it that way.

The carve-out needed a temporal backbone.

When did this data exist?
Who did it belong to at that point in time?
Why did it change — what event caused it?

Ownership at a point in time is only half the story. Without the causation event, you can’t explain it or repeat it.

Once you introduce time, ownership, and event, the problem stops being interpretive and starts being mechanical.

Same data. Same rules. Different date → predictable outcome.

Delivery. Data. Done Properly.

Define the shape once. Execute it many times.

Analytical Geometry — Data Modelling Tile

📐

Analytical Geometry for Data

Not guiding rockets — just making sure the data knows where it is and why it matters.

In utility data, a record isn’t just a row — it’s a positioned event.

Customer, asset, time, tariff — that’s a coordinate system.
Get the position right, and everything else follows. Get it wrong, and reconciliation becomes archaeology.

Geometry	Data Equivalent	Meaning
Coordinates	Business keys + time	Defines position of a fact
Axes	Dimensions	Reference system (star schema)
Vectors	Data movement	Change over time
Transformation	ETL / ELT	Move without losing meaning
Intersection	Join	Where business concepts meet

Practical read:
A meter reading is not just a value.
It is:
(Customer, Meter, Time, Value)

Lose one axis → lose meaning.

Visual Overlay

PlantUML source

@startuml
title Analytical Geometry as Data Model

entity "Customer (Axis X)" as C
entity "Meter (Axis Y)" as M
entity "Time (Axis Z)" as T

entity "Reading (Fact Point)" as R {
  value
}

C ||--o{ R
M ||--o{ R
T ||--o{ R

entity "Tariff" as TF
R }o--|| TF

note right of R
A fact is a coordinate:
(Customer, Meter, Time, Value)
end note

@enduml

Delivery. Data. Done Properly.