📚

Data Carve-Outs — From Narrative to Structure

A field experience at a Utility Firm showing how structure turns one-off data extraction into a repeatable, defensible system.

🧭 Context

Multiple phased carve-outs were delivered across the business. Each phase built on the last:
• Phase 1 → initial carve out business dataset 01
• Phase 2 → second carve out business dataset 02
• Phase 3 → third carve out business dataset 03

⚠️ The Problem

Each phase subtly redefined what “in scope” meant:
• Rules drifted
• Edge cases accumulated
• Reconciliation became harder

We weren’t repeating a process — we were redefining the problem each time.

🔍 Diagnosis

The carve-out existed as a narrative, not a structure.

  • No fixed grain (what is a record)
  • No stable axes (time, product, ownership)
  • No reusable rule set

🛠️ Intervention

To make the concept land with an audience—largely of management accountants—I used a simple library analogy: splitting books by university faculty. It anchored the idea in something familiar, allowing the structure to be understood clearly without relying on technical jargon.

“Splitting books in a library”

  • Define shelves → dimensions (time, product, entity)
  • Define a book → grain
  • Define allocation → rules

Not a platform. Just a consistent way of deciding what belongs.

🧱 Pushback

The steer from solution architecture was that this had been looked at and discounted on cost grounds — “we don’t have the budget to host a star schema, we just need to do the carve-out.” To me, that reflected a conceptual gap. Cost felt more like a gut feel judgement wrapped in technical language, and structure was being treated as something you host rather than something you define once and reuse.

Translation:
• Structure was seen as infrastructure
• Not as a reusable definition

💡 Core Insight

Without structure, a carve-out is a story.
With structure, it becomes a system.

📉 Outcome

  • Increasing reconciliation effort
  • Ambiguity across phases
  • Manual correction and rework

Cost wasn’t infrastructure — it was delivery friction.

✅ Better Way

Define once, reuse many times:

  • Grain → what is a record
  • Dimensions → axes of selection
  • Rules → inclusion boundaries

Implement as:
• views
• mapping tables
• lightweight staging logic

Delivery. Data. Done Properly.

Define the shape once. Execute it many times.

Analytical Geometry — Data Modelling Tile
📐

Analytical Geometry for Data

Not guiding rockets — just making sure the data knows where it is and why it matters.

In utility data, a record isn’t just a row — it’s a positioned event.

Customer, asset, time, tariff — that’s a coordinate system.
Get the position right, and everything else follows. Get it wrong, and reconciliation becomes archaeology.

Geometry Data Equivalent Meaning
Coordinates Business keys + time Defines position of a fact
Axes Dimensions Reference system (star schema)
Vectors Data movement Change over time
Transformation ETL / ELT Move without losing meaning
Intersection Join Where business concepts meet
Practical read:
A meter reading is not just a value.
It is:
(Customer, Meter, Time, Value)

Lose one axis → lose meaning.

Visual Overlay

Analytical Geometry as Data Model Customer Axis X Meter Axis Y Time Axis Z Reading Fact Point value Tariff Business context 1 to many 1 to many 1 to many many to 1 Interpretation A fact is a coordinate: (Customer, Meter, Time, Value) Why this matters Dimensions define position. ETL must preserve meaning, not just move rows about.
PlantUML source
@startuml
title Analytical Geometry as Data Model

entity "Customer (Axis X)" as C
entity "Meter (Axis Y)" as M
entity "Time (Axis Z)" as T

entity "Reading (Fact Point)" as R {
  value
}

C ||--o{ R
M ||--o{ R
T ||--o{ R

entity "Tariff" as TF
R }o--|| TF

note right of R
A fact is a coordinate:
(Customer, Meter, Time, Value)
end note

@enduml
Delivery. Data. Done Properly.