AI/ML Maffs | IainToolin

🗝️ Emoji Key — Industry Sectors

📊 Outsourcing General Commercial & IT

📊 Stats — descriptive, CI, A/B, precision/recall, robust outliers
🤖 ML — regression, logistic, regularisation, PCA, Bayes
☎️ Ops/Queues — Poisson, Erlang-C, smoothing, linear programming
🌐 Networks & Reliability — availability, PageRank

⚡️ Electricity Retail & Wholesale

📈 Forecasting — demand, ARIMA, Holt-Winters, weather-normalisation
🤖 ML — GBT/LSTM load & RES forecasts, spike classifiers, anomaly detection, RL battery dispatch
🧰 Operations & Dispatch — economic dispatch, unit commitment, storage, demand response
💹 Risk & Hedging — hedge cost, VaR/CVaR, GARCH, elasticity
🌐 Network & Settlement — DC-OPF/LMPs, congestion rent, loss factors

☎️ Telecommunications Retail & Wholesale

📈 Demand & Forecasting — ARIMAX, Holt-Winters, fraud sanity checks
🤖 ML — churn + survival, CDR fraud (IsoForest + supervised), traffic GBT/LSTM, QoE regression, graph ML, RL for SON
☎️ Traffic & Capacity — Erlang-B/C, Engset, Little’s Law, latency
🛰️ Radio & Throughput — path loss, SINR, Shannon, TCP throughput
💷 Commercial — ARPU, CLV, churn, LCR, interconnect billing
🌐 Network & Reliability — availability, PageRank, MOS/QoE

Outsourcing Model Click2Expand

📊 Stats — Descriptive, CI, A/B, Precision/Recall, Robust Outliers ▾

Name	Formula	Non-technical description	Technical description	Scenario	Pros	Pitfalls	Formulas to address pitfalls	Python libraries	Python code	Formula Steps in Plain English	Step-by-step evaluation
📏 Descriptive stats (mean/variance/std/z)	`x̄ = (1/n) Σ xi`; `s² = (1/(n-1)) Σ (xi−x̄)²`; `z = (x−x̄)/s`	Find average, spread, and how unusual a value is.	Estimators for central tendency & dispersion; z standardises units.	Average handling time (AHT) in a council helpline.	Simple baseline; quick QC.	Skew/outliers distort mean & std.	`median`, `MAD`; robust z: `0.6745(x−med)/MAD`	numpy, pandas	`z = (aht - aht.mean())/aht.std(ddof=1)`	1) Add values → mean. 2) Deviations → square/average → variance. 3) √variance → std. 4) (x−mean)/std → z.	Collect AHT → compute mean/std → flag \|z\|>3 → review calls.
🎯 Confidence interval (mean)	`x̄ ± z_α/2· s/√n`	A safety margin around the average.	Normal/t approximation to mean CI.	Estimate ticket resolve time ±5 min @95%.	Quantifies uncertainty.	Autocorrelation breaks iid assumptions.	Block bootstrap; Newey–West SEs.	scipy.stats, statsmodels	`stats.t.interval(0.95, n-1, loc=x.mean(), scale=x.std(ddof=1)/n**0.5)`	1) Mean. 2) Std error = std/√n. 3) Margin = z×SE. 4) Mean ± margin.	Size target MOE → compute n → sample → report CI on dashboards.
🧪 A/B test (means & proportions)	`t = (x̄₁−x̄₂)/(s_p√(1/n₁+1/n₂))`; `z = (p₁−p₂)/√(p(1−p)(1/n₁+1/n₂))`	Check if a change really helped.	NHST for differences in means/props.	Old vs new IVR transfer rate.	Clear decision-gate.	Multiple tests inflate false positives.	Holm/Bonferroni; power analysis (1−β).	scipy.stats, statsmodels	`from scipy import stats stats.ttest_ind(a, b, equal_var=False)`	1) Compare group means/props. 2) Compute test statistic. 3) Get p-value from dist. 4) Decide vs α.	Define KPI → randomise → run test → correct for multiplicity → decide rollout.
🧮 Precision / Recall / F1	`P=TP/(TP+FP)`, `R=TP/(TP+FN)`, `F1=2PR/(P+R)`	Quality of yes/no predictions.	Confusion-matrix derived metrics.	Security alert triage (high precision on P1).	Aligns to SLA priorities.	Threshold choice changes trade-off.	PR curve; optimise F_β for business cost.	scikit-learn	`from sklearn.metrics import f1_score f1 = f1_score(y_true, (proba>0.42))`	1) Count TP/FP/FN. 2) Compute P, R. 3) Combine into F1.	Estimate costs → pick threshold on PR curve → monitor drift.
🧯 Robust outlier detection (MAD)	`MAD = median(\|x−med\|)`; `zᵣ=0.6745(x−med)/MAD`	Outlier score that resists skew.	Median-based; 50% breakdown point.	Flag rogue CPU spikes or ticket times.	Works on messy ops data.	MAD=0 if flat segments.	IQR; STL detrend + MAD on residuals.	numpy, pandas	`med = np.median(x) mad = np.median(np.abs(x-med)) zr = 0.6745*(x-med)/mad`	1) Median. 2) Absolute deviations. 3) Median of those → MAD. 4) Scale to zᵣ.	Compute zᵣ → set \|zᵣ\|>k (e.g., 3.5) → alert & suppress repeats.

🤖 ML — Regression, Classification, Regularisation, Optimisation, NLP, PCA, Bayes ▾

Name	Formula	Non-technical description	Technical description	Scenario	Pros	Pitfalls	Formulas to address pitfalls	Python libraries	Python code	Formula Steps in Plain English	Step-by-step evaluation
📈 Linear regression	`β̂=(XᵀX)⁻¹Xᵀy`; minimize `MSE`	Best-fit line through points.	OLS; BLUE under Gauss–Markov.	Forecast backlog from arrivals & staffing.	Interpretable, fast.	Multicollinearity inflates variance.	VIF; Ridge `+λ‖β‖²`.	scikit-learn, numpy	`from sklearn.linear_model import LinearRegression m = LinearRegression().fit(X, y)`	1) Arrange X (inputs), y (target). 2) Compute β̂. 3) Use Xβ̂ to predict.	Split train/test → fit → residual checks → deploy & monitor.
🔐 Logistic regression (binary)	`P(y=1\|x)=1/(1+e^{-(wᵀx+b)})`; log-loss	Predict yes/no probability.	Bernoulli GLM with logit link; MLE.	Predict SLA breach risk in 24h.	Calibrated probabilities.	Class imbalance; miscalibration.	Class weights; Platt/Isotonic calibration.	scikit-learn	`from sklearn.linear_model import LogisticRegression m = LogisticRegression(class_weight='balanced').fit(X,y)`	1) Weighted sum → sigmoid. 2) Output 0–1 probability.	Pick threshold for business cost → calibrate → track drift.
🧲 Regularisation (L1/L2/ElasticNet)	Minimise `‖y−Xβ‖² + λ‖β‖²` (L2) or `+ λ‖β‖₁` (L1)	Prevent overfitting; simplify model.	Penalised ERM adds bias, reduces variance.	Many features, few rows.	Stabilises coefficients.	Too much λ underfits.	CV to choose λ; 1-SE rule.	scikit-learn	`from sklearn.linear_model import ElasticNetCV m = ElasticNetCV(cv=5).fit(X,y)`	1) Fit with penalty. 2) Tune λ via CV. 3) Refit on full data.	Grid search λ/α → check bias/var → lock and version.
🏃 Gradient descent (SGD)	`θ_{t+1}=θ_t − α ∇J(θ_t)`	Repeated nudges downhill.	First-order iterative optimiser.	Train classifiers on high-dim data.	Scales to big data.	Bad learning rate diverges.	Line search; Adam; schedules.	sklearn, PyTorch, TF	`from sklearn.linear_model import SGDClassifier m = SGDClassifier().fit(X,y)`	1) Start guess. 2) Compute slope. 3) Step opposite direction × α. 4) Repeat.	Monitor loss vs epochs → early stopping → save best weights.
🧾 TF-IDF + Cosine similarity	`tfidf=tf·log(N/df)`; `cosθ=(u·v)/(‖u‖‖v‖)`	Turn text into weighted numbers; compare angle.	Sparse vectorisation; angular similarity.	Auto-route tickets (“password reset”).	Strong baseline for NLP routing.	OOV & vocabulary drift.	n-grams; hashing; sublinear tf.	scikit-learn	`from sklearn.feature_extraction.text import TfidfVectorizer X = TfidfVectorizer().fit_transform(texts)`	1) Count words. 2) Down-weight common terms. 3) Compare vectors by angle.	Build training corpus → vectorise → nearest-neighbour route → retrain monthly.
🧊 PCA (dimensionality reduction)	`Σv=λv`; project `Z=XV_k`	Compress features but keep variance.	Orthogonal projection onto top eigenvectors.	Reduce 300 KPIs to 10 drivers.	Speeds training; denoises.	Loss of interpretability.	Inspect loadings; varimax rotation.	scikit-learn, numpy	`from sklearn.decomposition import PCA Z = PCA(10).fit_transform(X)`	1) Standardise. 2) Covariance & eigenvectors. 3) Keep top k. 4) Project.	Choose k by explained variance → sanity-check loadings → use Z downstream.
🧠 Bayes’ theorem	`P(A\|B)=P(B\|A)P(A)/P(B)`	Update belief with new evidence.	Posterior ∝ likelihood × prior.	Probability a ticket is security-related given keywords.	Transparent updating.	Bad priors bias results.	Empirical Bayes; hierarchical models.	numpy, scipy	`post=(likeprior)/(likeprior + like0*(1-prior))`	1) Set prior. 2) Compute likelihoods. 3) Apply formula → posterior.	Estimate priors from historicals → update online → calibrate with reliability curves.

☎️ Ops/Queues — Poisson & Little’s Law, Erlang-C, Exponential Smoothing, Linear Programming ▾

Name	Formula	Non-technical description	Technical description	Scenario	Pros	Pitfalls	Formulas to address pitfalls	Python libraries	Python code	Formula Steps in Plain English	Step-by-step evaluation
☎️ Poisson arrivals & Little’s Law	`P(N=k)=e^{-λt}(λt)^k/k!`; `L=λW`	Calls arrive randomly but at a rate.	Poisson process; steady-state relationship L=λW.	Inbound calls per 5-min; infer avg queue.	Simple, insight-rich.	Overdispersion (burstiness).	Negative Binomial (Var=μ+κμ²); λ(t) by time-of-day.	scipy.stats, statsmodels	`from scipy.stats import poisson p = poisson.pmf(k, lam*t)`	1) Estimate λ. 2) Choose t. 3) Use formula for k arrivals. 4) L=λW links rate & wait.	Fit λ(t) per interval → validate dispersion → apply Little’s Law for WIP & SLAs.
👥 Erlang-C (M/M/c) + ASA	`ρ=λ/(cμ)`; `P_W= (a^c/(c!(1−ρ)))/[Σ₀^{c−1} a^n/n! + a^c/(c!(1−ρ))]`; `a=λ/μ`; `ASA=P_W/(cμ−λ)`	Predict wait probability & agents needed.	Steady-state queueing for c parallel servers.	Size a 24/7 helpline to meet 80/20 SLA.	Industry standard.	Non-Poisson arrivals; skill mismatch.	Skills routing; simulation; use λ(t).	numpy	`def erlang_c(lam, mu, c): import numpy as np, math a = lam/mu; rho = lam/(cmu) num = ac/(math.factorial(c)(1-rho)) den = sum(a**n/math.factorial(n) for n in range(c)) + num return num/den`	1) Get λ, μ. 2) Try c. 3) Compute P_W, ASA. 4) Iterate c to hit SLA.	Forecast interval loads → compute c by interval → add shrinkage & occupancy caps.
📅 Exponential smoothing (level)	`ŷ_t+1=α y_t + (1−α) ŷ_t`	Fast, adaptive short-term forecast.	EW squared-error minimiser; 0<α<1.	15-min inbound chat forecast.	Tiny & robust.	No seasonality/trend.	Holt-Winters (add trend/seasonal).	statsmodels	`from statsmodels.tsa.holtwinters import ExponentialSmoothing fit = ExponentialSmoothing(y).fit() fc = fit.forecast(12)`	1) Pick α. 2) New forecast = α×latest + (1−α)×old forecast.	CV α per queue → roll forecasts → compare MAPE → feed into staffing LP.
🧮📦 Linear programming (rostering)	`min cᵀx` s.t. `Ax ≥ b`, `x ≥ 0`	Optimise resources under limits.	Convex optimisation with feasibility regions.	Minimise staffing cost while meeting 80/20 & occupancy.	Global optimum with proofs.	Infeasible if constraints clash.	Add slack; sensitivity `∂z/∂b`.	PuLP, OR-Tools	`import pulp as pl x = pl.LpVariable.dicts('shift', S, 0) prob = pl.LpProblem('staff', pl.LpMinimize) prob += pl.lpSum(cost[s]*x[s] for s in S)`	1) Define variables & costs. 2) Add constraints. 3) Solve → schedule.	Encode SLAs as constraints → add shrinkage → solve & export rota.

🌐 Networks/Reliability — Availability, PageRank/Centrality ▾

Name	Formula	Non-technical description	Technical description	Scenario	Pros	Pitfalls	Formulas to address pitfalls	Python libraries	Python code	Formula Steps in Plain English	Step-by-step evaluation
🛡️ Availability & redundancy	`A=MTBF/(MTBF+MTTR)`; Series: `∏Aᵢ`; Parallel: `1−∏(1−Aᵢ)`	Uptime of components and combinations.	Steady-state availability; assumes independence.	Data-centre uptime; redundant links justification.	Clear link to SLAs.	Common-mode failures break independence.	Dependency factoring; fault-tree analysis (FTA).	numpy, pandas	`A = mtbf/(mtbf+mttr) A_par = 1 - np.prod(1-np.array(Ai))`	1) Find MTBF/MTTR. 2) Compute A. 3) Combine in series/parallel.	Map dependencies → compute path availability → identify single points of failure.
🌐 PageRank / eigenvector centrality	`PR(i)=(1−d)/N + d Σ_{j∈In(i)} PR(j)/out(j)`	Importance ranking in a network.	Stationary distribution of a random walk with teleport.	Rank routers/AD servers for patch priority.	Surfaces critical assets.	Dangling nodes; directionality.	Damping 0<d<1; personalise vector; handle dangling mass.	networkx	`import networkx as nx pr = nx.pagerank(G, alpha=0.85)`	1) Start equal scores. 2) Distribute along edges. 3) Add teleport. 4) Iterate to convergence.	Build graph from CMDB → compute PR → harden top k assets first.

Electricity Market Model Click2Expand

📈 Forecasting — Holt-Winters, ARIMA/ARIMAX, Weather Normalisation ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
⚡️ Holt-Winters (additive)	`ŷ_{t+h}=(ℓ_t+h b_t) s_{t+h-mk}`, updates with α,β,γ; seasonal period `m=48` (HH)	Forecast half-hourly demand using level, trend, seasonality.	Triple exponential smoothing (ETS AAA) minimises exponentially weighted SSE.	Supplier day-ahead demand forecast for hedging.	Fast; strong baseline.	Breaks on regime shifts (price caps, storms).	Refit rolling; add exogenous vars (temp, price): ETSX.	statsmodels	`from statsmodels.tsa.holtwinters import ExponentialSmoothing fit = ExponentialSmoothing(y,trend='add',seasonal='add',seasonal_periods=48).fit() fc = fit.forecast(96)`	1) Track current level, trend, seasonality. 2) Update each period. 3) Project forward h steps.	Train/validation split → tune αβγ → backtest by season → nightly refit → push to hedge calc.
📈 ARIMA / ARIMAX	`ϕ(B)(1-B)^d y_t = θ(B)ε_t + βX_t`	Time-series that learns autocorrelation/shocks; can include weather/price.	Box-Jenkins with exogenous regressors; choose (p,d,q) by AIC/BIC.	Forecast IDNO/DUoS area load with temperature and price.	Flexible; interpretable.	Non-stationarity; regime changes.	Differencing; holiday dummies; rolling refit.	pmdarima, statsmodels	`import pmdarima as pm m = pm.auto_arima(y, X=exog, seasonal=False).fit() fc = m.predict(n_periods=48, X=exog_fc)`	1) Make series stationary. 2) Fit AR & MA parts. 3) Add exogenous drivers. 4) Forecast & invert differencing.	Diagnose residuals → select order → backtest multiple windows → deploy & monitor drift.
🌡️ Weather normalisation (HDD/CDD)	`HDD=max(0,T_b-T)`, `CDD=max(0,T-T_b)`; regression `y=β₀+β₁HDD+β₂CDD+ε`	Adjust demand to “typical weather” for fair KPI comparisons.	Linear model on degree-days; choose base `T_b` (e.g., 15.5 °C UK).	Compare this winter’s usage vs typical.	Fair baselines; portfolio comparability.	Wrong base temp; microclimates.	Grid-search `T_b`; add site fixed-effects.	pandas, scikit-learn	`X = df[['HDD','CDD']] from sklearn.linear_model import LinearRegression m = LinearRegression().fit(X, y)`	1) Compute HDD/CDD from temps. 2) Regress demand on them. 3) Predict at “typical” HDD/CDD.	Select base → fit → report normalised demand/KPIs → update monthly.

🧰 Operations & Dispatch — Economic Dispatch, Unit Commitment, Storage, DR, Load/Capacity & Diversity ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🧰 Economic dispatch (no network)	Minimise `∑ C_i(P_i)` s.t. `∑P_i=D`, `0≤P_i≤P_i^{max}`	Choose cheapest generator mix to meet demand now.	Convex cost with KKT multiplier (system lambda).	Half-hour balancing by marginal cost stack.	Efficient given costs.	Ignores start-up/min-up-down.	Add UC constraints (MILP).	cvxpy	`import cvxpy as cp P=cp.Variable(n) prob=cp.Problem(cp.Minimize(c@P),[cp.sum(P)==D,P>=0,P<=Pmax]) prob.solve()`	1) Sort by cost. 2) Fill to meet D. 3) Respect limits.	Validate vs actual stack → quantify savings → iterate costs.
🏭 Unit Commitment (UC)	Binary on/off `u_{i,t}∈{0,1}`, power `P_{i,t}`, min-up/down, ramps, start-ups.	Which plants to have on, considering start/stop costs.	MILP over time horizon; operational constraints.	DA schedule for CCGT/OCGT fleet.	Operationally realistic.	Large MILPs can be slow.	Lagrangian relax; rolling horizon; heuristics.	PuLP, OR-Tools	`# u[i,t] on/off, P[i,t] output; add minup/mindown & ramps`	1) Decide on/off. 2) Constrain time-coupling. 3) Minimise total cost.	Solve horizon → roll forward each HH → compare to market schedule.
🔋 Storage arbitrage	Maximise `∑ π_t e_t` s.t. SOC dynamics, charge/discharge limits, round-trip η	When to charge/discharge to earn from price spreads.	LP with state-of-charge constraints.	2-hour battery on GB day-ahead.	Clear & solvable.	Degradation, cycle limits.	Add cycle constraints; aging models.	cvxpy	`# soc[t+1]=soc[t]+η_c*ch[t]-dis[t]/η_d`	1) Set prices & limits. 2) Constrain SOC. 3) Optimise schedule.	Backtest → add degradation → redeploy daily.
🧲 Demand Response baseline & impact	Baseline `BL_t=avg_{d∈D}(y_{d,t})`, Impact `Δ=BL−y`	Estimate savings vs “what would’ve happened”.	Matched-day/10-of-10 + adj; or ML counterfactual.	Large C&I reduces load on price signal.	Simple accounting.	Baseline inflation/erosion; rebound.	ML baselines; penalty terms for rebound.	pandas, sklearn	`bl = y.shift(7*48).rolling(10).mean() impact = bl - y`	1) Build baseline. 2) Compare actual. 3) Sum verified kWh.	QC with weather/occupancy → settle payments on verified savings.
🧮 Load factor (LF)	`LF = Ȳ / P_{max}`	How “flat” your demand is (near 1 = smoother).	Average power divided by peak power over period.	Retail portfolio shaping KPI.	Intuitive, quick.	Misses sub-interval spikes.	Add 95th percentile peak, ramp rate.	pandas	`lf = y.mean()/y.max()`	1) Mean. 2) Max. 3) Divide.	Track LF + 95th pct & ramps for ops decisions.
⚙️ Capacity factor (generation)	`CF = (∑ P_t)/(P_{rated}·T)`	How hard an asset ran vs its capability.	Time-weighted utilisation metric.	Wind farm performance vs P50.	Clear health check.	Curtailment/outages confound.	Decompose by cause codes; weather-adjust.	pandas	`cf = gen.sum()/(nameplate*hours)`	1) Sum output. 2) Divide by rated×hours.	Attribute gaps to wind vs outages vs curtailment.
🤝 Coincidence / Diversity factor	`CF_agg = P_{max,agg} / ∑ P_{max,i}`, Diversity = `1−CF`	How individual peaks combine across many customers.	Portfolio peak relative to sum of individual peaks.	Size capacity/hedge at portfolio (not customer) level.	Shows natural smoothing.	Herding creates high coincidence.	Segment cohorts; stress at peak HH.	pandas	`cf = agg.max() / indiv_max.sum()`	1) Portfolio peak. 2) Sum indiv. peaks. 3) Divide.	Use CF in capacity/hedge planning; monitor by segment.

💹 Risk & Hedging — Elasticity, Hedge Cost-to-Serve, Mean-Variance, VaR/CVaR, GARCH, Imbalance ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🧲 Price elasticity of demand	`ε ≈ (ΔQ/Q)/(ΔP/P)`; log-log: `ln Q = α + ε ln P + …`	How usage changes when price changes.	Elasticity from log regression; beware endogeneity.	TOU tariff response estimation (EV load).	Quantifies DR potential.	Price ↔ demand feedback.	IV regression; diff-in-diff.	statsmodels	`import statsmodels.api as sm m = sm.OLS(np.log(Q), sm.add_constant(np.c_[np.log(P), Z])).fit()`	1) Take logs. 2) Regress. 3) Coef on log P = elasticity.	Check sign/CI → simulate tariff scenarios → validate vs trial.
📦 Hedge cost-to-serve	`C=∑ (q_h f_h) + ∑ (d_h - q_h) s_h`	Blended cost from forwards + spot top-up.	Portfolio cost under hedge ratio path.	Choose 80% day-ahead + 20% intraday.	Quantifies hedge value.	Over/under-hedge risk.	Optimise `q_h` via mean-CVaR.	numpy, cvxpy	`C = (qf + (d-q)s).sum()`	1) Pick hedge volumes. 2) Multiply by forward. 3) Residual at spot. 4) Sum.	Stress with price scenarios → pick policy → monitor tracking error.
📉 Mean-Variance (hedge portfolio)	Minimise `λ wᵀΣw − μᵀw` s.t. `∑w=1, w≥0`	Choose mix of forwards to balance risk/return.	Markowitz: return = −cost; variance = price risk.	Blend month/quarter/season strips.	Simple & fast.	Fat-tails break Gaussian assumptions.	Use CVaR (coherent) optimisation.	numpy, cvxpy	`w=cp.Variable(n) obj=cp.Minimize(cp.quad_form(w,Sigma) - mu@w)`	1) Estimate μ, Σ. 2) Optimise weights under constraints.	Backtest → set limits → monitor exposures & VaR.
📊 VaR / CVaR (energy P&L)	VaR: quantile; CVaR: tail mean `E[L \| L ≥ VaRα]`	Loss at confidence level; average of worst tail.	Quantile & tail expectation on P&L distribution.	95% daily VaR/CVaR on hedge book.	Focuses on downside.	VaR non-coherent; model risk.	Prefer CVaR; historical/MC scenarios.	numpy, scipy	`var = np.quantile(L, 0.95) cvar = L[L>=var].mean()`	1) Simulate P&L. 2) Take quantile. 3) Average the tail.	Scenario set incl. spikes → set limits & capital.
🌪️ GARCH (price volatility)	`σ_t^2 = ω + α ε_{t-1}^2 + β σ_{t-1}^2`	Models time-varying volatility clustering.	Conditional heteroskedasticity for returns.	Size intraday risk limits.	Captures clustering.	Regime changes, jumps.	EGARCH/GJR; jump components.	arch	`from arch import arch_model am = arch_model(r, vol='Garch', p=1, q=1).fit()`	1) Fit to returns. 2) Get conditional σ. 3) Forecast risk.	Combine with VaR/CVaR sizing; recalibrate periodically.
🔄 Imbalance cash-out	`Charge = Volume × ImbalancePrice`	What you pay if short/long vs position.	Settlement using system price (pay-as-imbalance).	Supplier under-forecast in tight evening peak.	Clear incentive to balance.	Price spikes → large P&L hits.	Better nowcasts; intraday re-trading.	pandas	`charge = vol * sysprice`	1) Compute delta vs notified position. 2) Multiply by imbalance price.	Attribute P&L by HH → improve forecast & hedge policy.

🌐 Network & Settlement — DC-OPF & LMPs, Congestion Rent, Loss Factors ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🧭 DC-OPF & LMPs	Minimise cost s.t. balance; line flows `F=Bθ`, `\|F\|≤F^{max}`; `LMP=λ_energy+λ_cong(+λ_loss)`	Price at each node reflects energy + congestion (and losses).	Linearised (DC) OPF; dual variables give LMPs.	Compute nodal prices & congestion rents.	Captures grid constraints.	Ignores reactive/voltage (AC effects).	AC-OPF or PTDF + loss factors.	PYPOWER, pandapower	`import pypower.api as pp r = pp.runopf(case) LMP = r['bus'][:,13]`	1) Build network model. 2) Optimise dispatch. 3) Read duals → nodal prices.	Analyse hub-node spreads → hedge congestion exposure.
🚦 Congestion rent	`Rent = ∑_ℓ (Δπ_ℓ) · F_ℓ`	Value collected due to constrained lines.	Price difference times flow, summed over lines.	Interconnector constraint revenue estimate.	Quantifies bottlenecks.	Volatile; counterflows.	Scenario DC-OPF expected flows.	numpy	`rent = np.sum((pi_from - pi_to) * flow)`	1) Compute price diff on each line. 2) Multiply by flow. 3) Sum.	Rank lines by rent → target upgrades/hedges.
🔌 Loss factors (settlement)	`E_delivered = E_metered × LLF` (or `1−ℓ_dist`)	Adjust energy for distribution losses so settlement is fair.	Apply LLF by profile class/region/time to meter reads.	Supplier settlement adjustments (e.g., Elexon LLFs).	Simple application.	Wrong MPAN→LLF mapping.	Validate joins; reconciliation vs statements.	pandas	`E_del = E_mtr * llf_series`	1) Map MPAN to LLF. 2) Multiply reads by LLF.	Reconcile with settlement statements; investigate deltas.

🤖 ML in Electricity — Forecasting, Risk, Networks ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🌳 Gradient Boosted Trees (STLF)	Stagewise additive model: `F_m(x)=F_{m-1}(x)+ν·h_m(x)` minimising loss	Many small trees, each fixes last one’s mistakes.	Boosting over CARTs on features (lags, weather, calendar).	Half-hourly load forecast with temp & holiday effects.	Strong accuracy; little feature scaling needed.	Overfit if too deep/too many rounds.	Early stopping; shrinkage `ν`; regularisation.	xgboost, lightgbm, sklearn	`import xgboost as xgb m = xgb.XGBRegressor(n_estimators=500, max_depth=6, learning_rate=0.05) m.fit(X_train, y_train)`	1) Build lag & weather features. 2) Train boosted trees. 3) Stop when val error rises.	Walk-forward backtest → SHAP to sanity-check drivers → deploy nightly.
🧠 LSTM (sequence-to-one)	Recurrent cells: `h_t,f_t,i_t,o_t` gating long/short memory	Neural net that “remembers” recent patterns.	Many-to-one RNN for short-term load/RES output.	PV/wind intra-day forecast with ramps.	Catches non-linear temporal effects.	Needs data & tuning; can drift.	Dropout; early stopping; recalibration schedule.	TensorFlow, PyTorch	`# Pytorch sketch model = LSTM(input_size=X.shape[-1], hidden=64, layers=2)`	1) Windowise time series. 2) Train on past windows. 3) Predict next horizon.	Backtest rolling origin → compare vs tree baseline → ensemble.
🌬️ Random Forest (RES output)	Bagging: average of many trees to reduce variance	Many decorrelated trees vote an answer.	RF regression on wind/pv with NWP features.	Day-ahead wind farm output P50/P90.	Stable; handles interactions.	Less sharp peaks vs boosting.	Quantile RF for P90; feature selection.	sklearn, skgarden	`from sklearn.ensemble import RandomForestRegressor m = RandomForestRegressor(n_estimators=400).fit(X,y)`	1) Build met/NWP features. 2) Fit many trees. 3) Average predictions.	Calibrate quantiles → generate P50/P90 bands.
⚠️ XGBoost Classifier (price spike)	Boosted trees minimising log-loss	Predicts probability of a spike event.	Binary classification with class weights.	Imbalance price spike nowcast.	Probabilities for risk sizing.	Class imbalance → false alarms.	Focal loss; threshold from cost curve.	xgboost, sklearn	`clf = xgb.XGBClassifier(scale_pos_weight=20).fit(X,y)`	1) Label spikes. 2) Train classifier. 3) Pick threshold to balance cost.	Track precision/recall over time → retrain monthly.
🕵️ Isolation Forest (anomalies)	Isolation via random splits; anomaly score by path length	“Outsiders” get isolated quickly.	Unsupervised anomaly detection on SCADA/load.	Detect metering or telemetry glitches.	No labels required.	Flags change-points as outliers.	Add STL detrend; use rolling training windows.	sklearn	`from sklearn.ensemble import IsolationForest iso = IsolationForest().fit(X); score = iso.decision_function(X)`	1) Build feature set. 2) Fit unsupervised. 3) Score & alert top-k.	Whitelist known events; tune contamination.
🎮 RL for Battery Dispatch	Q-learning: `Q(s,a)←(1−α)Q+α[r+γ max_a' Q(s',a')]`	Agent learns charge/discharge to maximise £.	MDP over price & SoC; discrete actions.	Day-ahead + intraday arbitrage.	Adapts to patterns.	Exploration risk; constraint handling.	Reward shaping; safety layer on SoC/DoD.	stable-baselines3, numpy	`# define env(state: SoC, price); train DQN/PPO agent`	1) Define states/actions. 2) Train with reward=profit. 3) Constrain SoC/cycles.	Backtest vs LP benchmark → deploy with guardrails.

Telecoms Wholesale and retail Click2Expand

📈 Demand & Forecasting — “What’s coming down the pipe?” ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
📅 Holt–Winters (additive)	`ŷ_{t+h}=(ℓ_t+h b_t)s_{t+h-mk}` with α,β,γ; seasonal `m=24` (hourly)	Forecast calls/data sessions with daily/weekly rhythm.	Triple exponential smoothing (ETS AAA) minimising EW-SSE.	Call centre volume by hour; store staffing.	Fast, solid baseline.	Breaks on outages, promos, price shocks.	Refit rolling; ETSX with exogenous (price, promo, outage dummies).	statsmodels	`from statsmodels.tsa.holtwinters import ExponentialSmoothing fit=ExponentialSmoothing(y,trend='add',seasonal='add',seasonal_periods=24).fit() fc=fit.forecast(48)`	Track level/trend/seasonal → update each period → project h steps ahead.	Train/validate → tune αβγ → backtest → publish to WFM & routing.
📈 ARIMAX	`ϕ(B)(1−B)^d y_t = θ(B)ε_t + βX_t`	Lets history and drivers (price, weather, promos) steer the forecast.	Box–Jenkins with exogenous regressors.	Predict broadband tickets after a price change.	Flexible; interpretable.	Non-stationary, regime switches.	Differencing; holiday dummies; rolling refits.	pmdarima, statsmodels	`import pmdarima as pm m=pm.auto_arima(y, X=exog, seasonal=False).fit()`	Make series steady → fit AR & MA → plug in drivers → forecast, undifference.	Residual checks → walk-forward backtest → deploy & watch drift.
🧮 Benford for CDR/fraud sanity	`P(d)=log10(1+1/d)`, d∈{1..9}	First-digit test to sniff odd billing patterns.	Benford distribution on aggregated measures.	Wholesale CDR audit: spot fabricated volumes.	Quick anomaly screen.	Not proof of fraud; small samples flaky.	Complement with MAD outliers & supervised models.	numpy, pandas	`import numpy as np digits = (np.log10(1+1/np.arange(1,10)))`	Count first digits → compare to Benford → flag deviations.	Escalate big gaps → deeper checks on rated events.

☎️ Traffic & Capacity — “How many trunks/agents do we actually need?” ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
📦 Erlang-B (blocking, no queue)	`B(A,c)=\frac{A^c/c!}{\sum_{k=0}^{c} A^k/k!}`, traffic `A=λ/μ`	% of calls that get “busy tone” if all trunks busy.	Loss system (M/M/c/c) steady-state blocking probability.	SIP trunks for a retail contact centre.	Gold standard for trunks.	Ignores retrials/queues.	Use Erlang-C if you queue; simulate retrials.	numpy, math	`def erlang_b(A,c): import math num=Ac/math.factorial(c) den=sum(Ak/math.factorial(k) for k in range(c+1)) return num/den`	Compute traffic A → pick trunks c → get blocking → tweak until GoS met.	Choose GoS (e.g., 1%) → iterate c → add safety for peaks.
⏳ Erlang-C (wait, with queue)	`ρ=λ/(cμ)`, `P_W` via standard Erlang-C; `ASA=P_W/(cμ−λ)`	Chance a caller waits + expected wait time.	M/M/c with infinite buffer; steady-state.	Retail care queue meeting 80/20.	Battle-tested staffing.	Arrival burstiness, skill-mismatch.	Skills-based routing; simulate; λ(t) per interval.	numpy	`# reuse earlier erlang_c; compute ASA from P_W`	Estimate λ, μ → trial c → compute wait prob & ASA → adjust.	Do per 15-min → add shrinkage/occupancy caps in WFM.
👥 Engset (finite sources)	Blocking with finite N callers (formula omitted here for brevity)	When your caller pool is small (e.g., internal helpdesk).	Loss model with finite source population.	IT service desk for 300 staff site.	More realistic than Erlang-B for small N.	Needs N estimate.	Cross-check with sim / B as bound.	custom	`# use an Engset implementation or simulate`	Estimate sources N → calls per head → compute blocking.	Sanity-check with discrete-event simulation.
📜 Little’s Law (packets, calls)	`L=λW`	Average in-system = rate × time. Simple and beautiful.	Applies across queues in steady-state.	Estimate average active calls given arrival & wait.	One-liner insight.	Not for transients.	Use windowed λ(t); exclude incident periods.	pandas	`W = L/lam`	Pick a stable window → apply the identity.	Cross-validate against measured occupancy.
⌛ M/M/1 latency (packet queue)	`W = 1/(μ−λ)`, `ρ=λ/μ`	Delay skyrockets as utilisation nears 1. Ooft.	Single-server queue with Poisson/Exp.	Edge SBC or NAT box sizing.	Back-of-envelope capacity.	Traffic not Poisson; service not Exp.	M/G/1 (Pollaczek–Khinchine), simulations.	numpy	`W = 1/(mu - lam)`	Estimate μ, λ → compute W → keep ρ well below 1.	Target ρ≤0.7 for headroom; monitor p95 latency.

🛰️ Radio & Throughput — “Will it actually shift the bits?” ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
📡 Friis / Path loss (free space)	`PL(dB)=32.44+20log10(f_MHz)+20log10(d_km) − G_t − G_r`	How loudly the signal fades with distance/frequency.	Link budget basic; add margins & non-LOS losses in reality.	Microwave backhaul feasibility check.	Simple first pass.	Ignores obstacles & rain fade.	ITU-R models; add rain/urban clutter.	numpy	`PL=32.44+20np.log10(f)+20np.log10(d)-Gt-Gr`	Pick f,d → compute loss → check RX level vs sensitivity.	Add fades & margins → confirm availability target.
📶 SINR & spectral efficiency	`SINR = S/(I+N)`; `η≈f(SINR)` via MCS curve	Signal vs noise+interference → how many bits we can cram in.	Maps to modulation/coding → bits/Hz.	5G small cell planning on busy street.	Directly tied to user throughput.	Fast fading & scheduling fairness.	Use distributions (percentiles), not point SINR.	numpy	`sinr = S/(I+N)`	Measure S,I,N → compute SINR → lookup MCS → get η.	Plan for p5/p50/p95 SINR → verify drive-tests.
📡 Shannon–Hartley (upper bound)	`C = B log2(1+SNR)`	Ceiling on throughput for a clean channel.	AWGN capacity bound; reality below due to overheads.	Rough max throughput for fixed wireless.	Gives hard upper bound.	Not achievable with real protocols.	Subtract overheads; use MCS curves.	numpy	`C = B*np.log2(1+snr)`	Bandwidth × log(1+SNR) → bits per second.	Apply protocol overhead → compare to SLA.
🌐 TCP throughput (Mathis)	`T ≈ 1.22 · MSS / (RTT · √p)`	Why long RTT + a sniff of loss kills your speed tests.	TCP Reno steady-state approximation.	International transit troubleshooting.	Explains user complaints fast.	Modern CCAs differ (Cubic/BBR).	Use measured CCA; still a good sanity check.	numpy	`T=1.22MSS/(RTTnp.sqrt(p))`	Measure MSS/RTT/loss → compute T → set expectations.	Recommend CDN/BBR/MTU tuning if constrained.

💷 Commercial & Risk — “ARPU pays the bills, churn steals your lunch.” ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
📊 ARPU & CLV (simple)	`ARPU = Rev/Subs`; `CLV ≈ ARPU · \frac{m}{r}` with margin m, retention r	Average revenue per user and lifetime £ value.	Steady-state approximation for planning.	Retail mobile bundle business case.	Quick portfolio health.	Ignores cohort & discounting.	Discounted CLV: sum of cashflows / (1+r)^t.	pandas, numpy	`clv = arpu*margin/retention`	Compute ARPU → apply margin → divide by retention.	Refine with cohort curves & discount rate.
🚪 Churn (logistic) + survival	`P(churn)=σ(wᵀx)`; Kaplan–Meier `S(t)`	Who’s likely to leave, and when.	Binary GLM + survival for tenure timing.	Flag risky SIMs; save offers.	Actionable probabilities.	Imbalance; leakage of future info.	Class weights; time-aware CV; survival features.	sklearn, lifelines	`from sklearn.linear_model import LogisticRegression m=LogisticRegression(class_weight='balanced').fit(X,y)`	Engineer features → fit → calibrate → threshold on cost curve.	Champion-challenger → monitor drift → retrain monthly.
🔗 Least-Cost Routing (LCR)	Minimise `∑ c_{i,d} x_{i,d}` s.t. demand & quality constraints	Pick the cheapest wholesaler per destination, within QoS.	LP/MIP with capacity, A-Z breaks, quality gates.	Wholesale voice A-Z routing plan.	Direct OPEX impact.	Route flapping; QoS drift.	Add hysteresis; QoS constraints; penalties.	cvxpy, pulp	`# x[i,d] fraction via carrier i to dest d; minimise cost`	Build cost/QoS matrix → solve → generate route tables.	Monitor CDR KPIs → auto-reroute on breaches.
💱 Interconnect settlement	`Charge = ∑ minutes_d × rate_d` (+ surcharges)	Bill each destination at the agreed rate card.	Rating by A-Z prefix with time bands, surcharges.	Monthly invoice to/ from carriers.	Transparent and auditable.	Prefix drift; rate card mismatches.	Versioned rate tables; prefix normalisation; diffs.	pandas	`bill = (mins*rate).groupby('dest').sum()`	Join CDRs to rate card → multiply → sum per counterparty.	Reconcile vs partner; chase deltas; adjust prefixes.
🧲 Price elasticity (bundles)	`ε ≈ (ΔQ/Q)/(ΔP/P)`; log-log regression	How subs change when you tweak price.	Elasticity from ln-ln model; beware endogeneity.	Broadband + mobile bundle test.	Guides promo depth.	Confounded by offers/seasonality.	Instruments; diff-in-diff; fixed effects.	statsmodels	`import statsmodels.api as sm m=sm.OLS(np.log(Q), sm.add_constant(np.c_[np.log(P), Z])).fit()`	Take logs → regress → elasticity is coef on log P.	Simulate scenarios → guardrails for pricing.

🌐 Network, Routing & Reliability — “Find the choke points before they find you.” ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🛡️ Availability (nodes/paths)	`A=MTBF/(MTBF+MTTR)`; series: `∏Aᵢ`; parallel: `1−∏(1−Aᵢ)`	Uptime of gear and combined paths.	Steady-state availability; assumes independence.	Core + access redundancy score.	Clear SLA link.	Common-mode failures spoil party.	Fault-tree analysis; dependency factors.	numpy	`A = mtbf/(mtbf+mttr)`	Get MTBF/MTTR → compute A → combine by topology.	Identify SPoFs → prioritise hardening.
🧭 PageRank / centrality	`PR(i)=(1−d)/N + d Σ PR(j)/out(j)`	Find the “king” routers/edges by influence.	Eigenvector of random-walk on graph.	Prioritise patching/monitoring targets.	Surfaces critical hubs.	Dangling nodes; directionality.	Handle dangling mass; personalise by traffic.	networkx	`import networkx as nx pr = nx.pagerank(G, alpha=0.85)`	Build graph → run PR → sort nodes by score.	Cross-check with flow; harden top-k.
🪙 MOS / E-model (voice QoE)	`R=R₀−I_s−I_d−I_e+…`; `MOS≈1+0.035R+7e−6R(R−60)(100−R)`	Maps impairments to a “how it sounds” score.	E-model per ITU-T G.107; codec, delay, loss factors.	Wholesale voice route QoE benchmarking.	Single MOS for execs.	Model assumptions; non-speech effects.	Use PESQ/POLQA for lab, E-model for fleet.	numpy	`# compute R then MOS from formula`	Measure loss/jitter/latency → compute R → map to MOS.	Alert on MOS dips → reroute carriers.

🤖 ML in Telecoms — Churn, Fraud, Traffic, QoE, Routing ▾

Name	Formula (📐)	Non-technical description	Technical description	Scenario 🧪	Pros ✅	Pitfalls ⚠️	Formulas to address pitfalls	Python libs 🛠️	Python code	Formula Steps in Plain English 📝	Step-by-step evaluation 👣
🚪 Churn ML (logistic + survival)	`P(churn)=σ(wᵀx)`; survival `S(t)` via Cox/KM	Who leaves, and how soon.	Binary GLM + time-to-event model.	Flag risky broadband/mobile subs.	Actionable; clear uplift targeting.	Leakage; class imbalance.	Time-aware CV; class weights; calibration.	sklearn, lifelines	`from sklearn.linear_model import LogisticRegression m=LogisticRegression(class_weight='balanced').fit(X,y)`	1) Engineer features. 2) Fit & calibrate. 3) Score & intervene.	Champion–challenger offers → monitor retention lift.
🕵️ CDR Fraud (IsoForest + supervised)	Isolation paths + classifier stack	Spot weird call patterns & confirm with labels.	Unsupervised pre-filter → supervised confirmer.	Wholesale A–Z fraud, IRSF, SIM box.	High recall on oddities.	Alert fatigue.	Two-stage: anomaly score → threshold → classifier.	sklearn, xgboost	`iso=IsolationForest().fit(X); cand=iso.decision_function(X)`	1) Score anomalies. 2) Train classifier on confirmed fraud.	Auto-block high score + human-in-loop review.
📈 Traffic Forecast (GBTs/LSTM)	Boosting or RNN as above	Predict sessions/calls with promos/outages.	Tree or seq2seq with exogenous drivers.	Contact-centre/Wi-Fi offload planning.	Great short-term accuracy.	Promo/incident drift.	Event features; rapid refits; ensembles.	xgboost, PyTorch	`# choose tree or LSTM path per KPI`	1) Build features/windows. 2) Train. 3) Validate & ensemble.	Deploy with drift monitors; auto-refresh models.
🛰️ QoE Estimation (MOS regressor)	Map KPIs → MOS via regression	Predict “how it feels” from KPIs.	Nonlinear regressor on latency/jitter/loss/codec.	Wholesale route MOS for alerts.	Fast, continuous MOS.	MOS is a proxy; lab ≠ field.	Retrain per codec/region; add quantile models.	sklearn	`from sklearn.ensemble import GradientBoostingRegressor`	1) Collect KPIs. 2) Fit MOS model. 3) Alert when predicted MOS dips.	Compare to POLQA samples; recalibrate.
🗺️ Graph ML (routing anomalies)	Node2Vec/GraphSAGE embeddings → classifier	Learn network “shape” to catch odd flows.	Graph embeddings + anomaly/cls.	Weird BGP/route changes, choke points.	Captures topology context.	Needs graph pipeline.	Automate graph ETL; rolling windows.	networkx, stellargraph, pyg	`# build G → node2vec → clf on embeddings`	1) Build graph. 2) Learn embeddings. 3) Detect anomalies.	Alert & correlate with NetFlow/SNMP.
🎮 RL for SON / Routing	Policy gradient / Q-learning rewards on QoE	Auto-tune network knobs for experience.	MDP with KPIs as reward; safe constraints.	eNodeB power/tilt, handover margins.	Adapts to local conditions.	Exploration risk on live users.	Shadow mode; constrained RL; guardrails.	stable-baselines3	`# define env; train PPO with safety constraints`	1) Sim/Shadow learn. 2) Validate. 3) Gradual live rollout.	Compare QoE uplift vs control cells; rollback if worse.

📐 Formulas without business context are the Billy-no-mates of algorithms. Full of potential, waiting for a dataset to play with 📊.

➕ Mean — baseline average (needs context)

➕ Mean (Average)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
📊 Quick “middle ground” check	Add ’em up ➕ and divide ➗	Arithmetic mean	1️⃣ Add 2️⃣ Count 3️⃣ Divide	(2,4,6) → 12 ÷ 3 = 4	⚡ Average monthly electricity bill from weekly readings

🔲 Variance — how spread out values are

🔲 Variance (Spread)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
📉 See how jumpy values are	Spread of values	Mean squared deviation	Subtract → Square → Sum → ÷n	(2,4,6) → Var≈2.67	📈 Month-to-month revenue volatility for a product line

🔗 Covariance — do two variables move together?

🔗 Covariance (Joint Movement)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🤝 Check if two things rise & fall together	Do 2 vars move in sync?	Expected joint deviation	Subtract means → Multiply → Sum → ÷n	X=(1,2,3), Y=(2,3,5) → Cov=1	🛒 Ad spend moving with weekly sales in a campaign

📏 Correlation — strength & direction (−1 to +1)

📏 Correlation (Strength of Link)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
📐 Ask: how strong is the link?	Togetherness measure	Cov ÷ (σXσY)	Cov / product of SDs	r≈0.97	📞 Call-centre volume vs website outage duration

📈 Linear Regression — best-fit line through points

📈 Linear Regression (Best Fit Line)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
📏 Want a line through dots	Best-fit straight line	Least squares fit	β₁=Cov/Var, β₀=ȳ−β₁x̄	ŷ=0.33+1.5x	⚙️ Predict maintenance time from machine age/usage

🧩 OLS Matrix — matrix solution to regression

🧩 Ordinary Least Squares (Matrix Form)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🧮 Need exact solution for line/plane	Matrix shortcut	(XᵀX)⁻¹Xᵀy	Build X → invert → multiply	Slope=1.5, Int=0.33	📊 Fitting BI trendlines across many features fast

🎯 MSE — average squared prediction error

🎯 Mean Squared Error (Loss)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🎯 Compare models’ accuracy	Average squared miss	Quadratic loss	Square → Sum → ÷n	MSE≈0.056	🧪 A/B test: pick the model with the lower error on holdout

🔻 Gradient Descent — iterative downhill optimiser

🔻 Gradient Descent (Optimiser)
⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🥾 When algebra too hard, walk it	Iterative optimiser	θ←θ−η∇L	Guess → Gradient → Step → Repeat	w=0→0.8→1.44	🤖 Train a forecasting model on large datasets

🚿 SGD — online learner (one sample at a time)

🚿 Stochastic Gradient Descent (Online Learning)

⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🍺 Stream data one by one	Noisy but efficient	Update on each sample	w=0→0.6→1.56	Clickstream learning	Online ads adapting live

📈 Logistic Regression — S-curve for probabilities

📈 Logistic Regression (Classification)

⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🔮 Want yes/no with probabilities	Sigmoid transform	ŷ=1/(1+e^−(β₀+β₁x))	β₀=−3, β₁=2, x=2 → p=0.73	Churn yes/no	Customer retention analysis

🌀 Cross-Entropy — penalises confident wrong answers

🌀 Cross-Entropy (Log Loss)

⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
⚖️ Penalise confident errors	Log-loss	−[ylogp+(1−y)log(1−p)]	y=1, p=0.73 → 0.314	Classifier training	Email spam filters

🧭 Bayes — update beliefs with evidence

🧭 Bayes’ Theorem (Update Rule)

⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🧠 Update beliefs with evidence	Posterior ∝ Prior×Likelihood	P(A\|B)=P(B\|A)P(A)/P(B)	P(spam\|WIN)=95%	Spam filter	Medical diagnosis updates

🧮 Matrix Inversion — algebra behind OLS

🧮 Matrix Inversion (OLS Step)

⏰ When to Apply	🗣️ Plain Speak	📐 Technical	🪜 Steps	✍️ Example	🌍 Scenario
🧮 When you must invert XᵀX	Feature covariance inverse	(XᵀX)⁻¹	[[14,−6],[−6,3]]/6	OLS engine	Running regression in BI tools

🦶 note: Machine Learning put into context

The maths are already used; ML just remembers stuff and evolves understanding. After all the world was flat until Aristotle (384 - 322 BCE) figured it out by observing that ships disappear hull-first over the horizon!