๐๏ธ Emoji Key โ Industry Sectors
๐ Outsourcing General Commercial & IT
- ๐ Stats โ descriptive, CI, A/B, precision/recall, robust outliers
- ๐ค ML โ regression, logistic, regularisation, PCA, Bayes
- โ๏ธ Ops/Queues โ Poisson, Erlang-C, smoothing, linear programming
- ๐ Networks & Reliability โ availability, PageRank
โก๏ธ Electricity Retail & Wholesale
- ๐ Forecasting โ demand, ARIMA, Holt-Winters, weather-normalisation
- ๐ค ML โ GBT/LSTM load & RES forecasts, spike classifiers, anomaly detection, RL battery dispatch
- ๐งฐ Operations & Dispatch โ economic dispatch, unit commitment, storage, demand response
- ๐น Risk & Hedging โ hedge cost, VaR/CVaR, GARCH, elasticity
- ๐ Network & Settlement โ DC-OPF/LMPs, congestion rent, loss factors
โ๏ธ Telecommunications Retail & Wholesale
- ๐ Demand & Forecasting โ ARIMAX, Holt-Winters, fraud sanity checks
- ๐ค ML โ churn + survival, CDR fraud (IsoForest + supervised), traffic GBT/LSTM, QoE regression, graph ML, RL for SON
- โ๏ธ Traffic & Capacity โ Erlang-B/C, Engset, Littleโs Law, latency
- ๐ฐ๏ธ Radio & Throughput โ path loss, SINR, Shannon, TCP throughput
- ๐ท Commercial โ ARPU, CLV, churn, LCR, interconnect billing
- ๐ Network & Reliability โ availability, PageRank, MOS/QoE
Outsourcing Model Click2Expand
๐ Stats โ Descriptive, CI, A/B, Precision/Recall, Robust Outliers โพ
| Name | Formula | Non-technical description | Technical description | Scenario | Pros | Pitfalls | Formulas to address pitfalls | Python libraries | Python code | Formula Steps in Plain English | Step-by-step evaluation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ Descriptive stats (mean/variance/std/z) | xฬ = (1/n) ฮฃ xi; sยฒ = (1/(n-1)) ฮฃ (xiโxฬ)ยฒ; z = (xโxฬ)/s |
Find average, spread, and how unusual a value is. | Estimators for central tendency & dispersion; z standardises units. | Average handling time (AHT) in a council helpline. | Simple baseline; quick QC. | Skew/outliers distort mean & std. | median, MAD; robust z: 0.6745(xโmed)/MAD |
numpy, pandas | |
1) Add values โ mean. 2) Deviations โ square/average โ variance. 3) โvariance โ std. 4) (xโmean)/std โ z. | Collect AHT โ compute mean/std โ flag |z|>3 โ review calls. |
| ๐ฏ Confidence interval (mean) | xฬ ยฑ zฮฑ/2ยท s/โn |
A safety margin around the average. | Normal/t approximation to mean CI. | Estimate ticket resolve time ยฑ5 min @95%. | Quantifies uncertainty. | Autocorrelation breaks iid assumptions. | Block bootstrap; NeweyโWest SEs. | scipy.stats, statsmodels | |
1) Mean. 2) Std error = std/โn. 3) Margin = zรSE. 4) Mean ยฑ margin. | Size target MOE โ compute n โ sample โ report CI on dashboards. |
| ๐งช A/B test (means & proportions) | t = (xฬโโxฬโ)/(s_pโ(1/nโ+1/nโ)); z = (pโโpโ)/โ(p(1โp)(1/nโ+1/nโ)) |
Check if a change really helped. | NHST for differences in means/props. | Old vs new IVR transfer rate. | Clear decision-gate. | Multiple tests inflate false positives. | Holm/Bonferroni; power analysis (1โฮฒ). | scipy.stats, statsmodels | |
1) Compare group means/props. 2) Compute test statistic. 3) Get p-value from dist. 4) Decide vs ฮฑ. | Define KPI โ randomise โ run test โ correct for multiplicity โ decide rollout. |
| ๐งฎ Precision / Recall / F1 | P=TP/(TP+FP), R=TP/(TP+FN), F1=2PR/(P+R) |
Quality of yes/no predictions. | Confusion-matrix derived metrics. | Security alert triage (high precision on P1). | Aligns to SLA priorities. | Threshold choice changes trade-off. | PR curve; optimise Fฮฒ for business cost. | scikit-learn | |
1) Count TP/FP/FN. 2) Compute P, R. 3) Combine into F1. | Estimate costs โ pick threshold on PR curve โ monitor drift. |
| ๐งฏ Robust outlier detection (MAD) | MAD = median(|xโmed|); zแตฃ=0.6745(xโmed)/MAD |
Outlier score that resists skew. | Median-based; 50% breakdown point. | Flag rogue CPU spikes or ticket times. | Works on messy ops data. | MAD=0 if flat segments. | IQR; STL detrend + MAD on residuals. | numpy, pandas | |
1) Median. 2) Absolute deviations. 3) Median of those โ MAD. 4) Scale to zแตฃ. | Compute zแตฃ โ set |zแตฃ|>k (e.g., 3.5) โ alert & suppress repeats. |
๐ค ML โ Regression, Classification, Regularisation, Optimisation, NLP, PCA, Bayes โพ
| Name | Formula | Non-technical description | Technical description | Scenario | Pros | Pitfalls | Formulas to address pitfalls | Python libraries | Python code | Formula Steps in Plain English | Step-by-step evaluation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ Linear regression | ฮฒฬ=(XแตX)โปยนXแตy; minimize MSE |
Best-fit line through points. | OLS; BLUE under GaussโMarkov. | Forecast backlog from arrivals & staffing. | Interpretable, fast. | Multicollinearity inflates variance. | VIF; Ridge +ฮปโฮฒโยฒ. |
scikit-learn, numpy | |
1) Arrange X (inputs), y (target). 2) Compute ฮฒฬ. 3) Use Xฮฒฬ to predict. | Split train/test โ fit โ residual checks โ deploy & monitor. |
| ๐ Logistic regression (binary) | P(y=1|x)=1/(1+e^{-(wแตx+b)}); log-loss |
Predict yes/no probability. | Bernoulli GLM with logit link; MLE. | Predict SLA breach risk in 24h. | Calibrated probabilities. | Class imbalance; miscalibration. | Class weights; Platt/Isotonic calibration. | scikit-learn | |
1) Weighted sum โ sigmoid. 2) Output 0โ1 probability. | Pick threshold for business cost โ calibrate โ track drift. |
| ๐งฒ Regularisation (L1/L2/ElasticNet) | Minimise โyโXฮฒโยฒ + ฮปโฮฒโยฒ (L2) or + ฮปโฮฒโโ (L1) |
Prevent overfitting; simplify model. | Penalised ERM adds bias, reduces variance. | Many features, few rows. | Stabilises coefficients. | Too much ฮป underfits. | CV to choose ฮป; 1-SE rule. | scikit-learn | |
1) Fit with penalty. 2) Tune ฮป via CV. 3) Refit on full data. | Grid search ฮป/ฮฑ โ check bias/var โ lock and version. |
| ๐ Gradient descent (SGD) | ฮธ_{t+1}=ฮธ_t โ ฮฑ โJ(ฮธ_t) |
Repeated nudges downhill. | First-order iterative optimiser. | Train classifiers on high-dim data. | Scales to big data. | Bad learning rate diverges. | Line search; Adam; schedules. | sklearn, PyTorch, TF | |
1) Start guess. 2) Compute slope. 3) Step opposite direction ร ฮฑ. 4) Repeat. | Monitor loss vs epochs โ early stopping โ save best weights. |
| ๐งพ TF-IDF + Cosine similarity | tfidf=tfยทlog(N/df); cosฮธ=(uยทv)/(โuโโvโ) |
Turn text into weighted numbers; compare angle. | Sparse vectorisation; angular similarity. | Auto-route tickets (โpassword resetโ). | Strong baseline for NLP routing. | OOV & vocabulary drift. | n-grams; hashing; sublinear tf. | scikit-learn | |
1) Count words. 2) Down-weight common terms. 3) Compare vectors by angle. | Build training corpus โ vectorise โ nearest-neighbour route โ retrain monthly. |
| ๐ง PCA (dimensionality reduction) | ฮฃv=ฮปv; project Z=XV_k |
Compress features but keep variance. | Orthogonal projection onto top eigenvectors. | Reduce 300 KPIs to 10 drivers. | Speeds training; denoises. | Loss of interpretability. | Inspect loadings; varimax rotation. | scikit-learn, numpy | |
1) Standardise. 2) Covariance & eigenvectors. 3) Keep top k. 4) Project. | Choose k by explained variance โ sanity-check loadings โ use Z downstream. |
| ๐ง Bayesโ theorem | P(A|B)=P(B|A)P(A)/P(B) |
Update belief with new evidence. | Posterior โ likelihood ร prior. | Probability a ticket is security-related given keywords. | Transparent updating. | Bad priors bias results. | Empirical Bayes; hierarchical models. | numpy, scipy | |
1) Set prior. 2) Compute likelihoods. 3) Apply formula โ posterior. | Estimate priors from historicals โ update online โ calibrate with reliability curves. |
โ๏ธ Ops/Queues โ Poisson & Littleโs Law, Erlang-C, Exponential Smoothing, Linear Programming โพ
| Name | Formula | Non-technical description | Technical description | Scenario | Pros | Pitfalls | Formulas to address pitfalls | Python libraries | Python code | Formula Steps in Plain English | Step-by-step evaluation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| โ๏ธ Poisson arrivals & Littleโs Law | P(N=k)=e^{-ฮปt}(ฮปt)^k/k!; L=ฮปW |
Calls arrive randomly but at a rate. | Poisson process; steady-state relationship L=ฮปW. | Inbound calls per 5-min; infer avg queue. | Simple, insight-rich. | Overdispersion (burstiness). | Negative Binomial (Var=ฮผ+ฮบฮผยฒ); ฮป(t) by time-of-day. | scipy.stats, statsmodels | |
1) Estimate ฮป. 2) Choose t. 3) Use formula for k arrivals. 4) L=ฮปW links rate & wait. | Fit ฮป(t) per interval โ validate dispersion โ apply Littleโs Law for WIP & SLAs. |
| ๐ฅ Erlang-C (M/M/c) + ASA | ฯ=ฮป/(cฮผ); PW= (a^c/(c!(1โฯ)))/[ฮฃโ^{cโ1} a^n/n! + a^c/(c!(1โฯ))]; a=ฮป/ฮผ; ASA=PW/(cฮผโฮป) |
Predict wait probability & agents needed. | Steady-state queueing for c parallel servers. | Size a 24/7 helpline to meet 80/20 SLA. | Industry standard. | Non-Poisson arrivals; skill mismatch. | Skills routing; simulation; use ฮป(t). | numpy | |
1) Get ฮป, ฮผ. 2) Try c. 3) Compute PW, ASA. 4) Iterate c to hit SLA. | Forecast interval loads โ compute c by interval โ add shrinkage & occupancy caps. |
| ๐ Exponential smoothing (level) | ลทt+1=ฮฑ y_t + (1โฮฑ) ลท_t |
Fast, adaptive short-term forecast. | EW squared-error minimiser; 0<ฮฑ<1. | 15-min inbound chat forecast. | Tiny & robust. | No seasonality/trend. | Holt-Winters (add trend/seasonal). | statsmodels | |
1) Pick ฮฑ. 2) New forecast = ฮฑรlatest + (1โฮฑ)รold forecast. | CV ฮฑ per queue โ roll forecasts โ compare MAPE โ feed into staffing LP. |
| ๐งฎ๐ฆ Linear programming (rostering) | min cแตx s.t. Ax โฅ b, x โฅ 0 |
Optimise resources under limits. | Convex optimisation with feasibility regions. | Minimise staffing cost while meeting 80/20 & occupancy. | Global optimum with proofs. | Infeasible if constraints clash. | Add slack; sensitivity โz/โb. |
PuLP, OR-Tools | |
1) Define variables & costs. 2) Add constraints. 3) Solve โ schedule. | Encode SLAs as constraints โ add shrinkage โ solve & export rota. |
๐ Networks/Reliability โ Availability, PageRank/Centrality โพ
| Name | Formula | Non-technical description | Technical description | Scenario | Pros | Pitfalls | Formulas to address pitfalls | Python libraries | Python code | Formula Steps in Plain English | Step-by-step evaluation |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ก๏ธ Availability & redundancy | A=MTBF/(MTBF+MTTR); Series: โAแตข; Parallel: 1โโ(1โAแตข) |
Uptime of components and combinations. | Steady-state availability; assumes independence. | Data-centre uptime; redundant links justification. | Clear link to SLAs. | Common-mode failures break independence. | Dependency factoring; fault-tree analysis (FTA). | numpy, pandas | |
1) Find MTBF/MTTR. 2) Compute A. 3) Combine in series/parallel. | Map dependencies โ compute path availability โ identify single points of failure. |
| ๐ PageRank / eigenvector centrality | PR(i)=(1โd)/N + d ฮฃ_{jโIn(i)} PR(j)/out(j) |
Importance ranking in a network. | Stationary distribution of a random walk with teleport. | Rank routers/AD servers for patch priority. | Surfaces critical assets. | Dangling nodes; directionality. | Damping 0<d<1; personalise vector; handle dangling mass. | networkx | |
1) Start equal scores. 2) Distribute along edges. 3) Add teleport. 4) Iterate to convergence. | Build graph from CMDB โ compute PR โ harden top k assets first. |
Electricity Market Model Click2Expand
๐ Forecasting โ Holt-Winters, ARIMA/ARIMAX, Weather Normalisation โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| โก๏ธ Holt-Winters (additive) | ลท_{t+h}=(โ_t+h b_t) s_{t+h-mk}, updates with ฮฑ,ฮฒ,ฮณ; seasonal period m=48 (HH) |
Forecast half-hourly demand using level, trend, seasonality. | Triple exponential smoothing (ETS AAA) minimises exponentially weighted SSE. | Supplier day-ahead demand forecast for hedging. | Fast; strong baseline. | Breaks on regime shifts (price caps, storms). | Refit rolling; add exogenous vars (temp, price): ETSX. | statsmodels | |
1) Track current level, trend, seasonality. 2) Update each period. 3) Project forward h steps. | Train/validation split โ tune ฮฑฮฒฮณ โ backtest by season โ nightly refit โ push to hedge calc. |
| ๐ ARIMA / ARIMAX | ฯ(B)(1-B)^d y_t = ฮธ(B)ฮต_t + ฮฒX_t |
Time-series that learns autocorrelation/shocks; can include weather/price. | Box-Jenkins with exogenous regressors; choose (p,d,q) by AIC/BIC. | Forecast IDNO/DUoS area load with temperature and price. | Flexible; interpretable. | Non-stationarity; regime changes. | Differencing; holiday dummies; rolling refit. | pmdarima, statsmodels | |
1) Make series stationary. 2) Fit AR & MA parts. 3) Add exogenous drivers. 4) Forecast & invert differencing. | Diagnose residuals โ select order โ backtest multiple windows โ deploy & monitor drift. |
| ๐ก๏ธ Weather normalisation (HDD/CDD) | HDD=max(0,T_b-T), CDD=max(0,T-T_b); regression y=ฮฒโ+ฮฒโHDD+ฮฒโCDD+ฮต |
Adjust demand to โtypical weatherโ for fair KPI comparisons. | Linear model on degree-days; choose base T_b (e.g., 15.5 ยฐC UK). |
Compare this winterโs usage vs typical. | Fair baselines; portfolio comparability. | Wrong base temp; microclimates. | Grid-search T_b; add site fixed-effects. |
pandas, scikit-learn | |
1) Compute HDD/CDD from temps. 2) Regress demand on them. 3) Predict at โtypicalโ HDD/CDD. | Select base โ fit โ report normalised demand/KPIs โ update monthly. |
๐งฐ Operations & Dispatch โ Economic Dispatch, Unit Commitment, Storage, DR, Load/Capacity & Diversity โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐งฐ Economic dispatch (no network) | Minimise โ C_i(P_i) s.t. โP_i=D, 0โคP_iโคP_i^{max} |
Choose cheapest generator mix to meet demand now. | Convex cost with KKT multiplier (system lambda). | Half-hour balancing by marginal cost stack. | Efficient given costs. | Ignores start-up/min-up-down. | Add UC constraints (MILP). | cvxpy | |
1) Sort by cost. 2) Fill to meet D. 3) Respect limits. | Validate vs actual stack โ quantify savings โ iterate costs. |
| ๐ญ Unit Commitment (UC) | Binary on/off u_{i,t}โ{0,1}, power P_{i,t}, min-up/down, ramps, start-ups. |
Which plants to have on, considering start/stop costs. | MILP over time horizon; operational constraints. | DA schedule for CCGT/OCGT fleet. | Operationally realistic. | Large MILPs can be slow. | Lagrangian relax; rolling horizon; heuristics. | PuLP, OR-Tools | |
1) Decide on/off. 2) Constrain time-coupling. 3) Minimise total cost. | Solve horizon โ roll forward each HH โ compare to market schedule. |
| ๐ Storage arbitrage | Maximise โ ฯ_t e_t s.t. SOC dynamics, charge/discharge limits, round-trip ฮท |
When to charge/discharge to earn from price spreads. | LP with state-of-charge constraints. | 2-hour battery on GB day-ahead. | Clear & solvable. | Degradation, cycle limits. | Add cycle constraints; aging models. | cvxpy | |
1) Set prices & limits. 2) Constrain SOC. 3) Optimise schedule. | Backtest โ add degradation โ redeploy daily. |
| ๐งฒ Demand Response baseline & impact | Baseline BL_t=avg_{dโD}(y_{d,t}), Impact ฮ=BLโy |
Estimate savings vs โwhat wouldโve happenedโ. | Matched-day/10-of-10 + adj; or ML counterfactual. | Large C&I reduces load on price signal. | Simple accounting. | Baseline inflation/erosion; rebound. | ML baselines; penalty terms for rebound. | pandas, sklearn | |
1) Build baseline. 2) Compare actual. 3) Sum verified kWh. | QC with weather/occupancy โ settle payments on verified savings. |
| ๐งฎ Load factor (LF) | LF = ศฒ / P_{max} |
How โflatโ your demand is (near 1 = smoother). | Average power divided by peak power over period. | Retail portfolio shaping KPI. | Intuitive, quick. | Misses sub-interval spikes. | Add 95th percentile peak, ramp rate. | pandas | |
1) Mean. 2) Max. 3) Divide. | Track LF + 95th pct & ramps for ops decisions. |
| โ๏ธ Capacity factor (generation) | CF = (โ P_t)/(P_{rated}ยทT) |
How hard an asset ran vs its capability. | Time-weighted utilisation metric. | Wind farm performance vs P50. | Clear health check. | Curtailment/outages confound. | Decompose by cause codes; weather-adjust. | pandas | |
1) Sum output. 2) Divide by ratedรhours. | Attribute gaps to wind vs outages vs curtailment. |
| ๐ค Coincidence / Diversity factor | CF_agg = P_{max,agg} / โ P_{max,i}, Diversity = 1โCF |
How individual peaks combine across many customers. | Portfolio peak relative to sum of individual peaks. | Size capacity/hedge at portfolio (not customer) level. | Shows natural smoothing. | Herding creates high coincidence. | Segment cohorts; stress at peak HH. | pandas | |
1) Portfolio peak. 2) Sum indiv. peaks. 3) Divide. | Use CF in capacity/hedge planning; monitor by segment. |
๐น Risk & Hedging โ Elasticity, Hedge Cost-to-Serve, Mean-Variance, VaR/CVaR, GARCH, Imbalance โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐งฒ Price elasticity of demand | ฮต โ (ฮQ/Q)/(ฮP/P); log-log: ln Q = ฮฑ + ฮต ln P + โฆ |
How usage changes when price changes. | Elasticity from log regression; beware endogeneity. | TOU tariff response estimation (EV load). | Quantifies DR potential. | Price โ demand feedback. | IV regression; diff-in-diff. | statsmodels | |
1) Take logs. 2) Regress. 3) Coef on log P = elasticity. | Check sign/CI โ simulate tariff scenarios โ validate vs trial. |
| ๐ฆ Hedge cost-to-serve | C=โ (q_h f_h) + โ (d_h - q_h) s_h |
Blended cost from forwards + spot top-up. | Portfolio cost under hedge ratio path. | Choose 80% day-ahead + 20% intraday. | Quantifies hedge value. | Over/under-hedge risk. | Optimise q_h via mean-CVaR. |
numpy, cvxpy | |
1) Pick hedge volumes. 2) Multiply by forward. 3) Residual at spot. 4) Sum. | Stress with price scenarios โ pick policy โ monitor tracking error. |
| ๐ Mean-Variance (hedge portfolio) | Minimise ฮป wแตฮฃw โ ฮผแตw s.t. โw=1, wโฅ0 |
Choose mix of forwards to balance risk/return. | Markowitz: return = โcost; variance = price risk. | Blend month/quarter/season strips. | Simple & fast. | Fat-tails break Gaussian assumptions. | Use CVaR (coherent) optimisation. | numpy, cvxpy | |
1) Estimate ฮผ, ฮฃ. 2) Optimise weights under constraints. | Backtest โ set limits โ monitor exposures & VaR. |
| ๐ VaR / CVaR (energy P&L) | VaR: quantile; CVaR: tail mean E[L | L โฅ VaRฮฑ] |
Loss at confidence level; average of worst tail. | Quantile & tail expectation on P&L distribution. | 95% daily VaR/CVaR on hedge book. | Focuses on downside. | VaR non-coherent; model risk. | Prefer CVaR; historical/MC scenarios. | numpy, scipy | |
1) Simulate P&L. 2) Take quantile. 3) Average the tail. | Scenario set incl. spikes โ set limits & capital. |
| ๐ช๏ธ GARCH (price volatility) | ฯ_t^2 = ฯ + ฮฑ ฮต_{t-1}^2 + ฮฒ ฯ_{t-1}^2 |
Models time-varying volatility clustering. | Conditional heteroskedasticity for returns. | Size intraday risk limits. | Captures clustering. | Regime changes, jumps. | EGARCH/GJR; jump components. | arch | |
1) Fit to returns. 2) Get conditional ฯ. 3) Forecast risk. | Combine with VaR/CVaR sizing; recalibrate periodically. |
| ๐ Imbalance cash-out | Charge = Volume ร ImbalancePrice |
What you pay if short/long vs position. | Settlement using system price (pay-as-imbalance). | Supplier under-forecast in tight evening peak. | Clear incentive to balance. | Price spikes โ large P&L hits. | Better nowcasts; intraday re-trading. | pandas | |
1) Compute delta vs notified position. 2) Multiply by imbalance price. | Attribute P&L by HH โ improve forecast & hedge policy. |
๐ Network & Settlement โ DC-OPF & LMPs, Congestion Rent, Loss Factors โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐งญ DC-OPF & LMPs | Minimise cost s.t. balance; line flows F=Bฮธ, |F|โคF^{max}; LMP=ฮป_energy+ฮป_cong(+ฮป_loss) |
Price at each node reflects energy + congestion (and losses). | Linearised (DC) OPF; dual variables give LMPs. | Compute nodal prices & congestion rents. | Captures grid constraints. | Ignores reactive/voltage (AC effects). | AC-OPF or PTDF + loss factors. | PYPOWER, pandapower | |
1) Build network model. 2) Optimise dispatch. 3) Read duals โ nodal prices. | Analyse hub-node spreads โ hedge congestion exposure. |
| ๐ฆ Congestion rent | Rent = โ_โ (ฮฯ_โ) ยท F_โ |
Value collected due to constrained lines. | Price difference times flow, summed over lines. | Interconnector constraint revenue estimate. | Quantifies bottlenecks. | Volatile; counterflows. | Scenario DC-OPF expected flows. | numpy | |
1) Compute price diff on each line. 2) Multiply by flow. 3) Sum. | Rank lines by rent โ target upgrades/hedges. |
| ๐ Loss factors (settlement) | E_delivered = E_metered ร LLF (or 1โโ_dist) |
Adjust energy for distribution losses so settlement is fair. | Apply LLF by profile class/region/time to meter reads. | Supplier settlement adjustments (e.g., Elexon LLFs). | Simple application. | Wrong MPANโLLF mapping. | Validate joins; reconciliation vs statements. | pandas | |
1) Map MPAN to LLF. 2) Multiply reads by LLF. | Reconcile with settlement statements; investigate deltas. |
๐ค ML in Electricity โ Forecasting, Risk, Networks โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ณ Gradient Boosted Trees (STLF) | Stagewise additive model: F_m(x)=F_{m-1}(x)+ฮฝยทh_m(x) minimising loss |
Many small trees, each fixes last oneโs mistakes. | Boosting over CARTs on features (lags, weather, calendar). | Half-hourly load forecast with temp & holiday effects. | Strong accuracy; little feature scaling needed. | Overfit if too deep/too many rounds. | Early stopping; shrinkage ฮฝ; regularisation. |
xgboost, lightgbm, sklearn | |
1) Build lag & weather features. 2) Train boosted trees. 3) Stop when val error rises. | Walk-forward backtest โ SHAP to sanity-check drivers โ deploy nightly. |
| ๐ง LSTM (sequence-to-one) | Recurrent cells: h_t,f_t,i_t,o_t gating long/short memory |
Neural net that โremembersโ recent patterns. | Many-to-one RNN for short-term load/RES output. | PV/wind intra-day forecast with ramps. | Catches non-linear temporal effects. | Needs data & tuning; can drift. | Dropout; early stopping; recalibration schedule. | TensorFlow, PyTorch | |
1) Windowise time series. 2) Train on past windows. 3) Predict next horizon. | Backtest rolling origin โ compare vs tree baseline โ ensemble. |
| ๐ฌ๏ธ Random Forest (RES output) | Bagging: average of many trees to reduce variance | Many decorrelated trees vote an answer. | RF regression on wind/pv with NWP features. | Day-ahead wind farm output P50/P90. | Stable; handles interactions. | Less sharp peaks vs boosting. | Quantile RF for P90; feature selection. | sklearn, skgarden | |
1) Build met/NWP features. 2) Fit many trees. 3) Average predictions. | Calibrate quantiles โ generate P50/P90 bands. |
| โ ๏ธ XGBoost Classifier (price spike) | Boosted trees minimising log-loss | Predicts probability of a spike event. | Binary classification with class weights. | Imbalance price spike nowcast. | Probabilities for risk sizing. | Class imbalance โ false alarms. | Focal loss; threshold from cost curve. | xgboost, sklearn | |
1) Label spikes. 2) Train classifier. 3) Pick threshold to balance cost. | Track precision/recall over time โ retrain monthly. |
| ๐ต๏ธ Isolation Forest (anomalies) | Isolation via random splits; anomaly score by path length | โOutsidersโ get isolated quickly. | Unsupervised anomaly detection on SCADA/load. | Detect metering or telemetry glitches. | No labels required. | Flags change-points as outliers. | Add STL detrend; use rolling training windows. | sklearn | |
1) Build feature set. 2) Fit unsupervised. 3) Score & alert top-k. | Whitelist known events; tune contamination. |
| ๐ฎ RL for Battery Dispatch | Q-learning: Q(s,a)โ(1โฮฑ)Q+ฮฑ[r+ฮณ max_a' Q(s',a')] |
Agent learns charge/discharge to maximise ยฃ. | MDP over price & SoC; discrete actions. | Day-ahead + intraday arbitrage. | Adapts to patterns. | Exploration risk; constraint handling. | Reward shaping; safety layer on SoC/DoD. | stable-baselines3, numpy | |
1) Define states/actions. 2) Train with reward=profit. 3) Constrain SoC/cycles. | Backtest vs LP benchmark โ deploy with guardrails. |
Telecoms Wholesale and retail Click2Expand
๐ Demand & Forecasting โ โWhatโs coming down the pipe?โ โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ HoltโWinters (additive) | ลท_{t+h}=(โ_t+h b_t)s_{t+h-mk} with ฮฑ,ฮฒ,ฮณ; seasonal m=24 (hourly) |
Forecast calls/data sessions with daily/weekly rhythm. | Triple exponential smoothing (ETS AAA) minimising EW-SSE. | Call centre volume by hour; store staffing. | Fast, solid baseline. | Breaks on outages, promos, price shocks. | Refit rolling; ETSX with exogenous (price, promo, outage dummies). | statsmodels | |
Track level/trend/seasonal โ update each period โ project h steps ahead. | Train/validate โ tune ฮฑฮฒฮณ โ backtest โ publish to WFM & routing. |
| ๐ ARIMAX | ฯ(B)(1โB)^d y_t = ฮธ(B)ฮต_t + ฮฒX_t |
Lets history and drivers (price, weather, promos) steer the forecast. | BoxโJenkins with exogenous regressors. | Predict broadband tickets after a price change. | Flexible; interpretable. | Non-stationary, regime switches. | Differencing; holiday dummies; rolling refits. | pmdarima, statsmodels | |
Make series steady โ fit AR & MA โ plug in drivers โ forecast, undifference. | Residual checks โ walk-forward backtest โ deploy & watch drift. |
| ๐งฎ Benford for CDR/fraud sanity | P(d)=log10(1+1/d), dโ{1..9} |
First-digit test to sniff odd billing patterns. | Benford distribution on aggregated measures. | Wholesale CDR audit: spot fabricated volumes. | Quick anomaly screen. | Not proof of fraud; small samples flaky. | Complement with MAD outliers & supervised models. | numpy, pandas | |
Count first digits โ compare to Benford โ flag deviations. | Escalate big gaps โ deeper checks on rated events. |
โ๏ธ Traffic & Capacity โ โHow many trunks/agents do we actually need?โ โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ฆ Erlang-B (blocking, no queue) | B(A,c)=\frac{A^c/c!}{\sum_{k=0}^{c} A^k/k!}, traffic A=ฮป/ฮผ |
% of calls that get โbusy toneโ if all trunks busy. | Loss system (M/M/c/c) steady-state blocking probability. | SIP trunks for a retail contact centre. | Gold standard for trunks. | Ignores retrials/queues. | Use Erlang-C if you queue; simulate retrials. | numpy, math | |
Compute traffic A โ pick trunks c โ get blocking โ tweak until GoS met. | Choose GoS (e.g., 1%) โ iterate c โ add safety for peaks. |
| โณ Erlang-C (wait, with queue) | ฯ=ฮป/(cฮผ), P_W via standard Erlang-C; ASA=P_W/(cฮผโฮป) |
Chance a caller waits + expected wait time. | M/M/c with infinite buffer; steady-state. | Retail care queue meeting 80/20. | Battle-tested staffing. | Arrival burstiness, skill-mismatch. | Skills-based routing; simulate; ฮป(t) per interval. | numpy | |
Estimate ฮป, ฮผ โ trial c โ compute wait prob & ASA โ adjust. | Do per 15-min โ add shrinkage/occupancy caps in WFM. |
| ๐ฅ Engset (finite sources) | Blocking with finite N callers (formula omitted here for brevity) | When your caller pool is small (e.g., internal helpdesk). | Loss model with finite source population. | IT service desk for 300 staff site. | More realistic than Erlang-B for small N. | Needs N estimate. | Cross-check with sim / B as bound. | custom | |
Estimate sources N โ calls per head โ compute blocking. | Sanity-check with discrete-event simulation. |
| ๐ Littleโs Law (packets, calls) | L=ฮปW |
Average in-system = rate ร time. Simple and beautiful. | Applies across queues in steady-state. | Estimate average active calls given arrival & wait. | One-liner insight. | Not for transients. | Use windowed ฮป(t); exclude incident periods. | pandas | |
Pick a stable window โ apply the identity. | Cross-validate against measured occupancy. |
| โ M/M/1 latency (packet queue) | W = 1/(ฮผโฮป), ฯ=ฮป/ฮผ |
Delay skyrockets as utilisation nears 1. Ooft. | Single-server queue with Poisson/Exp. | Edge SBC or NAT box sizing. | Back-of-envelope capacity. | Traffic not Poisson; service not Exp. | M/G/1 (PollaczekโKhinchine), simulations. | numpy | |
Estimate ฮผ, ฮป โ compute W โ keep ฯ well below 1. | Target ฯโค0.7 for headroom; monitor p95 latency. |
๐ฐ๏ธ Radio & Throughput โ โWill it actually shift the bits?โ โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ก Friis / Path loss (free space) | PL(dB)=32.44+20log10(f_MHz)+20log10(d_km) โ G_t โ G_r |
How loudly the signal fades with distance/frequency. | Link budget basic; add margins & non-LOS losses in reality. | Microwave backhaul feasibility check. | Simple first pass. | Ignores obstacles & rain fade. | ITU-R models; add rain/urban clutter. | numpy | |
Pick f,d โ compute loss โ check RX level vs sensitivity. | Add fades & margins โ confirm availability target. |
| ๐ถ SINR & spectral efficiency | SINR = S/(I+N); ฮทโf(SINR) via MCS curve |
Signal vs noise+interference โ how many bits we can cram in. | Maps to modulation/coding โ bits/Hz. | 5G small cell planning on busy street. | Directly tied to user throughput. | Fast fading & scheduling fairness. | Use distributions (percentiles), not point SINR. | numpy | |
Measure S,I,N โ compute SINR โ lookup MCS โ get ฮท. | Plan for p5/p50/p95 SINR โ verify drive-tests. |
| ๐ก ShannonโHartley (upper bound) | C = B log2(1+SNR) |
Ceiling on throughput for a clean channel. | AWGN capacity bound; reality below due to overheads. | Rough max throughput for fixed wireless. | Gives hard upper bound. | Not achievable with real protocols. | Subtract overheads; use MCS curves. | numpy | |
Bandwidth ร log(1+SNR) โ bits per second. | Apply protocol overhead โ compare to SLA. |
| ๐ TCP throughput (Mathis) | T โ 1.22 ยท MSS / (RTT ยท โp) |
Why long RTT + a sniff of loss kills your speed tests. | TCP Reno steady-state approximation. | International transit troubleshooting. | Explains user complaints fast. | Modern CCAs differ (Cubic/BBR). | Use measured CCA; still a good sanity check. | numpy | |
Measure MSS/RTT/loss โ compute T โ set expectations. | Recommend CDN/BBR/MTU tuning if constrained. |
๐ท Commercial & Risk โ โARPU pays the bills, churn steals your lunch.โ โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ ARPU & CLV (simple) | ARPU = Rev/Subs; CLV โ ARPU ยท \frac{m}{r} with margin m, retention r |
Average revenue per user and lifetime ยฃ value. | Steady-state approximation for planning. | Retail mobile bundle business case. | Quick portfolio health. | Ignores cohort & discounting. | Discounted CLV: sum of cashflows / (1+r)^t. | pandas, numpy | |
Compute ARPU โ apply margin โ divide by retention. | Refine with cohort curves & discount rate. |
| ๐ช Churn (logistic) + survival | P(churn)=ฯ(wแตx); KaplanโMeier S(t) |
Whoโs likely to leave, and when. | Binary GLM + survival for tenure timing. | Flag risky SIMs; save offers. | Actionable probabilities. | Imbalance; leakage of future info. | Class weights; time-aware CV; survival features. | sklearn, lifelines | |
Engineer features โ fit โ calibrate โ threshold on cost curve. | Champion-challenger โ monitor drift โ retrain monthly. |
| ๐ Least-Cost Routing (LCR) | Minimise โ c_{i,d} x_{i,d} s.t. demand & quality constraints |
Pick the cheapest wholesaler per destination, within QoS. | LP/MIP with capacity, A-Z breaks, quality gates. | Wholesale voice A-Z routing plan. | Direct OPEX impact. | Route flapping; QoS drift. | Add hysteresis; QoS constraints; penalties. | cvxpy, pulp | |
Build cost/QoS matrix โ solve โ generate route tables. | Monitor CDR KPIs โ auto-reroute on breaches. |
| ๐ฑ Interconnect settlement | Charge = โ minutes_d ร rate_d (+ surcharges) |
Bill each destination at the agreed rate card. | Rating by A-Z prefix with time bands, surcharges. | Monthly invoice to/ from carriers. | Transparent and auditable. | Prefix drift; rate card mismatches. | Versioned rate tables; prefix normalisation; diffs. | pandas | |
Join CDRs to rate card โ multiply โ sum per counterparty. | Reconcile vs partner; chase deltas; adjust prefixes. |
| ๐งฒ Price elasticity (bundles) | ฮต โ (ฮQ/Q)/(ฮP/P); log-log regression |
How subs change when you tweak price. | Elasticity from ln-ln model; beware endogeneity. | Broadband + mobile bundle test. | Guides promo depth. | Confounded by offers/seasonality. | Instruments; diff-in-diff; fixed effects. | statsmodels | |
Take logs โ regress โ elasticity is coef on log P. | Simulate scenarios โ guardrails for pricing. |
๐ Network, Routing & Reliability โ โFind the choke points before they find you.โ โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ก๏ธ Availability (nodes/paths) | A=MTBF/(MTBF+MTTR); series: โAแตข; parallel: 1โโ(1โAแตข) |
Uptime of gear and combined paths. | Steady-state availability; assumes independence. | Core + access redundancy score. | Clear SLA link. | Common-mode failures spoil party. | Fault-tree analysis; dependency factors. | numpy | |
Get MTBF/MTTR โ compute A โ combine by topology. | Identify SPoFs โ prioritise hardening. |
| ๐งญ PageRank / centrality | PR(i)=(1โd)/N + d ฮฃ PR(j)/out(j) |
Find the โkingโ routers/edges by influence. | Eigenvector of random-walk on graph. | Prioritise patching/monitoring targets. | Surfaces critical hubs. | Dangling nodes; directionality. | Handle dangling mass; personalise by traffic. | networkx | |
Build graph โ run PR โ sort nodes by score. | Cross-check with flow; harden top-k. |
| ๐ช MOS / E-model (voice QoE) | R=RโโI_sโI_dโI_e+โฆ; MOSโ1+0.035R+7eโ6R(Rโ60)(100โR) |
Maps impairments to a โhow it soundsโ score. | E-model per ITU-T G.107; codec, delay, loss factors. | Wholesale voice route QoE benchmarking. | Single MOS for execs. | Model assumptions; non-speech effects. | Use PESQ/POLQA for lab, E-model for fleet. | numpy | |
Measure loss/jitter/latency โ compute R โ map to MOS. | Alert on MOS dips โ reroute carriers. |
๐ค ML in Telecoms โ Churn, Fraud, Traffic, QoE, Routing โพ
| Name | Formula (๐) | Non-technical description | Technical description | Scenario ๐งช | Pros โ | Pitfalls โ ๏ธ | Formulas to address pitfalls | Python libs ๐ ๏ธ | Python code | Formula Steps in Plain English ๐ | Step-by-step evaluation ๐ฃ |
|---|---|---|---|---|---|---|---|---|---|---|---|
| ๐ช Churn ML (logistic + survival) | P(churn)=ฯ(wแตx); survival S(t) via Cox/KM |
Who leaves, and how soon. | Binary GLM + time-to-event model. | Flag risky broadband/mobile subs. | Actionable; clear uplift targeting. | Leakage; class imbalance. | Time-aware CV; class weights; calibration. | sklearn, lifelines | |
1) Engineer features. 2) Fit & calibrate. 3) Score & intervene. | Championโchallenger offers โ monitor retention lift. |
| ๐ต๏ธ CDR Fraud (IsoForest + supervised) | Isolation paths + classifier stack | Spot weird call patterns & confirm with labels. | Unsupervised pre-filter โ supervised confirmer. | Wholesale AโZ fraud, IRSF, SIM box. | High recall on oddities. | Alert fatigue. | Two-stage: anomaly score โ threshold โ classifier. | sklearn, xgboost | |
1) Score anomalies. 2) Train classifier on confirmed fraud. | Auto-block high score + human-in-loop review. |
| ๐ Traffic Forecast (GBTs/LSTM) | Boosting or RNN as above | Predict sessions/calls with promos/outages. | Tree or seq2seq with exogenous drivers. | Contact-centre/Wi-Fi offload planning. | Great short-term accuracy. | Promo/incident drift. | Event features; rapid refits; ensembles. | xgboost, PyTorch | |
1) Build features/windows. 2) Train. 3) Validate & ensemble. | Deploy with drift monitors; auto-refresh models. |
| ๐ฐ๏ธ QoE Estimation (MOS regressor) | Map KPIs โ MOS via regression | Predict โhow it feelsโ from KPIs. | Nonlinear regressor on latency/jitter/loss/codec. | Wholesale route MOS for alerts. | Fast, continuous MOS. | MOS is a proxy; lab โ field. | Retrain per codec/region; add quantile models. | sklearn | |
1) Collect KPIs. 2) Fit MOS model. 3) Alert when predicted MOS dips. | Compare to POLQA samples; recalibrate. |
| ๐บ๏ธ Graph ML (routing anomalies) | Node2Vec/GraphSAGE embeddings โ classifier | Learn network โshapeโ to catch odd flows. | Graph embeddings + anomaly/cls. | Weird BGP/route changes, choke points. | Captures topology context. | Needs graph pipeline. | Automate graph ETL; rolling windows. | networkx, stellargraph, pyg | |
1) Build graph. 2) Learn embeddings. 3) Detect anomalies. | Alert & correlate with NetFlow/SNMP. |
| ๐ฎ RL for SON / Routing | Policy gradient / Q-learning rewards on QoE | Auto-tune network knobs for experience. | MDP with KPIs as reward; safe constraints. | eNodeB power/tilt, handover margins. | Adapts to local conditions. | Exploration risk on live users. | Shadow mode; constrained RL; guardrails. | stable-baselines3 | |
1) Sim/Shadow learn. 2) Validate. 3) Gradual live rollout. | Compare QoE uplift vs control cells; rollback if worse. |
๐ Formulas without business context are the Billy-no-mates of algorithms. Full of potential, waiting for a dataset to play with ๐.
โ Mean โ baseline average (needs context)
| โ Mean (Average) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ Quick โmiddle groundโ check | Add โem up โ and divide โ | Arithmetic mean | 1๏ธโฃ Add 2๏ธโฃ Count 3๏ธโฃ Divide |
(2,4,6) โ 12 รท 3 = 4 | โก Average monthly electricity bill from weekly readings |
๐ฒ Variance โ how spread out values are
| ๐ฒ Variance (Spread) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ See how jumpy values are | Spread of values | Mean squared deviation | Subtract โ Square โ Sum โ รทn | (2,4,6) โ Varโ2.67 | ๐ Month-to-month revenue volatility for a product line |
๐ Covariance โ do two variables move together?
| ๐ Covariance (Joint Movement) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ค Check if two things rise & fall together | Do 2 vars move in sync? | Expected joint deviation | Subtract means โ Multiply โ Sum โ รทn | X=(1,2,3), Y=(2,3,5) โ Cov=1 | ๐ Ad spend moving with weekly sales in a campaign |
๐ Correlation โ strength & direction (โ1 to +1)
| ๐ Correlation (Strength of Link) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ Ask: how strong is the link? | Togetherness measure | Cov รท (ฯXฯY) | Cov / product of SDs | rโ0.97 | ๐ Call-centre volume vs website outage duration |
๐ Linear Regression โ best-fit line through points
| ๐ Linear Regression (Best Fit Line) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ Want a line through dots | Best-fit straight line | Least squares fit | ฮฒโ=Cov/Var, ฮฒโ=ศณโฮฒโxฬ | ลท=0.33+1.5x | โ๏ธ Predict maintenance time from machine age/usage |
๐งฉ OLS Matrix โ matrix solution to regression
| ๐งฉ Ordinary Least Squares (Matrix Form) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐งฎ Need exact solution for line/plane | Matrix shortcut | (XแตX)โปยนXแตy | Build X โ invert โ multiply | Slope=1.5, Int=0.33 | ๐ Fitting BI trendlines across many features fast |
๐ฏ MSE โ average squared prediction error
| ๐ฏ Mean Squared Error (Loss) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ฏ Compare modelsโ accuracy | Average squared miss | Quadratic loss | Square โ Sum โ รทn | MSEโ0.056 | ๐งช A/B test: pick the model with the lower error on holdout |
๐ป Gradient Descent โ iterative downhill optimiser
| ๐ป Gradient Descent (Optimiser) | |||||
|---|---|---|---|---|---|
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
| ๐ฅพ When algebra too hard, walk it | Iterative optimiser | ฮธโฮธโฮทโL | Guess โ Gradient โ Step โ Repeat | w=0โ0.8โ1.44 | ๐ค Train a forecasting model on large datasets |
๐ฟ SGD โ online learner (one sample at a time)
๐ฟ Stochastic Gradient Descent (Online Learning)
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
|---|---|---|---|---|---|
| ๐บ Stream data one by one | Noisy but efficient | Update on each sample | w=0โ0.6โ1.56 | Clickstream learning | Online ads adapting live |
๐ Logistic Regression โ S-curve for probabilities
๐ Logistic Regression (Classification)
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
|---|---|---|---|---|---|
| ๐ฎ Want yes/no with probabilities | Sigmoid transform | ลท=1/(1+e^โ(ฮฒโ+ฮฒโx)) | ฮฒโ=โ3, ฮฒโ=2, x=2 โ p=0.73 | Churn yes/no | Customer retention analysis |
๐ Cross-Entropy โ penalises confident wrong answers
๐ Cross-Entropy (Log Loss)
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
|---|---|---|---|---|---|
| โ๏ธ Penalise confident errors | Log-loss | โ[ylogp+(1โy)log(1โp)] | y=1, p=0.73 โ 0.314 | Classifier training | Email spam filters |
๐งญ Bayes โ update beliefs with evidence
๐งญ Bayesโ Theorem (Update Rule)
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
|---|---|---|---|---|---|
| ๐ง Update beliefs with evidence | Posterior โ PriorรLikelihood | P(A|B)=P(B|A)P(A)/P(B) | P(spam|WIN)=95% | Spam filter | Medical diagnosis updates |
๐งฎ Matrix Inversion โ algebra behind OLS
๐งฎ Matrix Inversion (OLS Step)
| โฐ When to Apply | ๐ฃ๏ธ Plain Speak | ๐ Technical | ๐ช Steps | โ๏ธ Example | ๐ Scenario |
|---|---|---|---|---|---|
| ๐งฎ When you must invert XแตX | Feature covariance inverse | (XแตX)โปยน | [[14,โ6],[โ6,3]]/6 | OLS engine | Running regression in BI tools |
๐ฆถ note: Machine Learning put into context
The maths are already used; ML just remembers stuff and evolves understanding. After all the world was flat until Aristotle (384 - 322 BCE) figured it out by observing that ships disappear hull-first over the horizon!