Databricks Components Overview

🧱 Databricks Components Overview

Component	Non-Technical Description 📘	Technical Description ⚙️	Use Case/Scenario 🎯
Workspace	Your collaborative project space	Hosts notebooks, jobs, repos, and ML experiments	Organising analytics or data science projects
Clusters	Computing engine (like your personal AI but scalable)	Spark-based distributed compute environment (autoscaling or manual)	Run jobs, notebooks, ML models
Jobs	Automated tasks or scheduled workflows	DAG-based execution of notebooks, scripts, or JARs	Nightly ETL jobs, ML training pipelines
Notebooks	Interactive workspace for code and output	Supports Python, SQL, Scala, R, Markdown	Exploratory data analysis, prototyping
SQL Editor	GUI to query data tables	Uses Databricks SQL for BI-friendly interface	Business users querying curated tables
Delta Lake	Like a spreadsheet that remembers everything	ACID-compliant storage layer over Parquet	Reliable data lake tables with versioning
Unity Catalog	Your data’s filing cabinet and bouncer	Central metadata & access control layer with RBAC	Secure multi-tenant access across clouds
Lakehouse Platform	The Databricks "big idea" — warehouse + lake	Combines data lake scalability with DB-like performance	Unified platform for batch, stream, ML, and BI
MLflow	Your model's history, packaging, and delivery	Open-source lifecycle management for ML models	Experiment tracking, model registry, deployment
Repos	Built-in Git versioning	Git-backed source control for notebooks & jobs	Code collaboration and CI/CD
Data Explorer	Browse your tables like folders	Visual UI to inspect catalog, schemas, tables	Data discovery and governance check
Dashboards	Shareable reports and visuals	BI dashboard powered by SQL or notebooks	Stakeholder insights and KPIs