Workspace |
Your collaborative project space |
Hosts notebooks, jobs, repos, and ML experiments |
Organising analytics or data science projects |
Clusters |
Computing engine (like your personal AI but scalable) |
Spark-based distributed compute environment (autoscaling or manual) |
Run jobs, notebooks, ML models |
Jobs |
Automated tasks or scheduled workflows |
DAG-based execution of notebooks, scripts, or JARs |
Nightly ETL jobs, ML training pipelines |
Notebooks |
Interactive workspace for code and output |
Supports Python, SQL, Scala, R, Markdown |
Exploratory data analysis, prototyping |
SQL Editor |
GUI to query data tables |
Uses Databricks SQL for BI-friendly interface |
Business users querying curated tables |
Delta Lake |
Like a spreadsheet that remembers everything |
ACID-compliant storage layer over Parquet |
Reliable data lake tables with versioning |
Unity Catalog |
Your dataβs filing cabinet and bouncer |
Central metadata & access control layer with RBAC |
Secure multi-tenant access across clouds |
Lakehouse Platform |
The Databricks "big idea" β warehouse + lake |
Combines data lake scalability with DB-like performance |
Unified platform for batch, stream, ML, and BI |
MLflow |
Your model's history, packaging, and delivery |
Open-source lifecycle management for ML models |
Experiment tracking, model registry, deployment |
Repos |
Built-in Git versioning |
Git-backed source control for notebooks & jobs |
Code collaboration and CI/CD |
Data Explorer |
Browse your tables like folders |
Visual UI to inspect catalog, schemas, tables |
Data discovery and governance check |
Dashboards |
Shareable reports and visuals |
BI dashboard powered by SQL or notebooks |
Stakeholder insights and KPIs |