LLMOps Bootcamp on Databricks — SoftBricks Academy

Cohort 1 · LLMOps on Databricks

The Gap

Why Most LLM Projects Stall at Demo

Building an LLM app is easy. Keeping it reproducible, observable, and safe in production is where teams get stuck.

🔍

No Trace, No Truth

When outputs go wrong, you can't replay what happened. No call graph, no prompt version, no retrieval snapshot. Debugging becomes guesswork and your users notice first.

🛠️

Manual Deploys, Manual Drift

Prompts edited in UI, models updated by hand, endpoints configured ad hoc. Environments drift, rollbacks are scary, and nobody can tell you what's actually running in prod.

📈

Cost & Quality Run Wild

Token spend triples overnight, hallucination rates creep, and nobody notices until the invoice or the complaint. Without evals and monitoring, you learn about problems from stakeholders.

The Arc

Two Phases. One Operator Mindset.

From workspace-ready engineer to platform operator shipping production LLM applications.

PHASE 1 · WEEKS 1–4

Foundations

Trace, evaluate, and version everything

Set up your Databricks workspace, build knowledge pipelines, trace every LLM call, evaluate quality systematically, and register prompts as versioned artifacts. By Week 4 your pipeline is fully observable.

Workspace, clusters & dev workflow
Chunking, embeddings & Vector Search
MLflow Tracing & Evaluation
Prompt Registry & optimization

PHASE 2 · WEEKS 5–8

Operations

Deploy, govern, and monitor at scale

Log agents into Unity Catalog, serve them through Mosaic AI Model Serving, define infrastructure with Asset Bundles, wire up CI/CD, and run your system with live monitoring and drift detection.

Agents, tool-calling & managed MCP
Mosaic AI Model Serving
Databricks Asset Bundles & CI/CD
Monitoring, observability & capstone

Curriculum

8 Weeks. 8 Production Builds.

Each week you learn the pattern, then ship a piece of the platform. Click a week to see what you'll own by the end of it.

WEEK 1

LLMOps Foundations & Workspace

Build: First Traced LLM Pipeline

What LLMOps actually means, how it extends MLOps, and the operator mindset. Provision a Databricks workspace, configure compute, set up version-controlled notebooks, and ship your first traced LLM call end-to-end.

Databricks workspace Compute & clusters Notebook dev workflow MLflow runs Unity Catalog basics

WEEK 2

Knowledge Pipelines & Vector Search

Build: Production Retrieval Pipeline

Ingest a real corpus, pick the right chunking strategy, generate embeddings, and wire up a Databricks Vector Search Index. Everything lands in governed Delta tables so your knowledge layer stays reproducible.

Document ingestion Chunking strategies Embeddings Vector Search Index Delta tables

WEEK 3

MLflow Tracing & Evaluation

Build: Traced & Evaluated RAG

Instrument every LLM call, retrieval, and tool invocation with MLflow Tracing. Build an evaluation harness with curated datasets, LLM-as-judge, and custom metrics so quality is measured, not guessed.

MLflow Tracing Evaluation harness LLM-as-judge Custom metrics Golden datasets

WEEK 4

Prompt Engineering & Registry

Build: Versioned Prompt Catalog

Stop treating prompts as code comments. Register them as versioned artifacts, run automated prompt optimization, and A/B test prompt variants against your eval harness with rollback you can trust.

Prompt Registry Prompt optimization Versioning & rollback A/B evaluation Prompt governance

WEEK 5

Agents, Tool-Calling & MCP

Build: Registered Tool-Calling Agent

Compose a tool-calling agent, plug in managed Databricks MCP servers, and build custom MCP tools for your own systems. Log the agent using MLflow and register it in Unity Catalog with full lineage.

Agent design Tool calling Managed MCP Custom MCP apps Unity Catalog agents

WEEK 6

Mosaic AI Model Serving

Build: Agent on a Serving Endpoint

Deploy your registered agent through Mosaic AI Model Serving. Configure autoscaling, route traffic, set token and rate limits, and wire up cost guardrails so production behaves predictably under load.

Mosaic AI Serving Endpoint configuration Autoscaling Rate & token limits Cost guardrails

WEEK 7

Asset Bundles & CI/CD

Build: Dev → Staging → Prod Pipeline

Define your whole system as code with Databricks Asset Bundles. Promote through dev, staging, and prod with GitHub Actions, manage secrets and permissions cleanly, and make deploys boring.

Databricks Asset Bundles Infrastructure as Code GitHub Actions Environments & secrets Release gating

WEEK 8

Observability & Capstone

Build: Full LLMOps Capstone + Demo Day

Add post-deployment monitoring, drift detection, and cost dashboards. Tighten your test pyramid for LLM workloads, then present your end-to-end system on Demo Day with traces, evals, and production metrics.

Production monitoring Drift detection Cost observability LLMOps testing strategy Demo Day

The Stack

What You'll Operate

A Databricks-native stack for LLM applications, paired with the open tooling that surrounds it.

☁️

Platform

Databricks Workspace, Unity Catalog, Delta tables, Clusters & compute

📊

Tracking & Evals

MLflow Tracing, MLflow Evaluation, LLM-as-judge, Golden datasets

📝

Prompts

MLflow Prompt Registry, Prompt optimization, Versioning & rollback

🗃

Retrieval

Databricks Vector Search, Embedding models, Chunking pipelines, Delta sync

🤖

Agents

Tool calling, Managed MCP servers, Custom MCP apps, Agent logging

🚀

Serving

Mosaic AI Model Serving, Autoscaling, Rate limits, Cost guardrails

🔨

IaC & CI/CD

Databricks Asset Bundles, GitHub Actions, Environment promotion, Secrets

👁

Observability

Production monitoring, Drift detection, Cost dashboards, Alerting

✅

Testing

Unit & integration tests, Eval pipelines, LLMOps vs MLOps testing patterns

What's Included

Everything You Need to Operate

Live instruction, production builds, reference architecture, and an operator community. One price. Everything in.

🎓

48 Hours Live Instruction

$1,200 value

Live sessions every Saturday and Sunday with real-time Q&A, architecture reviews, and walkthroughs on your actual workspace.

🏗️

Full LLMOps Capstone

$800 value

An end-to-end deployed LLM application with tracing, evals, serving, CI/CD, and monitoring. Fully yours to own and demo.

🔍

Evaluation Harness

$500 value

A reusable evaluation framework with golden datasets, LLM-as-judge, and custom metrics you can plug into any future LLM project.

📦

Asset Bundle Templates

$400 value

Production-tested Databricks Asset Bundle templates and CI/CD workflows you can drop into any LLM project and ship.

📐

Architecture Blueprints

$300 value

Reference architectures for retrieval, agent deployment, serving, monitoring, and multi-environment promotion.

🔄

Unlimited Re-attendance

$300 value

Re-attend any future cohort at no extra cost. The platform evolves fast — so does the curriculum.

💬

Operator Community

$250 value

Private community of LLM platform engineers for code reviews, deploy post-mortems, and job referrals.

🎥

Full Session Recordings

$200 value

Every live session recorded and indexed. Lifetime access to rewatch whenever a production problem reminds you of Week 6.

🏅

Completion Certificate

$200 value

A verified certificate from SoftBricks Academy signalling you can operate LLM applications end-to-end, not just build them.

Total Value

$3,900+ value delivered

Signals

What Operators Say

Feedback from engineers who ran LLMOps playbooks inside real teams.

★★★★★

"Week 3 alone paid for the cohort. MLflow Tracing gave us a call graph we'd been reverse-engineering from logs for months. Root-cause time on hallucinations dropped from hours to minutes."

RM

Rémi M.

ML Engineer → LLM Platform Lead

★★★★★

"Asset Bundles changed how our team ships. No more click-ops in the workspace — every prompt, endpoint, and job is code. Our staging environment actually mirrors prod now."

CB

Chiamaka B.

Data Engineer → Platform Engineer

★★★★★

"The Prompt Registry module was the missing piece. We stopped arguing over which prompt was in prod. Versioning, rollback, and A/B against the eval harness is now the standard workflow."

OP

Oluwaseun P.

Backend Engineer → AI Engineer

★★★★★

"I'd built agents before. I'd never properly served them. Week 6 walked me through Mosaic AI Serving with autoscaling and cost limits, and our first prod agent has been running for eight weeks without a page."

DK

Daniela K.

Senior Data Scientist

★★★★★

"The evaluation playbook is what we use across every LLM project now. Golden datasets, LLM-as-judge, CI gates — it's how we give stakeholders a number they can trust."

AK

Aymen K.

AI Engineer, Fintech

★★★★★

"I joined as a DevOps engineer trying to understand what my ML team actually needed. I left owning our LLMOps platform. The cohort conversations alone were worth the price."

LN

Lindiwe N.

DevOps → LLM Platform Engineer

SB

Taught By

SoftBricks Academy Professionals

Practising AI Platform Engineers

The LLMOps Bootcamp is led by SoftBricks Academy professionals — engineers who operate LLM applications on Databricks for real clients every week. The curriculum is distilled from production engagements: platforms we've architected, incidents we've debugged, and evaluation harnesses we've used to defend quality in front of stakeholders. Every module comes with the patterns, templates, and guardrails we use in our own work.

20+

Production LLM Apps

200+

Engineers Taught

48h

Live per cohort

Fit Check

Is This Right for You?

LLMOps is a platform discipline. This cohort is built for people ready to own the whole system.

This is for you if...

✓ You're an engineer who wants to own production LLM systems, not just prototypes
✓ You can write Python and understand APIs, version control, and cloud basics
✓ You want a Databricks-native skill set you can apply on day one
✓ You can commit 10-15 hours per week for 8 weeks
✓ You want a portfolio that proves operator-level capability

This is NOT for you if...

✗ You have zero programming experience
✗ You're looking for a passive video course
✗ You only want to learn prompt engineering
✗ You can't commit at least 10 hours per week

Pricing

One Cohort. One Price. Full Operator Toolkit.

No upsells. No locked modules. Everything you need to ship and operate LLM applications on Databricks.

LLMOps Bootcamp on Databricks · Cohort 1

$887

One-time · EMI available at checkout

✓ 48 hours of live instruction
✓ 8 production-grade builds
✓ Full LLMOps capstone project
✓ Reusable evaluation harness
✓ Databricks Asset Bundle templates
✓ Reference architecture blueprints
✓ Unlimited cohort re-attendance
✓ Operator community access
✓ Complete source code repository
✓ Session recordings · lifetime access
✓ Completion certificate

Enroll Now — $887 Join Cohort WhatsApp

✓ Secure checkout · EMI available · Invoice on request

🔒 14-day money-back guarantee. No questions asked.

FAQ

Common Questions

What operators usually ask before joining.

What's the time commitment?

Plan for 10-15 hours per week. That includes 6 hours of live sessions (two 3-hour sessions on Saturday and Sunday) plus 4-9 hours on the weekly build and self-study. The builds are where the operator instincts get wired in — don't skip them.

Do I need prior MLOps experience?

No. You need solid Python, comfort with APIs and version control, and basic familiarity with GenAI use cases. We teach the operator discipline from the ground up in Week 1. If you've built an LLM prototype once, you're ready.

Do I need a Databricks account?

A free Databricks trial is enough to follow every build in the course. We walk through workspace setup, compute, and permissions on day one so no one gets stuck on infrastructure. If your employer already has Databricks, even better — you'll be applying the patterns directly to your job.

What if I miss a live session?

All sessions are recorded and available within 24 hours. You also get unlimited re-attendance for future cohorts at no extra cost. Life happens — the program is designed for real working engineers.

How is this different from the Agentic AI Bootcamp?

Agentic AI is about building the systems — how to architect agents, memory, RAG, and multi-agent workflows. LLMOps is about operating them — how to trace, evaluate, deploy, serve, govern, and monitor LLM applications at production quality on Databricks. The two bootcamps are complementary: build with Agentic AI, operate with LLMOps.

Is EMI / installment payment available?

Yes. EMI is available at checkout. Employer sponsorship and invoicing also available — email academy@softbricks.ai and we'll send the paperwork.

What's the schedule?

Live sessions are every Saturday and Sunday, 8:00 PM – 11:00 PM WAT. All sessions are recorded for anyone who can't attend live. The community and async channels are active 24/7.

Will this help me get a job?

No bootcamp can promise a job — anyone who does is lying. What you get here is what most AI hires are missing: a deployed, traced, evaluated, and monitored LLM application you can walk through live, plus operator-level fluency that shows up the moment you open your laptop in an interview. That combination is what gets people hired.

Operate LLM Applications Like a Senior Platform Engineer.

Cohort 1 · LLMOps on Databricks

Why Most LLM Projects Stall at Demo

No Trace, No Truth

Manual Deploys, Manual Drift

Cost & Quality Run Wild

Two Phases. One Operator Mindset.

Foundations

Operations

8 Weeks. 8 Production Builds.

What You'll Operate

Platform

Tracking & Evals

Prompts

Retrieval

Agents

Serving

IaC & CI/CD

Observability

Testing

Everything You Need to Operate

What Operators Say

SoftBricks Academy Professionals

Is This Right for You?

This is for you if...

This is NOT for you if...

One Cohort. One Price. Full Operator Toolkit.

Stop shipping demos. Start operating platforms.

Common Questions