Module 7 · Semantic Layer

Build a Fabric Ontology over Your Data

Tables and columns describe how data is stored. An ontology describes what your data means - the entities (Patient, Train, Region), the relationships between them (a Train serves a Region; a Patient is admitted to a Hospital), and the hierarchies that flow through both. This module gives every downstream agent and Copilot experience a shared semantic vocabulary.

25 minutes
🎯 Goal: a graph the agents can reason over
🛠 Fabric Ontology · OneLake

Learning Objectives

Why Add an Ontology?

Up to this point you've built a complete data stack: a real-time Eventhouse, a Lakehouse, a Direct Lake semantic model. Every consumer - dashboards, reports, agents, Copilot - still has to know which table holds which fact and how to join them. That overhead is paid on every new question.

A Fabric Ontology removes that overhead. You define the conceptual model once - "a Patient has Vitals, is admitted to a Hospital, and lives in a Region. A Train runs on a Line and serves Regions" - and bind each entity to the tables that already exist in OneLake. Every Data Agent, M365 Copilot prompt, and Foundry orchestrator you build downstream then speaks the same vocabulary.

💡
The product angle. When you ship UrbanPulse to a second customer, the ontology is the asset you reuse verbatim. The customer's tables can have different names, schemas, or storage locations - only the ontology bindings change. The conceptual model and every agent grounded on it stay identical. This is how a one-off implementation becomes a repeatable product.

The UrbanPulse Ontology You'll Build

Below is the entity-relationship sketch for this lab. It spans both customer domains (health, transit) plus a shared Region entity that stitches them together. That shared entity is what enables cross-domain questions in Module 9.

🏥 HEALTH Patient id · age · condition Hospital id · region Vitals (time-series · ↑ Patient) 🚆 TRANSIT Train id · line · status Line Red · Blue · Green Telemetry (time-series · ↑ Train) Region id · name · bbox SHARED ACROSS DOMAINS admitted_to on_line in_region serves_region
Figure 6.1 - The UrbanPulse ontology. Solid borders are entity types, dashed boxes are domains, and the purple Region entity is the shared anchor that makes Module 9's cross-domain queries possible.

Step 1 - Create the Ontology Item

  1. Open your workspace

    From the Fabric portal, navigate back to urbanpulse-{your-user-id}. By now you should have: a Lakehouse, a Mirrored DB, an Eventhouse, a Real-Time Dashboard, and a Direct Lake semantic model.

  2. Create a new Ontology

    Click + New item → search "Ontology" → pick Ontology (preview).

    Name it: ont_urbanpulseCreate.

    📷
    SCREENSHOT NEEDED
    module-7/01-create-ontology.png
    "New Ontology" item creation dialog with name ont_urbanpulse
    Ontology item vs. semantic model. A Power BI semantic model (Module 6) is tabular and tuned for measures + reports. A Fabric Ontology is a graph, tuned for entity-centric questions and AI grounding. They coexist and complement each other.

Step 2 - Define Entity Types & Bind to OneLake

An entity type is "the conceptual thing" plus a binding to where its data actually lives.

  1. Add the Patient entity

    In the Ontology editor, click + Add entity type. Configure:

    • Name: Patient
    • Display name: Patient
    • Source: Eventhouse eh_urbanpulse_rtiHospitalVitals
    • Identity property: patient_id
    • Other properties: age, gender, condition, diagnosis_code
    📷
    SCREENSHOT NEEDED
    module-7/02-entity-patient.png
    Entity type editor for Patient with binding to HospitalVitals KQL table
  2. Add Train the same way

    Repeat the pattern with this binding:

    Entity Source Identity Properties
    Train Eventhouse → TrainTelemetry trainId line, status
  3. Add the cross-domain entities

    Two entities don't come from the streaming KQL tables. They're logical, bound to existing tables already in the Lakehouse from Module 2:

    • Hospital → Lakehouse lh_urbanpulse_bronze → mirrored SQL table Hospitals
    • Region → Lakehouse regions table (mirrored from Cosmos)
    Both tables already exist in lh_urbanpulse_bronze from Module 2's mirror + shortcut steps - no extra setup required. If you skipped Module 2, the rest of the ontology still works without these two entities.

Validation

The Ontology editor's left rail should now show 4 entity types: Patient, Train, Hospital, Region. Click any one to see its properties and source binding.

Step 3 - Define Relationships

This is where an ontology earns its keep. You're declaring the meaningful connections between entities so a downstream agent can traverse them without having to know SQL joins.

  1. Add the same-domain relationships

    In the Ontology editor, click + Add relationship and create these three:

    From Name To Cardinality
    Patient admitted_to Hospital many-to-one
    Train on_line Line (or store as a string property if you skipped Line as an entity) many-to-one
  2. Add the cross-domain relationships

    These cross-domain relationships are what enable Module 9 to answer questions spanning hospitals and transit without bespoke SQL:

    From Name To Why it matters
    Hospital in_region Region "How's hospital load in this region?"
    Line serves_region Region "Are trains for this region delayed?"
    📷
    SCREENSHOT NEEDED
    module-7/03-relationships.png
    Ontology relationships pane showing the three cross-domain edges into Region

What You Just Built

💡
Why this changes Module 8. Without an ontology, the Hospital agent in M7 needs to be told "the table is HospitalVitals, the patient column is patient_id, condition is in column...". With the ontology, you point the agent at ont_urbanpulse and declare "ground on Patient + Vitals." The agent's instructions become shorter, more accurate, and portable across customers.

References