Reference

Data Setup (Bring Your Own Tenant)

The lab normally runs against pre-provisioned Azure resources and a shared Fabric workspace. If you do not have that pre-provisioned tenant - for example you are running this lab on your own Azure subscription after the event - use this page to stand up every Azure resource and seed every dataset yourself.

When to use this. Only follow this page if you are self-provisioning. If your coach gave you a tenant with resources already created, skip straight to Module 0 · Setup & Environment Check and use the credentials there.

Prerequisites

💡
All seed data is deterministic, synthetic, and PII-free. The shared IDs (H-1..H-5, W-1..W-30, R-NORTH..R-SOUTH, Train-Red-1...) line up across every layer so cross-domain joins in M2/M5/M6/M7/M9 return real rows.

What you are creating

Five Azure resources in one resource group, plus a Fabric workspace. Each maps to a part of the lab:

Resource Holds Used in
Azure SQL Database Hospitals, Wards, Staff (5 / 30 / 150 rows) M2 mirror
Azure Cosmos DB (NoSQL) regions, trainRoutes (5 / 3 docs) M2 shortcut
Azure Storage (Blob) facility_catalog.parquet + 5 facility photos M2 Files upload
Azure Event Hubs namespace medicalvitals, medicalmovement, metrotrain M3 streaming
Fabric workspace Lakehouse, Eventstream, KQL DB, reports, agents M1 onward

Step 1 · Create a resource group

Sign in and create one resource group to hold everything. Pick a region close to you and use it consistently.

az login

# pick your own names and region
$rg     = "rg-urbanpulse-lab"
$region = "eastus2"

az group create --name $rg --location $region

Step 2 · Azure SQL Database + seed

Create a logical SQL server and a database, open the firewall to your client, then run the bundled setup script that creates the schema, seeds the data, and enables Change Tracking for mirroring.

  1. Create the server and database:
    $server = "urbanpulse-$(Get-Random)"   # must be globally unique
    $db     = "UrbanPulse"
    $sqlUser = "labadmin"
    $sqlPass = ""
    
    az sql server create --name $server --resource-group $rg --location $region `
        --admin-user $sqlUser --admin-password $sqlPass
    
    az sql db create --resource-group $rg --server $server --name $db `
        --service-objective S0
    
    # allow your client IP through the firewall
    $ip = (Invoke-RestMethod https://api.ipify.org)
    az sql server firewall-rule create --resource-group $rg --server $server `
        --name AllowMyClient --start-ip-address $ip --end-ip-address $ip
  2. Download the setup script and run it against your database. It is idempotent, so it is safe to re-run.
    sqlcmd -S "$server.database.windows.net" -d $db -U $sqlUser -P $sqlPass `
        -i urbanpulse-sql-setup.sql

Download the SQL setup script

Schema + seed (Hospitals, Wards, Staff) + Change Tracking, ready to mirror.

urbanpulse-sql-setup.sql · idempotent

Download script (.sql)
Managed identity for mirroring. To mirror this database into Fabric in M2, the SQL server needs a system-assigned managed identity. In the Azure portal open your SQL server (the logical server, not the database) → Security → Identity, set System assigned managed identity to On, and click Save.

Step 3 · Azure Cosmos DB + seed

Create a Cosmos DB (NoSQL) account, then load the regions and trainRoutes documents with the bundled Python loader. Download the bundle below - it contains the loader plus the two JSON data files. The loader creates the database and both containers if they are missing and upserts every document, so it is safe to re-run.

Download the Cosmos data + loader

The regions.json and trainRoutes.json seed files plus load_cosmos.py.

urbanpulse-cosmos-data.zip · 5 regions + 3 train routes

Download bundle (.zip)
  1. Create the account:
    $cosmos = "urbanpulse-cosmos-$(Get-Random)"   # globally unique
    
    az cosmosdb create --name $cosmos --resource-group $rg `
        --locations regionName=$region --kind GlobalDocumentDB
  2. Grab the endpoint and primary key:
    $cosmosEndpoint = az cosmosdb show --name $cosmos --resource-group $rg --query documentEndpoint -o tsv
    $cosmosKey      = az cosmosdb keys list --name $cosmos --resource-group $rg --query primaryMasterKey -o tsv
  3. Extract the bundle, then run the loader from the extracted folder (it reads regions.json and trainRoutes.json from next to load_cosmos.py):
    pip install azure-cosmos
    
    $env:COSMOS_ENDPOINT = $cosmosEndpoint
    $env:COSMOS_KEY      = $cosmosKey
    $env:COSMOS_DATABASE = "urbanpulse"
    
    python load_cosmos.py
Partition keys. The loader creates both containers for you with the correct partition keys. If you ever create them by hand, use /regionId for regions and /routeId for trainRoutes - the seed documents and the M2 shortcut both depend on these.
You should end up with database urbanpulse containing regions (PK /regionId, 5 docs) and trainRoutes (PK /routeId, 3 docs).

Step 4 · Azure Storage + facility catalog

Create a storage account and a public blob container named facilitycatalog (that exact name is referenced in M2), then upload the facility catalog Parquet file and the five facility photos.

  1. Create the account and container:
    $storage = "urbanpulse$(Get-Random)"   # 3-24 lowercase chars, globally unique
    
    az storage account create --name $storage --resource-group $rg `
        --location $region --sku Standard_LRS
    
    az storage container create --account-name $storage `
        --name facilitycatalog --public-access blob
  2. Download the data bundle and extract it. You should have facility_catalog.parquet next to a facilities/ folder of five JPGs.
  3. Upload both into the container:
    az storage blob upload-batch `
        --account-name $storage `
        --destination facilitycatalog `
        --source .\urbanpulse-lab-data `
        --overwrite

Download the data bundle

Facility catalog Parquet plus the five facility photos, in one archive.

urbanpulse-lab-data.zip

Download bundle (.zip)
You can also upload these files straight into your Lakehouse Files area instead of Blob Storage. Either route reaches the same end state for M2.

Step 5 · Event Hubs + streaming simulators

Create an Event Hubs namespace and three hubs, then run the Python simulators to produce live events for M3. The hub names must match exactly - the M3 KQL ingestion mapping reads named JSON fields and silently drops anything that does not match.

  1. Create the namespace and hubs:
    $ehNamespace = "urbanpulse-eh-$(Get-Random)"   # globally unique
    
    az eventhubs namespace create --name $ehNamespace --resource-group $rg `
        --location $region --sku Standard
    
    foreach ($hub in "medicalvitals","medicalmovement","metrotrain") {
        az eventhubs eventhub create --resource-group $rg `
            --namespace-name $ehNamespace --name $hub --partition-count 2
    }
  2. Get a per-hub connection string (it must end with ;EntityPath=<hub>):
    $rule = "RootManageSharedAccessKey"
    foreach ($hub in "medicalvitals","medicalmovement","metrotrain") {
        $base = az eventhubs eventhub authorization-rule keys list `
            --resource-group $rg --namespace-name $ehNamespace `
            --eventhub-name $hub --name $rule --query primaryConnectionString -o tsv 2>$null
        Write-Host "$hub => $base"
    }

    If the per-hub rule does not exist yet, create one per hub with az eventhubs eventhub authorization-rule create ... --rights Send Listen, or use the namespace-level RootManageSharedAccessKey and append ;EntityPath=<hub> yourself.

  3. Run the three simulators from seed-data/eventhub/ in separate terminals:
    pip install azure-eventhub
    
    $env:EVENTHUB_VITALS_CONN_STR   = "...;EntityPath=medicalvitals"
    python seed-data\eventhub\simulate_medicalvitals.py
    
    $env:EVENTHUB_MOVEMENT_CONN_STR = "...;EntityPath=medicalmovement"
    python seed-data\eventhub\simulate_medicalmovement.py
    
    $env:EVENTHUB_TRAIN_CONN_STR    = "...;EntityPath=metrotrain"
    python seed-data\eventhub\simulate_metrotrain.py
💡
Each simulator honors SIM_DURATION_SECONDS (default 600). For a quick prime of your KQL tables set $env:SIM_DURATION_SECONDS=30 before running.

Step 6 · Create a Fabric workspace

  1. Go to app.fabric.microsoft.com.
  2. Select Workspaces+ New workspace, name it (for example UrbanPulse-Lab), and assign it to your Fabric capacity or trial.
  3. Continue with Module 1 · Workspace Tour. When a module asks for a server name, key, or connection string, use the values from the resources you just created instead of the shared lab credentials.

Validation

  • Azure SQL: SELECT COUNT(*) FROM Hospitals returns 5; Wards 30; Staff 150.
  • Cosmos DB: regions has 5 docs and trainRoutes has 3 docs.
  • Storage: facility_catalog.parquet and facilities/H-1.jpg...H-5.jpg exist in the facilitycatalog container.
  • Event Hubs: the namespace has hubs medicalvitals, medicalmovement, metrotrain and the simulators report events sent.
  • Fabric: your workspace is created and bound to a capacity or trial.