Tutorial April 8, 2026 12 min read

Microsoft Clarity API: Complete Python Tutorial (2026)

Microsoft Clarity's Data Export API lets you pull session-level analytics programmatically. This tutorial walks you through authentication, making requests, parsing responses, handling rate limits, and storing everything in SQLite for long-term analysis.

Why use the Clarity API?

The Clarity dashboard is great for quick checks, but it has limitations. You can't query historical data beyond 30 days in bulk, you can't combine Clarity data with other sources, and you can't automate reporting. The Data Export API solves all of these problems.

With the API, you can:

Rate limit heads-up: The Clarity API allows only 10 requests per day per project and returns a maximum of 3 days of data per request. Plan your data collection strategy accordingly.

Prerequisites

Before you start, make sure you have:

Step 1: Get your API token

The Clarity API uses Bearer token authentication. Here's how to get your token:

  1. Log in to clarity.microsoft.com
  2. Open your project and go to Settings
  3. Navigate to Data Export in the left sidebar
  4. Click Generate API Token
  5. Copy the token and store it securely (you won't see it again)

Security tip: Never hardcode your API token. Store it in an environment variable or a .env file and load it with python-dotenv.

You'll also need your Project ID. Find it in the URL when viewing your Clarity dashboard: clarity.ms/app/your-project-id/dashboard.

Step 2: Make your first API request

The Data Export API has a single endpoint for live insights. Here's the basic structure:

import requests
import os

CLARITY_API_TOKEN = os.environ["CLARITY_API_TOKEN"]
PROJECT_ID = os.environ["CLARITY_PROJECT_ID"]

url = "https://www.clarity.ms/export-data/api/v1/project-live-insights"

headers = {
    "Authorization": f"Bearer {CLARITY_API_TOKEN}",
    "Content-Type": "application/json"
}

params = {
    "projectId": PROJECT_ID,
    "numOfDays": 1  # 1, 2, or 3
}

response = requests.get(url, headers=headers, params=params)

if response.status_code == 200:
    data = response.json()
    print(f"Got {len(data)} records")
else:
    print(f"Error: {response.status_code} - {response.text}")

The numOfDays parameter accepts values 1, 2, or 3. It controls how many days of recent data the API returns. You cannot request data older than 3 days from the current date.

Step 3: Understanding the response

The API returns a JSON array of session-level records. Each record contains metrics about a single user session. Here are the key fields:

Field Type Description
SessionId string Unique session identifier
PageUrl string URL of the page visited
Duration int Session duration in seconds
PageViews int Number of pages viewed in the session
RageClickCount int Number of rage clicks detected
DeadClickCount int Number of dead clicks detected
ScrollDepth float How far the user scrolled (0-100%)
Referrer string Traffic source URL
Device string Device type (Desktop, Mobile, Tablet)
Browser string Browser name and version
Country string User's country

Step 4: Store data in SQLite

Since the API only gives you 3 days of data at a time, you need a local database to accumulate historical data. SQLite is perfect for this — no server needed, just a single file.

import sqlite3
from datetime import datetime

def init_db(db_path="clarity.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS sessions (
            session_id TEXT PRIMARY KEY,
            page_url TEXT,
            duration INTEGER,
            page_views INTEGER,
            rage_clicks INTEGER DEFAULT 0,
            dead_clicks INTEGER DEFAULT 0,
            scroll_depth REAL,
            referrer TEXT,
            device TEXT,
            browser TEXT,
            country TEXT,
            collected_at TEXT
        )
    """)
    conn.commit()
    return conn

def store_sessions(conn, sessions):
    now = datetime.utcnow().isoformat()
    inserted = 0
    for s in sessions:
        try:
            conn.execute("""
                INSERT OR IGNORE INTO sessions
                (session_id, page_url, duration, page_views,
                 rage_clicks, dead_clicks, scroll_depth,
                 referrer, device, browser, country, collected_at)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
            """, (
                s.get("SessionId"),
                s.get("PageUrl"),
                s.get("Duration", 0),
                s.get("PageViews", 0),
                s.get("RageClickCount", 0),
                s.get("DeadClickCount", 0),
                s.get("ScrollDepth", 0),
                s.get("Referrer", ""),
                s.get("Device", ""),
                s.get("Browser", ""),
                s.get("Country", ""),
                now
            ))
            inserted += 1
        except sqlite3.IntegrityError:
            pass  # duplicate session, skip
    conn.commit()
    return inserted

Using INSERT OR IGNORE with the session ID as primary key means you can safely run the collection multiple times without creating duplicates.

Step 5: Handle rate limits properly

With only 10 requests per day, you need to be disciplined about how you call the API. Here's a strategy that works well:

Daily collection pattern

Run one request per day with numOfDays=1. This uses just 1 of your 10 daily requests and gives you yesterday's data. Over time, you build a complete historical record.

import sys
import time

def collect_with_retry(max_retries=3):
    """Collect daily data with exponential backoff."""
    for attempt in range(max_retries):
        try:
            response = requests.get(url, headers=headers, params=params, timeout=30)

            if response.status_code == 200:
                return response.json()
            elif response.status_code == 429:
                # Rate limited — wait and don't count this as data
                print("Rate limited. Daily quota likely exceeded.")
                sys.exit(1)
            elif response.status_code == 401:
                print("Authentication failed. Check your API token.")
                sys.exit(1)
            else:
                wait_time = 2 ** attempt * 5
                print(f"Error {response.status_code}, retrying in {wait_time}s...")
                time.sleep(wait_time)

        except requests.exceptions.Timeout:
            print(f"Request timed out (attempt {attempt + 1}/{max_retries})")
            time.sleep(10)

    print("All retries exhausted.")
    sys.exit(1)

Pro tip: Schedule your collection script as a daily cron job running at a consistent time (e.g., 7 AM). This ensures you never miss a day, and the 3-day lookback window gives you a safety net if the job fails once.

Backfill strategy

If you're starting fresh, you can use up to 3 of your daily requests to backfill:

# Day 1: Run with numOfDays=3 to get the last 3 days
# Day 2+: Run with numOfDays=1 for daily incremental updates

Step 6: Build a complete collection script

Let's put it all together into a production-ready script:

#!/usr/bin/env python3
"""Clarity API daily data collector."""

import os
import sys
import json
import sqlite3
import requests
from datetime import datetime

# Configuration
API_TOKEN = os.environ.get("CLARITY_API_TOKEN")
PROJECT_ID = os.environ.get("CLARITY_PROJECT_ID")
DB_PATH = os.path.join(os.path.dirname(__file__), "clarity.db")

API_URL = "https://www.clarity.ms/export-data/api/v1/project-live-insights"

def main():
    if not API_TOKEN or not PROJECT_ID:
        print("ERROR: Set CLARITY_API_TOKEN and CLARITY_PROJECT_ID")
        sys.exit(1)

    # Fetch data
    response = requests.get(API_URL, headers={
        "Authorization": f"Bearer {API_TOKEN}",
        "Content-Type": "application/json"
    }, params={
        "projectId": PROJECT_ID,
        "numOfDays": 1
    }, timeout=30)

    if response.status_code != 200:
        print(f"API error: {response.status_code}")
        sys.exit(1)

    sessions = response.json()
    print(f"Fetched {len(sessions)} sessions")

    # Store in SQLite
    conn = sqlite3.connect(DB_PATH)
    # ... (init table and insert as shown above)
    conn.close()

    print(f"Collection complete: {datetime.now().isoformat()}")

if __name__ == "__main__":
    main()

Step 7: Query your collected data

Once you have a few days of data, you can run powerful queries. Here are some useful ones:

Pages with the most frustration signals

SELECT page_url,
       SUM(rage_clicks) as total_rage,
       SUM(dead_clicks) as total_dead,
       COUNT(*) as sessions
FROM sessions
WHERE collected_at > date('now', '-7 days')
GROUP BY page_url
ORDER BY (total_rage + total_dead) DESC
LIMIT 10;

Average scroll depth by device type

SELECT device,
       ROUND(AVG(scroll_depth), 1) as avg_scroll,
       COUNT(*) as sessions
FROM sessions
GROUP BY device;

Bounce rate by referrer

SELECT referrer,
       COUNT(*) as sessions,
       ROUND(100.0 * SUM(CASE WHEN page_views = 1 THEN 1 ELSE 0 END) / COUNT(*), 1) as bounce_rate
FROM sessions
WHERE referrer != ''
GROUP BY referrer
HAVING sessions > 5
ORDER BY bounce_rate DESC;

Going further: Automated analysis

Collecting data is only half the story. The real value comes from analyzing trends over time. Some ideas:

Automation tip: Set up two cron jobs: one daily for data collection, one weekly for report generation. This way you always have fresh data and regular insights without manual work.

Common pitfalls and how to avoid them

1. Exceeding the rate limit

If you hit the 10-request daily limit, the API returns a 429 status. Your script should detect this and exit gracefully rather than retrying. Tomorrow is another day.

2. Duplicate data

Always use the session ID as a primary key or unique constraint. The 3-day window means consecutive daily pulls will overlap. INSERT OR IGNORE handles this automatically.

3. Missing the collection window

Data older than 3 days is gone from the API forever. If your cron job fails for 4 consecutive days, you'll have a gap. Monitor your cron jobs and set up alerts.

4. Token expiration

Clarity API tokens can expire. If you suddenly get 401 errors, regenerate your token in the Clarity Settings panel.

Summary

The Microsoft Clarity API is simple but limited. The key is to work within its constraints: collect daily with numOfDays=1, store everything in SQLite, and build your analysis on top of the accumulated data. This approach gives you unlimited historical depth from a free analytics tool.

The complete code from this tutorial is enough to get started. For production use, add logging, error notifications, and consider using a tool like ClarityInsights that handles collection, analysis, and reporting automatically.

Stop analyzing Clarity data manually

ClarityInsights sends you AI-powered weekly reports with per-page analysis, frustration signals, and prioritized recommendations.

Join the Waitlist