Tutorial April 8, 2026 12 min read

Microsoft Clarity API: Complete Python Tutorial (2026)

Microsoft Clarity's Data Export API lets you pull session-level analytics programmatically. This tutorial walks you through authentication, making requests, parsing responses, handling rate limits, and storing everything in SQLite for long-term analysis.

Why use the Clarity API?

The Clarity dashboard is great for quick checks, but it has limitations. You can't query historical data beyond 30 days in bulk, you can't combine Clarity data with other sources, and you can't automate reporting. The Data Export API solves all of these problems.

With the API, you can:

Pull raw session data into your own database
Build custom dashboards combining Clarity with Google Analytics or your CRM
Automate weekly or daily reports with trends over time
Run custom analyses that the Clarity UI doesn't support

Rate limit heads-up: The Clarity API allows only 10 requests per day per project and returns a maximum of 3 days of data per request. Plan your data collection strategy accordingly.

Prerequisites

Before you start, make sure you have:

Python 3.8 or later installed
A Microsoft Clarity project with the tracking code active on your site
The requests library (pip install requests)
A Clarity API token (we'll generate this next)

Step 1: Get your API token

The Clarity API uses Bearer token authentication. Here's how to get your token:

Log in to clarity.microsoft.com
Open your project and go to Settings
Navigate to Data Export in the left sidebar
Click Generate API Token
Copy the token and store it securely (you won't see it again)

Security tip: Never hardcode your API token. Store it in an environment variable or a .env file and load it with python-dotenv.

You'll also need your Project ID. Find it in the URL when viewing your Clarity dashboard: clarity.ms/app/your-project-id/dashboard.

Step 2: Make your first API request

The Data Export API has a single endpoint for live insights. Here's the basic structure:

import requests
import os

CLARITY_API_TOKEN = os.environ["CLARITY_API_TOKEN"]
PROJECT_ID = os.environ["CLARITY_PROJECT_ID"]

url = "https://www.clarity.ms/export-data/api/v1/project-live-insights"

headers = {
    "Authorization": f"Bearer {CLARITY_API_TOKEN}",
    "Content-Type": "application/json"
}

params = {
    "projectId": PROJECT_ID,
    "numOfDays": 1  # 1, 2, or 3
}

response = requests.get(url, headers=headers, params=params)

if response.status_code == 200:
    data = response.json()
    print(f"Got {len(data)} records")
else:
    print(f"Error: {response.status_code} - {response.text}")

The numOfDays parameter accepts values 1, 2, or 3. It controls how many days of recent data the API returns. You cannot request data older than 3 days from the current date.

Step 3: Understanding the response

The API returns a JSON array of session-level records. Each record contains metrics about a single user session. Here are the key fields:

Field	Type	Description
`SessionId`	string	Unique session identifier
`PageUrl`	string	URL of the page visited
`Duration`	int	Session duration in seconds
`PageViews`	int	Number of pages viewed in the session
`RageClickCount`	int	Number of rage clicks detected
`DeadClickCount`	int	Number of dead clicks detected
`ScrollDepth`	float	How far the user scrolled (0-100%)
`Referrer`	string	Traffic source URL
`Device`	string	Device type (Desktop, Mobile, Tablet)
`Browser`	string	Browser name and version
`Country`	string	User's country

Step 4: Store data in SQLite

Since the API only gives you 3 days of data at a time, you need a local database to accumulate historical data. SQLite is perfect for this — no server needed, just a single file.

import sqlite3
from datetime import datetime

def init_db(db_path="clarity.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS sessions (
            session_id TEXT PRIMARY KEY,
            page_url TEXT,
            duration INTEGER,
            page_views INTEGER,
            rage_clicks INTEGER DEFAULT 0,
            dead_clicks INTEGER DEFAULT 0,
            scroll_depth REAL,
            referrer TEXT,
            device TEXT,
            browser TEXT,
            country TEXT,
            collected_at TEXT
        )
    """)
    conn.commit()
    return conn

def store_sessions(conn, sessions):
    now = datetime.utcnow().isoformat()
    inserted = 0
    for s in sessions:
        try:
            conn.execute("""
                INSERT OR IGNORE INTO sessions
                (session_id, page_url, duration, page_views,
                 rage_clicks, dead_clicks, scroll_depth,
                 referrer, device, browser, country, collected_at)
                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
            """, (
                s.get("SessionId"),
                s.get("PageUrl"),
                s.get("Duration", 0),
                s.get("PageViews", 0),
                s.get("RageClickCount", 0),
                s.get("DeadClickCount", 0),
                s.get("ScrollDepth", 0),
                s.get("Referrer", ""),
                s.get("Device", ""),
                s.get("Browser", ""),
                s.get("Country", ""),
                now
            ))
            inserted += 1
        except sqlite3.IntegrityError:
            pass  # duplicate session, skip
    conn.commit()
    return inserted

Using INSERT OR IGNORE with the session ID as primary key means you can safely run the collection multiple times without creating duplicates.

Step 5: Handle rate limits properly

With only 10 requests per day, you need to be disciplined about how you call the API. Here's a strategy that works well:

Daily collection pattern

Run one request per day with numOfDays=1. This uses just 1 of your 10 daily requests and gives you yesterday's data. Over time, you build a complete historical record.

import sys
import time

def collect_with_retry(max_retries=3):
    """Collect daily data with exponential backoff."""
    for attempt in range(max_retries):
        try:
            response = requests.get(url, headers=headers, params=params, timeout=30)

            if response.status_code == 200:
                return response.json()
            elif response.status_code == 429:
                # Rate limited — wait and don't count this as data
                print("Rate limited. Daily quota likely exceeded.")
                sys.exit(1)
            elif response.status_code == 401:
                print("Authentication failed. Check your API token.")
                sys.exit(1)
            else:
                wait_time = 2 ** attempt * 5
                print(f"Error {response.status_code}, retrying in {wait_time}s...")
                time.sleep(wait_time)

        except requests.exceptions.Timeout:
            print(f"Request timed out (attempt {attempt + 1}/{max_retries})")
            time.sleep(10)

    print("All retries exhausted.")
    sys.exit(1)

Pro tip: Schedule your collection script as a daily cron job running at a consistent time (e.g., 7 AM). This ensures you never miss a day, and the 3-day lookback window gives you a safety net if the job fails once.

Backfill strategy

If you're starting fresh, you can use up to 3 of your daily requests to backfill:

# Day 1: Run with numOfDays=3 to get the last 3 days
# Day 2+: Run with numOfDays=1 for daily incremental updates

Step 6: Build a complete collection script

Let's put it all together into a production-ready script:

#!/usr/bin/env python3
"""Clarity API daily data collector."""

import os
import sys
import json
import sqlite3
import requests
from datetime import datetime

# Configuration
API_TOKEN = os.environ.get("CLARITY_API_TOKEN")
PROJECT_ID = os.environ.get("CLARITY_PROJECT_ID")
DB_PATH = os.path.join(os.path.dirname(__file__), "clarity.db")

API_URL = "https://www.clarity.ms/export-data/api/v1/project-live-insights"

def main():
    if not API_TOKEN or not PROJECT_ID:
        print("ERROR: Set CLARITY_API_TOKEN and CLARITY_PROJECT_ID")
        sys.exit(1)

    # Fetch data
    response = requests.get(API_URL, headers={
        "Authorization": f"Bearer {API_TOKEN}",
        "Content-Type": "application/json"
    }, params={
        "projectId": PROJECT_ID,
        "numOfDays": 1
    }, timeout=30)

    if response.status_code != 200:
        print(f"API error: {response.status_code}")
        sys.exit(1)

    sessions = response.json()
    print(f"Fetched {len(sessions)} sessions")

    # Store in SQLite
    conn = sqlite3.connect(DB_PATH)
    # ... (init table and insert as shown above)
    conn.close()

    print(f"Collection complete: {datetime.now().isoformat()}")

if __name__ == "__main__":
    main()

Step 7: Query your collected data

Once you have a few days of data, you can run powerful queries. Here are some useful ones:

Pages with the most frustration signals

SELECT page_url,
       SUM(rage_clicks) as total_rage,
       SUM(dead_clicks) as total_dead,
       COUNT(*) as sessions
FROM sessions
WHERE collected_at > date('now', '-7 days')
GROUP BY page_url
ORDER BY (total_rage + total_dead) DESC
LIMIT 10;

Average scroll depth by device type

SELECT device,
       ROUND(AVG(scroll_depth), 1) as avg_scroll,
       COUNT(*) as sessions
FROM sessions
GROUP BY device;

Bounce rate by referrer

SELECT referrer,
       COUNT(*) as sessions,
       ROUND(100.0 * SUM(CASE WHEN page_views = 1 THEN 1 ELSE 0 END) / COUNT(*), 1) as bounce_rate
FROM sessions
WHERE referrer != ''
GROUP BY referrer
HAVING sessions > 5
ORDER BY bounce_rate DESC;

Going further: Automated analysis

Collecting data is only half the story. The real value comes from analyzing trends over time. Some ideas:

Trend detection — compare this week's rage clicks to last week's. A sudden spike means something broke.
Per-page scoring — combine scroll depth, dead clicks, and bounce rate into a single "page health" score.
AI-powered insights — feed your weekly data to an LLM and ask it to identify patterns and recommend fixes. Tools like ClarityInsights automate exactly this kind of analysis.

Automation tip: Set up two cron jobs: one daily for data collection, one weekly for report generation. This way you always have fresh data and regular insights without manual work.

Common pitfalls and how to avoid them

1. Exceeding the rate limit

If you hit the 10-request daily limit, the API returns a 429 status. Your script should detect this and exit gracefully rather than retrying. Tomorrow is another day.

2. Duplicate data

Always use the session ID as a primary key or unique constraint. The 3-day window means consecutive daily pulls will overlap. INSERT OR IGNORE handles this automatically.

3. Missing the collection window

Data older than 3 days is gone from the API forever. If your cron job fails for 4 consecutive days, you'll have a gap. Monitor your cron jobs and set up alerts.

4. Token expiration

Clarity API tokens can expire. If you suddenly get 401 errors, regenerate your token in the Clarity Settings panel.

Summary

The Microsoft Clarity API is simple but limited. The key is to work within its constraints: collect daily with numOfDays=1, store everything in SQLite, and build your analysis on top of the accumulated data. This approach gives you unlimited historical depth from a free analytics tool.

The complete code from this tutorial is enough to get started. For production use, add logging, error notifications, and consider using a tool like ClarityInsights that handles collection, analysis, and reporting automatically.

Stop analyzing Clarity data manually

ClarityInsights sends you AI-powered weekly reports with per-page analysis, frustration signals, and prioritized recommendations.

Send my report

Microsoft Clarity API: Complete Python Tutorial (2026)

Why use the Clarity API?

Prerequisites

Step 1: Get your API token

Step 2: Make your first API request

Step 3: Understanding the response

Step 4: Store data in SQLite

Step 5: Handle rate limits properly

Daily collection pattern

Backfill strategy

Step 6: Build a complete collection script

Step 7: Query your collected data

Pages with the most frustration signals

Average scroll depth by device type

Bounce rate by referrer

Going further: Automated analysis

Common pitfalls and how to avoid them

1. Exceeding the rate limit

2. Duplicate data

3. Missing the collection window

4. Token expiration

Summary

Stop analyzing Clarity data manually

Related articles