Tutorial April 3, 2026 8 min read

How to Get Historical Data from Microsoft Clarity

Microsoft Clarity is generous with features but stingy with data retention. The API only goes back 3 days, the dashboard keeps 30 days, and then your data is gone. Here's how to build a simple system that preserves your Clarity data indefinitely.

The Data Retention Problem

If you've tried pulling historical data from Microsoft Clarity, you've already hit the wall. Clarity's data export API has strict limitations:

This means if you want to compare this month's scroll depth with last quarter's, or track how a redesign affected rage clicks over time, you're out of luck with Clarity alone.

Warning: Clarity does not offer any way to recover data older than 30 days. If you haven't been collecting it, that data is permanently lost. Start collecting now -- even if you don't need historical analysis yet.

Why Historical Data Matters

Without historical data, you're always looking at a snapshot. You can see what's happening now, but you can't answer the questions that actually drive decisions:

Trend Analysis

Is your bounce rate going up or down? Are rage clicks increasing after a recent deploy? Without historical baselines, every metric is just a number without context. A 15% dead click rate on your pricing page means nothing unless you know it was 5% last month.

Seasonality

E-commerce sites see wildly different user behavior during holiday seasons. SaaS products see different patterns during budget cycles. Historical data lets you separate seasonal changes from real problems.

Before/After Comparisons

You redesigned your checkout flow. Did it help? Without pre-redesign data, you're guessing. Historical data turns "I think it's better" into "rage clicks dropped 40% and scroll depth increased by 15%."

Solution: Daily Collection with Cron + SQLite

The workaround is simple: pull data from the Clarity API every day and store it locally. Since the API gives you up to 3 days of data and you only need to grab the latest day, a daily cron job is reliable and stays well within rate limits.

Architecture

  1. A Python script runs once daily via cron
  2. It fetches the latest day's data from the Clarity API
  3. It stores the results in a local SQLite database
  4. Over time, you accumulate months or years of data

Step 1: Set Up the Database

SQLite is the simplest option -- no server to manage, just a file. Here's the schema:

import sqlite3

def init_db(db_path="clarity.db"):
    conn = sqlite3.connect(db_path)
    conn.execute("""
        CREATE TABLE IF NOT EXISTS clarity_metrics (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            date TEXT NOT NULL,
            page_url TEXT,
            total_sessions INTEGER,
            distinct_users INTEGER,
            pages_per_session REAL,
            scroll_depth REAL,
            active_time REAL,
            dead_click_count INTEGER,
            rage_click_count INTEGER,
            quick_back_count INTEGER,
            excessive_scroll_count INTEGER,
            collected_at TEXT DEFAULT CURRENT_TIMESTAMP,
            UNIQUE(date, page_url)
        )
    """)
    conn.commit()
    return conn

The UNIQUE(date, page_url) constraint prevents duplicate entries if the script runs twice in one day.

Step 2: Fetch Data from the Clarity API

import requests
import os

def fetch_clarity_data():
    url = "https://www.clarity.ms/export-data/api/v1/project-live-insights"
    headers = {
        "Authorization": f"Bearer {os.environ['CLARITY_API_TOKEN']}",
        "Content-Type": "application/json"
    }
    payload = {
        "projectId": os.environ["CLARITY_PROJECT_ID"],
        "numOfDays": 1
    }
    response = requests.post(url, json=payload, headers=headers)
    response.raise_for_status()
    return response.json()

Tip: Always use numOfDays: 1 for daily collection. Using 2 or 3 wastes your rate limit (10 requests/day) on overlapping data you've already stored.

Step 3: Store the Data

from datetime import date

def store_metrics(conn, data):
    today = date.today().isoformat()
    rows = data.get("results", [])

    for row in rows:
        conn.execute("""
            INSERT OR IGNORE INTO clarity_metrics
            (date, page_url, total_sessions, distinct_users,
             pages_per_session, scroll_depth, active_time,
             dead_click_count, rage_click_count,
             quick_back_count, excessive_scroll_count)
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
        """, (
            today,
            row.get("url", "/"),
            row.get("totalSessionCount", 0),
            row.get("distinctUserCount", 0),
            row.get("pagesPerSession", 0),
            row.get("scrollDepth", 0),
            row.get("activeTime", 0),
            row.get("deadClickCount", 0),
            row.get("rageClickCount", 0),
            row.get("quickBackCount", 0),
            row.get("excessiveScrollCount", 0),
        ))
    conn.commit()
    print(f"Stored {len(rows)} rows for {today}")

Step 4: Schedule with Cron

Add this to your crontab (crontab -e):

# Collect Clarity data daily at 7:00 AM
0 7 * * * cd /path/to/project && source .env && python collect.py >> logs/collect.log 2>&1

Scaling Up: Supabase Instead of SQLite

SQLite works great for single-server setups. But if you want to query the data from multiple tools, build dashboards, or share access with your team, consider using Supabase (managed Postgres):

from supabase import create_client

supabase = create_client(
    os.environ["SUPABASE_URL"],
    os.environ["SUPABASE_KEY"]
)

supabase.table("clarity_metrics").insert({
    "date": today,
    "page_url": row["url"],
    "scroll_depth": row["scrollDepth"],
    "rage_click_count": row["rageClickCount"],
    # ... other fields
}).execute()

This gives you SQL access, a REST API, and a web-based table editor -- useful when you want to build custom reporting later.

Querying Historical Data

Once you have a few weeks of data, the real value starts showing up. Here are some useful queries:

Weekly Rage Click Trend

SELECT
  strftime('%W', date) as week,
  SUM(rage_click_count) as total_rage_clicks,
  AVG(scroll_depth) as avg_scroll_depth
FROM clarity_metrics
GROUP BY week
ORDER BY week DESC
LIMIT 12;

Before/After a Deploy

SELECT
  CASE WHEN date < '2026-03-15' THEN 'before' ELSE 'after' END as period,
  AVG(scroll_depth) as avg_scroll,
  AVG(rage_click_count) as avg_rage_clicks,
  AVG(dead_click_count) as avg_dead_clicks
FROM clarity_metrics
WHERE page_url = '/pricing'
GROUP BY period;

The Automated Alternative

Building this pipeline yourself takes an afternoon, but maintaining it -- handling API changes, adding new metrics, generating reports -- takes ongoing effort. ClarityInsights does all of this automatically: daily collection, long-term storage, and AI-powered weekly reports that highlight what actually changed and why it matters.

Info: Whether you build it yourself or use a tool, the key takeaway is the same: start collecting Clarity data today. Every day you wait is a day of historical data you'll never get back.

Summary

Microsoft Clarity's data retention is limited by design -- it's a free tool, and storage costs money. But with a simple Python script, a SQLite database, and a cron job, you can build an indefinite historical record of your UX metrics. The setup takes 30 minutes. The data it preserves is worth months of insights.

Stop analyzing Clarity data manually

ClarityInsights sends you AI-powered weekly reports with per-page analysis, frustration signals, and prioritized recommendations.

Join the Waitlist