How to Automate Shopify Reports with Python and the Shopify API

·9 min read·Ecommerce

Automate Shopify reports using Python and the Shopify Admin API — set up automatic daily and weekly sales reports, inventory tracking, and ecommerce KPI dashboards without manual spreadsheets or CSV exports.

How to Automate Shopify Reports with Python and the Shopify APIAI Generated Image

Every ecommerce team has the same problem: someone spends hours each week pulling data from Shopify, pasting it into spreadsheets, and formatting reports that are outdated by the time they are sent.

This is the wrong approach. The data already exists. It just needs to flow automatically.

This guide builds a reporting system that pulls ecommerce data via API, aggregates the metrics that matter, and generates formatted reports — on a schedule, without human involvement.

# Who This Is For

  • Ecommerce managers who spend hours each week compiling Shopify reports in spreadsheets
  • Store owners who want daily revenue and inventory summaries without logging into admin panels
  • Vibe coders who already pull data from APIs and want to turn it into polished automated reports
  • Operations teams who need consistent, timely reporting without relying on one person to run it manually

Basic Python knowledge is helpful but the concepts apply broadly. If you understand what a spreadsheet report looks like, you can follow the logic here even if you are not writing the code yourself.

# What We Are Building

flowchart TD
  S[Shopify API] --> F["Fetch orders,\nproducts, customers"]
  F --> C[Clean & validate]
  C --> A["Aggregate by period\n(daily/weekly/monthly)"]
  A --> E[Excel report]
  A --> DB[(Database history)]
  E --> EM[Email to team]
  DB --> DQ[Dashboard queries]

# What You Will Need

bash
pip install requests pandas openpyxl

A Shopify store with API access (Admin API token with read_orders, read_products, and read_customers scopes).

# Step 1: Connect to Shopify API

Shopify uses a REST Admin API. Create a private app or custom app in Shopify admin to get an API token.

python
import requests
import pandas as pd
from datetime import datetime, timedelta

class ShopifyClient:
    """Simple Shopify Admin API client."""

    def __init__(self, shop_url, api_token, api_version="2026-01"):
        self.base_url = f"https://{shop_url}/admin/api/{api_version}"
        self.headers = {
            "X-Shopify-Access-Token": api_token,
            "Content-Type": "application/json",
        }

    def get(self, endpoint, params=None):
        """Fetch data from a Shopify API endpoint with pagination."""
        all_results = []
        url = f"{self.base_url}/{endpoint}.json"

        while url:
            response = requests.get(url, headers=self.headers, params=params, timeout=30)
            response.raise_for_status()
            data = response.json()

            # Extract the resource name from endpoint
            resource = endpoint.split("/")[-1]
            all_results.extend(data.get(resource, []))

            # Handle pagination via Link header
            link = response.headers.get("Link", "")
            url = None
            if 'rel="next"' in link:
                for part in link.split(","):
                    if 'rel="next"' in part:
                        url = part.split(";")[0].strip(" <>")
            params = None  # Only use params on first request

        return all_results

# Step 2: Fetch Order Data

python
def fetch_orders(client, days_back=30):
    """Fetch recent orders and return as clean DataFrame."""
    since = (datetime.now() - timedelta(days=days_back)).strftime("%Y-%m-%dT00:00:00Z")

    raw_orders = client.get("orders", params={
        "created_at_min": since,
        "status": "any",
        "limit": 250,
    })

    records = []
    for order in raw_orders:
        records.append({
            "order_id": order["id"],
            "order_number": order["order_number"],
            "created_at": order["created_at"],
            "total_price": float(order["total_price"]),
            "subtotal_price": float(order.get("subtotal_price", 0)),
            "total_discounts": float(order.get("total_discounts", 0)),
            "total_tax": float(order.get("total_tax", 0)),
            "financial_status": order["financial_status"],
            "fulfillment_status": order.get("fulfillment_status", "unfulfilled"),
            "items_count": len(order.get("line_items", [])),
            "customer_id": order.get("customer", {}).get("id"),
            "currency": order["currency"],
        })

    df = pd.DataFrame(records)
    df["created_at"] = pd.to_datetime(df["created_at"])
    df["date"] = df["created_at"].dt.date

    print(f"Fetched {len(df)} orders from last {days_back} days")
    return df

# Step 3: Build KPI Summaries

These are the metrics ecommerce teams actually track weekly:

python
def calculate_kpis(orders_df):
    """Calculate key ecommerce KPIs from order data."""
    total_orders = len(orders_df)
    total_revenue = orders_df["total_price"].sum()
    avg_order_value = orders_df["total_price"].mean()
    total_discounts = orders_df["total_discounts"].sum()

    # Refund rate
    refunded = orders_df[orders_df["financial_status"] == "refunded"]
    refund_rate = len(refunded) / total_orders * 100 if total_orders > 0 else 0

    # Fulfillment rate
    fulfilled = orders_df[orders_df["fulfillment_status"] == "fulfilled"]
    fulfillment_rate = len(fulfilled) / total_orders * 100 if total_orders > 0 else 0

    # Unique customers
    unique_customers = orders_df["customer_id"].nunique()

    # Repeat customer rate
    customer_orders = orders_df.groupby("customer_id").size()
    repeat_customers = (customer_orders > 1).sum()
    repeat_rate = repeat_customers / unique_customers * 100 if unique_customers > 0 else 0

    return {
        "Total Orders": total_orders,
        "Total Revenue": round(total_revenue, 2),
        "Average Order Value": round(avg_order_value, 2),
        "Total Discounts Given": round(total_discounts, 2),
        "Refund Rate": f"{refund_rate:.1f}%",
        "Fulfillment Rate": f"{fulfillment_rate:.1f}%",
        "Unique Customers": unique_customers,
        "Repeat Customer Rate": f"{repeat_rate:.1f}%",
    }
python
def daily_revenue_summary(orders_df):
    """Aggregate revenue by day."""
    daily = (
        orders_df.groupby("date")
        .agg(
            orders=("order_id", "count"),
            revenue=("total_price", "sum"),
            avg_order_value=("total_price", "mean"),
            discounts=("total_discounts", "sum"),
        )
        .reset_index()
        .sort_values("date")
    )
    daily["revenue"] = daily["revenue"].round(2)
    daily["avg_order_value"] = daily["avg_order_value"].round(2)
    return daily

# Sample Output

date orders revenue avg_order_value discounts
2026-03-25 34 4,280.50 125.90 312.00
2026-03-26 41 5,120.00 124.88 428.50
2026-03-27 38 4,890.75 128.70 265.00

# Step 4: Generate the Report

python
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill, Alignment

def generate_ecommerce_report(kpis, daily_data, output_path):
    """Generate formatted Excel report with KPIs and daily breakdown."""
    wb = Workbook()
    header_font = Font(bold=True, color="FFFFFF", size=11)
    header_fill = PatternFill(start_color="2F5496", end_color="2F5496", fill_type="solid")
    currency_fmt = "£#,##0.00"

    # --- KPI Summary Sheet ---
    ws = wb.active
    ws.title = "KPI Summary"
    ws.merge_cells("A1:B1")
    ws["A1"] = f"Ecommerce KPIs — {datetime.now().strftime('%B %Y')}"
    ws["A1"].font = Font(bold=True, size=16)

    row = 3
    for metric, value in kpis.items():
        ws.cell(row=row, column=1, value=metric).font = Font(bold=True)
        cell = ws.cell(row=row, column=2, value=value)
        if isinstance(value, float):
            cell.number_format = currency_fmt
        row += 1

    ws.column_dimensions["A"].width = 25
    ws.column_dimensions["B"].width = 18

    # --- Daily Revenue Sheet ---
    ws2 = wb.create_sheet("Daily Revenue")
    headers = ["Date", "Orders", "Revenue", "Avg Order Value", "Discounts"]

    for col, h in enumerate(headers, 1):
        cell = ws2.cell(row=1, column=col, value=h)
        cell.font = header_font
        cell.fill = header_fill
        cell.alignment = Alignment(horizontal="center")

    for i, row_data in daily_data.iterrows():
        r = i + 2
        ws2.cell(row=r, column=1, value=str(row_data["date"]))
        ws2.cell(row=r, column=2, value=row_data["orders"])
        ws2.cell(row=r, column=3, value=row_data["revenue"]).number_format = currency_fmt
        ws2.cell(row=r, column=4, value=row_data["avg_order_value"]).number_format = currency_fmt
        ws2.cell(row=r, column=5, value=row_data["discounts"]).number_format = currency_fmt

    for col in ws2.columns:
        max_len = max(len(str(c.value or "")) for c in col)
        ws2.column_dimensions[col[0].column_letter].width = max_len + 4

    wb.save(output_path)
    print(f"Report saved: {output_path}")

# Step 5: Run the Full Pipeline

python
import os

def run_ecommerce_report():
    """Full pipeline: fetch → aggregate → report."""
    client = ShopifyClient(
        shop_url=os.environ["SHOPIFY_STORE"],
        api_token=os.environ["SHOPIFY_API_TOKEN"],
    )

    # Fetch
    orders = fetch_orders(client, days_back=30)

    # Aggregate
    kpis = calculate_kpis(orders)
    daily = daily_revenue_summary(orders)

    # Report
    timestamp = datetime.now().strftime("%Y%m%d")
    generate_ecommerce_report(kpis, daily, f"ecommerce_report_{timestamp}.xlsx")

    # Print KPIs
    for metric, value in kpis.items():
        print(f"  {metric}: {value}")

if __name__ == "__main__":
    run_ecommerce_report()
text
Fetched 312 orders from last 30 days
  Total Orders: 312
  Total Revenue: 39420.50
  Average Order Value: 126.35
  Total Discounts Given: 4280.00
  Refund Rate: 3.2%
  Fulfillment Rate: 89.1%
  Unique Customers: 245
  Repeat Customer Rate: 18.4%
Report saved: ecommerce_report_20260330.xlsx

# What This Replaces

Manual process Automated
Export CSV from Shopify admin API fetch with pagination
Open in Excel, filter date range pd.DataFrame with date filter
Calculate KPIs with formulas calculate_kpis() function
Format and style the report openpyxl formatting
Email to team Schedule + smtplib
Remember to do it weekly cron or Task Scheduler

Time saved: 2–3 hours per week across the team.

# Extending the System

This pipeline covers the core reporting flow. Production ecommerce reporting systems add:

  • Product-level analytics — top sellers, slow movers, margin analysis
  • Inventory tracking — stock levels, reorder alerts
  • Customer segmentation — RFM analysis (recency, frequency, monetary)
  • Multi-channel — combine Shopify, Amazon, wholesale into one report
  • Dashboard output — feed data into a real-time dashboard instead of Excel

For a deeper look at structuring these extensions into a scalable reporting architecture, see Ecommerce Data Pipeline: Reporting Architecture That Scales. To catch performance regressions automatically instead of discovering them in weekly reports, set up Shopify performance monitoring with automated alerts.

# Automated Inventory Reporting

Inventory is the second most-requested automated report after sales. Shopify exposes inventory levels through the Admin API:

python
def fetch_inventory_levels(client):
    """Fetch current inventory levels for all products."""
    products = client.get("products", params={"limit": 250, "fields": "id,title,variants"})

    inventory = []
    for product in products:
        for variant in product.get("variants", []):
            inventory.append({
                "product_id": product["id"],
                "product_title": product["title"],
                "variant_id": variant["id"],
                "variant_title": variant.get("title", "Default"),
                "sku": variant.get("sku", ""),
                "inventory_quantity": variant.get("inventory_quantity", 0),
                "price": float(variant.get("price", 0)),
            })

    df = pd.DataFrame(inventory)
    print(f"Fetched inventory for {len(df)} variants across {len(products)} products")
    return df


def inventory_alerts(inventory_df, low_stock_threshold=5):
    """Identify products that need reordering."""
    low_stock = inventory_df[inventory_df["inventory_quantity"] <= low_stock_threshold]
    out_of_stock = inventory_df[inventory_df["inventory_quantity"] == 0]

    return {
        "total_variants": len(inventory_df),
        "low_stock": low_stock[["product_title", "sku", "inventory_quantity"]].to_dict("records"),
        "out_of_stock": out_of_stock[["product_title", "sku"]].to_dict("records"),
        "total_inventory_value": (inventory_df["inventory_quantity"] * inventory_df["price"]).sum(),
    }

This feeds directly into the same Excel report generator — add an "Inventory" sheet with stock levels and a "Reorder Alerts" sheet highlighting products below threshold.

# Scheduling Reports Automatically

The pipeline works manually, but the real value comes from scheduling:

python
# schedule_reports.py — run via cron or Task Scheduler
import schedule
import time

def daily_sales_report():
    """Generate and email daily sales summary."""
    client = ShopifyClient(
        shop_url=os.environ["SHOPIFY_STORE"],
        api_token=os.environ["SHOPIFY_API_TOKEN"],
    )
    orders = fetch_orders(client, days_back=1)
    kpis = calculate_kpis(orders)
    daily = daily_revenue_summary(orders)

    timestamp = datetime.now().strftime("%Y%m%d")
    output_path = f"reports/daily_sales_{timestamp}.xlsx"
    generate_ecommerce_report(kpis, daily, output_path)

    # Email using smtplib
    send_report_email(
        to=["[email protected]"],
        subject=f"Daily Sales Report — {datetime.now().strftime('%d %B %Y')}",
        attachment=output_path,
    )

# Schedule
schedule.every().day.at("06:00").do(daily_sales_report)
schedule.every().monday.at("07:00").do(weekly_full_report)

while True:
    schedule.run_pending()
    time.sleep(60)

For production use, replace the schedule loop with a cron job (0 6 * * * python schedule_reports.py --daily) or use a workflow orchestrator like Prefect for better error handling and retries.

# Shopify Reports vs Custom Reporting: When to Use Each

Scenario Use Shopify built-in Use custom reporting
Quick sales check
Standard weekly summary
Multi-store aggregation
Custom KPI calculations
Combining with non-Shopify data
Automated email delivery
Historical trend analysis (>90 days)
Team-specific filtered views
Inventory reorder automation

Shopify's built-in reports cover basic analytics well. Custom reporting becomes necessary when you need cross-platform data, automated delivery, or calculations that Shopify does not support natively.

# Common Shopify API and Reporting Mistakes

  1. Not handling pagination — Shopify limits responses to 250 items. Without pagination, you silently lose data. The get() method in this guide handles it via Link headers.

  2. Ignoring rate limits — Shopify allows ~2 requests/second for REST. Add rate limiting or use GraphQL for bulk operations.

  3. Hardcoding API versions — Shopify deprecates API versions quarterly. Use a config variable and update it during maintenance windows.

  4. Not storing historical data — Once an order is refunded or edited, the original values change. Store snapshots in a database for accurate historical reporting.

  5. Reporting on created_at instead of processed_at — For revenue timing, processed_at reflects when payment was captured, not when the order was created.

  6. Missing timezone handling — Shopify returns UTC timestamps. Convert to your store's timezone before grouping by date.

python
import pytz

def localize_orders(orders_df, timezone="Europe/London"):
    """Convert UTC timestamps to local timezone for accurate daily grouping."""
    tz = pytz.timezone(timezone)
    orders_df["created_at"] = orders_df["created_at"].dt.tz_convert(tz)
    orders_df["date"] = orders_df["created_at"].dt.date
    return orders_df

# Building an Ecommerce KPI Dashboard Flow

For teams that need real-time visibility rather than periodic reports, extend the pipeline into a dashboard:

flowchart LR
  S[Shopify API] --> P[Python ETL]
  P --> DB[(PostgreSQL)]
  DB --> D[Dashboard Tool]
  D --> T[Team Views]

  P --> A{Alerts}
  A -->|Low stock| SL[Slack]
  A -->|Revenue drop| EM[Email]

The ETL runs every hour, refreshing the database. The dashboard tool (Metabase, Grafana, or a custom Next.js dashboard) queries the database directly. Alerts trigger on threshold breaches.

This approach scales better than Excel reports for teams that check metrics multiple times per day.

# Next Steps

If your team is manually pulling Shopify exports and formatting them into spreadsheets, this pipeline replaces that entirely.

For more on fixing the performance issues that affect the data these reports capture, see How to Fix Slow Shopify Stores. To understand how these reports connect to conversion improvements, see How to Improve Ecommerce Conversion Using Data and Automation.

For building dashboards from the data this pipeline produces, see How to Build a Data Dashboard Without Excel. For a deeper dive into the API data extraction layer, see Shopify Reporting API: How to Pull Sales, Inventory, and Customer Data Automatically.

Ecommerce optimization services include setting up Shopify reporting automation as part of the performance improvement process.

Get in touch to discuss automated reporting for your store.

Enjoyed this article?

Get notified when I publish new articles on automation, ecommerce, and data engineering.

how to automate shopify reportsautomate shopify reportssetup automatic shopify reportsshopify automated reportsshopify api reporting pythonautomated ecommerce dashboardecommerce kpi trackingshopify sales report automationautomated shopify reporting pythonshopify report automation guide