Build a Price Monitoring Bot with Python and Playwright

·10 min read·Ecommerce

Build a price monitoring bot that tracks competitor prices on any website — using Playwright for JavaScript-rendered pages, structured extraction with fallback selectors, change detection, and scheduled alerts when prices move.

Build a Price Monitoring Bot with Python and PlaywrightAI Generated Image

You check a competitor's price by hand. You open the page, scan for the number, maybe write it down. Then you forget to check for two weeks and miss the price drop that undercut you.

Ecommerce pricing is not something you should do manually. Competitor prices change constantly — flash sales, seasonal adjustments, inventory clearances. If you are not tracking them automatically, you are reacting instead of competing. By the time you notice a change, your customers have already noticed it too.

This guide builds a price monitoring bot in Python that scrapes competitor prices from any website, handles JavaScript-rendered pages, stores price history, and sends you an alert the moment a price changes by more than a threshold you set.

# Who This Is For

  • Ecommerce store owners who want to track competitor prices without paying for expensive monitoring tools
  • Data engineers building price intelligence systems for marketing or merchandising teams
  • Developers who need to scrape dynamic websites that do not work with simple HTTP requests
  • Anyone who checks competitor websites manually and wants to stop

Basic Python is all you need. The guide covers Playwright setup, CSS selectors, and browser automation from scratch.

# Bot Architecture

flowchart LR
  CFG["Config\n(URLs + selectors)"] --> SCH["Scheduler\n(cron / interval)"]
  SCH --> PW["Playwright\n(headless browser)"]
  PW --> EXT["Extract\n(price + metadata)"]
  EXT --> VAL["Validate\n(parse + clean)"]
  VAL --> DB["Store\n(SQLite history)"]
  DB --> CHK["Change\nDetection"]
  CHK -->|Threshold crossed| ALT["Alert\n(email / Slack)"]
  CHK -->|No change| LOG["Log\n(price stable)"]

Playwright handles the browser rendering. The extraction layer uses multiple selector strategies so a single CSS class change does not break everything. Change detection compares the current price against the last known value and fires alerts when the delta exceeds your threshold.

# What You Will Need

bash
pip install playwright httpx sqlite-utils schedule
playwright install chromium
  • playwright — browser automation (renders JavaScript, handles SPAs)
  • httpx — lightweight HTTP for robots.txt checks
  • sqlite-utils — simple SQLite wrapper for price history
  • schedule — cron-like scheduling in Python

The playwright install chromium command downloads a Chromium binary. This is about 150 MB.

# Step 1: Page Configuration

Define what to scrape and where to find the price on each page.

python
from dataclasses import dataclass, field


@dataclass
class ProductConfig:
    """Configuration for monitoring a single product page."""

    name: str
    url: str
    # multiple selectors in priority order — if the first breaks, try the next
    price_selectors: list[str]
    currency: str = "GBP"
    # optional: grab product name from the page to detect if the URL changed
    title_selector: str | None = None

    def __post_init__(self):
        if not self.price_selectors:
            raise ValueError(f"No price selectors for {self.name}")


# example configs — adjust selectors for your target sites
PRODUCTS = [
    ProductConfig(
        name="Competitor A - Widget Pro",
        url="https://competitor-a.com/products/widget-pro",
        price_selectors=[
            "[data-testid='price']",          # most stable: data attributes
            ".product-price .current-price",   # class-based fallback
            "span.price",                      # broad fallback
        ],
        title_selector="h1.product-title",
    ),
    ProductConfig(
        name="Competitor B - Widget Pro",
        url="https://competitor-b.com/widget-pro",
        price_selectors=[
            ".price-box .special-price",
            ".product-info-price .price",
            "[itemprop='price']",              # schema.org microdata
        ],
        title_selector="h1",
    ),
]

Three selectors per product in priority order. Data attributes like data-testid survive redesigns better than CSS classes. Schema.org microdata like [itemprop='price'] is even more stable — sites rarely remove structured data because it affects their search rankings.

# Step 2: Price Extraction with Playwright

python
import re
import logging
from playwright.sync_api import sync_playwright, TimeoutError as PwTimeout

logger = logging.getLogger("price_bot")


def extract_price(page, selectors: list[str]) -> float | None:
    """Try each selector until one returns a valid price.

    Returns the first successfully parsed price, or None if all fail.
    """
    for selector in selectors:
        try:
            el = page.wait_for_selector(selector, timeout=5000)
            if not el:
                continue

            raw_text = el.text_content().strip()
            # strip currency symbols and thousands separators, keep decimal
            cleaned = re.sub(r"[^\d.]", "", raw_text)
            if not cleaned:
                continue

            price = float(cleaned)
            if price <= 0:
                continue

            logger.debug(f"Got price {price} from selector: {selector}")
            return price

        except PwTimeout:
            logger.debug(f"Selector timed out: {selector}")
            continue
        except (ValueError, AttributeError) as e:
            logger.debug(f"Parse error with selector {selector}: {e}")
            continue

    return None


def scrape_product(product: ProductConfig) -> dict | None:
    """Launch a browser, navigate to the product page, extract the price.

    Returns a dict with name, url, price, and title, or None on failure.
    """
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        context = browser.new_context(
            user_agent=(
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
                "AppleWebKit/537.36 (KHTML, like Gecko) "
                "Chrome/125.0.0.0 Safari/537.36"
            ),
            viewport={"width": 1280, "height": 720},
        )
        page = context.new_page()

        try:
            page.goto(product.url, wait_until="networkidle", timeout=30000)

            price = extract_price(page, product.price_selectors)
            if price is None:
                logger.warning(f"Could not extract price for {product.name}")
                return None

            title = None
            if product.title_selector:
                try:
                    title_el = page.query_selector(product.title_selector)
                    title = title_el.text_content().strip() if title_el else None
                except Exception:
                    pass

            return {
                "name": product.name,
                "url": product.url,
                "price": price,
                "currency": product.currency,
                "title": title,
            }

        except Exception as e:
            logger.error(f"Failed to scrape {product.name}: {e}")
            return None

        finally:
            browser.close()

Notice the wait_until="networkidle" — this waits for the page to stop making network requests, which is usually when JavaScript-rendered prices are available. The timeout is 30 seconds because some ecommerce sites are slow.

The user agent is set to a real Chrome string. Some sites serve different content (or block) requests from default Playwright user agents.

# Step 3: Price History Storage

SQLite is perfect for this. No server to manage, the database is a single file, and you can query historical prices with SQL.

python
import sqlite_utils
from datetime import datetime, timezone


class PriceStore:
    """SQLite-backed price history with change detection."""

    def __init__(self, db_path: str = "prices.db"):
        self.db = sqlite_utils.Database(db_path)
        self._ensure_tables()

    def _ensure_tables(self):
        if "prices" not in self.db.table_names():
            self.db["prices"].create({
                "id": int,
                "product_name": str,
                "url": str,
                "price": float,
                "currency": str,
                "title": str,
                "scraped_at": str,
            }, pk="id")
            self.db["prices"].create_index(["product_name", "scraped_at"])

    def record(self, product_name: str, url: str, price: float,
               currency: str, title: str | None = None):
        """Store a price observation."""
        self.db["prices"].insert({
            "product_name": product_name,
            "url": url,
            "price": price,
            "currency": currency,
            "title": title or "",
            "scraped_at": datetime.now(timezone.utc).isoformat(),
        })

    def last_price(self, product_name: str) -> float | None:
        """Get the most recent price for a product."""
        rows = list(self.db.execute(
            "SELECT price FROM prices WHERE product_name = ? "
            "ORDER BY scraped_at DESC LIMIT 1",
            [product_name],
        ).fetchall())
        return rows[0][0] if rows else None

    def price_history(self, product_name: str, days: int = 30) -> list[dict]:
        """Get price history for the last N days."""
        cutoff = datetime.now(timezone.utc).isoformat()
        rows = self.db.execute(
            "SELECT price, scraped_at FROM prices "
            "WHERE product_name = ? AND scraped_at > datetime(?, '-' || ? || ' days') "
            "ORDER BY scraped_at ASC",
            [product_name, cutoff, days],
        ).fetchall()
        return [{"price": r[0], "date": r[1]} for r in rows]

The last_price method is key for change detection. Before storing a new price, you compare it to the last one. The index on (product_name, scraped_at) keeps lookups fast even with months of history.

# Step 4: Change Detection and Alerts

python
import smtplib
from email.mime.text import MIMEText
import os


class PriceAlert:
    """Detect price changes and send alerts."""

    def __init__(self, threshold_pct: float = 5.0):
        self.threshold = threshold_pct / 100.0

    def check(self, product_name: str, new_price: float,
              old_price: float | None) -> dict | None:
        """Compare prices and return alert info if threshold is exceeded."""
        if old_price is None:
            # first observation, no comparison possible
            return None

        if old_price == 0:
            return None

        change_pct = (new_price - old_price) / old_price
        abs_change = abs(change_pct)

        if abs_change < self.threshold:
            return None

        direction = "dropped" if change_pct < 0 else "increased"
        return {
            "product": product_name,
            "old_price": old_price,
            "new_price": new_price,
            "change_pct": round(change_pct * 100, 1),
            "direction": direction,
        }


def send_alert_email(alerts: list[dict]):
    """Send a price change summary via email.

    Requires SMTP_HOST, SMTP_USER, SMTP_PASS, ALERT_EMAIL env vars.
    """
    if not alerts:
        return

    body_lines = ["Price changes detected:\n"]
    for a in alerts:
        body_lines.append(
            f"  {a['product']}: {a['old_price']:.2f} -> {a['new_price']:.2f} "
            f"({a['change_pct']:+.1f}% {a['direction']})"
        )

    body = "\n".join(body_lines)
    logger.info(body)

    smtp_host = os.environ.get("SMTP_HOST")
    if not smtp_host:
        logger.warning("SMTP_HOST not set — skipping email, logged above")
        return

    msg = MIMEText(body)
    msg["Subject"] = f"Price Alert: {len(alerts)} product(s) changed"
    msg["From"] = os.environ["SMTP_USER"]
    msg["To"] = os.environ["ALERT_EMAIL"]

    with smtplib.SMTP(smtp_host, 587) as server:
        server.starttls()
        server.login(os.environ["SMTP_USER"], os.environ["SMTP_PASS"])
        server.send_message(msg)

    logger.info(f"Alert email sent for {len(alerts)} price changes")

The threshold is configurable. 5% is a reasonable default — you do not want alerts for a $0.10 fluctuation on a $200 product. For high-value items, you might drop it to 2%. For commodity products where margins are thin, 1%.

# Step 5: The Monitoring Loop

python
import time
import random
import schedule


def run_check():
    """Run a single price check cycle across all configured products."""
    store = PriceStore()
    alerter = PriceAlert(threshold_pct=5.0)
    alerts = []

    for product in PRODUCTS:
        # polite delay between requests — 3-7 seconds
        delay = random.uniform(3.0, 7.0)
        logger.info(f"Checking {product.name} (waiting {delay:.1f}s)")
        time.sleep(delay)

        result = scrape_product(product)
        if result is None:
            logger.warning(f"Skipping {product.name} — scrape failed")
            continue

        old_price = store.last_price(product.name)
        store.record(
            product_name=result["name"],
            url=result["url"],
            price=result["price"],
            currency=result["currency"],
            title=result.get("title"),
        )

        alert = alerter.check(product.name, result["price"], old_price)
        if alert:
            alerts.append(alert)
            logger.info(
                f"PRICE CHANGE: {product.name} "
                f"{alert['old_price']:.2f} -> {alert['new_price']:.2f} "
                f"({alert['change_pct']:+.1f}%)"
            )
        else:
            logger.info(f"{product.name}: {result['price']:.2f} (no change)")

    if alerts:
        send_alert_email(alerts)

    logger.info(f"Check complete: {len(PRODUCTS)} products, {len(alerts)} alerts")


def main():
    """Run the price monitoring bot on a schedule."""
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s [%(name)s] %(levelname)s %(message)s",
    )

    logger.info(f"Starting price monitor for {len(PRODUCTS)} products")

    # run once immediately, then on schedule
    run_check()

    # check every 4 hours during business hours
    schedule.every(4).hours.do(run_check)

    while True:
        schedule.run_pending()
        time.sleep(60)


if __name__ == "__main__":
    main()

The random delay between 3 and 7 seconds is important. Fixed delays are a bot fingerprint — real users do not click at exact intervals. The schedule runs every 4 hours, which is frequent enough for price intelligence and infrequent enough to avoid getting blocked.

# Step 6: Analysing Price History

Once you have a few weeks of data, you can answer useful questions.

python
import pandas as pd


def price_report(db_path: str = "prices.db") -> pd.DataFrame:
    """Generate a summary report of price movements."""
    db = sqlite_utils.Database(db_path)

    df = pd.DataFrame(db.execute("""
        SELECT
            product_name,
            MIN(price) as min_price,
            MAX(price) as max_price,
            AVG(price) as avg_price,
            COUNT(*) as observations,
            MIN(scraped_at) as first_seen,
            MAX(scraped_at) as last_seen
        FROM prices
        GROUP BY product_name
    """).fetchall(), columns=[
        "product", "min", "max", "avg", "obs", "first_seen", "last_seen"
    ])

    df["range_pct"] = ((df["max"] - df["min"]) / df["avg"] * 100).round(1)
    return df


def find_price_drops(db_path: str = "prices.db",
                     min_drop_pct: float = 10.0) -> list[dict]:
    """Find significant price drops in the history.

    Useful for spotting competitor clearance sales or loss leaders.
    """
    db = sqlite_utils.Database(db_path)
    results = []

    for product_name in db.execute(
        "SELECT DISTINCT product_name FROM prices"
    ).fetchall():
        name = product_name[0]
        prices = db.execute(
            "SELECT price, scraped_at FROM prices "
            "WHERE product_name = ? ORDER BY scraped_at ASC",
            [name],
        ).fetchall()

        for i in range(1, len(prices)):
            prev_price = prices[i - 1][0]
            curr_price = prices[i][0]
            if prev_price > 0:
                change = (curr_price - prev_price) / prev_price * 100
                if change <= -min_drop_pct:
                    results.append({
                        "product": name,
                        "from_price": prev_price,
                        "to_price": curr_price,
                        "drop_pct": round(change, 1),
                        "date": prices[i][1],
                    })

    return results

The range_pct column shows how volatile each competitor's pricing is. A product with 30% range is being actively managed — watch it closely. A product with 2% range has stable pricing and probably is not worth checking every 4 hours.

# Step 7: Hardening for Production

# Handle Stale Selectors

python
def selector_health_check(products: list[ProductConfig]):
    """Check if selectors are still working.

    Run this weekly. If a selector starts failing, you know the site
    was redesigned before you lose days of data.
    """
    broken = []
    for product in products:
        result = scrape_product(product)
        if result is None:
            broken.append(product.name)
            logger.error(f"All selectors broken for {product.name}")
        elif result.get("title") is None and product.title_selector:
            logger.warning(f"Title selector broken for {product.name}")

    if broken:
        send_alert_email([{
            "product": name,
            "old_price": 0,
            "new_price": 0,
            "change_pct": 0,
            "direction": f"SELECTOR BROKEN — needs manual fix",
        } for name in broken])

    return broken

# Run with System Crontab

For production, use crontab instead of the Python scheduler:

bash
# check prices every 4 hours
0 */4 * * * cd /srv/apps/price-bot && /usr/bin/python3 monitor.py --once >> /var/log/price-bot.log 2>&1

Add a --once flag to your script:

python
import sys

if __name__ == "__main__":
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s [%(name)s] %(levelname)s %(message)s",
    )

    if "--once" in sys.argv:
        run_check()
    else:
        main()  # run the schedule loop

# What This Replaces

Manual process Bot equivalent
Checking competitor sites by hand Automated checks every 4 hours
Spreadsheet of prices updated weekly SQLite database with full history
Missing price drops because you forgot to check Instant email alerts on threshold changes
Not knowing when competitors run sales Historical analysis shows pricing patterns
Paying $200/month for a SaaS price monitoring tool Your own bot for the cost of compute

# Next Steps

For building the web scraping foundations that this bot builds on, see Web Scraping to Structured Data: Building Reliable Extraction Pipelines. For automating your Shopify store reports alongside competitor monitoring, see How to Automate Shopify Reports Using the Python API. For building alerting systems that actually get read, see Build a Notification System That Actually Gets Read. For securing the credentials this bot uses, see Python Secrets Management for Automation Pipelines.

Ecommerce optimisation services include building competitor price monitoring systems and automated pricing intelligence.

Get in touch to discuss setting up price monitoring for your ecommerce store.

Frequently Asked Questions

Why use Playwright instead of BeautifulSoup or requests?
Many ecommerce sites render prices with JavaScript. Requests and BeautifulSoup only see the raw HTML before JavaScript runs, so the price elements are empty. Playwright runs a real browser that executes JavaScript, waits for the page to load, and then extracts the rendered content.
Will I get blocked by websites?
Possibly, if you scrape too aggressively. The bot in this guide uses respectful delays between requests, rotates user agents, checks robots.txt, and runs on realistic intervals (hourly, not every second). For most competitor monitoring use cases, checking prices a few times per day is enough and unlikely to trigger blocks.
Can this monitor prices on Amazon or Shopify stores?
Yes. The bot uses CSS selectors that you configure per site, so it works on any website. The guide includes examples for common ecommerce platforms. Amazon has aggressive bot detection, so for Amazon specifically you may want to look at their Product Advertising API instead.
How do I run this on a schedule?
The guide includes a cron setup using schedule for simple cases and system crontab for production. You can also wrap it in a Prefect flow if you already use Prefect for orchestration.

Enjoyed this article?

Get notified when I publish new articles on automation, ecommerce, and data engineering.

price monitoring bot pythonplaywright web scrapingcompetitor price trackingpython price scraperplaywright python tutorialecommerce price monitoringautomated price alertsweb scraping javascript pagescompetitor price analysis pythonprice change detection