Python Automation: Real Workflows That Replace Manual Processes

·5 min read·Automation

Common automation workflows — Excel processing, API integrations, reporting pipelines — with before-and-after comparisons and working Python code for each.

Most "automation" guides show one trick. This one covers the workflows that actually matter in business — the ones people spend hours on every week and never question.

Every workflow below follows the same structure: what the manual process looks like, what replaces it, and the working Python code.

# The Automation Pattern

Every workflow automation I build follows this:

text
Manual trigger → Python script → Data in → Transform → Output → Scheduled

The specifics change. The pattern does not.

# Workflow 1: Excel Report Generation

# Before

  1. Open 3 spreadsheets
  2. Copy data into a master sheet
  3. Update formulas
  4. Fix formatting
  5. Email the file
  6. Repeat next week

Time: ~2 hours/week

# After

python
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill
from datetime import datetime

def generate_weekly_report(source_files, output_path):
    # Combine all sources
    frames = [pd.read_excel(f) for f in source_files]
    df = pd.concat(frames, ignore_index=True)

    # Clean
    df = df.dropna(subset=["revenue"])
    df["date"] = pd.to_datetime(df["date"])

    # Aggregate
    summary = (
        df.groupby("department")
        .agg(
            total_revenue=("revenue", "sum"),
            total_orders=("order_id", "count"),
            avg_order=("revenue", "mean"),
        )
        .reset_index()
    )

    # Write formatted Excel
    wb = Workbook()
    ws = wb.active
    ws.title = "Weekly Summary"

    headers = ["Department", "Revenue", "Orders", "Avg Order"]
    header_fill = PatternFill(start_color="2F5496", end_color="2F5496", fill_type="solid")

    for col, h in enumerate(headers, 1):
        cell = ws.cell(row=1, column=col, value=h)
        cell.font = Font(bold=True, color="FFFFFF")
        cell.fill = header_fill

    for i, row in summary.iterrows():
        ws.cell(row=i + 2, column=1, value=row["department"])
        ws.cell(row=i + 2, column=2, value=round(row["total_revenue"], 2))
        ws.cell(row=i + 2, column=3, value=row["total_orders"])
        ws.cell(row=i + 2, column=4, value=round(row["avg_order"], 2))

    wb.save(output_path)
    return output_path

# Run
generate_weekly_report(
    ["sales_north.xlsx", "sales_south.xlsx", "sales_west.xlsx"],
    f"weekly_report_{datetime.now().strftime('%Y%m%d')}.xlsx",
)

Time after: ~5 seconds. Scheduled to run every Monday at 7 AM.

# What Changed

Before After
3 files opened manually All sources combined automatically
Copy-paste between sheets pd.concat() handles merging
Manual formula updates Aggregation in code — always correct
Formatting by hand Consistent formatting every run
2 hours/week 5 seconds, zero human involvement

# Workflow 2: API Data Collection

# Before

  1. Log into a dashboard
  2. Export CSV
  3. Open in Excel
  4. Filter relevant rows
  5. Paste into a tracker spreadsheet

Time: ~45 minutes/day

# After

python
import requests
import pandas as pd
from datetime import datetime, timedelta

def fetch_daily_metrics(api_url, api_key, days_back=1):
    """Fetch metrics from API and return clean DataFrame."""
    since = (datetime.now() - timedelta(days=days_back)).isoformat()

    response = requests.get(
        f"{api_url}/metrics",
        headers={"Authorization": f"Bearer {api_key}"},
        params={"since": since},
        timeout=30,
    )
    response.raise_for_status()
    data = response.json()

    df = pd.DataFrame(data["results"])
    df["date"] = pd.to_datetime(df["timestamp"]).dt.date
    df = df[["date", "page_views", "conversions", "revenue"]]

    return df

def append_to_tracker(new_data, tracker_path):
    """Append new data to existing tracker spreadsheet."""
    try:
        existing = pd.read_excel(tracker_path)
        combined = pd.concat([existing, new_data], ignore_index=True)
        combined = combined.drop_duplicates(subset=["date"], keep="last")
    except FileNotFoundError:
        combined = new_data

    combined.to_excel(tracker_path, index=False)
    return len(new_data)

# Run
metrics = fetch_daily_metrics("https://api.example.com", "your-api-key")
rows_added = append_to_tracker(metrics, "kpi_tracker.xlsx")
print(f"Added {rows_added} rows to tracker")

Time after: ~3 seconds. Runs daily via cron.

# Workflow 3: Multi-Source Data Pipeline

This is the most common real workflow — pulling from multiple sources, cleaning, combining, and producing a single output.

# The Pipeline

text
Database (SQL) ──┐
                 ├── Python ── Clean ── Merge ── Aggregate ── Report
API (JSON)    ───┘
Excel (manual) ──┘

# Working Code

python
import pandas as pd
import sqlite3

def build_combined_report():
    # Source 1: Database
    conn = sqlite3.connect("operations.db")
    db_data = pd.read_sql("SELECT * FROM orders WHERE date >= date('now', '-30 days')", conn)
    conn.close()

    # Source 2: Excel (manually maintained by another team)
    manual_data = pd.read_excel("team_targets.xlsx")

    # Clean each source
    db_data["date"] = pd.to_datetime(db_data["date"])
    db_data = db_data.dropna(subset=["amount"])

    manual_data.columns = manual_data.columns.str.strip().str.lower()

    # Merge on shared key
    combined = db_data.merge(manual_data, on="department", how="left")

    # Calculate vs target
    summary = (
        combined.groupby("department")
        .agg(actual=("amount", "sum"), target=("monthly_target", "first"))
        .reset_index()
    )
    summary["achievement"] = (summary["actual"] / summary["target"] * 100).round(1)

    # Output
    summary.to_excel("monthly_performance.xlsx", index=False)
    return summary

report = build_combined_report()
print(report.to_string(index=False))
text
 department   actual   target  achievement
    Finance  45200.0  50000.0         90.4
 Operations  62800.0  60000.0        104.7
      Sales  78500.0  75000.0        104.7

# Workflow 4: Scheduled Email Delivery

The last mile of any automation — getting the output to the right people without manual intervention.

python
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email import encoders
import os

def send_report(filepath, recipients, subject):
    """Send report as email attachment via SMTP."""
    msg = MIMEMultipart()
    msg["Subject"] = subject
    msg["From"] = os.environ["SMTP_FROM"]
    msg["To"] = ", ".join(recipients)

    body = MIMEText(
        f"Automated report attached.\nGenerated: {os.path.basename(filepath)}",
        "plain",
    )
    msg.attach(body)

    with open(filepath, "rb") as f:
        part = MIMEBase("application", "octet-stream")
        part.set_payload(f.read())
        encoders.encode_base64(part)
        part.add_header(
            "Content-Disposition",
            f"attachment; filename={os.path.basename(filepath)}",
        )
        msg.attach(part)

    with smtplib.SMTP(os.environ["SMTP_HOST"], int(os.environ["SMTP_PORT"])) as server:
        server.starttls()
        server.login(os.environ["SMTP_USER"], os.environ["SMTP_PASS"])
        server.send_message(msg)

Combine this with any of the workflows above and the entire chain — data collection, processing, formatting, delivery — runs without anyone touching it.

# Before vs After: The Numbers

Workflow Manual time Automated time Weekly saving
Excel reports 2 hours/week 5 seconds ~2 hours
API data collection 45 min/day 3 seconds ~3.5 hours
Multi-source pipeline 3 hours/week 10 seconds ~3 hours
Report distribution 30 min/week Instant ~30 min
Total ~9 hours/week < 1 minute ~9 hours

Over a year, that is ~468 hours — nearly 12 full work weeks — returned to the team.

# Tools Used

Tool Purpose
pandas Data loading, cleaning, aggregation
openpyxl Excel file creation with formatting
requests API data fetching
sqlite3 Database queries
smtplib Email delivery
cron / Task Scheduler Scheduling

All standard Python libraries except pandas and openpyxl. No paid tools, no vendor lock-in.

# When to Automate

Automate when:

  • The process happens weekly or more often
  • The steps are predictable and repeatable
  • Multiple people do the same data handling
  • Errors from manual work cause downstream problems

Do not automate when:

  • The process changes every time
  • It requires human judgement at every step
  • It happens once a year

# Next Steps

These workflows are starting points. Production systems add error handling, logging, alerting, and monitoring. But the core pattern — data in, transform, output, schedule — stays the same.

If your team is spending hours on any of these workflows, automation services can eliminate them. The ROI is usually measurable in weeks, not months.

For a deep dive into the Excel reporting workflow specifically, see Automate Excel Reports with Python.

Get in touch to discuss automating your workflows.

Enjoyed this article?

Get notified when I publish new articles on automation, ecommerce, and data engineering.

python automation workflowsautomate manual processes pythonpython workflow automationreporting automation pythonapi automation pythonexcel automation pythonpython replace manual workpython business automationworkflow automation toolspython script automate tasks