Python Automation: Real Workflows That Replace Manual Processes
Common automation workflows — Excel processing, API integrations, reporting pipelines — with before-and-after comparisons and working Python code for each.
Most "automation" guides show one trick. This one covers the workflows that actually matter in business — the ones people spend hours on every week and never question.
Every workflow below follows the same structure: what the manual process looks like, what replaces it, and the working Python code.
# The Automation Pattern
Every workflow automation I build follows this:
Manual trigger → Python script → Data in → Transform → Output → Scheduled
The specifics change. The pattern does not.
# Workflow 1: Excel Report Generation
# Before
- Open 3 spreadsheets
- Copy data into a master sheet
- Update formulas
- Fix formatting
- Email the file
- Repeat next week
Time: ~2 hours/week
# After
import pandas as pd
from openpyxl import Workbook
from openpyxl.styles import Font, PatternFill
from datetime import datetime
def generate_weekly_report(source_files, output_path):
# Combine all sources
frames = [pd.read_excel(f) for f in source_files]
df = pd.concat(frames, ignore_index=True)
# Clean
df = df.dropna(subset=["revenue"])
df["date"] = pd.to_datetime(df["date"])
# Aggregate
summary = (
df.groupby("department")
.agg(
total_revenue=("revenue", "sum"),
total_orders=("order_id", "count"),
avg_order=("revenue", "mean"),
)
.reset_index()
)
# Write formatted Excel
wb = Workbook()
ws = wb.active
ws.title = "Weekly Summary"
headers = ["Department", "Revenue", "Orders", "Avg Order"]
header_fill = PatternFill(start_color="2F5496", end_color="2F5496", fill_type="solid")
for col, h in enumerate(headers, 1):
cell = ws.cell(row=1, column=col, value=h)
cell.font = Font(bold=True, color="FFFFFF")
cell.fill = header_fill
for i, row in summary.iterrows():
ws.cell(row=i + 2, column=1, value=row["department"])
ws.cell(row=i + 2, column=2, value=round(row["total_revenue"], 2))
ws.cell(row=i + 2, column=3, value=row["total_orders"])
ws.cell(row=i + 2, column=4, value=round(row["avg_order"], 2))
wb.save(output_path)
return output_path
# Run
generate_weekly_report(
["sales_north.xlsx", "sales_south.xlsx", "sales_west.xlsx"],
f"weekly_report_{datetime.now().strftime('%Y%m%d')}.xlsx",
)
Time after: ~5 seconds. Scheduled to run every Monday at 7 AM.
# What Changed
| Before | After |
|---|---|
| 3 files opened manually | All sources combined automatically |
| Copy-paste between sheets | pd.concat() handles merging |
| Manual formula updates | Aggregation in code — always correct |
| Formatting by hand | Consistent formatting every run |
| 2 hours/week | 5 seconds, zero human involvement |
# Workflow 2: API Data Collection
# Before
- Log into a dashboard
- Export CSV
- Open in Excel
- Filter relevant rows
- Paste into a tracker spreadsheet
Time: ~45 minutes/day
# After
import requests
import pandas as pd
from datetime import datetime, timedelta
def fetch_daily_metrics(api_url, api_key, days_back=1):
"""Fetch metrics from API and return clean DataFrame."""
since = (datetime.now() - timedelta(days=days_back)).isoformat()
response = requests.get(
f"{api_url}/metrics",
headers={"Authorization": f"Bearer {api_key}"},
params={"since": since},
timeout=30,
)
response.raise_for_status()
data = response.json()
df = pd.DataFrame(data["results"])
df["date"] = pd.to_datetime(df["timestamp"]).dt.date
df = df[["date", "page_views", "conversions", "revenue"]]
return df
def append_to_tracker(new_data, tracker_path):
"""Append new data to existing tracker spreadsheet."""
try:
existing = pd.read_excel(tracker_path)
combined = pd.concat([existing, new_data], ignore_index=True)
combined = combined.drop_duplicates(subset=["date"], keep="last")
except FileNotFoundError:
combined = new_data
combined.to_excel(tracker_path, index=False)
return len(new_data)
# Run
metrics = fetch_daily_metrics("https://api.example.com", "your-api-key")
rows_added = append_to_tracker(metrics, "kpi_tracker.xlsx")
print(f"Added {rows_added} rows to tracker")
Time after: ~3 seconds. Runs daily via cron.
# Workflow 3: Multi-Source Data Pipeline
This is the most common real workflow — pulling from multiple sources, cleaning, combining, and producing a single output.
# The Pipeline
Database (SQL) ──┐
├── Python ── Clean ── Merge ── Aggregate ── Report
API (JSON) ───┘
Excel (manual) ──┘
# Working Code
import pandas as pd
import sqlite3
def build_combined_report():
# Source 1: Database
conn = sqlite3.connect("operations.db")
db_data = pd.read_sql("SELECT * FROM orders WHERE date >= date('now', '-30 days')", conn)
conn.close()
# Source 2: Excel (manually maintained by another team)
manual_data = pd.read_excel("team_targets.xlsx")
# Clean each source
db_data["date"] = pd.to_datetime(db_data["date"])
db_data = db_data.dropna(subset=["amount"])
manual_data.columns = manual_data.columns.str.strip().str.lower()
# Merge on shared key
combined = db_data.merge(manual_data, on="department", how="left")
# Calculate vs target
summary = (
combined.groupby("department")
.agg(actual=("amount", "sum"), target=("monthly_target", "first"))
.reset_index()
)
summary["achievement"] = (summary["actual"] / summary["target"] * 100).round(1)
# Output
summary.to_excel("monthly_performance.xlsx", index=False)
return summary
report = build_combined_report()
print(report.to_string(index=False))
department actual target achievement
Finance 45200.0 50000.0 90.4
Operations 62800.0 60000.0 104.7
Sales 78500.0 75000.0 104.7
# Workflow 4: Scheduled Email Delivery
The last mile of any automation — getting the output to the right people without manual intervention.
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email import encoders
import os
def send_report(filepath, recipients, subject):
"""Send report as email attachment via SMTP."""
msg = MIMEMultipart()
msg["Subject"] = subject
msg["From"] = os.environ["SMTP_FROM"]
msg["To"] = ", ".join(recipients)
body = MIMEText(
f"Automated report attached.\nGenerated: {os.path.basename(filepath)}",
"plain",
)
msg.attach(body)
with open(filepath, "rb") as f:
part = MIMEBase("application", "octet-stream")
part.set_payload(f.read())
encoders.encode_base64(part)
part.add_header(
"Content-Disposition",
f"attachment; filename={os.path.basename(filepath)}",
)
msg.attach(part)
with smtplib.SMTP(os.environ["SMTP_HOST"], int(os.environ["SMTP_PORT"])) as server:
server.starttls()
server.login(os.environ["SMTP_USER"], os.environ["SMTP_PASS"])
server.send_message(msg)
Combine this with any of the workflows above and the entire chain — data collection, processing, formatting, delivery — runs without anyone touching it.
# Before vs After: The Numbers
| Workflow | Manual time | Automated time | Weekly saving |
|---|---|---|---|
| Excel reports | 2 hours/week | 5 seconds | ~2 hours |
| API data collection | 45 min/day | 3 seconds | ~3.5 hours |
| Multi-source pipeline | 3 hours/week | 10 seconds | ~3 hours |
| Report distribution | 30 min/week | Instant | ~30 min |
| Total | ~9 hours/week | < 1 minute | ~9 hours |
Over a year, that is ~468 hours — nearly 12 full work weeks — returned to the team.
# Tools Used
| Tool | Purpose |
|---|---|
| pandas | Data loading, cleaning, aggregation |
| openpyxl | Excel file creation with formatting |
| requests | API data fetching |
| sqlite3 | Database queries |
| smtplib | Email delivery |
| cron / Task Scheduler | Scheduling |
All standard Python libraries except pandas and openpyxl. No paid tools, no vendor lock-in.
# When to Automate
Automate when:
- The process happens weekly or more often
- The steps are predictable and repeatable
- Multiple people do the same data handling
- Errors from manual work cause downstream problems
Do not automate when:
- The process changes every time
- It requires human judgement at every step
- It happens once a year
# Next Steps
These workflows are starting points. Production systems add error handling, logging, alerting, and monitoring. But the core pattern — data in, transform, output, schedule — stays the same.
If your team is spending hours on any of these workflows, automation services can eliminate them. The ROI is usually measurable in weeks, not months.
For a deep dive into the Excel reporting workflow specifically, see Automate Excel Reports with Python.
Get in touch to discuss automating your workflows.
Enjoyed this article?
Get notified when I publish new articles on automation, ecommerce, and data engineering.