Building an AI Email Triage System

I manage IT for a handful of small professional offices. That means I’m responsible for multiple M365 tenants, a few Gmail accounts, and the steady stream of alerts, notifications, vendor emails, and actual important messages that flow through all of them.

The volume isn’t overwhelming. It’s manageable. But “manageable” means I have to actually manage it: check inboxes, scan subjects, decide what matters, decide what doesn’t. Multiply that across accounts and the daily triage adds up. Not in hours, but in attention. Switching between six inboxes and deciding “does this matter right now?” gets old fast, even when the answer is usually no.

So I built an AI agent to do it.

The setup

The agent is Triss, running on OpenClaw on a Beelink SER5 mini PC under my desk. She monitors email on a cron schedule, every 30 minutes during work hours, and delivers a summary to Telegram. Only what matters. Everything else gets suppressed.

The triage rules are opinionated and specific to how I work:

Always surface: VIP contacts (a short list of people whose email I always want to see immediately), financial alerts (bank notifications, failed payments, Zelle from unknown senders), security items (quarantine alerts, authentication emails, credential resets), and anything time-sensitive.

Smart suppression: Routine vendor marketing, newsletter digests, automated receipts from known recurring payments, notification spam. If I’ve seen the same type of email from the same sender 50 times and never acted on it, Triss learns to skip it.

The gray area: Everything in between gets a judgment call from the model based on the triage rules. When in doubt, surface it. I’d rather dismiss a false positive in Telegram than miss something real.

What delivery looks like

Triss sends a Telegram message that looks something like this:

📬 Inbox Monitor — 2:07 PM

Hopewell:
→ [VIP] Jay responded to the fiber install thread — confirming Tuesday 
→ Azure CPU alert on SRV02 — brief spike, resolved
→ (3 routine notifications suppressed)

StoneCreek:  
→ [FINANCIAL] Intuit payment confirmation — $247.50
→ New ScreenConnect OTP request from cloud@screenconnect.com
→ (5 routine notifications suppressed)

Gmail: All clear.

Short and scannable. If something needs a response, there’s a one-line action suggestion. If it’s informational, it says so and moves on. The suppressed count tells me the system is working without listing every piece of noise it filtered.

The morning briefing

Separate from the 30-minute monitor, Triss delivers a morning briefing at 7 AM. This one’s broader: overnight email across all accounts, today’s calendar, upcoming appointments for the next few days, and anything flagged overnight that I should know about before the day starts.

The morning briefing uses a 12-hour lookback window instead of the monitor’s 1-hour window, so it catches everything that came in between the last evening check and wake-up. I learned this the hard way — the first version used the same 1-hour window as the monitor, which meant the 7 AM briefing only showed emails from 6-7 AM. Everything that arrived at midnight or 3 AM was invisible.

What worked from day one

The triage logic. The model understood priority immediately. VIP contacts got flagged. Financial alerts got surfaced. Noise got suppressed. The rules were clear enough that the model applied them consistently, and the few edge cases I caught were easy to fix by tightening the rule definitions.

What didn’t work

Everything about the data pipeline. The model was calling APIs directly, parsing raw HTML email bodies, and trying to build its own context from scratch every run. Expensive, fragile, and broke in ways that took real debugging to track down.

I ended up rebuilding the entire data layer. That’s a separate post.