Does this mean I should fire my SDR?

No. The SDR in this build was redirected to client strategy and relationship work -- tasks that require judgment, not volume. The sourcing engine handles the repeatable part. A good SDR applied to client-facing work is worth more than an SDR doing list pulls.

How does the split-pipeline architecture actually fail safer?

When sourcing and outbound share one flow, a failed enrichment step can poison the send queue or block it entirely. Split at a staging table, a bad enrichment row just sits there flagged -- the outbound queue keeps running from the rows that passed.

Why write rejection reasons into the scoring prompt?

A yes/no output tells you who got rejected. Rejection reasons tell you why, at scale. When you can query a rejection_reason column across 2,000 rows, you can fix the source filter in minutes. Without reasons, you audit manually or not at all.

Can I run this with Smartlead or Lemlist instead of Instantly?

Yes. The outbound tool is swappable. The architecture lives in n8n and the staging table. Instantly was already in this agency's stack, so we kept it. Smartlead works the same way at the sequence layer.

What is the cheapest viable version of this stack?

Apollo free tier for lookup, n8n self-hosted at roughly $30/mo on a small VPS, and Claude Haiku for scoring via the API. Clay is optional if you do enrichment inside n8n directly. You can start under $100/mo and scale tooling as volume justifies it.

How a 5-Person Recruiting Agency Replaced Its SDR With a $0.08 Per Lead Sourcing Engine

Before	After
SDR at $55K/yr plus tool stack	2,073 qualified leads in 2 weeks from 4 LinkedIn sources
~300 qualified leads/quarter	~$0.08 per qualified lead
Single n8n flow where any failure cascaded	Sourcing + outbound pipelines split by a staging table
Qualification was a yes/no without traceable reasons	Scoring decisions auditable via SQL instead of manual review

Why the SDR-vs-software debate is the wrong frame

The conversation in recruiting circles usually goes one of two ways. Either "AI will replace SDRs" or "you can't automate relationship-building." Both are strawmen.

The actual question is: which part of the SDR's job is repeatable enough to automate, and which part requires a human who knows your clients?

For a 5-person boutique recruiting agency doing $500K to $2M in revenue, the answer is nearly everything upstream of the first real conversation. Sourcing names from LinkedIn, enriching contact data, scoring against a qualification rubric, routing to the right outbound sequence -- none of that requires a person. It requires a system that runs reliably at 3am without forgetting the criteria.

This agency was paying one SDR $55K/yr plus tool stack to produce about 300 qualified leads per quarter. That is roughly 23 qualified leads per week from a full-time hire. The ceiling was the human. The question was how high you could push volume if you removed that ceiling.

The two architecture choices that made this build work

Most small-agency automations fail for one of two reasons: a single pipeline failure kills the whole run, or qualification logic is so opaque that no one can debug why a lead was rejected six months later.

Both problems are fixable with deliberate architecture. This build made two specific decisions that addressed both.

Why we split sourcing from outbound into two pipelines

The original setup -- common for agencies that have patched together automations over time -- ran sourcing and outbound as one continuous n8n flow. A failed Clay enrichment would stall the sequence. A bad Instantly template error would back up the sourcing queue. One failure propagated sideways.

The fix is a staging table between the two pipelines. n8n orchestrates both sides, but they do not share a flow. Sourcing writes every enriched, scored lead to a Supabase staging table. Outbound reads from that table on its own schedule. A failed enrichment row sits in the table flagged for review. It does not touch the send queue.

The practical result: during a two-week live run pulling from four LinkedIn influencer follower lists, there were enrichment failures. They affected zero outbound sends, because the pipeline boundary absorbed them. You fix the sourcing side independently without touching the outbound side.

This is the same pattern database engineers use for ETL pipelines. A staging layer between ingest and serve means failures stay local. Recruiting automation has been slow to adopt it because most agencies build their first automation as one big flow and never go back to refactor it.

Why the scoring prompt had to give rejection reasons

Clay handles enrichment. Apollo handles contact lookup. Claude handles qualification scoring.

The initial instinct was to use Claude as a binary gate: score the lead, pass or reject, move on. That works for volume. It does not work for improvement.

When you run 2,000 leads through a binary scorer, you end up with 1,400 rejections and no way to know why. Was it company size? Title mismatch? Industry filter? You have to manually sample rejected rows, which at any real volume means you either hire someone to audit or you just accept that the system is a black box.

The change was requiring explicit rejection reasons in the prompt output. Every rejected lead gets a structured reason field: "title mismatch," "company too large," "industry outside scope," "insufficient seniority signal." That field writes to a column in the staging table.

Now the audit is a SQL query. SELECT rejection_reason, COUNT(*) FROM leads WHERE status = 'rejected' GROUP BY rejection_reason tells you in 30 seconds whether your LinkedIn source is pulling the wrong audience or whether your qualification criteria are too narrow. A half-day audit becomes a 30-second query.

Anthropic's documentation on structured outputs and prompt design covers the approach in detail. The key is treating the model's output as structured data from the start -- not as text you parse after the fact.

Two weeks of live runs: what the numbers looked like

The sourcing layer pulled follower lists from four LinkedIn influencers whose audiences overlap with the agency's target client profile. Not every follower qualifies. That is expected. The question is what percentage survives the qualification filter and at what cost.

First two weeks: 2,073 leads reached "qualified" status and entered the outbound staging table. From four source lists. In 14 days. The SDR's prior output was roughly 300 per quarter, or about 46 over the same two-week window.

The system ran without manual intervention once the initial source lists were loaded. Enrichment failures were logged. Rejections were categorized. Qualified leads were routed to the appropriate Instantly sequence based on company size.

The SDR role did not disappear. It shifted. The agency founder redirected that person toward client relationship management and closing conversations -- work that actually requires judgment and relationship history. The sourcing engine does not have a relationship with anyone. The SDR does.

The cost math

Stack costs at time of build:

Apollo: $500/mo for contact lookup and enrichment
Clay: $400/mo for enrichment workflows and waterfall logic
n8n self-hosted: approximately $30/mo on a small VPS
Instantly: already in the agency's existing stack
Claude via Anthropic API: minimal at Haiku rates for classification volume -- under $20/mo at 2,000 leads

Total new tooling cost: roughly $950/mo.

At 2,073 qualified leads over two weeks, that works out to approximately $0.08 per qualified lead when you annualize the tool cost against the lead volume. The SDR at $55K/yr plus benefits and tool overhead was producing leads at a cost that was an order of magnitude higher.

The build is priced per scope, fixed during the AI Operations X-Ray. Payback came from the first placed candidate -- for a boutique recruiting agency, a single hire covers the implementation cost with margin left over.

Who this applies to

This architecture is a fit for recruiting and staffing agencies with 3 to 15 people where:

At least one person's primary job is sourcing and qualifying leads
Lead volume is capped by how much time that person has
The qualification criteria exist somewhere in someone's head but are not written down in a way a system can execute
The agency is already paying for one or more of these tools and using them manually

It is not a fit for agencies where the founder personally knows every client and the business runs on referrals. In that case, the bottleneck is not sourcing volume -- it is relationship capacity, and automation does not solve that.

What I'd revisit

The LinkedIn sourcing layer depends on the quality of the influencer's follower list. Some niches have tight, well-defined audiences. Others have a lot of noise. A cleaner version of this build would layer Apollo company filters as a second pre-enrichment pass, so only companies that match the basic firmographic criteria get enriched at all. That would reduce Clay credit spend on leads that were never going to qualify.

I would also add a confidence score to the Claude rejection reason field. Right now you know the rejection category. With a confidence score, you can identify borderline rejections and route them to a manual review queue instead of dropping them entirely.

Want this for your agency? Run the AI Operations X-Ray.