Email Scraper, MCA Lead Gen
An automated lead generation pipeline that scrapes court records and UCC filings for businesses with Merchant Cash Advance debt, enriches them with contact data, and sends targeted cold email sequences, producing 250-600 raw leads per day on autopilot
MCA Lead Generation System
One-Line Summary: An automated lead generation pipeline that scrapes court records and UCC filings for businesses with Merchant Cash Advance debt, enriches them with contact data, and sends targeted cold email sequences. producing 250-600 raw leads per day on autopilot.
--
Problem Statement
The client's core business is helping businesses resolve Merchant Cash Advance (MCA) debt. Finding these businesses is the fundamental growth challenge: companies struggling with MCA obligations rarely advertise that fact. Traditional lead generation methods (paid ads, referrals, purchased lists) are expensive and produce low-intent leads. Meanwhile, public records. court filings where MCA lenders sue borrowers, and UCC filings where lenders place liens. contain a goldmine of high-intent prospects who demonstrably have MCA debt and may need help resolving it. But manually searching court websites and Secretary of State databases is impractical at scale.
Solution
Built a three-stage automated pipeline that runs daily at 6 AM:
-
Scrape: Two independent scrapers harvest leads from public records. The Tier 1 scraper targets NY court records for MCA lawsuit defendants (highest-value leads. these businesses are actively being sued by MCA lenders). The Tier 2 scraper targets Secretary of State UCC filing databases for businesses with MCA liens.
-
Enrich: Raw leads (business name, sometimes just a docket number) are enriched with email addresses and phone numbers via the Apollo.io API and web scraping of company websites.
-
Send: Enriched leads are deduplicated against previous runs and fed into Instantly.ai for automated cold email sequences.
Tech Stack
| Layer | Technology |
|---|---|
| Court Scraping | Selenium + BeautifulSoup (dynamic JS-rendered court search pages) |
| UCC Scraping | Selenium + BeautifulSoup (Secretary of State UCC databases) |
| Web Enrichment | Selenium (Google search + company website scraping for contact info) |
| API Enrichment | Apollo.io API (email/phone lookup) |
| Email Delivery | Instantly.ai API (cold email sequences) |
| Data Processing | pandas (deduplication, normalization, CSV/Excel export) |
| Scheduling | schedule library (daily 6 AM cron) |
| Browser Driver | Chrome + chromedriver (via webdriver-manager) |
| Language | Python 3 |
Key Features
Tier 1: Court Record Scraper (Hottest Leads)
- Searches NY Courts website for MCA lender names as plaintiffs (30+ known MCA lenders in the search list: Libertas Funding, Yellowstone Capital, Pearl Capital, Credibly, OnDeck, BlueVine, etc.)
- Extracts defendant business names, case numbers, filing dates, court locations, and case status
- These are the highest-value leads: businesses currently being sued by their MCA lender are the most motivated to seek debt resolution
Tier 2: UCC Filing Scraper (Warm Leads)
- Searches Secretary of State UCC-1 filing databases for MCA lenders listed as secured parties
- Extracts debtor business names, filing numbers, dates, and secured party details
- Businesses with UCC liens from MCA lenders have active MCA obligations and may be candidates for restructuring
Contact Enrichment
- Web scraping layer: For each business, searches Google for the company website, navigates to it, and extracts email addresses and phone numbers using regex patterns. Follows "Contact" page links for deeper extraction.
- Apollo.io API layer: Supplements web scraping with Apollo.io's database for verified business email and phone data
- Filtering: Excludes invalid emails (image files, sentry, example domains) and normalizes phone formats
Email Automation
- Enriched leads are pushed to Instantly.ai campaigns via API
- Automated multi-touch cold email sequences
- Deduplication against previous runs prevents duplicate outreach
Daily Automation
- Pipeline scheduled to run daily at 6 AM
- Scrape, dedupe against historical data, enrich, and send. fully unattended
- Per-run logging with timestamped log files
- Output organized into subdirectories:
court_records/,ucc_filings/,enriched/
Impact / Metrics
- Expected output: 250-600 raw leads per day from combined court and UCC sources
- Projected conversion: 3-12 new clients per month at typical MCA debt resolution conversion rates
- Cost: Near-zero marginal cost per lead (Apollo.io free tier + Instantly.ai at $37/month) compared to $50-200+ per lead from paid acquisition channels
- Lead quality: Court defendants and UCC debtors are demonstrably carrying MCA debt. far higher intent than cold outreach to general business lists
Status
Built and tested. Scrapers validated against live court and UCC databases. Pipeline orchestrator configured for daily 6 AM runs. Apollo.io and Instantly.ai integrations ready for activation. Scaling to additional states beyond NY is planned.
More from Automation & Bots
More in this category
Slack Automation Bots (NSF / EFT / CM / Sales)
Four independent Python bots that watch Slack channels for file uploads and automate the entire daily lead management lifecycle for the Convoso dialer platform, parsing, deduplication, DNC clearing, uploading, reporting, and Asana task creation
Automation & BotsData Cleaning Tool
Python CLI and Flask web application that cleans, normalizes, and reformats raw lead data for import into the Convoso dialer platform, deployed on Vercel for team-wide access
Automation & BotsMeeting Task Tracker (Otter to Asana)
Automated pipeline that extracts meeting transcripts from Otter.ai, generates anonymized summaries with action items, creates structured Asana task hierarchies, and drafts manager notification emails, turning weekly meetings into trackable work