Email Scraper, MCA Lead Gen

An automated lead generation pipeline that scrapes court records and UCC filings for businesses with Merchant Cash Advance debt, enriches them with contact data, and sends targeted cold email sequences, producing 250-600 raw leads per day on autopilot

250-600
leads per day generated
30+
MCA lender search terms
near-zero
marginal cost per lead

MCA Lead Generation System

One-Line Summary: An automated lead generation pipeline that scrapes court records and UCC filings for businesses with Merchant Cash Advance debt, enriches them with contact data, and sends targeted cold email sequences. producing 250-600 raw leads per day on autopilot.

--

Problem Statement

The client's core business is helping businesses resolve Merchant Cash Advance (MCA) debt. Finding these businesses is the fundamental growth challenge: companies struggling with MCA obligations rarely advertise that fact. Traditional lead generation methods (paid ads, referrals, purchased lists) are expensive and produce low-intent leads. Meanwhile, public records. court filings where MCA lenders sue borrowers, and UCC filings where lenders place liens. contain a goldmine of high-intent prospects who demonstrably have MCA debt and may need help resolving it. But manually searching court websites and Secretary of State databases is impractical at scale.

Solution

Built a three-stage automated pipeline that runs daily at 6 AM:

  1. Scrape: Two independent scrapers harvest leads from public records. The Tier 1 scraper targets NY court records for MCA lawsuit defendants (highest-value leads. these businesses are actively being sued by MCA lenders). The Tier 2 scraper targets Secretary of State UCC filing databases for businesses with MCA liens.

  2. Enrich: Raw leads (business name, sometimes just a docket number) are enriched with email addresses and phone numbers via the Apollo.io API and web scraping of company websites.

  3. Send: Enriched leads are deduplicated against previous runs and fed into Instantly.ai for automated cold email sequences.

Tech Stack

LayerTechnology
Court ScrapingSelenium + BeautifulSoup (dynamic JS-rendered court search pages)
UCC ScrapingSelenium + BeautifulSoup (Secretary of State UCC databases)
Web EnrichmentSelenium (Google search + company website scraping for contact info)
API EnrichmentApollo.io API (email/phone lookup)
Email DeliveryInstantly.ai API (cold email sequences)
Data Processingpandas (deduplication, normalization, CSV/Excel export)
Schedulingschedule library (daily 6 AM cron)
Browser DriverChrome + chromedriver (via webdriver-manager)
LanguagePython 3

Key Features

Tier 1: Court Record Scraper (Hottest Leads)

  • Searches NY Courts website for MCA lender names as plaintiffs (30+ known MCA lenders in the search list: Libertas Funding, Yellowstone Capital, Pearl Capital, Credibly, OnDeck, BlueVine, etc.)
  • Extracts defendant business names, case numbers, filing dates, court locations, and case status
  • These are the highest-value leads: businesses currently being sued by their MCA lender are the most motivated to seek debt resolution

Tier 2: UCC Filing Scraper (Warm Leads)

  • Searches Secretary of State UCC-1 filing databases for MCA lenders listed as secured parties
  • Extracts debtor business names, filing numbers, dates, and secured party details
  • Businesses with UCC liens from MCA lenders have active MCA obligations and may be candidates for restructuring

Contact Enrichment

  • Web scraping layer: For each business, searches Google for the company website, navigates to it, and extracts email addresses and phone numbers using regex patterns. Follows "Contact" page links for deeper extraction.
  • Apollo.io API layer: Supplements web scraping with Apollo.io's database for verified business email and phone data
  • Filtering: Excludes invalid emails (image files, sentry, example domains) and normalizes phone formats

Email Automation

  • Enriched leads are pushed to Instantly.ai campaigns via API
  • Automated multi-touch cold email sequences
  • Deduplication against previous runs prevents duplicate outreach

Daily Automation

  • Pipeline scheduled to run daily at 6 AM
  • Scrape, dedupe against historical data, enrich, and send. fully unattended
  • Per-run logging with timestamped log files
  • Output organized into subdirectories: court_records/, ucc_filings/, enriched/

Impact / Metrics

  • Expected output: 250-600 raw leads per day from combined court and UCC sources
  • Projected conversion: 3-12 new clients per month at typical MCA debt resolution conversion rates
  • Cost: Near-zero marginal cost per lead (Apollo.io free tier + Instantly.ai at $37/month) compared to $50-200+ per lead from paid acquisition channels
  • Lead quality: Court defendants and UCC debtors are demonstrably carrying MCA debt. far higher intent than cold outreach to general business lists

Status

Built and tested. Scrapers validated against live court and UCC databases. Pipeline orchestrator configured for daily 6 AM runs. Apollo.io and Instantly.ai integrations ready for activation. Scaling to additional states beyond NY is planned.

SeleniumBeautifulSoupApollo.ioInstantly.aipandas

More in this category