ReplyLabs
FeaturesPricingCompareFAQUse casesBlogHelpSetup
Sign inGet started free
Get started

Product

  • Features
  • Pricing
  • Compare
  • Roadmap

Resources

  • Use cases
  • Blog
  • Glossary
  • Cost calculator

Support

  • Setup Guide
  • Help Center
  • Contact Support
  • Report an Issue
  • Feature Requests

Company

  • Opt Out of Testing

Legal

  • Privacy Policy
  • Terms of Service
  • Cookie list
  • Subprocessors

Empra Consultancy LTD
hello@replylabs.io

ReplyLabs|PrivacyTermsCookiesSubprocessors

© 2026 Empra Consultancy LTD. All rights reserved.

All articles
Lead enrichment

How waterfall enrichment works (and how to do it in Sheets)

Waterfall enrichment queries multiple data sources in sequence and keeps the first verified match. How it lifts coverage from ~50% to 80%+, and a Sheets version.

By Hugo Dupont · 7 min read

Waterfall enrichment is a method that queries several data sources in sequence for the same field, an email, a phone number, a firmographic value, and keeps the first verified match, only paying for the next source when the previous one comes up empty. It exists because no single provider has complete coverage: one vendor typically resolves verified contact data for only 40 to 60 percent of a B2B list, so a single source leaves nearly half your rows blank. Stacking sources closes that gap. This post explains how the waterfall mechanic actually works, the difference between sequential and parallel waterfalls, the coverage and cost maths, and how to run the practical version inside Google Sheets.

How does waterfall enrichment work?

The core mechanic is a cascade. When a record is missing a field, the system queries the first provider. If that provider returns a verified value, the waterfall stops and the row is done. If it comes back empty, the workflow falls through to the second provider, then the third, and so on until a match is confirmed or the sources are exhausted.

The key word is verified. A good waterfall validates each result before accepting it, for an email that means syntax, domain, and SMTP checks, so it does not stop at the first source that returns a plausible-looking but dead address. Each step both finds and checks, then either accepts or cascades down.

This is why the order of sources matters. You put your cheapest, highest-coverage provider first so most rows resolve early and cheaply, and reserve the expensive specialists for the tail of hard-to-find records that the earlier sources missed.

Why does waterfall enrichment beat single-source enrichment?

Because every provider has gaps, and the gaps do not overlap. One vendor might have strong email coverage for North American tech companies but thin data for European healthcare; another is the reverse. Querying just one means living with its blind spots.

The numbers are consistent across the market. Single-source tools return verified contact data for roughly 40 to 60 percent of a typical B2B list. Waterfall enrichment, by stacking providers so each fills the slice the previous one missed, pushes match rates to 80 percent or above, and the best multi-source services report into the mid-90s. On a 1,000-contact list, the difference is often 300 or more additional resolved records. Bounce rates fall too: single-source emails bounce at 5 to 15 percent, while a waterfall with verification at each step can drop bounce below 1 percent.

Sequential vs parallel waterfalls

There are two shapes, and the choice is a cost-versus-speed trade-off.

A sequential waterfall queries one source at a time and stops at the first verified match. It is the cheaper shape because you only pay the next provider when the previous one failed, so the most coverage gets bought at the lowest cost. The downside is latency: a hard-to-find row may have to fall through five providers before it resolves.

A parallel waterfall fires every source at once and picks the highest-confidence answer. It is faster and useful when you need a cold list resolved in minutes, but it pays for every lookup on every row, including the ones the first source would have answered alone. Most teams default to sequential and reserve parallel for time-critical runs.

The cost trap most people miss

The part that quietly drains budgets is billing on failure. Many tools charge a credit when a lookup runs, not when it succeeds, so a low-coverage source can consume 20 to 30 percent of an allocation on rows it never resolved. In a sequential waterfall this is doubly painful, because the rows that fall furthest down the cascade are exactly the ones that hit the most providers and rack up the most failed charges.

The honest version of a waterfall bills only for fields it actually filled. When you are choosing or building one, that single rule, pay on success, not on attempt, matters more to the real cost per lead than the headline credit price.

How to run a practical waterfall in Google Sheets

You do not need a 15-vendor orchestration platform to get the benefit of waterfall logic for company-level data. A spreadsheet add-on that can scrape pages and extract fields with AI runs the practical version of a waterfall, because a large share of firmographic data lives on the open web across several pages.

Here is the pipeline, run from a sidebar:

  1. Scrape the most likely source first. Open the sidebar with Extensions, ReplyLabs, Open sidebar, select your rows, and run Scrape on the domain column. The about, product, and careers text lands in a context column. This is your first "provider."
  2. Extract the fields with AI. In the AI tab, write a prompt that reads the scraped text: "From {{Scraped text}}, return the industry, employee range, and headquarters city, one per column. If the page does not say, return Unknown." Rows that resolve are done.
  3. Cascade the misses to a second source. Filter to the rows that came back Unknown, scrape a different page for those (a press release, a careers listing, a LinkedIn company page), and re-run the extraction on just those rows. This is the waterfall step: a second source filling the gap the first one left.
  4. Verify the emails on the survivors. Run Verify on the email column so dead addresses never enter a send, the same SMTP-style check a contact waterfall applies at each step.

Because every step writes to its own column, you can see exactly which source filled which field, and you re-run only the rows that are still thin rather than re-paying for the whole list. The full chained version, scrape, extract, score, verify, is laid out in the lead enrichment in Google Sheets guide, and the web scraping in Google Sheets page covers the scrape step in depth.

Where a Sheets waterfall stops, honestly

This approach handles the practical waterfall for company-level data that lives on public pages: scrape the most likely source, extract, then cascade the misses to another source and re-run. What it does not do is fan a single row across 15 licensed contact databases for high-coverage personal mobile numbers and deep technographics. That data is bought, not scraped, and a dedicated platform like Clay or a specialist data vendor will out-cover an open-web approach there. The trade-off is cost: the Sheets waterfall is far cheaper and fully auditable, and it wins for firmographic enrichment and AI personalisation, which is the common case for most outbound teams. For the full comparison see how to enrich leads without Clay, and to model your own numbers use the AI cost calculator.

Common questions

What is waterfall enrichment in simple terms?

It is querying several data sources one after another for the same field and keeping the first verified match, only paying for the next source when the previous one comes up empty. It exists because no single provider covers every record.

How much does waterfall enrichment improve coverage?

Single-source tools typically resolve 40 to 60 percent of a B2B list. A waterfall that stacks providers usually reaches 80 percent or higher, often 30 to 40 percentage points better, because each source fills the gaps of the one before it.

Is sequential or parallel waterfall better?

Sequential is cheaper because it stops at the first match and only pays the next source on a miss. Parallel is faster because it queries every source at once, but it pays for every lookup. Most teams use sequential and reserve parallel for time-critical runs.

Can I do waterfall enrichment in Google Sheets?

You can run the practical version for company-level data: scrape the most likely page, extract with AI, then cascade the rows that came back blank to a second source and re-run. It does not replace a 15-vendor licensed contact waterfall, but it covers firmographic enrichment at a fraction of the cost.

Why do failed lookups still cost money on some tools?

Because many platforms charge a credit when a lookup runs, not when it succeeds, so low-coverage sources drain credits on rows they never resolved. ReplyLabs charges only for rows that return a result, and new accounts start with a free $20 credit.

Keep reading: Lead enrichment
Read the full guide: Lead enrichment in Google Sheets
  • Enrich leads without Clay
  • Firmographic enrichment explained
Definitions
FirmographicsWaterfall enrichment

Try it on your own list

ReplyLabs runs from a sidebar inside Google Sheets. Start free with $20 credit, no card needed.

Get started free