You enrich your leads with Apollo, Lusha, or Cognism — but do you actually know where that data comes from? Behind every found email and retrieved phone number, there’s a specific data source: an official registry, a professional network, a crowdsourced database, or a web crawler.
This isn’t a trivial question. The source directly determines data freshness, accuracy, GDPR compliance, and geographic coverage. A tool that performs brilliantly on the US market can be nearly useless in France or Germany if its underlying sources don’t cover those regions well.
In this article, we break down the 6 major families of B2B data sources, the tools that rely on each one, and — most importantly — which profile should use which database depending on their context.
Enrich your leads directly in Google Sheets
Derrick aggregates multiple data sources to find emails, phone numbers, and company data for your prospects — without leaving Google Sheets.
Why the data source changes everything in B2B
Before comparing enrichment tools, you need to understand something that’s often overlooked: two tools offering the same feature can produce radically different results, simply because they draw from different sources.
Here’s a concrete example. Mike, an SDR at a B2B SaaS company, uses a US-based enrichment tool to prospect in France. He gets a 38% email completion rate. His colleague, using a tool built for the European market, achieves 71% on the exact same list. Same workflow, same target — different data sources.
This comes down to three factors:
Geographic coverage. Most major databases were built primarily from North American sources. Their coverage drops significantly on European markets — especially France, Germany, and Southern Europe.
Data freshness. Some sources update in real time (LinkedIn, official registries), while others are static snapshots that age quickly. According to a Scalability study, 32% of sales reps’ time is lost contacting bad prospects due to inaccurate or incomplete data.
The nature of the data collected. A source might be excellent for firmographic data (company size, industry, revenue) but poor for direct contact emails — or vice versa.
With that context in mind, let’s break down each source family in detail.
Source #1: LinkedIn — the gold mine of contact data
LinkedIn is by far the most widely used source for B2B contact enrichment. With over 1 billion members worldwide, the platform is the most up-to-date professional registry in existence — because users themselves keep their profiles current.
What LinkedIn provides
Data accessible through LinkedIn includes current job title, company, past experience, location, declared skills, and network connections. It’s the primary source for contact profile data (title, seniority, department) and declarative firmographic data (LinkedIn headcount, industry, HQ location).
LinkedIn doesn’t directly provide emails or phone numbers — but tools that use LinkedIn as an entry point can cross-reference this data with other databases to build a fuller profile.
Who uses this source?
LinkedIn-oriented tools — scrapers, Sales Navigator import solutions, and profile enrichers — rely on LinkedIn as their primary source. That’s the case with Derrick, whose LinkedIn Profile Scraper enriches a profile with 50+ attributes from a LinkedIn URL, and the Phone Finder from LinkedIn which finds phone numbers directly from LinkedIn profiles.
Who is it for?
LinkedIn as a data source is ideal for SDRs and BDRs prospecting on target lists built in Sales Navigator, recruiters sourcing candidates, and account-based marketing teams working from named account lists. Global coverage is solid, with particularly strong depth in tech, finance, and professional services.
Source #2: Official registries and government open data
In Europe, a significant share of firmographic data comes from official sources: Companies House in the UK, the INSEE SIRENE database in France (11 million French companies), the INPI, trade court registries, and Eurostat at the European level. In the US, the equivalent would be SEC filings, state business registries, and D&B.
These sources provide extremely reliable legal data: company registration numbers, legal name, industry classification, registered address, headcount (legal), revenue, directors, incorporation date, and legal status.
What this source delivers
Government open data — sourced from trade courts, INSEE, and the INPI in France — lets you enrich a B2B database with company directors, revenue, legal headcount, subsidiaries, and registered office. It’s one of the most reliable data sources available.
Reliability is the main advantage here. Legal data is certified, regularly updated, and raises no GDPR concerns for its firmographic component.
Who uses this source?
Platforms like Societeinfo, Pappers, and Kompass rely heavily on these registries. French and Europe-oriented enrichment solutions — including Pharow — use “SIRENization” (matching a company name to its official SIREN number) as the backbone of firmographic enrichment.
Who is it for?
This type of source is particularly well-suited to teams prospecting in France or Europe, finance or legal teams that need certified company data, and anyone working from lists of companies without named contacts (sector-based, geographic, or firmographic targeting).
Source #3: Crowdsourced (contributive) databases
A third family of sources is built on user-contributed data. The principle: a community of users (sales reps, recruiters, marketers) shares the contact data they encounter. The platform aggregates, validates, and redistributes this data across the community.
How it works
When a user visits a profile or uses a Chrome extension to view a prospect’s email, that information is potentially stored (with their consent) and made available to other users. The larger the user base, the more data is available.
This is the model used by platforms like Apollo, Lusha, and Kaspr. Lusha, for instance, is an affordable and effective tool for direct contact data enrichment, particularly popular with small and mid-sized businesses.
The limits of this model
Crowdsourced data has a clear advantage: volume. But it carries risks around freshness (data shared 18 months ago may already be outdated) and compliance. In Europe, questions around consent and the legal basis for processing are particularly acute for this type of data.
Who is it for?
This model works well for sales teams at startups and scale-ups who want fast access to large contact volumes, primarily on English-speaking markets (US, UK, Australia) where coverage is strongest. For French or Southern European markets, results tend to be more inconsistent.
Source #4: Web scraping and publicly available online data
Another major source is the web itself: public LinkedIn pages, company websites, professional directories, “About” or “Team” pages, customer review platforms like G2 or Capterra, and event or conference listings.
Enrichment tools that draw from this source deploy crawlers and scrapers to extract structured data from public pages — generic contact emails (info@, contact@), executive names, technology stacks, social media links.
What this source delivers
Web scraping surfaces data not available in official registries: technologies deployed on a website, G2 presence, homepage content, social media links. It’s the data source behind Derrick’s Website Tech Lookup, which identifies the tech stack of any domain, and the Website Email & Social Media Extractor, which pulls emails and social links from a URL.
By 2025, B2B databases have moved well beyond static information. AI now automates contact enrichment by pulling up-to-date data from multiple sources, including public web pages and professional networks.
Who is it for?
Web scraping is especially useful for growth marketers who want to quickly qualify large volumes of companies, RevOps teams doing technographic targeting (prospecting companies that use HubSpot, Salesforce, or a specific competitor), and lead gen agencies working on niche verticals.
For a deeper look at web-based enrichment, check out our article on database enrichment.
Source #5: Intent data
Intent data is a category unto itself. These aren’t contact or firmographic data points — they’re behavioral signals indicating that a company or decision-maker is actively researching a solution.
These signals can include: visiting competitor pricing pages, downloading a whitepaper on a specific topic, activity on professional forums, searching for certain keywords on Google, or change signals (new hire, funding round, rapid headcount growth).
How this data is collected
Intent data providers rely on networks of publisher partners (B2B websites that share visitor behavior data), IP tracking tools, and social signal aggregators. Bombora is the global leader in this category. Platforms like ZoomInfo and Demandbase also embed this type of data into their offerings.
For more on how to build intent-driven campaigns, see our guide on intent marketing.
Who is it for?
Intent data is most relevant for large enterprise sales teams running ABM campaigns at scale, with long sales cycles and a limited number of target accounts. Demandbase offers a comprehensive account-based marketing platform with strong CRM integration — built around this kind of data.
For a startup or SMB prospecting with a broad ICP, the cost of intent data is often disproportionate to the value it delivers.
Source #6: Proprietary databases from major vendors
Finally, some players have built — over years of operation — large proprietary databases assembled by aggregating and cross-referencing all the sources above: scraping, user contributions, official registries, partner data.
This is the model behind ZoomInfo, which claims over 100 million company accounts and hundreds of millions of professional contacts worldwide. ZoomInfo is a comprehensive B2B intelligence platform covering contact enrichment, firmographics, sales triggers, intent data, and go-to-market automation tools — with deep integrations into CRM and marketing platforms.
It’s also the model behind Cognism, widely recognized for its European coverage and GDPR-compliant approach, making it a strong pick for EMEA-focused teams who want the highest-quality, most compliant contact data.
The limits of this model
Proprietary databases have an obvious advantage: depth. But they come with two significant limitations. First, cost: these platforms are typically reserved for large sales teams with substantial budgets. Second, geographic relevance: ZoomInfo is highly performant on the US market but often less accurate in Europe.
Who is it for?
These databases are best suited for enterprise teams (100+ sales reps) prospecting internationally, particularly on the US or anglophone market. For smaller teams or those focused on Europe, specialized alternatives often offer a better value-to-cost ratio.
Comparison table: sources, tools, and use cases
| Source | Data available | Typical tools | Coverage | Best for |
|---|---|---|---|---|
| Job title, company, full profile | Derrick, Evaboot, Phantombuster | Global (strong in tech/finance) | SDRs, recruiters, ABM | |
| Official registries | Registration number, revenue, legal headcount, directors | Societeinfo, Pappers, Pharow | France / Europe | FR/EU teams, legal, finance |
| Crowdsourced | Direct email, phone number | Apollo, Lusha, Kaspr | US/UK strong, Europe variable | Startups, scale-ups (US market) |
| Web scraping | Tech stack, generic emails, social links | Derrick, BuiltWith | Global | Growth, RevOps, technographic targeting |
| Intent data | Behavioral signals, active research | Bombora, Demandbase | US strong | Enterprise ABM |
| Proprietary databases | All types (aggregated) | ZoomInfo, Cognism | US strong, Europe improving | Large teams, international |
How to choose your data source based on your profile
Now that you understand the main source families, let’s look at how to choose based on your actual context.
You’re prospecting in France or Europe
Prioritize tools that rely on official registries for firmographic data and LinkedIn for contact data. Be cautious about large US-based databases whose European coverage remains inconsistent.
For French firmographic data (SIREN, revenue, legal headcount), open government sources like the INSEE SIRENE database offer unmatched reliability — updated daily.
You’re targeting the US or English-speaking markets
Crowdsourced databases (Apollo, Lusha) and large proprietary vendors (ZoomInfo) have significantly stronger coverage here. Contact data density is much higher on these markets.
You need technographic data
To know whether a company uses HubSpot, Salesforce, or a specific CMS, you need tools that scrape website technologies. This data doesn’t exist in official registries or on LinkedIn — it comes exclusively from web scraping.
Derrick’s Website Tech Lookup identifies the tech stack of any domain directly in Google Sheets, making it straightforward to build technographic prospect lists.
You’re working at high volume
Waterfall enrichment is an approach that queries multiple sources sequentially to maximize completion rates. If source A doesn’t have the email, the system automatically tries source B, then C. Tools like FullEnrich specialize in this approach.
How to enrich a B2B database
Methods, tools, and best practices to complete your prospect data at scale.
Data sources and GDPR compliance: what you need to know
The question of data source is central to GDPR compliance. Professional data can be lawfully processed if its collection and use rest on a valid legal basis — typically legitimate interest for B2B prospecting.
But the source directly shapes that compliance picture:
Official registries (Companies House, INSEE, trade courts) are public records. Using them for legitimate business purposes raises no particular legal concern.
LinkedIn makes available data that users have chosen to make public. Using that data for professional prospecting can rely on legitimate interest, provided you respect the rights of the individuals concerned — including the right to object.
Crowdsourced databases raise more questions. How was the data originally collected? What legal basis was used? Can individuals exercise their rights? These questions become especially important when the tool is US-based, introducing a cross-border data transfer issue under GDPR Chapter V.
To dive deeper into the legal implications for outbound prospecting, read our guide on cold emailing and GDPR.
The multi-source approach: why the best tools have adopted it
The major market trend in 2026 is source aggregation. No single source covers all markets, all industries, and all data types perfectly.
The most performant enrichment tools today combine:
- LinkedIn for profile and contact data
- Official registries for certified firmographic data
- Web scraping for technographic data and generic emails
- Crowdsourced or proprietary databases to fill gaps on underserved markets
This logic underpins Derrick’s philosophy: aggregate multiple sources through a unified interface in Google Sheets, maximizing completion rates without multiplying tools and CSV exports. Derrick’s Lead Email Finder, for example, cross-references multiple sources to find professional emails with real-time validation — all from within your spreadsheet.
For a full overview of what Derrick’s enrichment covers, explore the G2 Company Insights feature, which adds a review-data layer on top of standard firmographic enrichment.
Key takeaways
- There are 6 major B2B data source families — each with distinct strengths and blind spots
- Geographic coverage is often the deciding factor: a database that dominates in the US may be weak in France or Germany
- Official registries (INSEE, Companies House) are the most reliable source for local firmographic data in Europe
- LinkedIn remains the strongest source for contact profile data, provided you use tools that can properly leverage it
- Crowdsourced databases offer volume but raise freshness and GDPR compliance questions
- Waterfall enrichment (multi-source) maximizes completion rates and is the approach adopted by leading platforms
- GDPR compliance depends directly on the source — always verify how data was originally collected
Conclusion: choose your data source before you choose your tool
Before comparing pricing and features on an enrichment tool, ask the right question first: what does its database actually draw from? A tool with a polished interface but a database that barely covers your target market won’t move the needle.
Based on your profile:
- SDR prospecting in Europe → prioritize LinkedIn + official registries
- Growth marketer targeting tech stacks → prioritize web scraping
- ABM team on the US market → crowdsourced or proprietary US databases
- Sales Ops maximizing completion rates → multi-source waterfall enrichment
One source layer, one tool, one workflow — in Google Sheets
Derrick aggregates LinkedIn, web scraping, and company data to enrich your prospects without leaving your spreadsheet.
FAQ
What’s the difference between a B2B database and an enrichment tool? A B2B database is a structured set of contacts and companies you buy or rent to prospect from. An enrichment tool completes existing data (your CRM, a lead list) with missing information. Both rely on data sources, but the use cases are different.
Why do US-based tools often underperform in Europe? Most major US databases were built primarily from North American sources and contributions. Their coverage is dense on the US/UK market but drops sharply in France, Germany, or Southern Europe. Tools built specifically for the European market — or those that integrate local official registries — consistently outperform on these geographies.
Is data sourced from LinkedIn GDPR-compliant? Using publicly accessible professional data from LinkedIn for B2B prospecting can rely on legitimate interest — a recognized legal basis under GDPR. This requires informing individuals, allowing them to exercise their rights, and only collecting data that’s necessary for the specific purpose.
What is waterfall enrichment? Waterfall enrichment is an approach that queries multiple data sources sequentially for the same field. If source A doesn’t find a prospect’s email, the system automatically tries source B, then source C. This method maximizes the final completion rate while optimizing cost per enrichment.
How do I evaluate a data source’s quality before committing? Test the tool on a representative sample of your target market (100–200 contacts). Measure completion rate (non-empty fields), email bounce rate (run a test send), and firmographic accuracy against what you already know about your targets. Most tools offer a freemium plan or trial period specifically for this kind of evaluation.