All posts
Cold EmailDataEmail StatisticsBenchmarks

What 10,000 Cold Emails Taught Us About What Actually Gets Replies

April 20, 20268 min readBongoBot Team

Most advice about cold email is based on intuition, anecdote, or a single campaign someone ran two years ago. We wanted something better.

Over the past year, we have collected aggregate performance data across thousands of cold email campaigns sent through BongoBot. The dataset covers B2B outreach across dozens of industries, company sizes, and offer types. What follows are the patterns that emerged when we looked at what actually moves reply rates — and what turned out to be noise.

A note before we dig in: these are averages as of March 2026. Individual results vary by industry, offer, and audience. But the directional trends are consistent enough to be worth acting on.

The Baseline: Where Most Cold Email Lands

First, some context. The widely cited industry average for cold email click-through rates hovers around 2.6%. The average reply rate for B2B cold outreach sits around 1%, give or take depending on whose data you trust.

Across campaigns in our dataset, the aggregate averages are:

  • 12.7% click-through rate (vs. 2.6% industry average)
  • 4.8% reply rate (vs. ~1% baseline)

Those numbers are not magic. They are the result of specific, measurable choices around personalization, copy structure, testing, and iteration. Here is what the data says about each one.

Subject Lines: Shorter Wins, but Specificity Wins More

We compared reply rates across subject lines grouped by character count:

Subject Line LengthRelative Reply Rate
Under 30 characters+18% above average
30-50 charactersBaseline
50-70 characters-9% below average
Over 70 characters-22% below average

Short subject lines outperform longer ones. But length alone does not explain the gap. When we controlled for content type, the real differentiator was specificity.

Subject lines that referenced something concrete about the recipient's business — a product name, a recent hire, a market they operate in — outperformed generic subject lines by 31%, regardless of length. The best-performing subject lines in our dataset tend to be short and specific: under 40 characters with a direct reference to the recipient's world.

What this means in practice

"Your hiring push in Q1" outperforms "Quick question about your growth plans." The first is specific and verifiable. The second could have been sent to anyone.

Email Length: The 50-125 Word Sweet Spot

We measured reply rates against total word count in the email body (excluding signature):

Word CountRelative Reply Rate
Under 50 words-14% below average
50-125 words+23% above average
125-200 wordsBaseline
Over 200 words-19% below average

The 50-125 word range consistently outperformed every other bracket. Emails under 50 words often lacked enough context to be compelling. Emails over 200 words lost readers before they reached the ask.

This tracks with what we see in open-ended feedback from recipients who reply positively: they frequently mention that the email was "short" or "easy to read." Nobody has ever replied to a cold email saying they wished it were longer.

Personalization Depth: The Biggest Single Lever

This is the finding we keep coming back to. We categorized emails by personalization depth:

  • Level 1: Merge fields only — First name, company name, industry
  • Level 2: Light research — References to the company's general market position or public profile
  • Level 3: Website-research personalization — Specific references to content, products, services, or recent activity found on the prospect's website

The results:

Personalization LevelAverage Reply Rate
Level 1 (merge fields)1.2%
Level 2 (light research)2.9%
Level 3 (website research)4.8%

Website-research personalization generates 3-4x more replies than merge-field-only emails. The gap between Level 2 and Level 3 is nearly as large as the gap between Level 1 and Level 2, which means that the difference between mentioning someone's industry and referencing something specific on their website is enormous.

This is not surprising if you think about it from the recipient's perspective. A merge field tells them you have their contact information. A specific reference to their business tells them you spent time understanding who they are. One signals a list. The other signals intent.

Send Timing: Tuesday Through Thursday Mornings

We looked at reply rates by day of week and time of send (adjusted for the recipient's local timezone):

Best-performing days:

  • Tuesday, Wednesday, and Thursday mornings (8-11 AM local) consistently outperform other windows
  • Tuesday and Wednesday are effectively tied for top performance
  • Thursday shows a slight decline toward the afternoon

Underperforming windows:

  • Monday mornings underperform by 16%, likely due to inbox overload from the weekend
  • Friday afternoons underperform by 28%
  • Weekend sends underperform by 34%

The timing effect is real but smaller than most people assume. The difference between the best and worst send windows is meaningful, but it is dwarfed by the impact of personalization depth. A well-researched email sent on a Friday will still outperform a generic email sent on a Tuesday.

Optimize your send timing, but do not treat it as a substitute for better emails.

Follow-Up Sequences: The Third Email Still Matters

We analyzed reply rates across follow-up sequences to understand where responses actually come from:

Email in SequenceShare of Total Replies
Initial email48%
1st follow-up22%
2nd follow-up17%
3rd follow-up9%
4th+ follow-up4%

Nearly half of all replies come from the first email, which makes sense. But the 2nd and 3rd follow-ups together account for 26% of total replies. That is more than one in four responses that you would never receive if you stopped after the first follow-up.

The 3rd follow-up is the most underused email in outreach. Most senders either stop at one follow-up or send follow-ups that are just shorter versions of the original email. The campaigns in our data that perform best on follow-ups tend to introduce new angles — a different benefit, a different proof point, a different framing of the same problem — rather than simply re-sending the original pitch with "just bumping this to the top of your inbox."

Variant Diversity: More Approaches Beat More Tweaks

This one surprised us. We compared two testing strategies:

  • Narrow testing: 2-3 variants of the same core approach (different subject lines, same body structure)
  • Wide testing: 8-10 fundamentally different approaches (different angles, structures, and value propositions)

Campaigns using wide testing — 10 or more distinct approaches — outperformed campaigns with only 2 variations by a significant margin. The reason is straightforward: when you test small variations, you optimize within a local maximum. When you test fundamentally different approaches, you discover which framing resonates with your audience in the first place.

A subject line test tells you which words work better. An approach test tells you which argument works better. The second question is more important, especially early in a campaign.

The Self-Optimization Effect: Campaigns Get Better Over Time

The most compelling pattern in our data is what happens across campaign iterations. When we track the same sender's campaigns over time:

  • Campaign 1 establishes a baseline
  • Campaign 2 typically shows a 20-30% improvement in reply rate as initial learnings are applied
  • By Campaign 3, reply rates improve an average of 47% over the first campaign

This is not a one-time bump. It is the compounding effect of testing multiple approaches, identifying what resonates, and feeding those insights into the next iteration. The campaigns that improve the most aggressively are the ones that tested the widest range of approaches early, giving the system more signal to work with.

The implication is clear: your first campaign is not your best campaign. It is your learning campaign. The value of systematic outreach compounds over time.

What the Numbers Add Up To

If we had to distill 10,000 emails into a short list of priorities, it would be this:

  1. Invest in deep personalization. It is the single biggest driver of reply rates, worth 3-4x more than merge fields alone.
  2. Keep emails between 50-125 words. Say what you need to say and stop.
  3. Test different approaches, not just different subject lines. Wide variant diversity uncovers what your audience actually responds to.
  4. Send 3 follow-ups with new angles. More than a quarter of replies come from the 2nd and 3rd follow-up.
  5. Iterate across campaigns. The 47% improvement by Campaign 3 is not accidental — it is the result of compounding what works.

These are not hacks. They are structural choices about how you approach outreach. The senders in our data who perform best are not writing better one-liners. They are building better systems.


BongoBot automates the research, personalization, and optimization behind these patterns — so each campaign learns from the last. See how it works.

Ready to put this into practice?

BongoBot automates personalized outreach so you can focus on closing.

Start Free