Using CT Logs to Catch Phishing Domains Before They Hit Your Users

Someone registered your-brand-login.com. You found out from a customer complaint.

That's the reality for most companies. Attackers register lookalike domains, grab a free Let's Encrypt certificate, spin up a credential harvesting page, and your security team doesn't know until someone falls for it. By then the damage is done, credentials are exfiltrated, and you're writing an incident report.

But here's what most teams miss: the attacker needed a certificate. And that certificate was logged publicly in CT logs before the phishing page even went live. You had a window. You just weren't watching.

Why CT logs are your early warning system

Every publicly trusted certificate gets submitted to Certificate Transparency logs. This is mandatory since 2018 for Chrome. No exceptions. When an attacker registers paypa1-secure-login.com and grabs a cert from Let's Encrypt, that issuance shows up in CT logs within minutes. Sometimes seconds.

The trick is turning that firehose of data into actionable alerts. CT logs process millions of certificates per day. You need to filter for patterns that look like your domains.

Setting up monitoring with certstream

The simplest way to tap into CT logs in real-time is certstream. It's an open-source library that gives you a WebSocket feed of newly issued certificates. Install it and you can start watching in under five minutes.

pip install certstream

# basic_monitor.py
import certstream
import re

WATCHED_KEYWORDS = ['certguard', 'paypal', 'your-brand']

def callback(message, context):
    if message['message_type'] == 'certificate_update':
        all_domains = message['data']['leaf_cert']['all_domains']
        for domain in all_domains:
            for keyword in WATCHED_KEYWORDS:
                if keyword in domain.lower():
                    print(f"ALERT: {domain}")
                    # send to Slack, PagerDuty, whatever

certstream.listen_for_events(callback, url='wss://certstream.calidog.io/')

This works. It's also incredibly noisy. You'll get hits for legitimate subdomains, unrelated businesses with similar names, and test certificates. The real work is in the filtering.

Smart filtering: reducing noise without missing threats

Raw keyword matching catches maybe 40% of real phishing attempts while generating a mountain of false positives. You need to layer multiple signals.

Levenshtein distance is your friend here. Calculate the edit distance between the discovered domain and your real domains. Anything with an edit distance of 1-3 is suspicious. certgurad.com (two letters swapped) has an edit distance of 2 from certguard.com. That's almost certainly malicious.

from Levenshtein import distance

REAL_DOMAINS = ['certguard.app', 'certguard.com']

def is_suspicious(discovered_domain):
    base = discovered_domain.split('.')[0]
    for real in REAL_DOMAINS:
        real_base = real.split('.')[0]
        d = distance(base, real_base)
        if 0 < d <= 3:
            return True, f"edit_distance={d} from {real}"

    # homoglyph check
    homoglyphs = {'0': 'o', '1': 'l', 'rn': 'm', 'vv': 'w'}
    normalized = base
    for fake, real_char in homoglyphs.items():
        normalized = normalized.replace(fake, real_char)
    if normalized != base:
        for real in REAL_DOMAINS:
            if normalized == real.split('.')[0]:
                return True, "homoglyph_match"

    return False, None

Combine this with TLD analysis. Your brand on a .xyz or .top domain? That's a red flag. Legitimate businesses rarely use those TLDs, but phishers love them because they cost $2 to register.

The timing advantage most teams waste

Here's what makes CT log monitoring genuinely valuable. There's a gap between when a certificate is issued and when the phishing page goes live. Sometimes it's hours. Sometimes days. The attacker needs to set up hosting, configure DNS, build the phishing kit, maybe test it.

If you catch the certificate issuance in that window, you can act preemptively. File a domain takedown request. Add the domain to your email gateway blocklist. Push it to your web proxy's deny list. Alert your SOC. All before a single user sees the phishing page.

I've seen teams cut their phishing incident rate by 60% just by getting takedown requests in before the campaign launched. Registrars are surprisingly responsive when you can show them a lookalike domain with a freshly minted cert.

Scaling beyond a Python script

The certstream approach works for small operations. But if you're monitoring dozens of brands or need historical lookback, you'll want something more robust.

crt.sh is the go-to for historical queries. It's a free database maintained by Sectigo that indexes all CT logs. The API is simple:

# find all certs ever issued for domains matching a pattern
curl -s "https://crt.sh/?q=%25certguard%25&output=json" | \
  jq -r '.[].name_value' | \
  sort -u

Run that weekly. Diff against previous results. New entries are worth investigating.

For production setups, consider feeding CT data into your SIEM. Most modern SIEMs (Splunk, Elastic, Sentinel) can ingest certstream data through a simple forwarder. Then you write correlation rules that combine CT alerts with DNS queries and web proxy logs. A user visiting a domain that triggered a CT alert last week? That's a high-confidence indicator of compromise.

Common mistakes in CT monitoring

Monitoring only exact matches. Attackers don't register certguard.com, they register certguard-secure.com or my-certguard.com or certgu4rd.com. Your matching needs to be fuzzy.

Ignoring wildcard certs. A wildcard certificate for *.evil-certguard.com won't show individual subdomains in CT logs. You'll see the wildcard entry and that's it. The attacker could spin up login.evil-certguard.com, secure.evil-certguard.com, fifty different subdomains, all under one cert.

Not automating the response. Getting an alert is useless if someone has to manually triage it at 2 AM. Build automated workflows: CT alert triggers domain reputation check, reputation check triggers auto-block in email gateway, high-confidence matches trigger auto-takedown request.

Over-monitoring. If you add every possible brand keyword, you'll drown in alerts. Start with your primary domain and top 3 product names. Expand gradually as you tune the false positive rate down.

What about attackers who don't use CT-logged certs?

They can't. Not really. Since April 2018, Chrome requires SCTs (Signed Certificate Timestamps) from CT logs for all new certificates. Safari has similar requirements. If an attacker gets a certificate that isn't CT-logged, modern browsers will reject it with a security warning. Which defeats the purpose of phishing.

There's an edge case with privately-trusted CAs in enterprise environments, but those certs won't work for public-facing phishing. And yes, an attacker could use plain HTTP without a certificate. But browsers now show aggressive "Not Secure" warnings on HTTP pages, and users are increasingly trained to look for the padlock. Most phishing kits include HTTPS setup because the conversion rate tanks without it.

A practical starting point

You don't need a massive infrastructure to start. Run the certstream script on a $5/month VPS. Pipe alerts to a Slack channel. Have someone glance at it once a day. That's already more CT monitoring than 90% of companies do.

Then iterate. Add Levenshtein matching. Add homoglyph detection. Connect it to your ticketing system. Build automated takedown workflows. Each step makes the window between "attacker gets cert" and "domain is blocked" shorter. And that window is where phishing campaigns live or die.