May 14th, 2023 This is the third blog post on the topic of the centralization of the internet. The first post, discussing diversity of authoritative name servers, can be found here, the second post, discussing diversity of MX records, here. That's right, no need to be picky: any certificate authority can sign any domain name, so you can pick from literally hundreds, since that is the number of trusted CA root certificates baked into your browser1 or included in most operating systems:
$ security find-certificate -a \
-Z /System/Library/Keychains/SystemRootCertificates.keychain | \
sed -n -e 's/.*alis"<blob>=//p' | wc -l
166
$ security find-certificate -a \
-Z /System/Library/Keychains/SystemRootCertificates.keychain | \
sed -n -e 's/.*alis"<blob>=//p' | more
"Go Daddy Root Certificate Authority - G2"
"HARICA TLS ECC Root CA 2021"
"SwissSign Platinum CA - G2"
"NAVER Global Root Certification Authority"
"chambersignroot@chambersign.org"
"OISTE WISeKey Global Root GA CA"
"KISA RootCA 1"
"Actalis Authentication Root CA"
"D-TRUST Root CA 3 2013"
"Apple Root CA - G2"
"StartCom Certification Authority G2"
"SSL.com EV Root Certification Authority ECC"
"Hellenic Academic and Research Institutions RootCA 2015"
"ePKI Root Certification Authority"
"AAA Certificate Services"
"VeriSign Class 3 Public Primary Certification Authority - G5"
"VeriSign Class 3 Public Primary Certification Authority - G3"
"Trustis FPS Root CA"
"Apple Root CA - G3"
[...]
But chances are, you really only want a very small number of CAs to do that -- the ones that you have a business relationship with or that you use for free. To solve that problem, the industry has tried a few things with varying degrees of success. Possible Alternatives to CAA RecordsFor a while, we tried to tell the browser which CAs can issue a cert for a given domain via dynamic HTTP Public Key Pinning (HPKP), an HTTP response header (Public-Key-Pins). Like e.g., HTTP Strict Transport Security (HSTS), this does not address the Trust on First Use issue; in addition, it was quickly identified as a pretty big footgun and was again deprecated and support for it removed from the browsers. Except, of course, static HPKP, whereby pins are baked into the browsers remains alive2 (and likely forgotten by the various companies who submitted their pins years ago). Certificate Transparency3 was supposed to make up for (dynamic) HPKP being deprecated, but of course that shifts the defense mechanism from prevention to detection. Monitoring all certs in the logs for all of your domains is far from trivial: accounting for typo- and bitflip squatting, insult domains, and reserving every language variant of your trademark in the almost 1500 TLDs, many large organizations end up with literally thousands of domains to keep track of. No surprise CT Monitoring As A (Paid) Service is now a thing... And of course there's also a solution that works perfectly well but isn't used at all because it depends on DNSSEC: pinning your cert in the DNS using DNS-based Authentication of Named Entities, aka DANE, but that aside... CAA Records...the preventative mechanism that has seen at least some adoption is the use of Certification Authority Authorization or CAA DNS Resource Records, specified in RFC8659. Checking CAA records was made a requirement for Certificate Authorities via CA/B Forum Ballot 187 in 20174. The idea here is that you specify in the CAA records the name of the CAs that you wish to grant authorization to issue certificates for the domain in question. Sounds straight forward, right? Unfortunately, there are a few pitfalls to consider. On the one hand, the determination of the CAA record to use for a given FQDN is performed as a left-to-right first match. This is useful, because it allows you to have different records for sub.domain.example.com and domain.example.com, with perhaps a catch-all record set on the second-level domain (example.com). (And yes, you can have CAA records on a TLD, but as of early May 2023, no TLD has one set.). Where this gets complicated, however, is when it comes to CNAME records. Per RFC2181, a given label in the DNS may not have any other records if it has a CNAME record (except the associated DNSSEC records), and the CAA resolution must follow the CNAME. This gets messy quickly. The other is that you have to have your act together for all of your domains: you need to know which domains are used where and how, which may have subdomains CNAMEd to third parties, which have subdomains you delegate, which you use for internal versus which you use for external use etc. Many large organizations are really, really bad at this. But alright, as so often, it is what it is. Still better than allowing Honest Ahmed and everybody else to issue certs in your domains. So let's take a look at how widely used CAA records actually are. Use of CAA recordsLike before for NS and MX records, I once again pulled down the various gTLD zone files and combined them with whatever ccTLD data I could get my hands on, ending up with just around 214 million domain names in almost 1200 TLDs. In addition, I also took at a look at the Tranco Top 1M Domains list and compared results for all TLDs and the Top 1M domains. In total, fewer than 3 million domains have CAA records; fewer than 50K for the Top 1M domains. That's barely 1.4% of all TLDs or 4.8% of the Top 1M domains -- not that great, adoption wise.
Of those domains that do have CAA records, what do they look like? RFC8659 defines the resource record to be of the format CAA <flags> <tag> <value>. The tag-value portion is called a property, and each domain may have zero, one, or more properties defined. The majority of domains that do have CAA records set appear to use a small number of CAs, commonly <=5, which then adds around 10 records in total, which is indeed the most frequently seen number of CAA records:
Of course there are outliers, too: almost 900 domains have over 20 CAA records, and some domains have even more than 50!
CAA flagsThe flags field should practically be exactly either 0 or 128, as no other values are currently defined. But this being an RFC, it's of course needlessly complicated and easy to misunderstand: the Issuer Critical Flag is bit 0 of the flags field, and not the value of this field. That is, to set bit 0, you have to specify a value of 128; a value of 1 still leaves that bit unset. It's therefor not surprising to find the top flags encountered to be:
(There are an additional 50 other values found, ranging from 2 to 250, with no clear indication what people thought those values might mean.) CAA propertiesRFC8659 defines three different properties: issue, issuewild, and iodef. That's it.5 But of course you won't be surprised to find that across all the domains analyzed, we find over 100 additional words, including different misspellings of those three properties (e.g., issiue, issuewld, iodev) and what seems like guesswork based on expected functionality (e.g., enable). The overwhelming majority of records are, however, correct, and break down across the three valid properties like so:
Not surprisingly, the majority of organizations implementing CAA records want to restrict issuance, with most also utilizing wildcard issuance restrictions. What is a bit surprising, perhaps, is that only a very small number of organizations appears interested in receiving reports of attempted unauthorized issue requests. (But that is likely explained by the fact that RFC6844 makes honoring iodef optional ("...MAY report..."), and at least Let's Encrypt has publicy stated that they do not send mails on failed issuance due to CAA.) The number of domains using any combination of these three properties is shown in more detail in the table below.
iodefRFC8659 defines three valid methods for CAs to report requests for issuance that violate the policy: mailto and http(s). For the most part, domains get this right, and not surprisingly prefer the simpler mailto reporting mechanism:
Most domains have a single iodef record, although some have multiple, while others clearly misunderstood the proper syntax of the RR, and at least one is using the record as a Log4Shell canary: $ host -t caa elevate.services | grep iodef elevate.services has CAA record 0 iodef "mailto:imdomains@intermedia.net" elevate.services has CAA record 0 iodef "mailto:hostmaster@elevate.services" elevate.services has CAA record 0 iodef "mailto:hostmaster@intermedia.net" $ host -t caa smartroom.com | grep iodef smartroom.com has CAA record 0 iodef "comodoca.com" smartroom.com has CAA record 0 iodef "usertrust.com" smartroom.com has CAA record 0 iodef "trust-provider.com" smartroom.com has CAA record 0 iodef "mailto:domains@bmcgroup.com" smartroom.com has CAA record 0 iodef "sectigo.com" $ host -t caa kyhwana.org | grep iodef kyhwana.org has CAA record 0 iodef "mailto:kyhwana@gmail.com" kyhwana.org has CAA record 0 iodef "${jndi:ldap://baylwjkcgkp30xx2ut082owpu.canarytokens.com/a}" $ The most frequently used iodef records are shown below:
Note the dominance of security@yahoo-inc.com for the iodef records. I'm pleased to see this, since setting the right CAA policy and adding default CAA records for all of Yahoo's (many) parked domains was something I pushed for at my time there. Yay! \o/ issue and issuewildOk, so now let's see what CAs the different domains authorize. In total, I found almost 2,200 distinct issue records (for domains in all TLDs, 456 distinct for the Top 1M domains) and 878 issuewild records (all TLDs, 227 Top 1M). The various misspellings and otherwise invalid records aside, the top 20 CAs in these records are:
What you see here shows the overwhelming majority of CAA records using just a handful of CAs. (The use of ; signals that no CA is allowed to issue a certificate for the domain in question; this is used primarily for parked and otherwise unused domains.) But recall what RFC8659 says about the meaning of these records: If the issue Property Tag is present in the Relevant RRset for an FQDN, it is a request that Issuers: 1. Perform CAA issue restriction processing for the FQDN, and 2. Grant authorization to issue certificates containing that FQDN to the holder of the issuer-domain-name or a party acting under the explicit authority of the holder of the issuer-domain-name. (Emphasis mine.) Who is the "holder of the issuer-domain-name" for geotrust.com, rapidssl.com, or thawte.com? That's right: DigiCert. That is, by specifying, say, geotrust.com in your CAA record, you are implicitly also granting the various DigiCert subsidiaries authorization. So we can collate many of the above records, which then gives us a breakdown of the most popular CAs used in CAA issue and issuewild records:
Or, if you prefer Pareto charts:
ExtensionsAs noted above, even authorizing a given CA can still end up being rather broad, and you may well want to have much tighter restrictions, such as specifying which specific account under a given CA may request certificates for a domain, or how the CA should validate the request. For this, RFC8657 specifies a few extensions: the accounturi parameter and the validationmethods parameter. There is also a draft on Signed HTTP Exchanges within the Web Packages group that adds another parameter: cansignhttpexchanges. As of May 2023, it looks like the only CAs supporting this parameter are digicert.com and pki.goog (see e.g., DigiCert's documentation as well as a discussion on the Let's Encrypt forum), although I also saw a very small number of domains setting this parameter on records authorizing letsencrypt.org, sectigo.com, amazon.com, and globalsign.com. (I'm guessing those were set, but not honored.) In addition, I encountered three more extension parameters that appear to not be well documented: policy=ev (found only in combination with comodo.com), root=g1-class3 (found only in combination with cacert.org), and account= (found only in combination with letsencrypt.org, digicert.com, cacert.org, and Amazon's CAs). It is not clear to me whether these are actually supported by the different CAs, or if they are opportunistically or mistakenly set by the domain owner The use of these extension parameters broken down by number of domains using them looks like this:
validationmethods encountered were dns-01 (dominant), http-01 and tls-alpn-01; accounturis were primarily under https://acme-v02.api.letsencrypt.org/, with just a handful under https://acme-v01.api.letsencrypt.org/ and https://acme-staging-v02.api.letsencrypt.org/. SummaryHaving analyzed around 214 million domain names, here are my main findings: CAA records are still not widely used. Most people don't set iodef. Extensions are not widely used. A small number of CAs dominate. Even though this only covers the small percentage of domains that do set CAA records, I would not be surprised if the overall use of CAs across all domains followed a similar distribution. (In some markets, regional players will play a bigger role; once again the inability to get access to all ccTLD zones makes this difficult to assess.) If you're wondering whether you really need to have over 160 different CAs in your trust bundle, I suspect the answer is "no"; you could likely get away with fewer than 20 and wouldn't notice the difference. But whether that's a good thing, whether it's wise for the entire internet to place all -- well, >99% -- of its certificates/eggs into fewer than 10 CAs/baskets seems more than questionable. May 14th, 2023 Footnotes: [1] I use the term "browsers" here as if all browsers implemented the same features. Of course there are differences, but since basically all browsers are Chrome now anyway, they are, sadly, becoming increasingly less relevant. Whatever Chrome does is what "the browsers" do now. [2] NB: as of 2023-05-16, it looks like only Google, Facebook, the TorProject, and Yahoo have static pins in Chrome. Considering that changing or updating your static pins requires the release and propagation across all markets you care about -- of multiple browsers, no less -- it might be time to deprecate that, too. [3] CT is nowadays enforced in the browsers1, which is why the Expect-CT header, defined in RFC9163, was pretty short-lived. [4] It's worth noting that compliance with CAA records, like Certificate Transparency and some other restrictions, is not required for root certs that you (or your organization's IT policy) installed in your trust bundle yourself. [5] CA/B Forum Ballot SC13 and Ballot SC14 added contactemail and contactphone to allow domain owners to provide information that increasingly is hidden in WHOIS. But these are not defined in the RFC and very rarely used: not only 741 out of all domains observed used contactemail (54 out of the Top 1M Domains), 23 contactphone (3 out of the Top 1M Domains). Links: |