Resynth

Cloudflare's Email Protection Does Not Work

This isn't the way to fix email spam.


I've recently taken an interest in Cloudflare's wide range of fascinating JavaScript functions. They host scripts like "Rocket Loader", which makes scripts load faster, however I have yet to investigate that.

Let's take a look at one of Cloudflare's long-standing security options: email protection. 🔒

If you browse the web with JavaScript disabled, you may have heard of this before. In short, Cloudflare can obfuscate email addresses, forcing you to enable JavaScript to view them. Apparently, this thwarts bots, but we'll investigate that below.

De-obfuscating the de-obfuscation

(This section is quite advanced; feel free to skip to the next section.)

If you've enabled Cloudflare's email protection on your page, you'll probably notice that all mailto: links, and text following the form of abc@abc.abc is, in fact, obfuscated. Furthermore, a JavaScript file is then injected into your page, which runs the de-obfuscation process.

Here's an example of an obfuscated email:

<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="532136202a3d273b626a67601331323d323d327d373c34">[email&#160;protected]</a>

We'll deobfuscate this below.

Relatively speaking, this seemingly cryptic process is surprisingly simple. If we view the script in the browser, we'll see something along the lines of this (the full script):

!function(){"use strict";function e(e){try{if("undefined"==typeof console)return;"error"in console?console.error(e):console.log(e)}catch(e){}}function t(e){return d.innerHTML='<a href="'+e.replace(/"/g,"&quot;")+'"></a>',d.childNodes[0].getAttribute("href")||""}function r(e,t){var r=e.substr(t,2);return parseInt(r,16)}function n(n,c){for(var o="",a=r(n,c),i=c+2;i<n.length;i+=2){var l=r(n,i)^a;o+=String.fromCharCode(l)}try{o=decodeURIComponent(escape(o))}catch(u){e(u)}return t(o)}function c(t){for(var r=t.querySelectorAll("a"),c=0;c<r.length;c++)try{var o=r[c],a=o.href.indexOf(l);a>-1&&(o.href="mailto:"+n(o.href,a+l.length))}catch(i){e(i)}}function o(t){for(var r=t.querySelectorAll(u),c=0;c<r.length;c++)try{var o=r[c],a=o.parentNode,i=o.getAttribute(f);if(i){var l=n(i,0),d=document.createTextNode(l);a.replaceChild(d,o)}}catch(h){e(h)}}function a(t){for(var r=t.querySelectorAll("template"),n=0;n<r.length;n++)try{i(r[n].content)}catch(c){e(c)}}function i(t){try{c(t),o(t),a(t)}catch(r){e(r)}}var l="/cdn-cgi/l/email-protection#",u=".__cf_email__",f="data-cfemail",d=document.createElement("div");i(document),function(){var e=document.currentScript||document.scripts[document.scripts.length-1];e.parentNode.removeChild(e)}()}();

The script is minified, which means a special script has been used to reduce the size of this script. This makes the script look rather esoteric, but if you beautify this script, it's a lot easier to read.

Let's cut out the bloat, and get into the important parts of the code, which are these three functions, which do most of the de-obfuscation work:

  // The `href` attribute of any masked `<a>` tags.
  var l = "/cdn-cgi/l/email-protection#",
    // A `querySelector` for masked `<a>` tags.
    u = ".__cf_email__",
    // The actual key of the masked data. This is important.
    f = "data-cfemail";
 
  // Takes `e`, and creates a substring of length 2, with `t` being the index of the first character to include in the returned substring.
  function r(e, t) {
    // String.substr :: (start: number, length?: number) => string;
    var r = e.substr(t, 2);
    
    // This parses the number, in hexadecimal form.
    return parseInt(r, 16);
  }
  
  function n(n, c) {
    for (var o = "", a = r(n, c), i = c + 2; i < n.length; i += 2) {
      var l = r(n, i) ^ a;
      
      // Get the character from the UTF-16 code unit.
      o += String.fromCharCode(l);
    }
    try {
      // Pass the output through `escape`, then `decodeURIComponent`.
      o = decodeURIComponent(escape(o));
    } catch (u) {
      // Error logging function.
      e(u);
    }
    return t(o);
  }
  function c(t) {
    // Iterates over all available links in page.
    for (var r = t.querySelectorAll("a"), c = 0; c < r.length; c++)
      try {
        // Get the link.
        var o = r[c],
        
          // Check index of /cdn-cgi/l/email-protection# in string.
          a = o.href.indexOf(l);
        // Make sure the link is right, then...
        // ...set the `href` of the `<a>` to the unpacked email.
        a > -1 && (o.href = "mailto:" + n(o.href, a + l.length));
      } catch (i) {
        // Error logging function.
        e(i);
      }
  }

Firstly, we notice that the de-obfuscation process is as simple as this:

// In this case, `hash` would be `[data-cfemail]`, which contains the hashed form of the email.
function parse (hash, index) {
  const sliced = hash.substr(index, 2);
  return parseInt(sliced, 16);
}

function extract (hash) {
  let output = '';
  
  for (let index = 2, start = parse(hash, 0); index < hash.length; index = index + 2) {
    const charCode = parse(hash, index) ^ start;
    output += String.fromCharCode(charCode);
  }
  
  output = escape(output);
  output = decodeURIComponent(output);
  return output;
}

So citing our earlier example, we're going to be extracting the email from this HTML:

<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="532136202a3d273b626a67601331323d323d327d373c34">[email&#160;protected]</a>

Firstly, we can remove everything but the cfemail attribute, which contains the hash of the email:

532136202a3d273b626a67601331323d323d327d373c34

✳️Secondly, we'll now use our extract function to unpack the email from the hash.

When we do that, we'll end up with "resynth1943@banana.dog"! 🎉️

Let's move on to the algorithm behind it all.

The Algorithm

I'll break down the above JavaScript into pseudocode, for our less JavaScript-ey readers. Here it is:

Function Parse (Hash, Index) Do
  return ParseNumber(
    CreateSubString(Hash) (Index, 2), 16
  )
End 

Function Extract (Hash) Do
  Variable Output = ""
  Variable Start = Parse (Hash, 0)
  Variable Index = 0

  For Each Second Letter of Hash (Letter) Do 
    Variable CharacterCode = Parse (Hash, Index) XOR Start
    Append Output = Get String Of UTF16 Character Code (CharacterCode)

    Increment Index By 2
  End

  Escape Output
  DecodeURI Output
  Return Output
End

The algorithm is surprisingly simple. It has a few "magic" areas, e.g. the XOR and the parse, but the rest is pretty straightforward.

Your Turn

Here's a challenge. Check out a few more ID's from the banana.dog homeserver:

ec958dac8e8d828d828dc288838b
8ff6eeeefffccfedeee1eee1eea1ebe0e8
3f5c5e4d4d565e585a7f5d5e515e515e115b5058
7511101700121b1c1b1f143517141b141b145b111a12
0266676077656c6b6c6863426f6371766d666d6c2c71677065636e2c6d7065
bfeccfd7dacdd6dcded3eccfd7dacddaffddded1ded1de91dbd0d8
65170a070017110a2507040b040b044b010a02

Try and de-obfuscate them! You could also try improving on my de-obfuscation function... 🐾

Conclusion

Cloudflare's attempts to protect emails from spammers are a commendable action, but they do not work. Even if they changed the algorithm, there's a few inherent problems with using JavaScript for de-obfuscation:

  • The spammers could enable JavaScript!
  • The JavaScript can easily be reverse-engineered
  • It requires extra, unnecessary DOM manipulation / computations

Following this, the use of JavaScript places more constraints on people using the Tor Browser, for little to no gain. Depending on the user's threat model, JavaScript may be seen as a danger to anonymity. Many Tor users disable it.

Following this, Cloudflare's email protection seems to follow a predictable pattern. Instead of using natural obfuscation techniques, the use of programmatic, reverse-engineerable algorithms put users who entrust this software at risk. This functionality seems to be a total smoke-screen.

I would definitely not recommend you use this "protection". There are much better alternatives that simply aren't as predictable, such as "resynth1943 @@ tutanota .. com". These are easier to use, do not require any DOM manipulation, useless number-crunching, or JavaScript to be enabled.

Nothing is perfect. There's no fool-proof way to stop email spam, it's just about minimising it.

When you enable this "protection", you're not going to stop spam. You're just slowing it down. For proper anti-spam techniques... that's for another article.

resynth1943.article