Christian Huitema's blog

Cloudy sky, waves on the sea, the sun is
shining

DMARC or not, can email evolve?

21 Apr 2014

Many years ago, I worked on email standards, developing for example a gateway between SMTP/TCP-IP and X.400. We used it in the very early years of the Internet, from 1983 to about 1990, when European research networks finally gave up on the OSI/ITU standards and fully embraced the Internet. Since then, there have been hits and misses. The big success was the standardization of MIME, which allowed multimedia content. Everybody uses it to send pictures or presentations as email attachments. The big failure has been security.

Email security has progressed in the last 20 years, but only very slowly. The Internet community IETF made several attempts at defining "secure mail" extensions. "Privacy Enhanced Mail" never got deployed. S-MIME got widely implemented but hardly anybody uses it. PGP is popular among security experts and paranoid users, but the general public pretty much ignores it. The vast majority of email is SPAM, despite attempts at filtering, blacklisting, and origin verification protocols like SPF or DKIM. And then, there is "phishing," which turns out to be a very vicious threat.

Phishing operates by sending an email to the target, the "phish." The email will typically contain some kind of trap, maybe an attachment that conceals a virus, or maybe a link to an infected web site. If the phishes open the attachment or download the virus, they are hooked. In a large part, phishing is enabled by the very weak "origin control" in email messages. It is the same weakness that we used to exploit for practical jokes, such as sending this kind of email to a newly hired coworker:

From: Bill Gates

To: John Newbee

Subject: New Hire Welcome Party

Please join me for the new hire welcome party today at 5 pm in the Building 7 lobby.

Hilarity ensued when the poor guy prepared for and tried to join the non-existent party. (There is no building 7 on the Microsoft campus.) The prank was very easy to do: just telnet to port 25 of an email server, and tell that server that there was an incoming mail from "billg@microsoft.com." Back then, the message was just accepted, no question asked. Things have progressed somewhat with the deployment of DKIM and SPF, and the "real" sender of message will be tracked. The same prank is still doable, but it will require creating a relay domain under the control of the prankster, plus a fair amount of configuration of that domain. If the sender field is displayed, the receiver will have more information:

Sender: Postmaster <postmaster @ mailrelay.example.net>

From: Bill Gates <billg @ microsoft.com>

To: John Newbee

Subject: New Hire Welcome Party

Please join me for the new hire welcome party today at 5 pm in the Building 7 lobby.

In theory, an alert reader will notice that the "sender" and "from" fields do not match, and would discard the message as a joke. But in practice, many people will still be fooled. Some mail programs may not have a good convention for displaying the sender information, and some do not display it at all. Users may or may not understand the significance of the "sender" field.

Phishing is often a numbers' game, in which the attackers may try many potential phishes in a target organization. They only need to hook one of them to get a beach head inside the target, and proceed from there on their way to the trove of secrets that they are coveting. A single lapse of judgment by one user is enough to compromise the whole organization. And that's why we see know a proposed escalation on top of SPF and DKIM, with DMARC.

DMARC allows domain owners to set policy on the handling of email coming from their domains. It basically directs email recipients to use SPF and DKIM to check the origin of the email, and to verify that the "sender" domain matches the "from" information. If there is a mismatch, the recipients are instructed to either flag the message for further inspection, or possibly to reject it outright. The sending domains chooses between "flag" or "reject" policy, and the receivers are expected to just follow orders.

In theory, DMARC would make some kind of phishing much harder. In practice, it turns out to be incompatible with the existing practice of remailers and mailing lists. For example, if I send a message to the IETF mailing list, the recipients will see it appear as:

Sender: ietf <ietf-bounces@ietf.org>

From: Christian Huitema <huitema@microsoft.com>

This is exactly the same relay structure that could be abused by phishers, so of course it is incompatible with DMARC. All mail delivered by such mailing lists will thus be flagged as "potential phishing message" by the DMARC filter. Yahoo went one step further, and asked DMARC compliant recipients to automatically reject such messages, causing a great uproar among mailing list managers. Just check the IETF list archive, and you will see a hundreds of messages in a few days, most stating the "yahoo broke mailing list, they need to fix it," with a few DMARC supporters asserting that "mailing lists are stuck in the 80's, they need to evolve."

The problem of course is that mailing lists cannot evolve easily. Many mailing list agents reformat the subject field or adding some text to the email, which breaks DKIM. All mailing list operation include sending the message through a relay that has no relation with the "from" address, which either breaks SPF or requires different "sender" and "from" addresses, breaking DMARC "same origin" rule. The only way for list agents to comply would be to completely rewrite the message and create a new origin within the mailing list domain, something like:

Sender: ietf <ietf-bounces@ietf.org>

From: Christian Huitema <huitema@microsoft.com>

From: Christian Huitema <ietf-christian-huitema@ietf.org>

Reply-To: Christian Huitema <huitema@microsoft.com>

Such rewrite would comply with DMARC because the origin of the message would be clearly in the IETF domain. Of course, we would want private replies to also work, and for that mailing list agent would need to put the original sender address in the "reply-to" field. In short, we would get "DMARC compliant mailing list" that works, at some increased operational cost. But the problem is that the result is not really different from another form of phishing, in which the phisher creates addresses in his own domain, such as:

From: Christian Huitema <christian-huitema@all-your-mail-belong-to-us.info>

By encouraging mailing lists to rewrite addresses, we would encourage users to disregard the "email address" part of the "from" field, because it varies from list to list. Instead, users would just look at the common name, which can be easily forged by phishers. Strict DMARC policy would cause mail agents to create work around that make phishing easier, not harder.

I don't know what will happen next. Reasonable minds would think that Yahoo would revert their DMARC policy from "reject" to "flag." After all, once a message is flagged, inspection software can reasonably be expected to make a difference between a reputable mailing list and a phishing domain. The software could learn the mailing lists that a user is subscribed to. At the same time, we may expect mailing list practices to evolve and not break DKIM, for example by placing the mailing list specific information in a new header instead of rewriting subject and message fields that are covered by the DKIM signature. All that will probably happen within the next month, and hopefully phishing will become harder.

The main lesson of this debate is that changing an old standard is really hard. Email has been around since the beginnings of the Internet, and each little detail of the e-mail headers is probably there so serve some group of users somewhere. Security changes have to accommodate all these consistencies. That will be slow!