The recent brouhaha with Site Finder

Peter Seebach ([mailto:crankyuser@seebs.plethora.net?cc=&subject=The recent brouhaha with Site Finder] crankyuser@seebs.plethora.net) Freelance writer 24 October 2003

Abstract: When VeriSign launched its Site Finder service, it included a number of side effects. One was the redirection of mistyped domain names to its own Web site. Under pressure from the Internet Corporation for Assigned Names and Numbers (ICANN), VeriSign has suspended the new service pending further discussion. This article addresses some of the effects of the service.

When people talk about usability, they normally think in terms of end-user applications. The Internet's underlying infrastructure is generally not an issue. Packets move from one place to another; host names are resolved; e-mail is delivered. These things happen, without much user attention, because the underlying infrastructure has been carefully designed with robustness in mind, even in the face of large-scale catastrophic hardware failures.

Many contend that the Internet infrastructure has not been designed to withstand changes to the code that handles standard name service. On September 14th and 15th of this year, VeriSign -- one of the companies responsible for maintaining this name service -- introduced such changes, in the form of edits to the root zone files, causing invalid domain names in the .com and .net Top-Level Domains (TLDs) to resolve, not to an NXDOMAIN error, but to a machine that hosted the VeriSign Site Finder service.

How it works

A brief summary of how this works is perhaps in order. Name service is a broadly distributed service. In general, when you need to look up a host name, you send a query to the root name servers -- a set of name servers distributed around the world and run by multiple organizations.

These name servers normally don't themselves give you an answer. Instead, they provide directions to the authoritative name servers for the domain in question. For example, if you look up www-106.ibm.com, the root name servers will direct you to the servers for the .com TLD, which in turn directs you to IBM's name servers, which actually look up the host for you.

A TLD's zone file contains the records needed to allow the root servers to deliver this information reliably. What VeriSign did was very simple: At the end of this file, it attached a wildcard record -- a record that matched any string entered -- that returned the address of a VeriSign machine. So, if a domain doesn't match one of the specific rules, the wildcard record is used. If no authoritative name server exists for the host you looked up, instead of giving you the traditional error message, it returns a Web page that, as VeriSign describes in a recent press release, "offers a search box, a 'Did You Mean?' listing of similar domain names, and a listing of popular categories related to the search request."

This might seem harmless enough. All VeriSign is doing is catching every single typo anyone will ever make and redirecting it to their own search engine page. There's certainly some validity to VeriSign's claim that "Internet users consider the service a helpful tool to navigate the Web." But VeriSign also enjoyed an apparently substantial benefit itself: In a single day, without spending a penny on advertising, Site Finder became one of the most widely used sites in the world. But what's the impact?

Estimating the impact

All the effects are difficult to assess, because they are so widespread. Most early coverage of this issue talks only about the effect on Web sites. However, e-mail also is affected. Anything, anywhere, that uses DNS, can be affected by this issue.

Let's look at a few real-world cases. Sometimes, I'll visit an old bookmark and find that the company is gone. Often, a domain squatter has replaced the company. If I have trouble reaching a page, I might write it off as a temporary failure. However, if I try to visit a Web site and find some domain squatter's "Find what you're looking for" page, I know the company has failed and the site is no longer live.

For example, suppose you tried to visit my company's page, but entered a typo. Traditionally this would direct you to an error page indicating "that domain name is invalid" -- encouraging you to review and correct your typo. But if, instead, you are directed to the Site Finder Web page, you might conclude that my domain has been acquired by VeriSign, and that my company has gone bust. You might know better, but you might be misled. This is not helpful, and hard to work around.

E-mail suffers worse. A traditional tactic of spammers is to put invalid domains in e-mail, so that mail to them bounces. This allows users to filter all mail with invalid domains in the headers; this solution captures only a small percentage of spam, but it's very close to 100% reliable and substantially more accurate than nearly any other filter. With Site Finder active, such filters didn't work at all.

I spoke with a postmaster at a large company who explained the difficulty.

The real issue is forged domains in sender envelopes. It was the first line of defense (second, if you use the RBLs -- Realtime Blackhole Lists -- to block the connection itself). Now, Joe Spammer can send MAIL FROM:yourmother@verisign.haha.com and you have no way to differentiate that domain from momandpopisp.com. It returns an A record, so you accept it and pass it on. At my job, we have three to four tiers of Mail Transfer Agents, or MTAs. The border is used to block the worst of things. Now, we have to accept it, and it gets two to three levels in before being undeliverable. And now we have to pass the bounce back [to VeriSign's MTA, which rejects everything]." My machines are NOT happy today. This is worse than SoBig as far as our machines' performance is concerned.

The SoBig email worm was a fairly major load on mail servers worldwide. Site Finder had greater effect than SoBig, for at least one major company. And, according to the Postmaster I just quoted, that was just on the first day that it was in use.

It gets worse. DNS blacklists, used to reject spam, work by doing DNS lookups. For instance, if I want to know whether a given IP address, say 1.2.3.4, is in the MAPS (Mail Abuse Protection System) RBL, I query its DNS, say 4.3.2.1.blackholes.mail-abuse.org. If this name resolves, then the address is in the RBL; if it doesn't, the address is not in the RBL.

Some blacklists have gone out of service and no longer have authoritative name servers. For instance, a DNS Block List, or DNSBL, called "Dorkslayers," was quite popular for a while, but is now defunct. When Dorkslayers stopped answering queries, all queries came back negative. Until September 15th. When Site Finder was active, any attempt to query an address anywhere in the "dorkslayers.com" domain came back with VeriSign's address. That means that any lookup of any IP address -- including the Web server that served you this page, or the machine you're reading it on -- will indicate that it's a known spam source.

That's just the tip of the iceberg. Other side effects turn up. The entire Internet is, arguably, founded in part on the assumption that a DNS query for a nonexistent host will return empty-handed. In fact, some DNS software maintainers have already produced patches to treat VeriSign's search engine as an invalid result. These patches are unneeded, for the moment: VeriSign has turned off the questionable entries. However, the patches will probably stay around in case the issue resurfaces. VeriSign, on the other hand, indicates it may revive Site Finder. Executives at the company stated they were "...considering turning on Site Finder again, but disabling the 'wild card' service for e-mail deliveries to nonexistent domains, which could solve many of the e-mail problems that ICANN [Internet Committee for Assigned Names and Numbers] described." This is curious, because Site Finder was never email-specific to begin with.

Systems that use other name resolution methods often used DNS first, then fell through to try other methods if DNS failed. With Site Finder up, the correct information kept in the other databases -- such as NetInfo, NIS, or regular hardcoded hosts files -- was never found.

Other issues arose for users who don't primarily use English. If your Web browser can't find a page, the chances are the error message will be in your language of choice. The Site Finder page was in English only.

Finally, even privacy becomes a concern. The Site Finder page passed user information to a company called Omniture, which provides various data-mining services. This was not something made obvious to the user. Worse, the site's policies said that, by using the site, you agreed to its terms. This can be justifiable (if a bit hard to manage chronologically) for a site you go to on purpose, but for a site you are most likely to hit while trying to go somewhere else, it's a serious matter.

What can you do?

The Internet was designed to be secure against catastrophes, but its design did not anticipate the type of changes VeriSign has introduced. A few technical measures can be taken. New versions of the ISC's BIND software won't accept address records from top-level name servers by default; they will require that the root zones delegate only to other name servers, as designed.

In the long run, the technical problems with wildcards in top-level domains are probably insurmountable; more importantly, any usability benefit to users from such things is unclear. It's easy enough for a Web browser to implement such a feature, and the resulting feature can be enabled or disabled by users as they wish. Wildcard DNS records, by contrast, affect every aspect of network functionality, not just the browser; in doing so, they have a substantial negative impact on quality of service for all services. As much as I hope the case is now closed, VeriSign seems to indicate otherwise.

Resources

About the author

Photo of Peter Seebach Peter Seebach has been having trouble navigating through badly designed pages since before frames and JavaScript technology existed. He continues to believe that, some day, pages will be designed to be usable, rather than designed to look impressive. You can reach him at [file://localhost/home/seebs/ic/cu/crankyuser@seebs.plethora.net] crankyuser@seebs.plethora.net.