Wednesday, April 01, 2009

clues that a URL is probably going to be a dead link..

.. without even having to click on it. In no particular order.
  1. It's at an .edu domain - Students usually lose their school-hosted web pages when they graduate, you know. Plus, any such link is probably at least five years old. Current college students don't bother with these kinds of pages, they just use social networking sites for everything. And what if it's not a student page but an official university page? Still probably dead, because university webmasters move shit around with no regards for concepts like "cool URLs don't change".
  2. It's at an ISP's domain. AOL, Comcast, Cox, RR, etc. This means that if the person has ever changed service providers, which is likely, the link will be dead because their old account will be canceled.
  3. It contains a tilde - Firstly, any such URL is probably also a few years old, since not nearly as many sites create URLs like this as used to. Then, there's the very nature of the thing. This indicates, in general, the web space that comes with a Unix shell account, and that the person who authored the web page is not the owner of the domain, but just a user of a multi-user system. Most such sites are either educational or ISP related (see the first two bullets in this list). And even if they're not, it probably indicates something else ephemeral, such as an employee's homepage on an employer's website.
  4. Geocities. Actually, this is just as likely to turn up a page that still exists but has not been updated since about 2001. (I myself am guilty here, I have never taken down my geocities page)
  5. It's from a domain that is easily recognizable as a shitty small town newspaper. These still do not get such concepts as "permalink".
  6. It contains obvious implementation details, such as ending with "/cgi-bin/". The less the people who built the site care about hiding this kind of sauage-making, the less likely they are to care about more advanced concepts like permalinks. That URL will almost certainly break if they swicth from .NET to LAMP or vice versa. There is probably also some hierarchy of what language-or-platform-indicating-extentions (.jsp, .pl, .php, .cfm, .asp, etc) are more likely to be dead than others.
  7. It's a god damn IP address. This really means you shouldn't click on it at all, because it's probably phishing or worse. But if you do, it'll probably be dead anyway.


8. If it links to the main splash page of a well-known site. You have no idea how many times I've clicked some jackass's two-year-old link to an awesome comic, that leads me straight to or