If you run a blog or Wiki you will be only too aware of the Google PageRank™ sytem. In case you have been on a rather extended holiday and/or in a long coma, it’s a system whereby your site climbs the search engine results page if lot’s of other people are linking to you. It’s not quite that simple, but that’s the gist. In competition with
each other to promote sales of Viagra, or to get people hooked on gambling, various crooked characters deface public sites with gay abandon. They leave a trail of links pointing at their own sites, often with Chinese titles, all to boost their own PageRank. All to climb Google.
These comment spammers are not nice people.
They will happily destroy the content of a Wiki and overwrite every page. If they don’t get every one, then it’s usually because their script is too stupid to keep track of the pages it’s already written over and so cannot get to the now newly orphaned pages. These scripts hammer the site while they operate. Not only that, but the frequency
of attacks is now at epidemic proportions. I get about three separate attacks a day on my blog and about five major attacks a day on this PageRank 7 wiki. Faced with the brutality and increasing frequency of these incursions, ISPs can take down servers believing that are under a denial of service
attack. Even if they understand the phenomena, such attacks cause too much server load for the value of having the small blog customer. ISPs are starting to ban the use of tools like WordPress and MoveableType on their end user accounts.
OK, it’s not just Google to blame here, but all of the search engines. It’s just that Google’s system is the most well known and this has historically made it the main spam target. In a tacit acknowledgement of this, Google have decided to help the bloggers.
Er…sort of.
Their solution is to allow you to take away the PageRank value of selected links. If you are maintaining Wiki/blog software then comment field links should have a “rel” attribute (uh?) set to “nofollow". That way the spammers will lose the incentive to spam you, because they will get no benefit from the links they leave. “Drat” they say, as
the abandon their get rich quick scheme and go off to earn an honest wage.
The plan is so idiotic it’s almost surreal. It obviously in no way penalises the spammers, who are playing a percentage game anyway. So what if a few spams are ploughed into stoney ground? It does make the engine spider’s life a little easier of course, because it can spend less time indexing blogs. Lucky old engines, poor old webmasters
who are expected to upgrade all of their software. Software that has been heavily customised and, given that few of these applications are design masterpieces, heavily hacked. I certainly won’t be upgrading when there is zero benefit. Even if I do, the new attribute has to survive RSS feeds and some old and not so smart news aggregators.
Really I won’t have time anyway because I am too busy fighting spam.
What’s even more surreal though, is that the software authors are jumping on board and working on adding this as a feature. There is even talk of making it part of the HTML standard. This attribute is about as useful as the blink tag.
Suppose the engines had tackled it differently. Suppose that when your site was spammed, you could dispatch the content of the spam straight to Google, Yahoo, etc. They could then ban all of the links promoted with the dubious posting. A sort of “SpamBack". This changes the market forces significantly from the peddlars point of view. Far
from ploughing on less furtile ground, they are now ploughing a mindfield. Rather than one hundred percent of everybody having to manually instruct the GoogleBot, all it would take would be a small percentage of spam aware applications to fight back. The spammers could not risk dumb spamming for fear of tripping these alarms.
I bet there are other simple solutions as well.
So how did Google get it wrong? There are smart people in Google, so did they not allocate enough time to this? Perhaps they lost touch? Can you see blogger peons working in a lowly office from the hallowed windows of a “plex"? Perhaps the Google blog could explain as it’s hardly a public relations coup. Whole sites have sprung up against “nofollow”.