What You Should Know About Trackback Spam

What You Should Know About Trackback Spam

Trackback facilitates communication between blogs. When a blogger writes a new entry whether to comment on or refer to an entry found at another blog, the commenting blogger can notify the other blog with a Trackback ping. The receiving blog will display summaries and links to all the commenting entries below the original entry. Trackback spam is when Trackback pings to a site that directs viewers to a totally unrelated URL.

Trackback Explained

Trackback as initially released is an open specification both as a protocol and as a feature of Movable Type 2.2. This contained the first implementation of Trackback. It has always been planned as an open system or a system that could easily be implemented in other blogging tools. This is because of the fact that the real value of Trackback can only be realized when many sites support it.

Basically, Trackback is designed to provide a method of notification between websites. This is a way of one person saying to another that “This is something you may be interested in”. This can be done when a person sends a Trackback ping to the other.

Trackback is a form of remote comments where one person who wishes to comment on a post in another person’s blog writes a post on his own weblog rather than posting the comment directly on the other person’s weblog. The person commenting simply sends a Trackback ping to notify the other. Of course, this is only possible when both blogging tools support the Trackback protocol.

Trackback is likewise a form of content aggregation. When a person writes a post on a topic that a group of people are interested in, he/she sends a Trackback ping to a central server whereby all visitors can read all posts about the topic. Anyone interested in reading about a specific topic could look at the site to continue being updated on what bloggers have to say about it.

Blogging software that supports the Trackback protocol displays a “Trackback URL” along with every entry. This URL is used by the commenting blogger to send XML-formatted information about the new entry to this URL through his/her software. There are some blogging tools that are able to discover this Trackback URL automatically while others require the manual entry by the commenting blogger.

The protocol of Trackback is based on the principle of initiating the connection when sharing of information is desired rather than waiting for this same information to be discovered by other websites. Sites can communicate about related resources and are able to accomplish the automatic listing of all sites that have referenced a particular post. The ping also provides a firm, explicit link between entries as opposed to an implicit link that depends upon outside action.

Trackback is particularly useful in finding out whether other people are thinking well enough about what a person has written on a weblog to actually link to it. However, allowing Trackback links will require more site maintenance to remove Trackback links that are no longer valid. The capability of listing anyone who has placed a link to a site on one’s blog can be abused by spammers.

Trackback Spam

The flood of Trackback initiated by spammers can put a strain on server resources. The amount of Trackback spam that a site is getting may be seen by clicking on “Trackbacks” from the main blog menu and select “Junk Trackbacks”. The repeated pinging of one’s server even at hundreds of times an hour by spammers can cause server CPU overloads and crashes and can result to having the web host shut down the affected account.

There are some defensive measures that can be taken, one is to moderate all Trackbacks. The MT 3.2 allows for approval of all Trackbacks before being posted to a site. Closely related to this is the limiting of unnecessary Trackback usage. Trackbacks are pointless when no one tracks a site back. Not everything needs to be “Trackbackable” so prudence in determining the difference is required. This move is all about giving the spammers less opportunities to play at one’s expense.

The use of the powerful anti-spam Movable Type Plugin called Spam Lookup is another option. Being hit by a flood of Trackback spams can be stopped by looking for the common unwanted words or specific strings to block. Spam Lookup uses PERL Regular Expressions thus by adding a few characters to the keywords; more flexibility in what is being blocked is attained.

Spam Lookup can be configured at the blog or installation level. Configuring at the installation level is suitable for those who have just one blog or want any setting to apply across all the blogs on one’s installation of Movable Type. When settings are intended only to apply to one blog, one can configure Spam Lookup using the Plugins Tab of the Settings Item on the weblog menu.

The plugin has three options in its anti-spam arsenal. It looks up the source IP address of the comment or Trackback and compares it with several centralized blacklist servers. There is an option to force moderation of the comment and adjust its junk status when the IP address is found on the blacklist server. It also looks up the domain names of the posted links. The plugin is likewise able to compare the IP of the source URL of the Trackback with the IP it was sent from. The blog software sending the ping is usually on the same server as the blog itself. Most spams are sent from zombie machines and not from the website thus this sort of spam can be detected.

Link settings are also looked into. A comment that has no links is unlikely to be spam as blog spams generally aim to link to a dodgy site to improve its rankings in search engines. Any comment or Trackback that has more than a certain number of links shall be forcibly moderated.

The keyword filter setting act upon keywords in comments and will replicate some functionality of MT-Blacklist. This is an incredibly powerful feature except that the plugin, by default hardly has any keywords in it. The WordPress Wiki is a good place to find a list that can be pasted in.

The Trackback validator plugin for WordPress performs a simple but very effective test on all Trackbacks in order to stop spam. The plugin retrieves the web page located at the URL included in the Trackback when one is received. The Trackback is approved when the page contains a link to one’s weblog. If the page does not link, the Trackback is flagged as spam and rejected. Since Trackback spammers do not set up custom web pages linking to the weblogs they attack, this test would quickly reveal illegitimate Trackbacks.