Googlebot ist böse

… aber nur, wenn Webseiten fahrlässig programmiert werden. Zum Beispiel dann, wenn ein CMS nur unzureichend User-Berechtigungen prüft.

A user copied and pasted some content from one page to another, including an „edit“ hyperlink to edit the content on the page. Normally, this wouldn’t be an issue, since an outside user would need to enter a name and password. But, the CMS authentication subsystem didn’t take into account the sophisticated hacking techniques of Google’s spider. Whoops.

As it turns out, Google’s spider doesn’t use cookies, which means that it can easily bypass a check for the „isLoggedOn“ cookie to be „false“. It also doesn’t pay attention to Javascript, which would normally prompt and redirect users who are not logged on. It does, however, follow every hyperlink on every page it finds, including those with „Delete Page“ in the title. Whoops.

Ein nicht ganz ernst gemeinter Vorschlag meinerseits: Wieso den Googlebot nicht ganz einfach per robots.txt ausschliessen? :mrgreen:

(via Google Blogoscoped bzw. Digg)

4 Comments

  1. Mario Aeby 29.03.2006
  2. Marc S. 29.03.2006
  3. BloggingTom 29.03.2006
  4. Marc S. 29.03.2006