# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
# Mainly to reduce server load from bots, we block pages which are actions, and
# searches. We also block /feed/, as RSS readers (rightly, I think) don't seem
# to check robots.txt.
# Note: Can delay Bing's crawler with:
# Crawl-delay: 1
# http://www.bing.com/community/blogs/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx
# This file uses the non-standard extension characters * and $, which
# are supported by Google and Yahoo! and the 'Allow' directive, which
# is supported by Google.
# http://code.google.com/web/controlcrawlindex/docs/robots_txt.html
# http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html
User-agent: *
Disallow: */annotate/*
Disallow: */new/*
Disallow: */search/*
Disallow: */similar/*
Disallow: */track/*
Disallow: */upload/*
Disallow: */user/contact/*
Disallow: */feed/*
Disallow: */profile/*
Disallow: */signin*
Allow: */request/*/response/*/attach/*
Disallow: */request/*/response/*
Disallow: */body/*/view_email$
Disallow: */change_request/*
Disallow: */outgoing_messages/*/mail_server_logs*
# The following adding Jan 2012 to stop robots crawling pages
# generated in error (see
# https://github.com/mysociety/alaveteli/issues/311). Can be removed
# later in 2012 when the error pages have been dropped from the index
Disallow: *.json.j*