# magpie-crawler magpie-crawler is an intelligence-gathering web crawler for Brandwatch, a leading social media monitoring and digital consumer intelligence company. Its purpose is to scan public web pages, blogs, and forums to find mentions of specific keywords, brands, or products that Brandwatch's clients are tracking. It is a well-behaved crawler that respects `robots.txt` directives. Breadcrumb navigation - [Privacy-focused, simple website analytics](https://plainsignal.com/) - [Agents](https://plainsignal.com/agents "Agents, User-Agents, Crawlers, Browsers") - [magpie-crawler](https://plainsignal.com/agents/magpie-crawler) ## What is magpie-crawler? magpie-crawler is a web crawler operated by Brandwatch, a social media monitoring and intelligence company. It functions as a conventional web scraper that systematically indexes public content from blogs, forums, news sites, and social media platforms. The crawler identifies itself in server logs with a user-agent string like `magpie-crawler/1.1`. It is designed for efficient data collection and adheres to standard web protocols, including respecting `robots.txt`, making it a legitimate and transparent crawler. ## Why is magpie-crawler crawling my site? magpie-crawler is visiting your website to collect public information that may be relevant to Brandwatch's clients. It is likely that your site contains content related to brands, products, or industry keywords that are being monitored. The crawler is particularly interested in content that expresses opinions or reviews. The frequency of its visits is determined by the relevance of your content to the monitoring needs of Brandwatch's clients. This crawling is considered authorized as it only accesses publicly available content. ## What is the purpose of magpie-crawler? The purpose of magpie-crawler is to support Brandwatch's social listening and digital consumer intelligence platform. The data it collects helps organizations monitor online mentions of their brands, analyze market trends, and gain insights into consumer behavior. For website owners, there is no direct benefit from being crawled, as the service is designed to benefit Brandwatch's clients. However, the crawler is designed to be respectful of server resources and should not cause performance issues. ## How do I block magpie-crawler? To prevent magpie-crawler from accessing your website, you can add a disallow rule for it in your `robots.txt` file. This is the standard method for managing access for legitimate web crawlers. Add the following lines to your `robots.txt` file to block magpie-crawler: ``` User-agent: magpie-crawler Disallow: / ``` ## Related agents and operators ## Canonical Human friendly, reader version of this article is available at [magpie-crawler](https://plainsignal.com/agents/magpie-crawler) ## Copyright (c) 2025 [PlainSignal](https://plainsignal.com/ "Privacy-focused, simple website analytics")