Why we crawl
To help people discover and do what they love, we’re creating a database of billions of Pins on Pinterest. In order to make sure that we protect our users and provide the highest quality content we can, we use web crawlers to help us identify the data behind the Pins.
These pages contain rich signals that enable us to infer better recommendations, fight spam, and display useful information. This helps to create a rich, relevant, and safe experience for Pinners and Partners. To take full advantage of these signals, we regularly fetch, store, and process page content associated with Pins.
How Pinterest accesses your site
When the Pinterest crawler visits your website, it will send a valid Pinterest User-Agent and will connect from a network which is operated by Pinterest. In addition to respecting the Robots Exclusion Standard, the Pinterest crawler is configured to automatically rate limit concurrent requests made to your website in order to reduce the burden of additional load.
- Pinterest/0.2 (+https://www.pinterest.com/bot.html)
- Mozilla/5.0 (compatible; Pinterestbot/1.0; +https://www.pinterest.com/bot.html)
- Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Pinterestbot/1.0; +https://www.pinterest.com/bot.html)
How to verify the Pinterest Crawler
A genuine Pinterest crawler will connect from a network which is operated by Pinterest. We recommend that webmasters avoid hard-coding these IP addresses in their site configuration as the addresses that the crawler uses may change in the future without notice.
You can perform the following steps to verify the Pinterest crawler:
To verify the Pinterest Crawler:
1. Using the host command, run a reverse DNS lookup on the IP address from your logs.
2. Verify that the domain name in the response ends with pinterest.com.
3. Again, using the host command, run a forward DNS lookup on the response retrieved from step 1.
4. Verify that it is the same as the IP address from step 1.
> host 184.108.40.206
220.127.116.11.in-addr.arpa domain name pointer crawl-54-236-1-11.pinterest.com.
> host crawl-54-236-1-11.pinterest.com
crawl-54-236-1-11.pinterest.com has address 18.104.22.168
If you receive a consistent volume of traffic from a client sending a valid Pinterest User-Agent but it does not pass the above DNS test, please open a support ticket.
How to restrict Pinterest from accessing your site
In order to modify the behavior of the Pinterest crawler, you will need to update your site’s robots.txt file. The Pinterest crawler obeys the following directives:
Reducing Crawl Rate
If you would like to increase the number of seconds to wait between subsequent visits to your site, you can use the Crawl-Delay directive.
Delay subsequent visits to 10 seconds apart
Crawl delay: 10
Block one file
Block one directory
Block all access