Scraping Automations

Automate SEO workflows

Scraping is easy to automate with Unbuild AI. Combine our pre-built SEO automations to create custom SEO tools.

Build your custom SEO tool in 5 min.   🚀

Scrape Google Search results for a list of keywords. First creates Google Search queries based on the keywords and settings, then gets the Google Search results for the queries.

Get the Google Search autocomplete suggestions for a list of keywords. The autocomplete suggestions are based on the most popular searches related to the keyword. Allows setting the country, language, and other settings.

Find the sitemaps for a list of domains and extract the URLs from them. Automatically looks for the sitemaps in common locations like /sitemap.xml, /sitemap_index.xml, /robots.txt, etc. Also follows the sitemap index files to extract all URLs.

Scrape a list of URLs. Returns the HTTP status code, content, headers and other information of the URLs. Allows setting the timeout.

Get the Google Search related searches for a list of keywords. The related keywords are based on "searches related to" SERP element for the keyword. Allows setting the recursion depth, country, language, and other settings.

Scrape links from a list of URLs. Returns the source URL, target URL, rel attribute, external/internal, target attribute, relative/absolute, and hashbang of the links. Allows setting the timeout.

Remove duplicate URLs from a list of URLs. Allows setting if the URL should be considered as duplicate if the URL is the same or if the URL is the same after removing the protocol and www.

Parse the domains from a list of URLs. Allows setting if the subdomains should be included in the domain.

Remove duplicate domains from a list of domains. Allows setting if the domain should be considered as duplicate if the domain is the same or if the domain is the same after removing the subdomain.

Scrape a list of URLs with a browser. Usefull for when javascript needs to be executed or to circumvent anti-scraping measures. Returns the HTTP status code, rendered browser content, headers and other information of the URLs. Allows setting the timeout.