Scrape data from websites into Google Sheets

Want to grab data from the internet? Google Sheets has a built-in function called ImportXML which can be used to scrape publicly available structured data from websites. ImportXML imports data from any of various structured data types including XML, HTML, CSV, TSV, and RSS and ATOM XML feeds.

IMPORTXML(url, xpath_query)

url - The URL of the page to examine, including protocol (e.g. http://). The value for url must either be enclosed in quotation marks or be a reference to a cell containing the appropriate text.

xpath_query - The XPath query to run on the structured data.

XPath is a major element in the XSLT standard. XPath can be used to navigate through elements and attributes in an XML document. The xpath_query argument is the XPath query to run on the data of the given URL. XPath is a query language used to retrieve pieces of information from websites.

Google Chrome includes the ability to copy Xpath as part of its DevTools feature. To access it, inspect the element you’re interested in by right-clicking it and selecting Inspect. In the elements panel, right-click the element you’ve highlighted and Copy > Copy XPath.

IMPORTXML("https://en.wikipedia.org/wiki/Moon_landing", "//a/@href") IMPORTXML(A2,B2)

Originally published at https://ronaldsvilcins.com on June 3, 2022.

--

--

I drink a lot of tea and build beautiful websites. If you are looking to work together or just start a conversation 👉 https://www.ronaldsvilcins.com/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ronalds Vilcins

I drink a lot of tea and build beautiful websites. If you are looking to work together or just start a conversation 👉 https://www.ronaldsvilcins.com/