RE/pinboard

Vancouver Data Blog by Neil McGuigan: Web scraping with Google Spreadsheets and XPath

In this first video, I show how to grab parts of a web page (scraping) using Google Docs Spreadsheets and XPath. Google Spreadsheets has a nice function called importXML which will read in a web page. You can then apply an XPath to that page, to grab various parts of it, such as one particular value, or all of the hyperlinks. This is a convenient method, as your data will be in a format that is easily downloadable in Excel.