Digital Methods and Tools Seminar Series #3 – Introduction to Web Scraping

Wednesday, 15 August, 1pm in Locke 611A

Dr Christopher Thomson (English & Digital Humanities)

Chris will introduce web scraping, an approach to collecting research data and automating research tasks. First we’ll briefly consider types of data that may interest us, and ask when web scraping may be the right approach for collecting them. Second, we’ll cover some concepts needed to understand how web scraping works. Then we’ll put these ideas into practice with the Web Scraper extension for the Chrome browser (https://tinyurl.com/o9cncoa). We’ll collect some texts that could be used for discourse analysis, as described in Donald’s talk last week. This will be more a ‘walk-through’ than an interactive tutorial, but you might like to bring your laptop with the extension installed if you would like to follow along. If there’s time, we’ll also identify some limitations we are likely to encounter, and provide some starting points for programming your own web scraper.