Post contributed by Owen Avery ‘25, Digital Scholarship Assistant
Almost 500 million tweets are sent per day, and thanks to Twitter’s fairly generous API (a tool that allows users to look at Twitter data without having to go on the page itself), there are loads of tools available to collect large datasets of tweets using specific guidelines. Here are some great tools that anyone, from the beginner researcher to the experienced data analyst, can use.
If you have no programming experience and do not want to spend any money, then TAGS (Twitter Archival Google Sheet) is a great starting point. It does not require any extra software and allows you to collect tweets from specific users, hashtags, periods, and other filters. It also allows you to set up automatic data collection based on specific criteria and set intervals, allowing you to forget about it and let the software do all the heavy lifting.
If you need something more professional and have some funding for your project, then Apify is the way to go. A large web scraping service that has hundreds of what the site calls ‘actors’, which are a simple set of commands on how to scrape a specific site’s data, allows you to get data from Twitter (and almost any website!). Apify’s free tier gives users $5 in Apify credit per month to use on any of the actors. While this credit stretches differently based on what you want to analyze, the free tier applied towards the Twitter scraper translates to roughly 20,000 tweets for free! If you run out of credits the next tier starts at $45 per month, although with a bit of communication with the Apify support I have found that they will offer a $25 option.
If you have some knowledge of the programming language Python and the command line interface, then the open-source tool Twarc is great for Twitter analysis. Once installed and configured via the command line, it allows users to bulk download tweets into a spreadsheet based on specific search criteria. The tool is completely free and allows you to download as many tweets as you want in batches of 3200! There is also a visualization tool that allows users to create a map based on tweets that they have downloaded.
For more Twitter data collection tools and pros and cons of each, I have made a tool comparison table available here.
Please feel free to reach out to me (firstname.lastname@example.org) if you would like help getting started with any of these tools!
If you’re interested in social media data and web scraping, join the Data Mining Faculty Interest Group sponsored by the library! Our next meeting will be held April 1 at noon, and we will focus on tools for scraping websites. Register here.