I’m publishing a series of tutorials that teach the fundamentals of quantitative text analysis for social scientists. The emphasis is on application. How can you collect and analyze thousands of web pages or Tweets? What are the best practices for turning words into numbers?
The tutorials are designed for people who may be familiar with a standard statistical program, such as Stata or SPSS, or perhaps a qualitative analysis program like NVivo, but who haven’t done any quantitative text analysis or used Python. Python is an open-source computer language that is quite popular amongst computer programmers. Computational linguistics and computer scientists have developed a large number of add-ons for Python that make it a popular choice for quantitative text analysis.
The tutorials are written in a cumulative fashion, following the flow of a workshop on collecting and analyzing data from the web that I lead at Carolina. Each tutorial assumes you have read through the prior ones. The first set is especially important as it introduce the basic concepts of using Python. If you want to jump directly to something of particular interest, go right ahead, but if you get lost, you might at least want to refer back to the introductory posts.
Special thanks to Sarah Gaby for beta testing the posts.
You can subscribe to this page using your RSS reader if you want to find out when new things being posted.
The basics of Python and working with text files to compute something interesting.
An introduction to text analysis with Python, Part 1
An introduction to text analysis with Python, Part 2
An introduction to text analysis with Python, Part 3
Using the Google Maps API
Inequality from Space
Who’s offering domestic partnership benefits?
Fun Demographic Data from Facebook
“Our findings show”: What words to use in an abstract
The 102 most cited articles in sociology
The most cited articles in sociology by journal
Using the Twitter Streaming API
Using the Facebook API
Cleaning Factiva and LexisNexis Files
K-Means and Other Clustering Algorithms
Fun with Python
Setting up Python
More Python Basics