Open Source Code

I develop and maintain several open source packages of code for performing computational text and network analyses. By providing easily applied implementations of advanced methods, I hope to help support the computational social science community grow as a field.

Text as Data

CorEx Topic Model
The CorEx topic model allows users to integrate domain knowledge through user-specified “anchor words.” These anchor words allow substantive experts from the computational social sciences and digital humanities to interact and refine topics in ways that are not possible with traditional topic models.

Word Shift Graphs
Word shift graphs are interpretable data visualizations for understanding which words contribute to the differences between two texts, and how they contribute. They can be used with a variety of text comparison measures including proportions, sentiment, entropy, and more complex measures like the Kullback-Leibler and Jensen-Shannon divergences.

Network Science

Bayesian Stochastic Block Models
Bayesian stochastic block models are statistical models of a network’s mesoscale structure. Here, I have implemented two block models for inferring different types of core-periphery structure, a fundamental network structure in many different domains.