Open Source Code & Demos
Code and notebooks developed on the EXPERT project to fuse a variety of multilingual heterogenous open-source data streams, e.g., publications, institutional web pages, conference pages, and researcher profiles, to convert unstructured data into knowledge summaries and construct dynamically evolving proliferation expertise graphs for descriptive, predictive, and prescriptive analytics.

TExplore is an interactive tool for rapid and reproducible
identification of informative trends over time in unstructured
text datasets. Using TExplore, users can probe the
different axes of interest over time to see how behavioral
patterns persist or differ across different combinations
quickly and easily.
The tool summarizes the prominence over time
of text elements (e.g., words, ngrams, keyword phrases) in
datasets or inputs across groups — such as predictive model
output categories (e.g., predictions, confidence bin labels
etc.), or other categorical annotations in a dataset (e.g.,
topics, locations etc.).

An interactive tool for rapid cross-model comparison and reproducible error analysis. CrossCheck enables users to make informed decisions when choosing between multiple models, identify when the models are correct and for which examples, investigate whether the models are making the same mistakes as humans, evaluate models’ generalizability and highlight models’ limitations, strengths, and weaknesses.

SocialSim is a comprehensive Python package with 100+ measurements for quantifying many properties of online information spread. You can examine the spread of piece(s) of information online in terms of different: entity types, groups, temporal scales, and behaviors or phenomena. The package provides functionality to measure both simulation and ground truth representations of events (who, what, where, when) at a range of resolutions (user, community, population) and compare the results to evaluate the simulations against the ground truth using the metrics provided.