The project aims to synthesize long time-series mobility chains and to evaluate the synthesizer performance.

Open source code for evaluating time-series synthetic data is available in our Github repository, including two projects:

  • synthesize_mobility_chains: The repository contains code that analyze long mobility chains repositories (such as the location traces of an app user for several weeks) and synthesizes location traces using short-term memory networks (LSTMs), Markov Chains (MC), and variable-order Markov models
  • evaluate_synthetic_time_series: The repository contains code that evaluates measures for synthetic time series data, such as synthesis of the locations a set of people visit over a couple of weeks. The scripts compare the synthetic data to the original data and analyze how well the synthesis preserves the privacy of the original data subjects, the statistical similarity, the per-instance similarity, and the diversity of the synthetic data.

The first paper in this project, “Synthesis of Longitudinal Human Location Sequences: Balancing Utility and Privacy“, was recently accepted to the ACM Transactions on Knowledge Discovery from Data (TKDD).