EAGER: Spatiotemporal Big Data Analysis to Understand COVID-19 Effects
Principal Investigator(s):Shashi Shekhar, McKnight Distinguished Professor, Computer Science and Engineering
The COVID-19 pandemic has impacted public health with a large number of mortalities and ravaged the economy by increasing unemployment to a historically high level. The goal of this project is to investigate the potential for novel spatiotemporal big data to assist in identifying COVID-19-related geographic patterns, such as locations where groups of people visit for long, overlapping times, and travel to and from hotspots. Such patterns are of great interest to policymakers and public health researchers but are difficult to find in traditional mobility datasets such as infrequent travel surveys and urban highway traffic data. Example spatiotemporal big data include privacy-protected aggregated location traces of mobile devices that have recently been opened for COVID-19 research. If successful, the results will inform disease spread models and policy interventions to save lives and reopen the economy safely.
This project is expected to result in multiple data science and computer science innovations. First, it will define and quantify hangout-venues, a novel spatiotemporal pattern family modeling the places with many overlapping long visits. Examples include full-service dine-in restaurants which have many long visits, but not limited-service restaurants which mostly have short visits. Second, it will probe new interest measures to not only distinguish between patterns (e.g., full-service restaurants) and non-patterns (e.g., limited-service restaurants) but also support the design of computationally efficient algorithms based on properties such as anti-monotone. Third, it will design novel and scalable algorithms for analyzing the large (tens of terabytes) dataset for hangout-venues. Fourth, it will investigate the impact of selection bias and noise from differential privacy schemes. The results have the potential to transform data science knowledge with novel pattern families (e.g., hangout-venue) and improve the understanding of the impact of selection bias and noise added by differential privacy schemes on pattern mining methods and their results. Furthermore, the project will co-produce knowledge in close collaboration with public health researchers and policymakers. The results have the potential to transform the understanding of the public mobility for modeling disease transmission dynamics by leveraging the emerging spatiotemporal big data.