Abstract
Social media sensing has emerged as a new application paradigm to collect observations from online social media users about the physical environment. A fundamental problem in social media sensing applications lies in estimating the evolving truth of the measured variables and the reliability of data sources without knowing either of them a priori. This problem is referred to as dynamic truth discovery. Two major limitations exist in current truth discovery solutions: i) existing solutions cannot effectively address the missing truth problem where the measured variables do not have any reported measurements from the data sources; ii) the latent correlations among the measured variables were not fully captured and utilized in current solutions. In this paper, we proposed a Reliable Missing Truth Finder (RMTF) to address the above limitations in social media sensing applications. In particular, we develop a novel data-driven technique to identify the lagged and latent correlations among measured variables, and incorporate such correlation information into a holistic spatiotemporal inference model to infer the missing truth. We evaluated the RMTF using the real-world Twitter data feeds. The results show that the RMTF scheme significantly outperforms the state-of-the-art truth discovery solutions by correctly inferring the missing truth of the measured variables.