|
Published Articles >> Table of Contents >> Abstract
11th Pacific Rim International Symposium on Dependable Computing (PRDC'05)
pp. 31-40
Privacy-Preserving Bayesian Network Structure Learning on Distributed Heterogeneous Data
Wang Hongmei, Tianjin University, Tianjin 300072, China
Zhao Zheng, Tianjin University, Tianjin 300072, China
Sun Zhiwei, Tianjin University, Tianjin 300072, China
Full Article Text:

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/PRDC.2005.49
Send link to a friend
| Abstract |
|
Privacy concerns often prevent different parties
from sharing their data in order to carry out data
mining applications on their joint data. Privacypreserving
data mining provides a solution by creating
distributed data mining algorithm in which the
underlying data is not revealed. In this paper, we
address a particular data mining problem, learning the
structure of Bayesian network on distributed
heterogeneous data. In this setting, three or more
parties owning confidential databases wish to learn the
structure on the combination of their databases
without revealing anything about their data to each
other. We provide a private generalized scalar product
share protocol for learning the empirical entropy.
Then we give an effective and privacy-preserving
version of the B&B_MDL algorithm to construct the
structure of a Bayesian network for the parties' joint
data. In comparison to the previously known solution
for this problem (Wright, Yang, 2004), which is based
on K2 algorithm, our solution provides complete
accuracy, full privacy, ideal universality, and better
performance. In particular, our solution provides fully
private, in that the only thing the parties learn about
each other's inputs is the desired output and the
number of stochastic variables' value, and more
universal, in that the databases partitioned vertically
are among three or more parties, and completely
accurate, in that the structure computed are exactly
what they would be if the data was centralized. In
addition, our solution works for both binary and nonbinary
discrete data.
|
Additional Information
|
Citation:
Wang Hongmei, Zhao Zheng, Sun Zhiwei,
"Privacy-Preserving Bayesian Network Structure Learning on Distributed Heterogeneous Data,"
prdc,
pp. 31-40,
11th Pacific Rim International Symposium on Dependable Computing (PRDC'05),
2005
|
|