TY - GEN
T1 - Clustering-based k-anonymity
AU - He, Xianmang
AU - Chen, Hua Hui
AU - Chen, Yefang
AU - Dong, Yihong
AU - Wang, Peng
AU - Huang, Zhenhua
PY - 2012
Y1 - 2012
N2 - Privacy is one of major concerns when data containing sensitive information needs to be released for ad hoc analysis, which has attracted wide research interest on privacy-preserving data publishing in the past few years. One approach of strategy to anonymize data is generalization. In a typical generalization approach, tuples in a table was first divided into many QI (quasi-identifier)-groups such that the size of each QI-group is no less than k. Clustering is to partition the tuples into many clusters such that the points within a cluster are more similar to each other than points in different clusters. The two methods share a common feature: distribute the tuples into many small groups. Motivated by this observation, we propose a clustering-based k-anonymity algorithm, which achieves k-anonymity through clustering. Extensive experiments on real data sets are also conducted, showing that the utility has been improved by our approach.
AB - Privacy is one of major concerns when data containing sensitive information needs to be released for ad hoc analysis, which has attracted wide research interest on privacy-preserving data publishing in the past few years. One approach of strategy to anonymize data is generalization. In a typical generalization approach, tuples in a table was first divided into many QI (quasi-identifier)-groups such that the size of each QI-group is no less than k. Clustering is to partition the tuples into many clusters such that the points within a cluster are more similar to each other than points in different clusters. The two methods share a common feature: distribute the tuples into many small groups. Motivated by this observation, we propose a clustering-based k-anonymity algorithm, which achieves k-anonymity through clustering. Extensive experiments on real data sets are also conducted, showing that the utility has been improved by our approach.
KW - algorithm
KW - privacy preservation
KW - proximity privacy
UR - http://www.scopus.com/inward/record.url?scp=84861422569&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-30217-6_34
DO - 10.1007/978-3-642-30217-6_34
M3 - Conference contribution
AN - SCOPUS:84861422569
SN - 9783642302169
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 405
EP - 417
BT - Advances in Knowledge Discovery and Data Mining - 16th Pacific-Asia Conference, PAKDD 2012, Proceedings
T2 - 16th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2012
Y2 - 29 May 2012 through 1 June 2012
ER -