Advanced Search
CS Search Google Search
Subscribers, please login

Published Articles >> Table of Contents >> Abstract

Publication Home Page
August 2007 (Vol. 19, No. 8)   pp. 1072-1088
Efficient Skyline and Top-k Retrieval in Subspaces

Full Article Text: View linked HTML of full textDownload PDF of full textBuy this articleGet full text from IEEE Xplore

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2007.1051
Send link to a friend

Abstract
Skyline and top-k queries are two popular operations for preference retrieval. In practice, applications that require these operations usually provide numerous candidate attributes, whereas, depending on their interests, users may issue queries regarding different subsets of the dimensions. The existing algorithms are inadequate for subspace skyline/top-k search because they have at least one of the following defects: 1) They require scanning the entire database at least once, 2) they are optimized for one subspace but incur significant overhead for other subspaces, or 3) they demand expensive maintenance cost or space consumption. In this paper, we propose a technique SUBSKY, which settles both types of queries by using purely relational technologies. The core of SUBSKY is a transformation that converts multidimensional data to one-dimensional (1D) values. These values are indexed by a simple B-tree, which allows us to answer subspace queries by accessing a fraction of the database. SUBSKY entails low maintenance overhead, which equals the cost of updating a traditional B-tree. Extensive experiments with real data confirm that our technique outperforms alternative solutions significantly in both efficiency and scalability.
References
[1] W.-T. Balke, U. Guntzer, and J.X. Zheng, “Efficient Distributed Skylining for Web Information Systems,” Proc. Ninth Int'l Conf. Extending Database Technology (EDBT '04), pp. 256-273, 2004.
[2] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, “The ${\rm R}^{\ast}$ -Tree: An Efficient and Robust Access Method for Points and Rectangles,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '90), pp. 322-331, 1990.
[3] S. Berchtold, D.A. Keim, and H.-P. Kriegel, “The X-Tree : An Index Structure for High-Dimensional Data,” Proc. Int'l Conf. Very Large Data Bases (VLDB '96), pp. 28-39, 1996.
[4] C. Bohm, “A Cost Model for Query Processing in High-Dimensional Data Spaces,” ACM Trans. Database Systems, vol. 25, no. 2, pp. 129-178, 2000.
[5] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” Proc. 17th IEEE Int'l Conf. Data Eng. (ICDE '01), pp.421-430, 2001.
[6] C.-Y. Chan, P.-K. Eng, and K.-L. Tan, “Stratified Computation of Skylines with Partially-Ordered Domains,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '05), pp. 203-214, 2005.
[7] C.-Y. Chan, H. Jagadish, K.-L. Tan, A. Tung, and Z. Zhang, “On High Dimensional Skylines,” Proc. 10th Int'l Conf. Extending Database Technology (EDBT '06), pp. 478-495, 2006.
[8] C.-Y. Chan, H.V. Jagadish, K.-L. Tan, A.K.H. Tung, and Z. Zhang, “Finding k-Dominant Skylines in High Dimensional Space,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '06), pp.503-514, 2006.
[9] Y.-C. Chang, L.D. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J.R. Smith, “The Onion Technique: Indexing for Linear Optimization Queries,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '00), pp. 391-402, 2000.
[10] J. Chomicki, P. Godfrey, J. Gryz, and D. Liang, “Skyline with Presorting,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), pp.717-719, 2003.
[11] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. Wiley, 1973.
[12] R. Fagin, “Combining Fuzzy Information from Multiple Systems (Extended Abstract),” Proc. 15th ACM Symp. Principles of Database Systems (PODS '96), pp. 216-226, 1996.
[13] P. Godfrey, “Skyline Cardinality for Relational Processing,” Proc. Third Int'l Symp. Foundations of Information and Knowledge Systems (FoIKS '04), pp. 78-97, 2004.
[14] P. Godfrey, R. Shipley, and J. Gryz, “Maximal Vector Computation in Large Data Sets,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 229-240, 2005.
[15] G.R. Hjaltason and H. Samet, “Distance Browsing in Spatial Databases,” ACM Trans. Database Systems, vol. 24, no. 2, pp. 265-318, 1999.
[16] V. Hristidis and Y. Papakonstantinou, “Algorithms and Applications for Answering Ranked Queries Using Ranked Views,” VLDBJ., vol. 13, no. 1, pp. 49-70, 2004.
[17] Z. Huang, C.S. Jensen, H. Lu, and B.C. Ooi, “Skyline Queries against Mobile Lightweight Devices in MANETs,” Proc. 22nd IEEE Int'l Conf. Data Eng. (ICDE '06), 2006.
[18] D. Kossmann, F. Ramsak, and S. Rost, “Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB '02), pp. 275-286, 2002.
[19] X. Lin, Y. Yuan, W. Wang, and H. Lu, “Stabbing the Sky: Efficient Skyline Computation over Sliding Windows,” Proc. 21st IEEE Int'l Conf. Data Eng. (ICDE '05), 2005.
[20] S. Michel, P. Triantafillou, and G. Weikum, “Klee: A Framework for Distributed Top-k Query Algorithms,” Proc. 31st Int'l Conf. Very Large Data Bases, pp. 637-648, 2005.
[21] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '03), pp. 467-478, 2003.
[22] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive Skyline Computation in Database Systems,” ACM Trans. Database Systems, vol. 30, no. 1, pp. 41-82, 2005.
[23] J. Pei, W. Jin, M. Ester, and Y. Tao, “Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 253-264, 2005.
[24] D. Pelleg and A.W. Moore, “X-Means: Extending k-Means with Efficient Estimation of the Number of Clusters,” Proc. 17th Int'l Conf. Machine Learning (ICML '00), pp. 727-734, 2000.
[25] R.K. Surajit Chaudhuri and N. Dalvi, “Robust Cardinality and Cost Estimation for Skyline Operator,” Proc. 23rd IEEE Int'l Conf. Data Eng. (ICDE '06), 2006.
[26] K.-L. Tan, P.-K. Eng, and B.C. Ooi, “Efficient Progressive Skyline Computation,” Proc. 27th Int'l Conf. Very Large Data Bases (VLDB '01), pp. 301-310, 2001.
[27] Y. Tao, V. Hristidis, D. Papadias, and Y. Papakonstantinou, “Branch-and-Bound Processing of Ranked Queries,” Information Systems, vol. 32, no. 3, pp. 424-445, 2007.
[28] P. Tsaparas, T. Palpanas, Y. Kotidis, N. Koudas, and D. Srivastava, “Ranked Join Indices,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 277-288, 2003.
[29] T. Xia and D. Zhang, “Refreshing the Sky: The Compressed Skycube with Efficient Support for Frequent Updates,” Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '06), pp. 491-502, 2006.
[30] K. Yi, H. Yu, J. Yang, G. Xia, and Y. Chen, “Efficient Maintenance of Materialized Top-k Views,” Proc. 19th IEEE Int'l Conf. Data Eng. (ICDE '03), pp. 189-200, 2003.
[31] Y. Yuan, X. Lin, Q. Liu, W. Wang, J.X. Yu, and Q. Zhang, “Efficient Computation of the Skyline Cube,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 241-252, 2005.
Additional Information
Index Terms- Skyline, top-k, subspace, B-tree.

Citation:  Yufei Tao, Xiaokui Xiao, Jian Pei, "Efficient Skyline and Top-k Retrieval in Subspaces," IEEE Transactions on Knowledge and Data Engineering, vol. 19,  no. 8,  pp. 1072-1088,  Aug.,  2007

RSS Feed

Similar Articles

Abstract Contents
Abstract
References
Index Terms
Citation




Free access to

  • Abstracts
  • Selected PDFs

Electronic subscribers login to:

  • Access HTML/PDFs of full text articles

Subscription information

Get a Web account

Peer Review Notice

Give us Feedback