File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1023/B:JIIS.0000029669.16825.54
- Scopus: eid_2-s2.0-3042834703
- WOS: WOS:000221745000002
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Optimization in data cube system design
Title | Optimization in data cube system design |
---|---|
Authors | |
Keywords | Approximate algorithm Data cube system design Data warehouse OLAP Optimization |
Issue Date | 2004 |
Publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0925-9902 |
Citation | Journal Of Intelligent Information Systems, 2004, v. 23 n. 1, p. 17-45 How to Cite? |
Abstract | The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precompiled multidimensional views of data in a data warehouse. We can derive a set of data cubes to answer each frequently asked query directly. However, there are two practical problems: (1) the maintenance cost of the data cubes, and (2) the query cost to answer those queries. Maintaining a data cube requires disk storage and CPU computation, so the maintenance cost is related to the total size as well as the total number of data cubes materialized. In most cases, materializing all data cubes is impractical. The maintenance cost may be reduced by merging some data cubes. However, the resulting larger data cubes will increase the query cost of answering some queries. If the bounds on the maintenance cost and the query cost are too strict, we help the user decide which queries to be sacrificed and not taken into consideration. We have defined an optimization problem in data cube system design. Given a maintenance-cost bound, a query-cost bound and a set of frequently asked queries, it is necessary to determine a set of data cubes such that the system can answer a largest subset of the queries without violating the two bounds. This is an NP-hard problem. We propose approximate Greedy algorithms GR, 2GM and 2GMM, which are shown to be both effective and efficient by experiments done on a census data set and a forest-cover-type data set. |
Persistent Identifier | http://hdl.handle.net/10722/88999 |
ISSN | 2015 Impact Factor: 1.0 2015 SCImago Journal Rankings: 0.691 |
ISI Accession Number ID | |
References |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hung, E | en_HK |
dc.contributor.author | Cheung, DW | en_HK |
dc.contributor.author | Kao, B | en_HK |
dc.date.accessioned | 2010-09-06T09:51:07Z | - |
dc.date.available | 2010-09-06T09:51:07Z | - |
dc.date.issued | 2004 | en_HK |
dc.identifier.citation | Journal Of Intelligent Information Systems, 2004, v. 23 n. 1, p. 17-45 | en_HK |
dc.identifier.issn | 0925-9902 | en_HK |
dc.identifier.uri | http://hdl.handle.net/10722/88999 | - |
dc.description.abstract | The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precompiled multidimensional views of data in a data warehouse. We can derive a set of data cubes to answer each frequently asked query directly. However, there are two practical problems: (1) the maintenance cost of the data cubes, and (2) the query cost to answer those queries. Maintaining a data cube requires disk storage and CPU computation, so the maintenance cost is related to the total size as well as the total number of data cubes materialized. In most cases, materializing all data cubes is impractical. The maintenance cost may be reduced by merging some data cubes. However, the resulting larger data cubes will increase the query cost of answering some queries. If the bounds on the maintenance cost and the query cost are too strict, we help the user decide which queries to be sacrificed and not taken into consideration. We have defined an optimization problem in data cube system design. Given a maintenance-cost bound, a query-cost bound and a set of frequently asked queries, it is necessary to determine a set of data cubes such that the system can answer a largest subset of the queries without violating the two bounds. This is an NP-hard problem. We propose approximate Greedy algorithms GR, 2GM and 2GMM, which are shown to be both effective and efficient by experiments done on a census data set and a forest-cover-type data set. | en_HK |
dc.language | eng | en_HK |
dc.publisher | Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0925-9902 | en_HK |
dc.relation.ispartof | Journal of Intelligent Information Systems | en_HK |
dc.subject | Approximate algorithm | en_HK |
dc.subject | Data cube system design | en_HK |
dc.subject | Data warehouse | en_HK |
dc.subject | OLAP | en_HK |
dc.subject | Optimization | en_HK |
dc.title | Optimization in data cube system design | en_HK |
dc.type | Article | en_HK |
dc.identifier.openurl | http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0925-9902&volume=23&issue=1&spage=17&epage=45&date=2004&atitle=Optimization+in+Data+Cube+System+Design | en_HK |
dc.identifier.email | Cheung, DW:dcheung@cs.hku.hk | en_HK |
dc.identifier.email | Kao, B:kao@cs.hku.hk | en_HK |
dc.identifier.authority | Cheung, DW=rp00101 | en_HK |
dc.identifier.authority | Kao, B=rp00123 | en_HK |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1023/B:JIIS.0000029669.16825.54 | en_HK |
dc.identifier.scopus | eid_2-s2.0-3042834703 | en_HK |
dc.identifier.hkuros | 93316 | en_HK |
dc.relation.references | http://www.scopus.com/mlt/select.url?eid=2-s2.0-3042834703&selection=ref&src=s&origin=recordpage | en_HK |
dc.identifier.volume | 23 | en_HK |
dc.identifier.issue | 1 | en_HK |
dc.identifier.spage | 17 | en_HK |
dc.identifier.epage | 45 | en_HK |
dc.identifier.isi | WOS:000221745000002 | - |
dc.publisher.place | United States | en_HK |
dc.identifier.scopusauthorid | Hung, E=7004256336 | en_HK |
dc.identifier.scopusauthorid | Cheung, DW=34567902600 | en_HK |
dc.identifier.scopusauthorid | Kao, B=35221592600 | en_HK |
dc.identifier.citeulike | 33451 | - |