Optimization in data cube system design

Hung, E; Cheung, DW; Kao, B

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1023/B:JIIS.0000029669.16825.54
Scopus: eid_2-s2.0-3042834703
WOS: WOS:000221745000002
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

See more details

Article: Optimization in data cube system design

Title	Optimization in data cube system design
Authors	Hung, E Cheung, DW Kao, B
Keywords	Approximate algorithm Data cube system design Data warehouse OLAP Optimization
Issue Date	2004
Publisher	Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0925-9902
Citation	Journal Of Intelligent Information Systems, 2004, v. 23 n. 1, p. 17-45 How to Cite? DOI: http://dx.doi.org/10.1023/B:JIIS.0000029669.16825.54
Abstract	The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precompiled multidimensional views of data in a data warehouse. We can derive a set of data cubes to answer each frequently asked query directly. However, there are two practical problems: (1) the maintenance cost of the data cubes, and (2) the query cost to answer those queries. Maintaining a data cube requires disk storage and CPU computation, so the maintenance cost is related to the total size as well as the total number of data cubes materialized. In most cases, materializing all data cubes is impractical. The maintenance cost may be reduced by merging some data cubes. However, the resulting larger data cubes will increase the query cost of answering some queries. If the bounds on the maintenance cost and the query cost are too strict, we help the user decide which queries to be sacrificed and not taken into consideration. We have defined an optimization problem in data cube system design. Given a maintenance-cost bound, a query-cost bound and a set of frequently asked queries, it is necessary to determine a set of data cubes such that the system can answer a largest subset of the queries without violating the two bounds. This is an NP-hard problem. We propose approximate Greedy algorithms GR, 2GM and 2GMM, which are shown to be both effective and efficient by experiments done on a census data set and a forest-cover-type data set.
Persistent Identifier	http://hdl.handle.net/10722/88999
ISSN	0925-9902 2023 Impact Factor: 2.3 2023 SCImago Journal Rankings: 0.835
ISI Accession Number ID	WOS:000221745000002
References	References in Scopus

DC Field	Value	Language
dc.contributor.author	Hung, E	en_HK
dc.contributor.author	Cheung, DW	en_HK
dc.contributor.author	Kao, B	en_HK
dc.date.accessioned	2010-09-06T09:51:07Z	-
dc.date.available	2010-09-06T09:51:07Z	-
dc.date.issued	2004	en_HK
dc.identifier.citation	Journal Of Intelligent Information Systems, 2004, v. 23 n. 1, p. 17-45	en_HK
dc.identifier.issn	0925-9902	en_HK
dc.identifier.uri	http://hdl.handle.net/10722/88999	-
dc.description.abstract	The design of an OLAP system for supporting real-time queries is one of the major research issues. One approach is to use data cubes, which are materialized precompiled multidimensional views of data in a data warehouse. We can derive a set of data cubes to answer each frequently asked query directly. However, there are two practical problems: (1) the maintenance cost of the data cubes, and (2) the query cost to answer those queries. Maintaining a data cube requires disk storage and CPU computation, so the maintenance cost is related to the total size as well as the total number of data cubes materialized. In most cases, materializing all data cubes is impractical. The maintenance cost may be reduced by merging some data cubes. However, the resulting larger data cubes will increase the query cost of answering some queries. If the bounds on the maintenance cost and the query cost are too strict, we help the user decide which queries to be sacrificed and not taken into consideration. We have defined an optimization problem in data cube system design. Given a maintenance-cost bound, a query-cost bound and a set of frequently asked queries, it is necessary to determine a set of data cubes such that the system can answer a largest subset of the queries without violating the two bounds. This is an NP-hard problem. We propose approximate Greedy algorithms GR, 2GM and 2GMM, which are shown to be both effective and efficient by experiments done on a census data set and a forest-cover-type data set.	en_HK
dc.language	eng	en_HK
dc.publisher	Springer New York LLC. The Journal's web site is located at http://springerlink.metapress.com/openurl.asp?genre=journal&issn=0925-9902	en_HK
dc.relation.ispartof	Journal of Intelligent Information Systems	en_HK
dc.subject	Approximate algorithm	en_HK
dc.subject	Data cube system design	en_HK
dc.subject	Data warehouse	en_HK
dc.subject	OLAP	en_HK
dc.subject	Optimization	en_HK
dc.title	Optimization in data cube system design	en_HK
dc.type	Article	en_HK
dc.identifier.openurl	http://library.hku.hk:4550/resserv?sid=HKU:IR&issn=0925-9902&volume=23&issue=1&spage=17&epage=45&date=2004&atitle=Optimization+in+Data+Cube+System+Design	en_HK
dc.identifier.email	Cheung, DW:dcheung@cs.hku.hk	en_HK
dc.identifier.email	Kao, B:kao@cs.hku.hk	en_HK
dc.identifier.authority	Cheung, DW=rp00101	en_HK
dc.identifier.authority	Kao, B=rp00123	en_HK
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1023/B:JIIS.0000029669.16825.54	en_HK
dc.identifier.scopus	eid_2-s2.0-3042834703	en_HK
dc.identifier.hkuros	93316	en_HK
dc.relation.references	http://www.scopus.com/mlt/select.url?eid=2-s2.0-3042834703&selection=ref&src=s&origin=recordpage	en_HK
dc.identifier.volume	23	en_HK
dc.identifier.issue	1	en_HK
dc.identifier.spage	17	en_HK
dc.identifier.epage	45	en_HK
dc.identifier.isi	WOS:000221745000002	-
dc.publisher.place	United States	en_HK
dc.identifier.scopusauthorid	Hung, E=7004256336	en_HK
dc.identifier.scopusauthorid	Cheung, DW=34567902600	en_HK
dc.identifier.scopusauthorid	Kao, B=35221592600	en_HK
dc.identifier.citeulike	33451	-
dc.identifier.issnl	0925-9902	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Optimization in data cube system design

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats