DSpace Community:http://hdl.handle.net/10722/10452024-03-29T05:55:42Z2024-03-29T05:55:42ZA Data Cube Model for Prediction-based Web PrefetchingYang, QHuang, JZNg, KPhttp://hdl.handle.net/10722/2251652019-09-18T07:00:01Z2003-01-01T00:00:00ZTitle: A Data Cube Model for Prediction-based Web Prefetching
Authors: Yang, Q; Huang, JZ; Ng, KP
Abstract: Reducing the web latency is one of the primary concerns of Internet research. Web caching and web
prefetching are two effective techniques to latency reduction. A primary method for intelligent prefetching is to rank potential web documents based on prediction models that are trained on the past web server and proxy server log data, and to prefetch the highly ranked objects. For this method to work well, the prediction model must be updated constantly, and different queries must be answered efficiently. In this paper we present a data-cube model to represent Web access sessions for data mining for supporting the prediction model construction. The cube model organizes session data into three dimensions. With the data cube in place, we apply efficient data mining algorithms for clustering and correlation analysis. As a result of the analysis, the web page clusters can then be used to guide the prefetching system. In this paper, we propose an integrated web-caching and web-prefetching model,
where the issues of prefetching aggressiveness, replacement policy and increased network traffic are addressed together in an integrated framework. The core of our integrated solution is a prediction model based on statistical correlation between web objects. This model can be frequently updated by querying the data cube of web server logs. This integrated data cube and prediction based prefetching framework represents a first such effort in our knowledge.2003-01-01T00:00:00ZA Semantic DOM Approach For Webpage Information ExtractionFei, YLuo, ZXu, YZhang, Whttp://hdl.handle.net/10722/2237622020-11-23T06:56:20Z2009-01-01T00:00:00ZTitle: A Semantic DOM Approach For Webpage Information Extraction
Authors: Fei, Y; Luo, Z; Xu, Y; Zhang, W
Abstract: With the development of electronic technology and e-commerce, technology for Web pages has attracted a lot of research efforts which becomes one of the hottest topics recently. This paper has proposed a semantic DOM (SDOM) approach for information extraction of e-commerce Web pages. With the combination of content and structure information, the precision and recall can achieve a good result which is shown in our experiments on listpage and tablepage data sets.2009-01-01T00:00:00ZBuilding a Decision Cluster Classification Model for High Dimensional Data by a Variable Weighting k-Means MethodLi, YHung, EChung, KHuang, JZhttp://hdl.handle.net/10722/2237602020-11-23T06:56:20Z2008-01-01T00:00:00ZTitle: Building a Decision Cluster Classification Model for High Dimensional Data by a Variable Weighting k-Means Method
Authors: Li, Y; Hung, E; Chung, K; Huang, JZ
Abstract: In this paper, a new classification method (ADCC) for high dimensional data is proposed. In this method, a decision cluster classification model (DCC) consists of a set of disjoint decision clusters, each labeled with a dominant class that determines the class of new objects falling in the cluster. A cluster tree is first generated from a training data set by recursively calling a variable weighting <em>k</em> -means algorithm. Then, the DCC model is selected from the tree. Anderson-Darling test is used to determine the stopping condition of the tree growing. A series of experiments on both synthetic and real data sets have shown that the new classification method (ADCC) performed better in accuracy and scalability than the existing methods of <em>k</em> -<em>NN</em> , decision tree and SVM. It is particularly suitable for large, high dimensional data with many classes.2008-01-01T00:00:00ZA Changing Window Approach to Exploring Gene Expression PatternsWang, QYe, YMHuang, JZhttp://hdl.handle.net/10722/2237592020-11-23T06:56:20Z2008-01-01T00:00:00ZTitle: A Changing Window Approach to Exploring Gene Expression Patterns
Authors: Wang, Q; Ye, YM; Huang, JZ
Abstract: This paper presents a changing window approach to exploring gene expression patterns in 'snapshot windows'. A snapshot window is a sub-matrix of co-expressed microarray data representing certain expression pattern. In this approach, we use a feature weighting k-means subspace clustering algorithm to generate a set of clusters and each cluster defines a set of 'snapshot windows' which are characterized by different sets of ordered sample weights that were assigned by the clustering algorithm. We define an accumulated weighting threshold (AWT) as the sum of weights of samples in the 'snapshot window'. Given a cluster, different 'snapshot windows' can be obtained by changing AWT to explore all possible local expression patterns in the cluster. Experiment results have shown our approach is effective and flexible in exploring various expression patterns and identifying novel ones.2008-01-01T00:00:00Z