File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

postgraduate thesis: Relationship analysis for web content adaptation

TitleRelationship analysis for web content adaptation
Authors
Advisors
Advisor(s):Lau, FCM
Issue Date2014
PublisherThe University of Hong Kong (Pokfulam, Hong Kong)
Citation
Lai, P. [賴寶欣]. (2014). Relationship analysis for web content adaptation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5194775
AbstractThe use of mobile devices to access the World Wide Web is becoming more prevalent. When browsing webpages on small-screen devices, it is difficult to locate information of interest since the limited screen space can be fully packed with information. Also, browsing Web tables on small-screen devices is a non-trivial problem. To fit a large table in a small-screen device, the association between data values and their corresponding headers may be disrupted. It is difficult to locate information accurately if the data meanings are lost. For visually impaired users, the problem is even more challenging. Sequential presentation of the webpage by a screen reader is too time-consuming if the information of interest is placed at or near the end of the webpage. Therefore, there is a need to re-organize useful information in webpages in order to enhance information finding on small-screen devices. In this thesis, various adaptations are proposed by exploring and exploiting relationships between Web elements in the webpage. In the current literature, some proposed heuristics are based on specific HTML elements, which cannot be generalized. Some other algorithms assume a correct DOM structure, which would fail if the webpage is not properly marked up. Many algorithms extract blocks without assigning them the proper titles. A gap needs to be filled, such that extracted blocks will be given a proper title through exploring the relationships between semantic elements. In this thesis, I propose to integrate relationship analysis and DOM-tree structure traversal for identifying logical sections together with their section headings. By extracting all the section headings, a table of content can be constructed to provide direct access to interested sections in an efficient way. Relationship analysis is a critical complement to the DOM structure for identifying the semantic content hierarchy when a webpage is not properly marked up. By exploring relationships between table cells, the structure of an unstructured Web table can be extracted. The semantic meanings of the data values are retained by preserving the data values and their corresponding headers. A novel way of accessing a webpage, which converts the page itself and its Web table into menu-based presentation, is then proposed. Converting the webpage into an Interactive Voice Response System introduces yet another mode of access which can enhance the accessibility of the webpage. In addition to improving mobile accessibility, the proposed adaptations can also benefit the visually impaired users. Experiments show that the average effectiveness and efficiency of adaptation with direct access are improved by 18% and 15% respectively, which are clearly better than the case without adaptation. Also, by adapting the Web table into a series of menu pages, the effectiveness and efficiency are improved by 61% and 37% respectively. For the evaluations with visually impaired users, the adaptation with direct access can greatly improve efficiency by 85%. Some complicated Web tables in fact could not be properly interpreted by visually impaired users; the Web table adaptation makes them accessible. Information finding indeed becomes more efficient and effective when using the adapted versions.
DegreeDoctor of Philosophy
SubjectWeb sites - Design
Dept/ProgramComputer Science
Persistent Identifierhttp://hdl.handle.net/10722/197531
HKU Library Item IDb5194775

 

DC FieldValueLanguage
dc.contributor.advisorLau, FCM-
dc.contributor.authorLai, Po-yan-
dc.contributor.author賴寶欣-
dc.date.accessioned2014-05-27T23:16:41Z-
dc.date.available2014-05-27T23:16:41Z-
dc.date.issued2014-
dc.identifier.citationLai, P. [賴寶欣]. (2014). Relationship analysis for web content adaptation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5194775-
dc.identifier.urihttp://hdl.handle.net/10722/197531-
dc.description.abstractThe use of mobile devices to access the World Wide Web is becoming more prevalent. When browsing webpages on small-screen devices, it is difficult to locate information of interest since the limited screen space can be fully packed with information. Also, browsing Web tables on small-screen devices is a non-trivial problem. To fit a large table in a small-screen device, the association between data values and their corresponding headers may be disrupted. It is difficult to locate information accurately if the data meanings are lost. For visually impaired users, the problem is even more challenging. Sequential presentation of the webpage by a screen reader is too time-consuming if the information of interest is placed at or near the end of the webpage. Therefore, there is a need to re-organize useful information in webpages in order to enhance information finding on small-screen devices. In this thesis, various adaptations are proposed by exploring and exploiting relationships between Web elements in the webpage. In the current literature, some proposed heuristics are based on specific HTML elements, which cannot be generalized. Some other algorithms assume a correct DOM structure, which would fail if the webpage is not properly marked up. Many algorithms extract blocks without assigning them the proper titles. A gap needs to be filled, such that extracted blocks will be given a proper title through exploring the relationships between semantic elements. In this thesis, I propose to integrate relationship analysis and DOM-tree structure traversal for identifying logical sections together with their section headings. By extracting all the section headings, a table of content can be constructed to provide direct access to interested sections in an efficient way. Relationship analysis is a critical complement to the DOM structure for identifying the semantic content hierarchy when a webpage is not properly marked up. By exploring relationships between table cells, the structure of an unstructured Web table can be extracted. The semantic meanings of the data values are retained by preserving the data values and their corresponding headers. A novel way of accessing a webpage, which converts the page itself and its Web table into menu-based presentation, is then proposed. Converting the webpage into an Interactive Voice Response System introduces yet another mode of access which can enhance the accessibility of the webpage. In addition to improving mobile accessibility, the proposed adaptations can also benefit the visually impaired users. Experiments show that the average effectiveness and efficiency of adaptation with direct access are improved by 18% and 15% respectively, which are clearly better than the case without adaptation. Also, by adapting the Web table into a series of menu pages, the effectiveness and efficiency are improved by 61% and 37% respectively. For the evaluations with visually impaired users, the adaptation with direct access can greatly improve efficiency by 85%. Some complicated Web tables in fact could not be properly interpreted by visually impaired users; the Web table adaptation makes them accessible. Information finding indeed becomes more efficient and effective when using the adapted versions.-
dc.languageeng-
dc.publisherThe University of Hong Kong (Pokfulam, Hong Kong)-
dc.relation.ispartofHKU Theses Online (HKUTO)-
dc.rightsThe author retains all proprietary rights, (such as patent rights) and the right to use in future works.-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subject.lcshWeb sites - Design-
dc.titleRelationship analysis for web content adaptation-
dc.typePG_Thesis-
dc.identifier.hkulb5194775-
dc.description.thesisnameDoctor of Philosophy-
dc.description.thesislevelDoctoral-
dc.description.thesisdisciplineComputer Science-
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.5353/th_b5194775-
dc.identifier.mmsid991036878769703414-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats