File Download
Supplementary

Presentation: Beyond bibliographic metadata, augmenting the HKU IR

TitleBeyond bibliographic metadata, augmenting the HKU IR
Authors
KeywordsKnowledge Exchange
Theses
Patents
Grants
CRIS
Issue Date2012
Citation
Open Repositories 2012, Edinburgh, UK, 9 July 2012 How to Cite?
AbstractThe HKU Scholars Hub (The Hub) is the institutional repository of the University of Hong Kong (HKU), which from 2005 has placed items in open access. However in 2009, a new university initiative appeared, Knowledge Exchange, which recognizes the 3rd mission of universities, to engage with their communities, for bilateral benefit. In this regard The Hub received funding from the HKU Office of Knowledge Exchange, to make HKU research and researchers highly visible, in the expectation that this will greatly increase opportunities for contract and collaborative research, the creation of cross-institutional multidisciplinary teams for e-research, etc. To answer this need, The Hub hosted in DSpace, needed to quickly augment existing bibliographic metadata, add metadata on other objects, such as people, grants, patents, etc., and augment this metadata as well. Bibliographic Metadata: We receive a data feed of publication metadata from the HKU Research Output System (ROS) database, managed by the HKU Registry. This metadata is usually dirty and thin. To ameliorate we created several procedures: 1) A web service on ROS, to receive a DOI or ISBN, and then auto-populate the record with data from CrossRef or OCLC, without any other input from the user. Our next planned step, is to refuse entry into ROS, without a DOI or ISBN in most cases. 2) Thin metadata from ROS, and even data from CrossRef or OCLC, can be augmented, or further cleansed, to add more value, and of course more access points for search engine optimization. We have had an on-going project for several years to clean HKU metadata in Scopus. Therefore the Scopus AU-IDs for HKU people are “mostly” correct. We now use the Scopus API, matching on corresponding AU-IDs, to bring back publisher supplied metadata on HKU publications. After giving HKU authors one last time to deny ownership to this metadata, we overlay our existing thin and dirty metadata with full and correct metadata. 3) To show publication metadata authored previous to authors’ HKU tenure, we retrieve RIS or XML data from Scopus, or from author supplied EndNote files. 4) With correct DOI, we can bring in on-the-fly, corresponding citation counts from Scopus, Web of Science, PubMed Central, etc. Author Metadata: We created ResearcherPages for each of the approximately 1,400 scholars, eligible to apply for grant funding. These pages show their publication lists linked to fulltext, or remotely subscribed fulltext. However, we soon found that some disciplines do not value publications as indications of scholarly worth. For these we added other types of metadata, such as Awards, which greatly pleased the Faculty of Architecture, and those receiving teaching awards. Other facets that we added were, HKU Committees, Supervision of Graduate Students, Grants, Bibliometrics, etc. Thesis Metadata: HKU has three separate, and separately controlled silos for thesis metadata and fulltext objects, each with issues of dirty or thin metadata: a) Graduate School, b) Student Information Service (Registry), and c) library catalogue records. The first two of these are dark databases, allowing no public access. We have had to create processes to update each silo with supplementary data from the others. For The Hub, this means that we can now show for each supervisor, complete lists of supervised students, with a correct thesis title hyperlinked to the fulltext thesis, also in The Hub. At the other end of the hyperlink, the record for a thesis will now show a link to a supervisor. We have become a publisher in CrossRef, and are issuing DOI numbers to each HKU thesis. This will make them more visible, and eventually allow us to show citation counts beside each thesis. Grant Metadata: We import this metadata from a dark database controlled by the Registry. It is now publicly accessible in the Hub showing hyperlinks to the ResearcherPages of PI and Co-I, as well as to grants of similar sponsor, panel, etc. Recently we found ways to hyperlink from the Grant record, to a publication resulting from that grant. Patent Metadata: We receive metadata from the HKU Technology Transfer Office, which is limited and focused upon patents in the US. We have built VBnet scripts to use the patent publication or issued number, search on the patent offices of the US, China, EU, Japan and others to bring back complete metadata, which includes priority dates, patent numbers of application, continuation, and of corresponding patents by the same HKU inventors in other jurisdictions, aka., “patent family”. We then use these newly received numbers to retrieve metadata for their complete records and display in The Hub. In the end, we have patent records that show a complete history from application to issuance and hyperlinkage to patents in the same family. All of this increased metadata, which is augmented, and becoming cleaner, has increased the utility of the Hub, and enabled content re-use: 1) We can now show hyperlinked visualizations of networked people, based upon co-authorship, co-investigatorship, same committee, same keyword in publication, patent, research interest, etc. 2) We provide a web service for the HKU departments to extract from The Hub metadata and build author profiles in their departmental pages. 3) Some HKU offices, realizing that we can present better data and attract more eyes, have asked us to take-over services that they once provided. 4) We can send files of augmented and clean metadata back to our original data providers, thus allowing their records to benefit from our work. 5) We use Thomson Reuter provided API to auto-create and auto-populate the TR ResearcherID for each of our 1,400 authors. Each Hub record on people, publication, grant, and patent will show download and view counts. These internal measures, anecdotes, and external measures, such as the Webometrics “Ranking Web of World Repositories” indicate that we have succeeded in making visible the research and researchers of HKU.
Persistent Identifierhttp://hdl.handle.net/10722/152525

 

DC FieldValueLanguage
dc.contributor.authorPalmer, DT-
dc.contributor.authorLo, CY-
dc.contributor.authorLiu, E-
dc.date.accessioned2012-07-06T09:19:28Z-
dc.date.available2012-07-06T09:19:28Z-
dc.date.issued2012-
dc.identifier.citationOpen Repositories 2012, Edinburgh, UK, 9 July 2012en_US
dc.identifier.urihttp://hdl.handle.net/10722/152525-
dc.description.abstractThe HKU Scholars Hub (The Hub) is the institutional repository of the University of Hong Kong (HKU), which from 2005 has placed items in open access. However in 2009, a new university initiative appeared, Knowledge Exchange, which recognizes the 3rd mission of universities, to engage with their communities, for bilateral benefit. In this regard The Hub received funding from the HKU Office of Knowledge Exchange, to make HKU research and researchers highly visible, in the expectation that this will greatly increase opportunities for contract and collaborative research, the creation of cross-institutional multidisciplinary teams for e-research, etc. To answer this need, The Hub hosted in DSpace, needed to quickly augment existing bibliographic metadata, add metadata on other objects, such as people, grants, patents, etc., and augment this metadata as well. Bibliographic Metadata: We receive a data feed of publication metadata from the HKU Research Output System (ROS) database, managed by the HKU Registry. This metadata is usually dirty and thin. To ameliorate we created several procedures: 1) A web service on ROS, to receive a DOI or ISBN, and then auto-populate the record with data from CrossRef or OCLC, without any other input from the user. Our next planned step, is to refuse entry into ROS, without a DOI or ISBN in most cases. 2) Thin metadata from ROS, and even data from CrossRef or OCLC, can be augmented, or further cleansed, to add more value, and of course more access points for search engine optimization. We have had an on-going project for several years to clean HKU metadata in Scopus. Therefore the Scopus AU-IDs for HKU people are “mostly” correct. We now use the Scopus API, matching on corresponding AU-IDs, to bring back publisher supplied metadata on HKU publications. After giving HKU authors one last time to deny ownership to this metadata, we overlay our existing thin and dirty metadata with full and correct metadata. 3) To show publication metadata authored previous to authors’ HKU tenure, we retrieve RIS or XML data from Scopus, or from author supplied EndNote files. 4) With correct DOI, we can bring in on-the-fly, corresponding citation counts from Scopus, Web of Science, PubMed Central, etc. Author Metadata: We created ResearcherPages for each of the approximately 1,400 scholars, eligible to apply for grant funding. These pages show their publication lists linked to fulltext, or remotely subscribed fulltext. However, we soon found that some disciplines do not value publications as indications of scholarly worth. For these we added other types of metadata, such as Awards, which greatly pleased the Faculty of Architecture, and those receiving teaching awards. Other facets that we added were, HKU Committees, Supervision of Graduate Students, Grants, Bibliometrics, etc. Thesis Metadata: HKU has three separate, and separately controlled silos for thesis metadata and fulltext objects, each with issues of dirty or thin metadata: a) Graduate School, b) Student Information Service (Registry), and c) library catalogue records. The first two of these are dark databases, allowing no public access. We have had to create processes to update each silo with supplementary data from the others. For The Hub, this means that we can now show for each supervisor, complete lists of supervised students, with a correct thesis title hyperlinked to the fulltext thesis, also in The Hub. At the other end of the hyperlink, the record for a thesis will now show a link to a supervisor. We have become a publisher in CrossRef, and are issuing DOI numbers to each HKU thesis. This will make them more visible, and eventually allow us to show citation counts beside each thesis. Grant Metadata: We import this metadata from a dark database controlled by the Registry. It is now publicly accessible in the Hub showing hyperlinks to the ResearcherPages of PI and Co-I, as well as to grants of similar sponsor, panel, etc. Recently we found ways to hyperlink from the Grant record, to a publication resulting from that grant. Patent Metadata: We receive metadata from the HKU Technology Transfer Office, which is limited and focused upon patents in the US. We have built VBnet scripts to use the patent publication or issued number, search on the patent offices of the US, China, EU, Japan and others to bring back complete metadata, which includes priority dates, patent numbers of application, continuation, and of corresponding patents by the same HKU inventors in other jurisdictions, aka., “patent family”. We then use these newly received numbers to retrieve metadata for their complete records and display in The Hub. In the end, we have patent records that show a complete history from application to issuance and hyperlinkage to patents in the same family. All of this increased metadata, which is augmented, and becoming cleaner, has increased the utility of the Hub, and enabled content re-use: 1) We can now show hyperlinked visualizations of networked people, based upon co-authorship, co-investigatorship, same committee, same keyword in publication, patent, research interest, etc. 2) We provide a web service for the HKU departments to extract from The Hub metadata and build author profiles in their departmental pages. 3) Some HKU offices, realizing that we can present better data and attract more eyes, have asked us to take-over services that they once provided. 4) We can send files of augmented and clean metadata back to our original data providers, thus allowing their records to benefit from our work. 5) We use Thomson Reuter provided API to auto-create and auto-populate the TR ResearcherID for each of our 1,400 authors. Each Hub record on people, publication, grant, and patent will show download and view counts. These internal measures, anecdotes, and external measures, such as the Webometrics “Ranking Web of World Repositories” indicate that we have succeeded in making visible the research and researchers of HKU.en_US
dc.language.isoengen_US
dc.relation.ispartofOpen Repositories 2012-
dc.subjectKnowledge Exchangeen_US
dc.subjectThesesen_US
dc.subjectPatentsen_US
dc.subjectGrantsen_US
dc.subjectCRISen_US
dc.titleBeyond bibliographic metadata, augmenting the HKU IRen_US
dc.typePresentationen_US
dc.description.naturepublished_or_final_version-
dc.identifier.hkuros205621-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats