File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1016/j.jclepro.2024.144572
- WOS: WOS:001409239500001
- Find via

Supplementary
-
Citations:
- Web of Science: 0
- Appears in Collections:
Article: ESGReveal: An LLM-based approach for extracting structured data from ESG reports
| Title | ESGReveal: An LLM-based approach for extracting structured data from ESG reports |
|---|---|
| Authors | |
| Issue Date | 15-Jan-2025 |
| Publisher | Elsevier |
| Citation | Journal of Cleaner Production, 2025, v. 489 How to Cite? |
| Abstract | As an important source for disclosure a company's environmental, social, and governance (ESG) performance, stock exchanges gradually strengthen their requirements for listed companies to periodically submit their ESG exports. However, these documents are often unstructured, making it difficult to directly evaluate a company's disclosure level as well as the performance quantitatively. In this study, we develop a quantitative framework, ESGReveal, for assessing corporate ESG performance based on large language model (LLM) techniques. Specifically, by integrating retrieval-augmented generation (RAG) technology with LLMs, we extract relevant performance data from complex corporate ESG reports. The ESGReveal framework consists of three primary modules: an ESG Metadata module for standardized queries, a Report Preprocessing module for database construction, and an LLM Agent module for data extraction. We evaluated the performance of various LLMs, including GPT-3.5, GPT-4, ChatGLM, and QWEN, and found that GPT-4 achieved 76.9% accuracy in data extraction and 83.7% accuracy in disclosure analysis, showing the best improvement over baseline models. We applied this ESGReveal model to 2249 ESG reports published by 166 companies across 12 industries listed on the Hong Kong Stock Exchange (HKEx), analyzing the disclosure and performance of key ESG indicators. Results show that for mandatory environmental and social indicators required by HKEx, the sample companies achieved disclosure rates of 69.5% and 57.2%, respectively. Different industries exhibited varying performance in key ESG indicators, such as the proportion of direct and indirect greenhouse gas emissions, highlighting key areas for future emission reduction efforts. These findings underscore the need to strengthen ESG practices across sectors and emphasize both general and sector-specific ESG initiatives. In summary, by leveraging the capabilities of LLM and RAG technologies, ESGReveal offers a practical and efficient solution to the pressing need for consistent and accurate ESG information retrieval. |
| Persistent Identifier | http://hdl.handle.net/10722/358232 |
| ISSN | 2023 Impact Factor: 9.7 2023 SCImago Journal Rankings: 2.058 |
| ISI Accession Number ID |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Zou, Yi | - |
| dc.contributor.author | Shi, Mengying | - |
| dc.contributor.author | Chen, Zhongjie | - |
| dc.contributor.author | Deng, Zhu | - |
| dc.contributor.author | Lei, Zongxiong | - |
| dc.contributor.author | Zeng, Zihan | - |
| dc.contributor.author | Yang, Shiming | - |
| dc.contributor.author | Tong, Hongxiang | - |
| dc.contributor.author | Xiao, Lei | - |
| dc.contributor.author | Zhou, Wenwen | - |
| dc.date.accessioned | 2025-07-26T00:30:30Z | - |
| dc.date.available | 2025-07-26T00:30:30Z | - |
| dc.date.issued | 2025-01-15 | - |
| dc.identifier.citation | Journal of Cleaner Production, 2025, v. 489 | - |
| dc.identifier.issn | 0959-6526 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/358232 | - |
| dc.description.abstract | <p>As an important source for disclosure a company's environmental, social, and governance (ESG) performance, stock exchanges gradually strengthen their requirements for listed companies to periodically submit their ESG exports. However, these documents are often unstructured, making it difficult to directly evaluate a company's disclosure level as well as the performance quantitatively. In this study, we develop a quantitative framework, ESGReveal, for assessing corporate ESG performance based on large language model (LLM) techniques. Specifically, by integrating retrieval-augmented generation (RAG) technology with LLMs, we extract relevant performance data from complex corporate ESG reports. The ESGReveal framework consists of three primary modules: an ESG Metadata module for standardized queries, a Report Preprocessing module for database construction, and an LLM Agent module for data extraction. We evaluated the performance of various LLMs, including GPT-3.5, GPT-4, ChatGLM, and QWEN, and found that GPT-4 achieved 76.9% accuracy in data extraction and 83.7% accuracy in disclosure analysis, showing the best improvement over baseline models. We applied this ESGReveal model to 2249 ESG reports published by 166 companies across 12 industries listed on the Hong Kong Stock Exchange (HKEx), analyzing the disclosure and performance of key ESG indicators. Results show that for mandatory environmental and social indicators required by HKEx, the sample companies achieved disclosure rates of 69.5% and 57.2%, respectively. Different industries exhibited varying performance in key ESG indicators, such as the proportion of direct and indirect greenhouse gas emissions, highlighting key areas for future emission reduction efforts. These findings underscore the need to strengthen ESG practices across sectors and emphasize both general and sector-specific ESG initiatives. In summary, by leveraging the capabilities of LLM and RAG technologies, ESGReveal offers a practical and efficient solution to the pressing need for consistent and accurate ESG information retrieval. <br></p> | - |
| dc.language | eng | - |
| dc.publisher | Elsevier | - |
| dc.relation.ispartof | Journal of Cleaner Production | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.title | ESGReveal: An LLM-based approach for extracting structured data from ESG reports | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1016/j.jclepro.2024.144572 | - |
| dc.identifier.volume | 489 | - |
| dc.identifier.eissn | 1879-1786 | - |
| dc.identifier.isi | WOS:001409239500001 | - |
| dc.identifier.issnl | 0959-6526 | - |
