Deep learning based image analysis with enhanced reasoning capability

Zhao, Gangming; 趙剛明

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Computer Science: Theses

postgraduate thesis: Deep learning based image analysis with enhanced reasoning capability

Title	Deep learning based image analysis with enhanced reasoning capability
Authors	Zhao, Gangming 趙剛明
Advisors	Advisor(s):Yu, Y
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Zhao, G. [趙剛明]. (2024). Deep learning based image analysis with enhanced reasoning capability. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	To create high-quality hybrid models interactively based on Convolutional Neural Networks (CNNs) or by conducting cross-module learning from diverse feature representations usually suffers from the challenging task of combining multiple latency information from a sparse input, such as a semantic object, a medical image, and images with special topology. In this thesis, we use deep learning strategies to present novel algorithms for three problems: representing the basic object semantic information, creating a hybrid CNNs and Graph Neural Networks (GNNs) model for 3D nodule recognition, and learning special topological information for 3D medical vessel images. At first, to represent the object semantic information, we proposed a novel Graph Feature Pyramid Network (GraphFPN). Feature pyramids have been proven powerful in image understanding tasks that require multi-scale features. State-of-the-art methods for multi-scale feature learning focus on performing feature interactions across space and scales using neural networks with a fixed topology. In this section, we propose graph feature pyramid networks capable of adapting their topological structures to varying intrinsic image structures and supporting simultaneous feature interactions across all scales. We first define an image-specific superpixel hierarchy for each input image to represent its intrinsic image structures. The graph feature pyramid network inherits its structure from this superpixel hierarchy. Contextual and hierarchical layers are designed to achieve feature interactions within the same scale and across different scales, respectively. In clinical practice, doctors often use attributes, e.g. morphological and appearance characteristics of a lesion, to aid disease diagnosis. Effectively modeling all relationships among attributes could boost the accuracy of medical image diagnosis. In this section, we introduce a hybrid neural-probabilistic reasoning algorithm for interpretable attribute-based medical image diagnosis. There are two parallel branches in our hybrid algorithm, a Bayesian network branch performing probabilistic causal relationship reasoning and a graph convolutional network branch performing more generic relational modeling and reasoning using a feature representation. Tight coupling between these two branches is achieved via a cross-network attention mechanism and the fusion of their classification results. Finally, vessel segmentation is widely used to help with vascular disease diagnosis. Vessels reconstructed using existing methods are often not sufficiently accurate to meet clinical use standards. This is because 3D vessel structures are highly complicated and exhibit unique characteristics, including sparsity and anisotropy. In this section, we propose a novel hybrid deep neural network for vessel segmentation. Our network consists of two cascaded subnetworks performing initial and refined segmentation respectively. The second subnetwork further has two tightly coupled components, a traditional CNN-based U-Net and a graph U-Net. Cross-network multi-scale feature fusion is performed between these two U-shaped networks to effectively support high-quality vessel segmentation. The entire cascaded network can be trained from end to end. The graph in the second subnetwork is constructed according to a vessel probability map as well as appearance and semantic similarities in the original CT volume.
Degree	Doctor of Philosophy
Subject	Image analysis - Data processing Deep learning (Machine learning)
Dept/Program	Computer Science
Persistent Identifier	http://hdl.handle.net/10722/343770

DC Field	Value	Language
dc.contributor.advisor	Yu, Y	-
dc.contributor.author	Zhao, Gangming	-
dc.contributor.author	趙剛明	-
dc.date.accessioned	2024-06-06T01:04:51Z	-
dc.date.available	2024-06-06T01:04:51Z	-
dc.date.issued	2024	-
dc.identifier.citation	Zhao, G. [趙剛明]. (2024). Deep learning based image analysis with enhanced reasoning capability. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/343770	-
dc.description.abstract	To create high-quality hybrid models interactively based on Convolutional Neural Networks (CNNs) or by conducting cross-module learning from diverse feature representations usually suffers from the challenging task of combining multiple latency information from a sparse input, such as a semantic object, a medical image, and images with special topology. In this thesis, we use deep learning strategies to present novel algorithms for three problems: representing the basic object semantic information, creating a hybrid CNNs and Graph Neural Networks (GNNs) model for 3D nodule recognition, and learning special topological information for 3D medical vessel images. At first, to represent the object semantic information, we proposed a novel Graph Feature Pyramid Network (GraphFPN). Feature pyramids have been proven powerful in image understanding tasks that require multi-scale features. State-of-the-art methods for multi-scale feature learning focus on performing feature interactions across space and scales using neural networks with a fixed topology. In this section, we propose graph feature pyramid networks capable of adapting their topological structures to varying intrinsic image structures and supporting simultaneous feature interactions across all scales. We first define an image-specific superpixel hierarchy for each input image to represent its intrinsic image structures. The graph feature pyramid network inherits its structure from this superpixel hierarchy. Contextual and hierarchical layers are designed to achieve feature interactions within the same scale and across different scales, respectively. In clinical practice, doctors often use attributes, e.g. morphological and appearance characteristics of a lesion, to aid disease diagnosis. Effectively modeling all relationships among attributes could boost the accuracy of medical image diagnosis. In this section, we introduce a hybrid neural-probabilistic reasoning algorithm for interpretable attribute-based medical image diagnosis. There are two parallel branches in our hybrid algorithm, a Bayesian network branch performing probabilistic causal relationship reasoning and a graph convolutional network branch performing more generic relational modeling and reasoning using a feature representation. Tight coupling between these two branches is achieved via a cross-network attention mechanism and the fusion of their classification results. Finally, vessel segmentation is widely used to help with vascular disease diagnosis. Vessels reconstructed using existing methods are often not sufficiently accurate to meet clinical use standards. This is because 3D vessel structures are highly complicated and exhibit unique characteristics, including sparsity and anisotropy. In this section, we propose a novel hybrid deep neural network for vessel segmentation. Our network consists of two cascaded subnetworks performing initial and refined segmentation respectively. The second subnetwork further has two tightly coupled components, a traditional CNN-based U-Net and a graph U-Net. Cross-network multi-scale feature fusion is performed between these two U-shaped networks to effectively support high-quality vessel segmentation. The entire cascaded network can be trained from end to end. The graph in the second subnetwork is constructed according to a vessel probability map as well as appearance and semantic similarities in the original CT volume.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Image analysis - Data processing	-
dc.subject.lcsh	Deep learning (Machine learning)	-
dc.title	Deep learning based image analysis with enhanced reasoning capability	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Computer Science	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044809206303414	-

File Download

Supplementary

postgraduate thesis: Deep learning based image analysis with enhanced reasoning capability

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats