Advanced neural networks and their application for medical image segmentation

Qi, Wenbo; 祁文博

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Electrical & Electronic Engineering: Theses

postgraduate thesis: Advanced neural networks and their application for medical image segmentation

Title	Advanced neural networks and their application for medical image segmentation
Authors	Qi, Wenbo 祁文博
Advisors	Advisor(s):Chan, SC Wu, YC
Issue Date	2025
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Qi, W. [祁文博]. (2025). Advanced neural networks and their application for medical image segmentation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Convolutional neural networks have shown significant promise in medical image segmentation, providing crucial insights for early diagnosis, biopsy planning, and clinical therapies. However, accurate segmentation is challenging due to inherent image characteristics (\textit{i.e.}, low contrast, speckle noise in ultrasound), variations in target shape and size, and domain shift across datasets. This thesis aims to develop advanced neural networks for medical image segmentation. Specifically, we shall consider breast tumor segmentation from ultrasound images, ultrasound thyroid nodule segmentation, and domain generalization in prostate MRI segmentation. Firstly, we propose an innovative Multi-scale Dynamic Fusion Network (MDF-Net) to segment the ultrasound breast tumors. It is structured as an end-to-end two-stage architecture, comprising a trunk sub-network responsible for multi-scale feature selection and a refinement sub-network that is optimized to enhance feature exploration and fusion, thereby minimizing impairments. Building upon the UNet++, the trunk network features dense skip connections to facilitate connectivity between features across different scales. Additionally, we introduce multi-scale deep supervision to capture more discriminative features and attenuate inaccuracies stemming from speckle noise. The refinement sub-network leverages a structurally optimized MDF mechanism to enhance initial segmentation at coarser scales and delve into inter-subject variation insights at finer scales. Evaluation of two publicly available datasets demonstrates that our proposed MDF-Net outperforms state-of-the-art approaches in terms of Dice coefficient and other evaluation metrics. Secondly, existing multi-task learning methods for thyroid nodule segmentation suffer from 1) the distribution gap between different datasets and 2) inconsistency in loss calculation for different tasks. We propose a novel STR-Net to address these issues. Specifically, we propose a Multi-mix Data Augmentation that randomly crops the foreground and background of gland and nodule images and mixes them to generate new samples. Furthermore, we propose a new Thyroid-Region Prior Guided Refinement Network by adding Multi-scale Deep Supervision and Nodule Refinement Structure. Moreover, a Teacher-Student Semi-supervised Framework is constructed with our proposed network to maintain consistency in the multi-task feature alignment. Finally, an Edge Distance Regularization method is proposed for post-processing to make nodule segmentation boundaries smoother and flatter. Extensive experiments on two datasets have demonstrated the effectiveness of our method. Lastly, we propose a bidirectional Gated Recurrent Unit (GRU) based refinement network with simple and effective Patch Mixing and Risk Extrapolation (PMRE) schemes for multi-site prostate MRI segmentation. It employs a large convolution kernel-based multiscale feature encoder to extract multiscale features from consecutive 2D slices and a recurrent bidirectional ConvGRU-based contracting decoder to fuse the 3D segmentation features from a coarse-to-fine strategy. To enhance the generalization capability and robustness of the network across diverse target domains, a novel PMRE domain generalization approach is introduced by leveraging data manipulation, network design optimizations, and risk extrapolation. Specifically, an effective Domain Patch Mixing mechanism, which interpolates patches from different domains, is proposed for effective data augmentation. A simple and effective Segmentation Risk Extrapolation scheme is proposed to minimize the performance spread of the network over all the multisite samples. Experimental results from six commonly used source domains of prostate show that the proposed framework performs better than state-of-the-art algorithms.
Degree	Doctor of Philosophy
Subject	Neural networks (Computer science) Diagnostic imaging - Data processing
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/356571

DC Field	Value	Language
dc.contributor.advisor	Chan, SC	-
dc.contributor.advisor	Wu, YC	-
dc.contributor.author	Qi, Wenbo	-
dc.contributor.author	祁文博	-
dc.date.accessioned	2025-06-05T09:31:11Z	-
dc.date.available	2025-06-05T09:31:11Z	-
dc.date.issued	2025	-
dc.identifier.citation	Qi, W. [祁文博]. (2025). Advanced neural networks and their application for medical image segmentation. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/356571	-
dc.description.abstract	Convolutional neural networks have shown significant promise in medical image segmentation, providing crucial insights for early diagnosis, biopsy planning, and clinical therapies. However, accurate segmentation is challenging due to inherent image characteristics (\textit{i.e.}, low contrast, speckle noise in ultrasound), variations in target shape and size, and domain shift across datasets. This thesis aims to develop advanced neural networks for medical image segmentation. Specifically, we shall consider breast tumor segmentation from ultrasound images, ultrasound thyroid nodule segmentation, and domain generalization in prostate MRI segmentation. Firstly, we propose an innovative Multi-scale Dynamic Fusion Network (MDF-Net) to segment the ultrasound breast tumors. It is structured as an end-to-end two-stage architecture, comprising a trunk sub-network responsible for multi-scale feature selection and a refinement sub-network that is optimized to enhance feature exploration and fusion, thereby minimizing impairments. Building upon the UNet++, the trunk network features dense skip connections to facilitate connectivity between features across different scales. Additionally, we introduce multi-scale deep supervision to capture more discriminative features and attenuate inaccuracies stemming from speckle noise. The refinement sub-network leverages a structurally optimized MDF mechanism to enhance initial segmentation at coarser scales and delve into inter-subject variation insights at finer scales. Evaluation of two publicly available datasets demonstrates that our proposed MDF-Net outperforms state-of-the-art approaches in terms of Dice coefficient and other evaluation metrics. Secondly, existing multi-task learning methods for thyroid nodule segmentation suffer from 1) the distribution gap between different datasets and 2) inconsistency in loss calculation for different tasks. We propose a novel STR-Net to address these issues. Specifically, we propose a Multi-mix Data Augmentation that randomly crops the foreground and background of gland and nodule images and mixes them to generate new samples. Furthermore, we propose a new Thyroid-Region Prior Guided Refinement Network by adding Multi-scale Deep Supervision and Nodule Refinement Structure. Moreover, a Teacher-Student Semi-supervised Framework is constructed with our proposed network to maintain consistency in the multi-task feature alignment. Finally, an Edge Distance Regularization method is proposed for post-processing to make nodule segmentation boundaries smoother and flatter. Extensive experiments on two datasets have demonstrated the effectiveness of our method. Lastly, we propose a bidirectional Gated Recurrent Unit (GRU) based refinement network with simple and effective Patch Mixing and Risk Extrapolation (PMRE) schemes for multi-site prostate MRI segmentation. It employs a large convolution kernel-based multiscale feature encoder to extract multiscale features from consecutive 2D slices and a recurrent bidirectional ConvGRU-based contracting decoder to fuse the 3D segmentation features from a coarse-to-fine strategy. To enhance the generalization capability and robustness of the network across diverse target domains, a novel PMRE domain generalization approach is introduced by leveraging data manipulation, network design optimizations, and risk extrapolation. Specifically, an effective Domain Patch Mixing mechanism, which interpolates patches from different domains, is proposed for effective data augmentation. A simple and effective Segmentation Risk Extrapolation scheme is proposed to minimize the performance spread of the network over all the multisite samples. Experimental results from six commonly used source domains of prostate show that the proposed framework performs better than state-of-the-art algorithms.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Neural networks (Computer science)	-
dc.subject.lcsh	Diagnostic imaging - Data processing	-
dc.title	Advanced neural networks and their application for medical image segmentation	-
dc.type	PG_Thesis	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2025	-
dc.identifier.mmsid	991044970873903414	-

File Download

Supplementary

postgraduate thesis: Advanced neural networks and their application for medical image segmentation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats