Artificial training samples for the improvement of pattern recognitionsystems

Ni, Zhibo.; 倪志博.

File Download

FullText.pdf

Links for fulltext

(May Require Subscription)

DOI: 10.5353/th_b4784964

Supplementary

Citations:
Appears in Collections:
- Electrical & Electronic Engineering: Theses
- HKU Theses Online

postgraduate thesis: Artificial training samples for the improvement of pattern recognitionsystems

Title	Artificial training samples for the improvement of pattern recognitionsystems
Authors	Ni, Zhibo.倪志博.
Advisors	Advisor(s):Leung, CH
Issue Date	2012
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Ni, Z. [倪志博]. (2012). Artificial training samples for the improvement of pattern recognition systems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4784964
Abstract	Pattern recognition is the assignment of some sort of label to a given input value or instance, according to some specific learning algorithm. The recognition performance is directly linked with the quality and size of the training data. However, in many real pattern recognition implementations, it is difficult or not so convenient to collect as many samples as possible for training up the classifier, such as face recognition or Chinese character recognition. In view of the shortage of training samples, the main object of our research is to investigate the generation and use of artificial samples for improving the recognition performance. Besides enhancing the learning, artificial samples are also used in a novel way such that a conventional Chinese character recognizer can read half or combined Chinese character segments. It greatly simplifies the segmentation procedure as well as reduces the error introduced by segmentation. Two novel generation models have been developed to evaluate the effectiveness of supplementing artificial samples in the training. One model generates artificial faces with various facial expressions or lighting conditions by morphing and warping two given sample faces. We tested our face generation model in three popular 2D face databases, which contain both gray scale and color images. Experiments show the generated faces look quite natural and they improve the recognition rates by a large margin. The other model uses stroke and radical information to build new Chinese characters. Artificial Chinese characters are produced by Bezier curves passing through some specified points. This model is more flexible in generating artificial handwritten characters than merely distorting the genuine real samples, with both stroke level and radical level variations. Another feature of this character generation model is that it does not require any real handwritten character sample at hand. In other words, we can train the conventional character classifier and perform character recognition tasks without collecting handwritten samples. Experiment results have validated its possibility and the recognition rate is still acceptable. Besides tackling the small sample size problem in face recognition and isolated character recognition, we improve the performance of bank check legal amount recognizer by proposing character segments recognition and applying Hidden Markov Model (HMM). It is hoped that this thesis can provide some insights for future researches in artificial sample generation, face morphing, Chinese character segmentation and text recognition or some other related issues.
Degree	Doctor of Philosophy
Subject	Pattern recognition systems.
Dept/Program	Electrical and Electronic Engineering
Persistent Identifier	http://hdl.handle.net/10722/174522
HKU Library Item ID	b4784964

DC Field	Value	Language
dc.contributor.advisor	Leung, CH	-
dc.contributor.author	Ni, Zhibo.	-
dc.contributor.author	倪志博.	-
dc.date.issued	2012	-
dc.identifier.citation	Ni, Z. [倪志博]. (2012). Artificial training samples for the improvement of pattern recognition systems. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b4784964	-
dc.identifier.uri	http://hdl.handle.net/10722/174522	-
dc.description.abstract	Pattern recognition is the assignment of some sort of label to a given input value or instance, according to some specific learning algorithm. The recognition performance is directly linked with the quality and size of the training data. However, in many real pattern recognition implementations, it is difficult or not so convenient to collect as many samples as possible for training up the classifier, such as face recognition or Chinese character recognition. In view of the shortage of training samples, the main object of our research is to investigate the generation and use of artificial samples for improving the recognition performance. Besides enhancing the learning, artificial samples are also used in a novel way such that a conventional Chinese character recognizer can read half or combined Chinese character segments. It greatly simplifies the segmentation procedure as well as reduces the error introduced by segmentation. Two novel generation models have been developed to evaluate the effectiveness of supplementing artificial samples in the training. One model generates artificial faces with various facial expressions or lighting conditions by morphing and warping two given sample faces. We tested our face generation model in three popular 2D face databases, which contain both gray scale and color images. Experiments show the generated faces look quite natural and they improve the recognition rates by a large margin. The other model uses stroke and radical information to build new Chinese characters. Artificial Chinese characters are produced by Bezier curves passing through some specified points. This model is more flexible in generating artificial handwritten characters than merely distorting the genuine real samples, with both stroke level and radical level variations. Another feature of this character generation model is that it does not require any real handwritten character sample at hand. In other words, we can train the conventional character classifier and perform character recognition tasks without collecting handwritten samples. Experiment results have validated its possibility and the recognition rate is still acceptable. Besides tackling the small sample size problem in face recognition and isolated character recognition, we improve the performance of bank check legal amount recognizer by proposing character segments recognition and applying Hidden Markov Model (HMM). It is hoped that this thesis can provide some insights for future researches in artificial sample generation, face morphing, Chinese character segmentation and text recognition or some other related issues.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.source.uri	http://hub.hku.hk/bib/B47849642	-
dc.subject.lcsh	Pattern recognition systems.	-
dc.title	Artificial training samples for the improvement of pattern recognitionsystems	-
dc.type	PG_Thesis	-
dc.identifier.hkul	b4784964	-
dc.description.thesisname	Doctor of Philosophy	-
dc.description.thesislevel	Doctoral	-
dc.description.thesisdiscipline	Electrical and Electronic Engineering	-
dc.description.nature	published_or_final_version	-
dc.identifier.doi	10.5353/th_b4784964	-
dc.date.hkucongregation	2012	-
dc.identifier.mmsid	991033486089703414	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

postgraduate thesis: Artificial training samples for the improvement of pattern recognitionsystems

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats