File Download
Supplementary
-
Citations:
- Appears in Collections:
postgraduate thesis: Similarity search with earth mover's distance at scale
Title | Similarity search with earth mover's distance at scale |
---|---|
Authors | |
Advisors | |
Issue Date | 2013 |
Publisher | The University of Hong Kong (Pokfulam, Hong Kong) |
Citation | Tang, Y. [唐宇]. (2013). Similarity search with earth mover's distance at scale. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5204926 |
Abstract | Earth Mover's Distance (EMD), as a similarity measure, has received a lot of attention in the fields of multimedia and probabilistic databases, computer vision, image retrieval, machine learning, etc. EMD on multidimensional histograms provides better distinguishability between the objects approximated by the histograms (e.g., images), compared to classic measures like Euclidean distance. Despite its usefulness, EMD has a high computational cost; therefore, a number of effective filtering methods have been proposed, to reduce the pairs of histograms for which the exact EMD has to be computed, during similarity search. Still, EMD calculations in the refinement step remain the bottleneck of the whole similarity search process. In this thesis, we focus on optimizing the refinement phase of EMD-based similarity search by (i) adapting an efficient min-cost flow algorithm (SIA) for the EMD computation, (ii) proposing a dynamic distance bound, which is progressively updated and tightened during the refinement process and can be used to terminate an EMD refinement early, and (iii) proposing a dynamic refinement order for the candidates which, paired with a concurrent EMD refinement strategy, reduces the amount of needless computations. Our proposed techniques are orthogonal to and can be easily integrated with the state-of-the-art filtering techniques, reducing the cost of EMD-based similarity queries by orders of magnitude. |
Degree | Master of Philosophy |
Subject | Computer algorithms Information retrieval Image processing Electronic information resource searching |
Dept/Program | Computer Science |
Persistent Identifier | http://hdl.handle.net/10722/198819 |
HKU Library Item ID | b5204926 |
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Cheng, CK | - |
dc.contributor.advisor | Mamoulis, N | - |
dc.contributor.author | Tang, Yu | - |
dc.contributor.author | 唐宇 | - |
dc.date.accessioned | 2014-07-10T04:10:17Z | - |
dc.date.available | 2014-07-10T04:10:17Z | - |
dc.date.issued | 2013 | - |
dc.identifier.citation | Tang, Y. [唐宇]. (2013). Similarity search with earth mover's distance at scale. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR. Retrieved from http://dx.doi.org/10.5353/th_b5204926 | - |
dc.identifier.uri | http://hdl.handle.net/10722/198819 | - |
dc.description.abstract | Earth Mover's Distance (EMD), as a similarity measure, has received a lot of attention in the fields of multimedia and probabilistic databases, computer vision, image retrieval, machine learning, etc. EMD on multidimensional histograms provides better distinguishability between the objects approximated by the histograms (e.g., images), compared to classic measures like Euclidean distance. Despite its usefulness, EMD has a high computational cost; therefore, a number of effective filtering methods have been proposed, to reduce the pairs of histograms for which the exact EMD has to be computed, during similarity search. Still, EMD calculations in the refinement step remain the bottleneck of the whole similarity search process. In this thesis, we focus on optimizing the refinement phase of EMD-based similarity search by (i) adapting an efficient min-cost flow algorithm (SIA) for the EMD computation, (ii) proposing a dynamic distance bound, which is progressively updated and tightened during the refinement process and can be used to terminate an EMD refinement early, and (iii) proposing a dynamic refinement order for the candidates which, paired with a concurrent EMD refinement strategy, reduces the amount of needless computations. Our proposed techniques are orthogonal to and can be easily integrated with the state-of-the-art filtering techniques, reducing the cost of EMD-based similarity queries by orders of magnitude. | - |
dc.language | eng | - |
dc.publisher | The University of Hong Kong (Pokfulam, Hong Kong) | - |
dc.relation.ispartof | HKU Theses Online (HKUTO) | - |
dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
dc.rights | The author retains all proprietary rights, (such as patent rights) and the right to use in future works. | - |
dc.subject.lcsh | Computer algorithms | - |
dc.subject.lcsh | Information retrieval | - |
dc.subject.lcsh | Image processing | - |
dc.subject.lcsh | Electronic information resource searching | - |
dc.title | Similarity search with earth mover's distance at scale | - |
dc.type | PG_Thesis | - |
dc.identifier.hkul | b5204926 | - |
dc.description.thesisname | Master of Philosophy | - |
dc.description.thesislevel | Master | - |
dc.description.thesisdiscipline | Computer Science | - |
dc.description.nature | published_or_final_version | - |
dc.identifier.doi | 10.5353/th_b5204926 | - |
dc.identifier.mmsid | 991036905599703414 | - |