Recursive visual sound separation using minus-plus net

Xu, Xudong; Dai, Bo; Lin, Dahua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/ICCV.2019.00097
Scopus: eid_2-s2.0-85081894297
WOS: WOS:000531438101001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Recursive visual sound separation using minus-plus net

Title	Recursive visual sound separation using minus-plus net
Authors	Xu, Xudong Dai, Bo Lin, Dahua
Issue Date	2019
Citation	Proceedings of the IEEE International Conference on Computer Vision, 2019, v. 2019-October, p. 882-891 How to Cite? DOI: http://dx.doi.org/10.1109/ICCV.2019.00097
Abstract	Sounds provide rich semantics, complementary to visual data, for many tasks. However, in practice, sounds from multiple sources are often mixed together. In this paper we propose a novel framework, referred to as MinusPlus Network (MP-Net), for the task of visual sound separation. MP-Net separates sounds recursively in the order of average energy, removing the separated sound from the mixture at the end of each prediction, until the mixture becomes empty or contains only noise. In this way, MP-Net could be applied to sound mixtures with arbitrary numbers and types of sounds. Moreover, while MP-Net keeps removing sounds with large energy from the mixture, sounds with small energy could emerge and become clearer, so that the separation is more accurate. Compared to previous methods, MP-Net obtains state-of-the-art results on two large scale datasets, across mixtures with different types and numbers of sounds.
Persistent Identifier	http://hdl.handle.net/10722/352188
ISSN	1550-5499 2023 SCImago Journal Rankings: 12.263
ISI Accession Number ID	WOS:000531438101001

DC Field	Value	Language
dc.contributor.author	Xu, Xudong	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Lin, Dahua	-
dc.date.accessioned	2024-12-16T03:57:12Z	-
dc.date.available	2024-12-16T03:57:12Z	-
dc.date.issued	2019	-
dc.identifier.citation	Proceedings of the IEEE International Conference on Computer Vision, 2019, v. 2019-October, p. 882-891	-
dc.identifier.issn	1550-5499	-
dc.identifier.uri	http://hdl.handle.net/10722/352188	-
dc.description.abstract	Sounds provide rich semantics, complementary to visual data, for many tasks. However, in practice, sounds from multiple sources are often mixed together. In this paper we propose a novel framework, referred to as MinusPlus Network (MP-Net), for the task of visual sound separation. MP-Net separates sounds recursively in the order of average energy, removing the separated sound from the mixture at the end of each prediction, until the mixture becomes empty or contains only noise. In this way, MP-Net could be applied to sound mixtures with arbitrary numbers and types of sounds. Moreover, while MP-Net keeps removing sounds with large energy from the mixture, sounds with small energy could emerge and become clearer, so that the separation is more accurate. Compared to previous methods, MP-Net obtains state-of-the-art results on two large scale datasets, across mixtures with different types and numbers of sounds.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE International Conference on Computer Vision	-
dc.title	Recursive visual sound separation using minus-plus net	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/ICCV.2019.00097	-
dc.identifier.scopus	eid_2-s2.0-85081894297	-
dc.identifier.volume	2019-October	-
dc.identifier.spage	882	-
dc.identifier.epage	891	-
dc.identifier.isi	WOS:000531438101001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Recursive visual sound separation using minus-plus net

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats