File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TPAMI.2024.3411595
- Scopus: eid_2-s2.0-85196104478
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: UniDetector: Towards Universal Object Detection with Heterogeneous Supervision
| Title | UniDetector: Towards Universal Object Detection with Heterogeneous Supervision |
|---|---|
| Authors | |
| Keywords | Annotations Detectors Heterogeneous Label Spaces Heterogeneous Supervision Learning Object detection Open World and Universal Object Detection Proposals Task analysis Training Vocabulary |
| Issue Date | 1-Jan-2024 |
| Publisher | Institute of Electrical and Electronics Engineers |
| Citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, v. 47, n. 7, p. 5205-5222 How to Cite? |
| Abstract | In this paper, we formally address universal object detection, which aims to detect every category in every scene. The dependence on human annotations, the limited visual information, and the novel categories in open world severely restrict the universality of detectors. We propose UniDetector, a universal object detector that recognizes enormous categories in the open world. The critical points for UniDetector are: 1) it leverages images of multiple sources and heterogeneous label spaces in training through image-text alignment, which guarantees sufficient information for universal representations. 2) it involves heterogeneous supervision training, which alleviates the dependence on the limited fully-labeled images. 3) it generalizes to open world easily while keeping the balance between seen and unseen classes. 4) it further promotes generalizing to novel categories through our proposed decoupling training manner and probability calibration. These contributions allow UniDetector to detect over 7k categories, the largest measurable size so far, with only about 500 classes participating in training. Our UniDetector behaves the strong zero-shot ability on large-vocabulary datasets - it surpasses supervised baselines by more than 5% without seeing any corresponding images. On 13 detection datasets with various scenes, UniDetector also achieves state-of-the-art performance with only a 3% amount of training data. |
| Persistent Identifier | http://hdl.handle.net/10722/362200 |
| ISSN | 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Wang, Zhenyu | - |
| dc.contributor.author | Li, Yali | - |
| dc.contributor.author | Chen, Xi | - |
| dc.contributor.author | Lim, Ser Nam | - |
| dc.contributor.author | Torralba, Antonio | - |
| dc.contributor.author | Zhao, Hengshuang | - |
| dc.contributor.author | Wang, Shengjin | - |
| dc.date.accessioned | 2025-09-20T00:30:43Z | - |
| dc.date.available | 2025-09-20T00:30:43Z | - |
| dc.date.issued | 2024-01-01 | - |
| dc.identifier.citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, v. 47, n. 7, p. 5205-5222 | - |
| dc.identifier.issn | 0162-8828 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/362200 | - |
| dc.description.abstract | In this paper, we formally address universal object detection, which aims to detect every category in every scene. The dependence on human annotations, the limited visual information, and the novel categories in open world severely restrict the universality of detectors. We propose UniDetector, a universal object detector that recognizes enormous categories in the open world. The critical points for UniDetector are: 1) it leverages images of multiple sources and heterogeneous label spaces in training through image-text alignment, which guarantees sufficient information for universal representations. 2) it involves heterogeneous supervision training, which alleviates the dependence on the limited fully-labeled images. 3) it generalizes to open world easily while keeping the balance between seen and unseen classes. 4) it further promotes generalizing to novel categories through our proposed decoupling training manner and probability calibration. These contributions allow UniDetector to detect over 7k categories, the largest measurable size so far, with only about 500 classes participating in training. Our UniDetector behaves the strong zero-shot ability on large-vocabulary datasets - it surpasses supervised baselines by more than 5% without seeing any corresponding images. On 13 detection datasets with various scenes, UniDetector also achieves state-of-the-art performance with only a 3% amount of training data. | - |
| dc.language | eng | - |
| dc.publisher | Institute of Electrical and Electronics Engineers | - |
| dc.relation.ispartof | IEEE Transactions on Pattern Analysis and Machine Intelligence | - |
| dc.rights | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. | - |
| dc.subject | Annotations | - |
| dc.subject | Detectors | - |
| dc.subject | Heterogeneous Label Spaces | - |
| dc.subject | Heterogeneous Supervision Learning | - |
| dc.subject | Object detection | - |
| dc.subject | Open World and Universal Object Detection | - |
| dc.subject | Proposals | - |
| dc.subject | Task analysis | - |
| dc.subject | Training | - |
| dc.subject | Vocabulary | - |
| dc.title | UniDetector: Towards Universal Object Detection with Heterogeneous Supervision | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1109/TPAMI.2024.3411595 | - |
| dc.identifier.scopus | eid_2-s2.0-85196104478 | - |
| dc.identifier.volume | 47 | - |
| dc.identifier.issue | 7 | - |
| dc.identifier.spage | 5205 | - |
| dc.identifier.epage | 5222 | - |
| dc.identifier.eissn | 1939-3539 | - |
| dc.identifier.issnl | 0162-8828 | - |
