File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Adaptive Perspective Distillation for Semantic Segmentation

TitleAdaptive Perspective Distillation for Semantic Segmentation
Authors
KeywordsKnowledge distillation
scene understanding
semantic segmentation
Issue Date2023
Citation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, v. 45, n. 2, p. 1372-1387 How to Cite?
AbstractStrong semantic segmentation models require large backbones to achieve promising performance, making it hard to adapt to real applications where effective real-time algorithms are needed. Knowledge distillation tackles this issue by letting the smaller model (student) produce similar pixel-wise predictions to that of a larger model (teacher). However, the classifier, which can be deemed as the perspective by which models perceive the encoded features for yielding observations (i.e., predictions), is shared by all training samples, fitting a universal feature distribution. Since good generalization to the entire distribution may bring the inferior specification to individual samples with a certain capacity, the shared universal perspective often overlooks details existing in each sample, causing degradation of knowledge distillation. In this paper, we propose Adaptive Perspective Distillation (APD) that creates an adaptive local perspective for each individual training sample. It extracts detailed contextual information from each training sample specifically, mining more details from the teacher and thus achieving better knowledge distillation results on the student. APD has no structural constraints to both teacher and student models, thus generalizing well to different semantic segmentation models. Extensive experiments on Cityscapes, ADE20K, and PASCAL-Context manifest the effectiveness of our proposed APD. Besides, APD can yield favorable performance gain to the models in both object detection and instance segmentation without bells and whistles.
Persistent Identifierhttp://hdl.handle.net/10722/332255
ISSN
2023 Impact Factor: 20.8
2023 SCImago Journal Rankings: 6.158
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorTian, Zhuotao-
dc.contributor.authorChen, Pengguang-
dc.contributor.authorLai, Xin-
dc.contributor.authorJiang, Li-
dc.contributor.authorLiu, Shu-
dc.contributor.authorZhao, Hengshuang-
dc.contributor.authorYu, Bei-
dc.contributor.authorYang, Ming Chang-
dc.contributor.authorJia, Jiaya-
dc.date.accessioned2023-10-06T05:10:04Z-
dc.date.available2023-10-06T05:10:04Z-
dc.date.issued2023-
dc.identifier.citationIEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, v. 45, n. 2, p. 1372-1387-
dc.identifier.issn0162-8828-
dc.identifier.urihttp://hdl.handle.net/10722/332255-
dc.description.abstractStrong semantic segmentation models require large backbones to achieve promising performance, making it hard to adapt to real applications where effective real-time algorithms are needed. Knowledge distillation tackles this issue by letting the smaller model (student) produce similar pixel-wise predictions to that of a larger model (teacher). However, the classifier, which can be deemed as the perspective by which models perceive the encoded features for yielding observations (i.e., predictions), is shared by all training samples, fitting a universal feature distribution. Since good generalization to the entire distribution may bring the inferior specification to individual samples with a certain capacity, the shared universal perspective often overlooks details existing in each sample, causing degradation of knowledge distillation. In this paper, we propose Adaptive Perspective Distillation (APD) that creates an adaptive local perspective for each individual training sample. It extracts detailed contextual information from each training sample specifically, mining more details from the teacher and thus achieving better knowledge distillation results on the student. APD has no structural constraints to both teacher and student models, thus generalizing well to different semantic segmentation models. Extensive experiments on Cityscapes, ADE20K, and PASCAL-Context manifest the effectiveness of our proposed APD. Besides, APD can yield favorable performance gain to the models in both object detection and instance segmentation without bells and whistles.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Pattern Analysis and Machine Intelligence-
dc.subjectKnowledge distillation-
dc.subjectscene understanding-
dc.subjectsemantic segmentation-
dc.titleAdaptive Perspective Distillation for Semantic Segmentation-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TPAMI.2022.3159581-
dc.identifier.pmid35294341-
dc.identifier.scopuseid_2-s2.0-85126511871-
dc.identifier.volume45-
dc.identifier.issue2-
dc.identifier.spage1372-
dc.identifier.epage1387-
dc.identifier.eissn1939-3539-
dc.identifier.isiWOS:000912386000003-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats