File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning

TitlePromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning
Authors
Keywordsbias in recommender system
collaborative filtering
content-based recommendation
data sparsity
graph learning
knowledge distillation
multi-modal learning
multi-modal recommendation
prompt-tuning
Issue Date2024
Citation
WWW 2024 - Proceedings of the ACM Web Conference, 2024, p. 3217-3228 How to Cite?
AbstractMultimedia online platforms (e.g., Amazon, TikTok) have greatly benefited from the incorporation of multimedia (e.g., visual, textual, and acoustic) content into their personal recommender systems. These modalities provide intuitive semantics that facilitate modality-aware user preference modeling. However, two key challenges in multi-modal recommenders remain unresolved: i) The introduction of multi-modal encoders with a large number of additional parameters causes overfitting, given high-dimensional multi-modal features provided by extractors (e.g., ViT, BERT). ii) Side information inevitably introduces inaccuracies and redundancies, which skew the modality-interaction dependency from reflecting true user preference. To tackle these problems, we propose to simplify and empower recommenders through Multi-modal Knowledge Distillation (PromptMM) with the prompt-tuning that enables adaptive quality distillation. Specifically, PromptMM conducts model compression through distilling u-i edge relationship and multi-modal node content from cumbersome teachers to relieve students from the additional feature reduction parameters. To bridge the semantic gap between multi-modal context and collaborative signals for empowering the overfitting teacher, soft prompt-tuning is introduced to perform student task-adaptive. Additionally, to adjust the impact of inaccuracies in multimedia data, a disentangled multi-modal list-wise distillation is developed with modality-aware re-weighting mechanism. Experiments on real-world data demonstrate PromptMM's superiority over existing techniques. Ablation tests confirm the effectiveness of key components. Additional tests show the efficiency and effectiveness.
Persistent Identifierhttp://hdl.handle.net/10722/355967

 

DC FieldValueLanguage
dc.contributor.authorWei, Wei-
dc.contributor.authorTang, Jiabin-
dc.contributor.authorXia, Lianghao-
dc.contributor.authorJiang, Yangqin-
dc.contributor.authorHuang, Chao-
dc.date.accessioned2025-05-19T05:46:57Z-
dc.date.available2025-05-19T05:46:57Z-
dc.date.issued2024-
dc.identifier.citationWWW 2024 - Proceedings of the ACM Web Conference, 2024, p. 3217-3228-
dc.identifier.urihttp://hdl.handle.net/10722/355967-
dc.description.abstractMultimedia online platforms (e.g., Amazon, TikTok) have greatly benefited from the incorporation of multimedia (e.g., visual, textual, and acoustic) content into their personal recommender systems. These modalities provide intuitive semantics that facilitate modality-aware user preference modeling. However, two key challenges in multi-modal recommenders remain unresolved: i) The introduction of multi-modal encoders with a large number of additional parameters causes overfitting, given high-dimensional multi-modal features provided by extractors (e.g., ViT, BERT). ii) Side information inevitably introduces inaccuracies and redundancies, which skew the modality-interaction dependency from reflecting true user preference. To tackle these problems, we propose to simplify and empower recommenders through Multi-modal Knowledge Distillation (PromptMM) with the prompt-tuning that enables adaptive quality distillation. Specifically, PromptMM conducts model compression through distilling u-i edge relationship and multi-modal node content from cumbersome teachers to relieve students from the additional feature reduction parameters. To bridge the semantic gap between multi-modal context and collaborative signals for empowering the overfitting teacher, soft prompt-tuning is introduced to perform student task-adaptive. Additionally, to adjust the impact of inaccuracies in multimedia data, a disentangled multi-modal list-wise distillation is developed with modality-aware re-weighting mechanism. Experiments on real-world data demonstrate PromptMM's superiority over existing techniques. Ablation tests confirm the effectiveness of key components. Additional tests show the efficiency and effectiveness.-
dc.languageeng-
dc.relation.ispartofWWW 2024 - Proceedings of the ACM Web Conference-
dc.subjectbias in recommender system-
dc.subjectcollaborative filtering-
dc.subjectcontent-based recommendation-
dc.subjectdata sparsity-
dc.subjectgraph learning-
dc.subjectknowledge distillation-
dc.subjectmulti-modal learning-
dc.subjectmulti-modal recommendation-
dc.subjectprompt-tuning-
dc.titlePromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1145/3589334.3645359-
dc.identifier.scopuseid_2-s2.0-85191036365-
dc.identifier.spage3217-
dc.identifier.epage3228-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats