File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: AS-MLP: AN AXIAL SHIFTED MLP ARCHITECTURE FOR VISION
Title | AS-MLP: AN AXIAL SHIFTED MLP ARCHITECTURE FOR VISION |
---|---|
Authors | |
Issue Date | 2022 |
Citation | ICLR 2022 - 10th International Conference on Learning Representations, 2022 How to Cite? |
Abstract | An Axial Shifted MLP architecture (AS-MLP) is proposed in this paper. Different from MLP-Mixer, where the global spatial feature is encoded for information flow through matrix transposition and one token-mixing MLP, we pay more attention to the local features interaction. By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different axial directions, which captures the local dependencies. Such an operation enables us to utilize a pure MLP architecture to achieve the same local receptive field as CNN-like architecture. We can also design the receptive field size and dilation of blocks of AS-MLP, etc, in the same spirit of convolutional neural networks. With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset. Such a simple yet effective architecture outperforms all MLP-based architectures and achieves competitive performance compared to the transformer-based architectures (e.g., Swin Transformer) even with slightly lower FLOPs. In addition, AS-MLP is also the first MLP-based architecture to be applied to the downstream tasks (e.g., object detection and semantic segmentation). The experimental results are also impressive. Our proposed AS-MLP obtains 51.5 mAP on the COCO validation set and 49.5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures. Our AS-MLP establishes a strong baseline of MLP-based architecture. Code is available at https://github.com/svip-lab/AS-MLP. |
Persistent Identifier | http://hdl.handle.net/10722/345308 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Lian, Dongze | - |
dc.contributor.author | Yu, Zehao | - |
dc.contributor.author | Sun, Xing | - |
dc.contributor.author | Gao, Shenghua | - |
dc.date.accessioned | 2024-08-15T09:26:32Z | - |
dc.date.available | 2024-08-15T09:26:32Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | ICLR 2022 - 10th International Conference on Learning Representations, 2022 | - |
dc.identifier.uri | http://hdl.handle.net/10722/345308 | - |
dc.description.abstract | An Axial Shifted MLP architecture (AS-MLP) is proposed in this paper. Different from MLP-Mixer, where the global spatial feature is encoded for information flow through matrix transposition and one token-mixing MLP, we pay more attention to the local features interaction. By axially shifting channels of the feature map, AS-MLP is able to obtain the information flow from different axial directions, which captures the local dependencies. Such an operation enables us to utilize a pure MLP architecture to achieve the same local receptive field as CNN-like architecture. We can also design the receptive field size and dilation of blocks of AS-MLP, etc, in the same spirit of convolutional neural networks. With the proposed AS-MLP architecture, our model obtains 83.3% Top-1 accuracy with 88M parameters and 15.2 GFLOPs on the ImageNet-1K dataset. Such a simple yet effective architecture outperforms all MLP-based architectures and achieves competitive performance compared to the transformer-based architectures (e.g., Swin Transformer) even with slightly lower FLOPs. In addition, AS-MLP is also the first MLP-based architecture to be applied to the downstream tasks (e.g., object detection and semantic segmentation). The experimental results are also impressive. Our proposed AS-MLP obtains 51.5 mAP on the COCO validation set and 49.5 MS mIoU on the ADE20K dataset, which is competitive compared to the transformer-based architectures. Our AS-MLP establishes a strong baseline of MLP-based architecture. Code is available at https://github.com/svip-lab/AS-MLP. | - |
dc.language | eng | - |
dc.relation.ispartof | ICLR 2022 - 10th International Conference on Learning Representations | - |
dc.title | AS-MLP: AN AXIAL SHIFTED MLP ARCHITECTURE FOR VISION | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.scopus | eid_2-s2.0-85148322296 | - |