Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

0

导语:
在人工智能和机器学习领域,图神经网络(Graph Neural Networks,GNN)和图Transformer近年来取得了显著的进展,并在计算机视觉领域得到了广泛应用。本文将基于《IEEE 模式分析与机器智能汇刊》(IEEE TPAMI)最新发表的综述文章,对GNN和图Transformer在计算机视觉中的最新进展进行详细介绍。

一、综述背景
近年来,GNN在图表示学习(graph representation learning)和非网格数据(non-grid data)上的性能优势使其被广泛应用于各个领域。本文综述了GNN在计算机视觉中的最新进展,旨在帮助研究人员掌握相关领域的经典方法和最新动态。

二、综述内容
1. GNN发展史
GNN最初以循环GNN的形式发展,用于从有向无环图中提取节点表示。随着研究的深入,GNN逐渐扩展到更多类型的图结构,如循环图和无向图。受到深度学习中卷积神经网络(CNN)的启发,研究人员开发了将卷积概念推广到图域的方法,主要包括基于频域的方法和基于空域的方法。

  1. 图神经网络在计算机视觉中的应用
    本文将GNN在计算机视觉中的应用分为以下五类:
    (1)自然图像(二维):包括图像分类、目标检测、语义分割和场景图生成等任务。
    (2)视频:包括视频动作识别、时序动作定位、多目标跟踪、人类运动预测和轨迹预测等任务。
    (3)视觉 + 语言:包括视觉问答、视觉定位、图像描述、图像-文本匹配和视觉语言导航等任务。
    (4)三维数据:包括3D表示学习(点云、网格)、3D理解(点云分割、3D目标检测、3D视觉定位)和3D生成(点云补全、3D数据去噪、3D重建)等任务。
    (5)医学影像:包括脑活动研究、疾病诊断(脑部疾病、胸部疾病等)等任务。

  2. 图Transformer在计算机视觉中的应用
    本文还介绍了基于Transformer的图神经网络方法在计算机视觉中的应用,包括图Transformer在图像分类、目标检测、语义分割、视频动作识别等任务中的应用。

三、总结
本文综述了GNN和图Transformer在计算机视觉中的最新进展,从发展史、应用领域和具体任务等方面进行了详细介绍。这些方法在计算机视觉领域具有广泛的应用前景,为相关研究人员提供了有益的参考。

参考文献:
[1] 陈超奇, 周洪宇, 俞益洲. A Survey on Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(5): 2677-2703.
[2] 吴毓双, 许牧天, 韩晓光. Graph Neural Networks and Graph Transformers in Computer Vision: A Task-Oriented Perspective[J]. arXiv preprint arXiv:2209.13232, 2022.


>>> Read more <<<

Views: 0

0

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注