Local multi-headed self-attention · pyg-team/pytorch_geometric#8972

Métriques du dépôt

I am unable to find the clean implementation of local multi-headed self-attention in pytorch geometric. I found three types of multi-head attention, one TransformerConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.TransformerConv.html#torch_geometric.nn.conv.TransformerConv). But this one calculates a linear combination of all features with different attention weights as opposed to dividing features into multiple heads and taking their linear combination: another RGATConv in the similar direction (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.RGATConv.html). And finally GPSConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.GPSConv.html) that does multi-head attention but is global.

I think it is nice to have the implementation of local self-attention with multiple heads where each head looks into a part of the feature dimension.

No response

Direction de recherche: Étudiez les implémentations existantes dans PyTorch Geometric : TransformerConv (torch geometric/nn/conv/transformer conv.py) effectue une combinaison linéaire globale des caractéristiques, tandis que GPSConv (torch geometric/nn/conv/gps conv.py) effectue une attention multi têtes globale. Concevez une version locale qui divise les caractéristiques en têtes et calcule l'attention au sein d'un voisinage local, en utilisant un schéma de passage de messages. Référez vous à RGATConv (torch geometric/nn/conv/rgat conv.py) pour l'attention relationnelle, mais adaptez la à une auto attention locale sans relations. L'implémentation doit être une nouvelle classe de convolution qui prend les caractéristiques des nœuds et edge index, et renvoie des caractéristiques de nœuds mises à jour avec une attention multi têtes locale.
Stack technique: python
Domaine: machine learning
Type d'issue: Fonctionnalité
Difficulté: 3
Temps estimé: 3-5 jours
Statut d'activité: Ancienne
Clarté: Claire
Prérequis: PyTorch basicsAttention mechanism understandingPyTorch Geometric conventions
Accessibilité débutant: 50