pyg-team/pytorch_geometric

Local multi-headed self-attention

Open

#8 972 ouverte le 26 févr. 2024

Voir sur GitHub
 (3 commentaires) (1 réaction) (0 assignés)Python (3 514 forks)batch import
featurehelp wanted

Métriques du dépôt

Stars
 (19 985 stars)
Métriques de merge PR
 (Merge moyen 16j 3h) (13 PRs mergées en 30 j)

Description

🚀 The feature, motivation and pitch

I am unable to find the clean implementation of local multi-headed self-attention in pytorch geometric. I found three types of multi-head attention, one TransformerConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.TransformerConv.html#torch_geometric.nn.conv.TransformerConv). But this one calculates a linear combination of all features with different attention weights as opposed to dividing features into multiple heads and taking their linear combination: another RGATConv in the similar direction (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.RGATConv.html). And finally GPSConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.GPSConv.html) that does multi-head attention but is global.

Alternatives

I think it is nice to have the implementation of local self-attention with multiple heads where each head looks into a part of the feature dimension.

Additional context

No response

Guide contributeur