pyg-team/pytorch_geometric

Local multi-headed self-attention

Open

#8.972 aberto em 26 de fev. de 2024

Ver no GitHub
 (3 comments) (1 reaction) (0 assignees)Python (3.514 forks)batch import
featurehelp wanted

Métricas do repositório

Stars
 (19.985 stars)
Métricas de merge de PR
 (Mesclagem média 16d 3h) (13 fundiu PRs em 30d)

Description

🚀 The feature, motivation and pitch

I am unable to find the clean implementation of local multi-headed self-attention in pytorch geometric. I found three types of multi-head attention, one TransformerConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.TransformerConv.html#torch_geometric.nn.conv.TransformerConv). But this one calculates a linear combination of all features with different attention weights as opposed to dividing features into multiple heads and taking their linear combination: another RGATConv in the similar direction (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.RGATConv.html). And finally GPSConv (https://pytorch-geometric.readthedocs.io/en/latest/generated/torch_geometric.nn.conv.GPSConv.html) that does multi-head attention but is global.

Alternatives

I think it is nice to have the implementation of local self-attention with multiple heads where each head looks into a part of the feature dimension.

Additional context

No response

Guia do colaborador