Skip to content

[Proposal] Documentation: Map the Act Names to the Transformer #644

@juvogt

Description

@juvogt

Proposal

Create a figure that maps the act names to the transformer architecture.

Motivation

Names are just conventions. I find it hard to get the exact position within the transformer block just from the act name. I.e. the resid_pre might be before the split happens or before the merge happens. So I put it in context to the other act names and work by exclusion process or modify it to see what values will change.

Pitch

I suggest using the images from the Vasvani paper and adding labeled arrows pointing to the hook positions.

Alternatives

A list or table of (act name, description) pairs.

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    complexity-moderateModerately complicated issues for people who have intermediate experience with the codedocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions