fix: decode str_tokens per token for transformers >= 5#177
Open
robbiebusinessacc wants to merge 1 commit into
Open
fix: decode str_tokens per token for transformers >= 5#177robbiebusinessacc wants to merge 1 commit into
robbiebusinessacc wants to merge 1 commit into
Conversation
On transformers >= 5, batch_decode treats a 1-D tensor as a single sequence and returns one joined string instead of per-token strings. This silently broke token highlighting in the explainer and classifier scorers and raised IndexError in the intruder scorer. Add decode_per_token to delphi.utils and use it at the four sites that need per-token strings (samplers, non-activating constructor, OpenAI simulator). Fixes EleutherAI#176
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #176
On transformers >= 5,
batch_decodetreats a 1-D tensor as a singlesequence and returns one joined string instead of per-token strings.
Delphi calls it on 1-D
(ctx_len,)token tensors when buildingstr_tokens, so on a fresh install (transformers is unpinned) tokenhighlighting silently disappears from explainer and fuzz/detection
prompts, the simulator's
ActivationRecordgets a length-1 token list,and the intruder scorer raises
IndexError.Changes
decode_per_tokentodelphi.utils: decodes a 1-D tensor oftoken ids into one string per token via
batch_decode(tokens.unsqueeze(-1)). This restores the transformers 4.xbehavior exactly and works on both 4.x and 5.x. As noted in str_tokens is a single joined string (not per-token) on transformers 5.x — token highlighting silently broken in explainer and classifier scorers #176,
convert_ids_to_tokensis not a substitute since it leaks BPE artifactslike
Ġworld.latents/samplers.py(train andtest examples),
latents/constructors.py(
prepare_non_activating_examples), andscorers/simulator/simulation/oai_simulator.py. The two sites inconstructors.pythat immediately"".join(...)the result stillproduce the intended text, so they are left unchanged.
tests/test_utils.pyasserting per-token output and that thedecoded strings round-trip to the original text. The per-token assertion
fails on transformers 5.x without the fix and passes with it. Behavior
on 4.x is unchanged — verified element-wise against the old
batch_decodeoutput on both BPE (pythia-70m) and WordPiece(bert-base-uncased) tokenizers, including special tokens.
One possible follow-up I left out of scope:
tests/client_test.pybuilds
str_tokenswith the same 1-Dbatch_decodepattern, but it isa manual vLLM script rather than a collected test.
Credit to @LakeSJS for the diagnosis in #176.