Skip to content

[Refactor] Decouple RoPE Scaling Logic #1180

@rubik-hua

Description

@rubik-hua

Description:

Background

The current RoPE implementation tightly couples the scaling logic with the core computation loop. Specific scaling types (like LongRopeConfig) are handled via if-else branches and std::dynamic_pointer_cast inside initialize_cache. This violates the Open/Closed Principle, making it cumbersome to extend support for new RoPE scaling variants (e.g., Llama 3, YaRN) without modifying the core loop.

Proposed Changes

  1. Architectural Decoupling: Extract the ScalingConfig base class and LongRopeConfig into a dedicated rope_scaling_configs.hpp/.cc file.
  2. Polymorphic Interface: Introduce virtual methods get_freq_scale and get_magnitude_scale in the base class, providing default implementations that return 1.0f.
  3. Core Loop Simplification: Refactor initialize_cache in rope.cc to eliminate type-checking branches, relying instead on the polymorphic interface:
    float base_inv_freq = 1.0f / std::pow(...);
    float freq_scale = scaling_ ? scaling_->get_freq_scale(...) : 1.0f;
    float mag_scale = scaling_ ? scaling_->get_magnitude_scale(...) : 1.0f;
    float angle = static_cast<float>(pos) * base_inv_freq * freq_scale;
    sin_data[...] = std::sin(angle) * mag_scale;
  4. Llama3 Skeleton: Add the Llama3Config class to support Llama-3/3.1 models.

Action Items / TODOs

  • Implement Llama3 Scaling Logic: The current Llama3Config::get_freq_scale is a placeholder returning 1.0f. It needs to be implemented with the wavelength-based smooth interpolation logic to ensure correct context extension for Llama 3 models.
  • Fix Implicit Conversion Warning: In the LongRopeConfig constructor, explicitly cast original_max_position_embeddings_ to double when passing it to std::log to avoid implicit type conversion warnings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions