adaptive-computation

Star

Here are 17 public repositories matching this topic...

raymin0223 / mixture_of_recursions

Star

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation (NeurIPS 2025)

router early-exiting adaptive-computation kv-cache llm recursive-transformers

Updated Sep 26, 2025
Python

LINs-lab / DynMoE

Star

[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

moe language-model mixture-of-experts adaptive-computation multimodal-large-language-models

Updated Jul 9, 2025
Python

koayon / awesome-adaptive-computation

Star

A curated reading list of research in Adaptive Computation, Inference-Time Computation & Mixture of Experts (MoE).

nlp machine-learning computer-vision tensorflow transformers pytorch mixture-of-experts adaptive-computation

Updated Jan 1, 2025

lucidrains / self-reasoning-tokens-pytorch

Star

Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto

deep-learning transformers artificial-intelligence attention-mechanism adaptive-computation

Updated May 17, 2024
Python

lucidrains / pause-transformer

Star

Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amount of time on any token

deep-learning transformers artificial-intelligence attention-mechanisms adaptive-computation

Updated Oct 22, 2023
Python

RightNow-AI / ouroboros

Sponsor

Star

Dynamic weight generation for recursive transformers via input-conditioned LoRA modulation

deep-learning transformers pytorch lora hypernetworks adaptive-computation recursive-transformers

Updated Apr 2, 2026
Python

Lee-Gihun / MicroNet_OSI-AI

Star

(NeurIPS-2019 MicroNet Challenge - 3rd Winner) Open source code for "SIPA: A simple framework for efficient networks"

pruning model-compression micronet model-acceleration neurips-2019 micronet-challenge early-exiting adaptive-computation compact-neural-network

Updated Dec 18, 2022
Python

vignesh2027 / TemporalMesh-Transformer

Star

Temporalmesh-transformer. It is the first architecture to simultaneously fuse dynamic graph topology, token-level adaptive compute, and temporal semantic decay into a single unified model. No prior work does all three together.

nlp machine-learning natural-language-processing research deep-learning pytorch artificial-intelligence transformer language-model attention-mechanism preprint graph-neural-network adaptive-computation efficient-transformers sparse-attention

Updated Jun 7, 2026
Python

USArmyResearchLab / ARL-Hierarchical-Multiscale-Framework

Star

The ARL Hierarchical MultiScale Framework (ARL-HMS) is a software library for development of multiscale models on heterogeneous high-performance computing systems.

adaptive-computation multiscale-modeling scale-bridging

Updated Jul 18, 2024
C++

gbyuvd / Mo2BERTa-v2-proto

Star

Frozen KV Context for Mixture-of-Recursions on a Modernized BERT

research representation-learning bert encoder-model adaptive-computation masked-language-modeling efficient-ai mixture-of-recursions frozen-kv

Updated Mar 31, 2026
Jupyter Notebook

srose69 / ViperLLM

Star

Volumetric language model with Triangle Cross-Scan State Modelling. Without Attention. With Neural Turing Machines (NTM) & Differentiable Neural Computers (DNC) smells

viper transformer echo-state-networks domain-specific-language state-space-model neural-turing-machine tensor-programming adaptive-computation kolmogorov-arnold-networks content-addressable-memory viper-llm noattention neural-programming predictive-state-coupling

Updated May 23, 2026
Python

WilliamK112 / ctm-adaptive-thinking-benchmark

Star

PyTorch benchmark for CTM-style adaptive computation, sparse-retrieval failure analysis, adaptive halting, and attention-supervised recovery.

benchmark machine-learning pytorch interpretability research-engineering adaptive-computation continuous-thought-machines

Updated May 17, 2026
Python

CNCLgithub / AdaptiveComputation

Star

Model implementation for "Adaptive computation as a new mechanism of dynamic human attention"

julia attention object-tracking adaptive-computation

Updated Jun 2, 2025
Julia

olanokhin / rci-inference

Star

Recursive Convergent Inference — dynamic MoE with convergence-gated stopping. Unexpected finding: model-relative complexity diverges from human difficulty labels

machine-learning research inference arxiv mixture-of-experts adaptive-computation ollama