Skip to content
View airawatraj's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report airawatraj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
airawatraj/README.md

technologist | experimenter | seeker

"the future belongs to those who understand at a very deep level how to combine their unique expertise with what algorithms do best" — Pedro Domingos, The Master Algorithm


Portfolio

Independent Research & Systems.

Building local-first AI systems across reasoning, tool use, long-context agents, and multimodal workflows including text, image, audio, video-as-frames, voice notes, Telegram agents, Claude Code backends, and personal/household intelligence.

  • Cogni-Brain Qwen Omni: Latest high-context Cogni-Brain iteration for DGX Spark: Qwen3.5-122B-A10B, a native multimodal model supporting text, image, and video inputs, served via spark-vLLM Docker. Compared with the earlier Nemotron brain, this variant expands local context from 131K to 262K, improves speed from ~24 to ~40 tok/s, and reaches 100/100 Tool-Eval.
  • Cogni-Brain Omni Vision: Dedicated local multimodal perception agent for sandboxed DGX Spark workflows: Gemma 4 12B via vLLM with image input, audio/voice-note workflows, video-as-frames perception, multilingual chat, tool calling, 196K context, and reproducible local benchmarks.
  • Cogni-Brain Qwen Fast: Fast local tool-agent brain for sandboxed DGX Spark workflows: Qwen 3.6-35B via Atlas with 218.85 tok/s, 100/100 Tool-Eval, and local NVFP4 acceleration for responsive Claude Code and agent loops.
  • Cogni-Brain Nemotron: Original large local reasoning brain for sandboxed DGX Spark workflows: Nemotron-3-Super-120B via vLLM with stable 131K local context, ~24 tok/s, 93/100 Tool-Eval, hardened agentic stack, and reproducible benchmark methodology.
  • SageGPT-7.5M: Small language model trained from scratch on ~140M pure Sanskrit tokens on NVIDIA DGX Spark. 6 Layer, 8 Attention Head, 256 embed, 1024 context.
  • SageGPT-7M MLX: Small language model trained from scratch on ~57M Sanskrit tokens on Apple Silicon. 4 Layer, 8 Attention Head, 256 embed, 256 context.
  • Cogni.chat: Local-first multimodal AI ecosystem for personal and household intelligence, designed to support memory, planning, wellbeing, learning, creative work, family coordination, and multimodal interaction across text, voice, images, and personal context.
  • Fiduciary-Ops-Agent: Autonomous enterprise governance agent utilising a strict Check-then-Act protocol via Gemini 2.5 Flash Lite; enforces real-time fiduciary risk-alignment using tool-first orchestration.


"I have no special talent. I am only passionately curious" — Albert Einstein

Pinned Loading

  1. dgx-spark-qwen-omni-super-agent dgx-spark-qwen-omni-super-agent Public

    Stable long-context Cogni-Brain agent on DGX Spark: Qwen3.5-122B-A10B INT4 AutoRound, 262K context, ~40 TPS, 100/100 Tool-Eval, vLLM.

    Python

  2. dgx-spark-gemma4-omni-agent dgx-spark-gemma4-omni-agent Public

    Cogni-Brain Omni: Gemma 4 12B on DGX Spark via vLLM. Multimodal input, Telegram voice notes, multilingual chat, tools, MTP, 196K context, and reproducible local benchmarks.

    Python 1

  3. dgx-spark-qwen-super-agent dgx-spark-qwen-super-agent Public

    Cogni-Brain-2: Qwen 3.6-35B on DGX Spark via Atlas. 218.85 tok/s, 100/100 Tool-Eval, local NVFP4 acceleration.

    Python 1

  4. dgx-spark-nemotron-super-agent dgx-spark-nemotron-super-agent Public

    Cogni-Brain: Nemotron-3-Super-120B on DGX Spark via vLLM. Stable 131K local context, ~24 tok/s, hardened agentic stack, and reproducible benchmark methodology.

    Python 6 3

  5. sage-gpt sage-gpt Public

    SageGPT (7.5M param SLM): A Transformer trained from scratch on ~140M pure Sanskrit tokens on NVIDIA DGX Spark. 6 Layer, 8 Attn Head, 256 embed, 1024 context, ~8K vocab.

    Python

  6. fiduciary-ops-agent fiduciary-ops-agent Public

    An autonomous Fiduciary Agent powered by Gemini Flash Lite that enforces enterprise risk governance (CLV vs Refund) using strict tool-first protocols

    Jupyter Notebook