Skip to content
@DataArcTech

DataArcTech

Welcome to DataArc Tech Inc.

⚡DataArcTech⚡

👉 Data-Driven, Intelligently Synthesized

🔥 We specialize in intelligent synthetic data generation and knowledge-augmented LLM reasoning technologies.

🌟 With a focus on context graphs and multi-agent systems, we build more efficient and trustworthy next-generation data and model infrastructure.

🚀 Through open-source projects and in-depth research, we explore the full technical cycle from data synthesis and continual pre-training to model evaluation.

👋 Join us in contributing high-quality algorithms, data, and insights to the open-source community.

 

             

Popular repositories Loading

  1. DataArc-SynData-Toolkit DataArc-SynData-Toolkit Public

    Synthetic Data Generation Platform By DataArcTech

    Python 1.6k 49

  2. ToG ToG Public

    This is the official github repo of Think-on-Graph (ICLR 2024). If you are interested in our work or willing to join our research team in Shenzhen, please feel free to contact us by email (xuchengj…

    Python 644 70

  3. LLM-as-a-Judge LLM-as-a-Judge Public

    174 6

  4. SQL-R1 SQL-R1 Public

    [NeurIPS'25] Official Repository for the Paper "SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning"

    Python 137 18

  5. ToG-2 ToG-2 Public

    Python 110 20

  6. ChartMoE ChartMoE Public

    [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

    Jupyter Notebook 98 9

Repositories

Showing 10 of 34 repositories
  • ToG-3 Public

    Think-on-Graph 3.0: Efficient and Adaptive LLM Reasoning on Heterogeneous Graphs via Multi-Agent Dual-Evolving Context Retrieval

    DataArcTech/ToG-3’s past year of commit activity
    Python 85 MIT 13 6 0 Updated Apr 9, 2026
  • GraphSearch Public

    GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation

    DataArcTech/GraphSearch’s past year of commit activity
    Python 93 Apache-2.0 11 0 0 Updated Apr 9, 2026
  • Awesome-FinLLMs Public

    🥇 A curated list of awesome large language models in finance(FinLLMs), including papers,models,datasets and codebases. 金融大模型列表,特别是中英双语大模型。

    DataArcTech/Awesome-FinLLMs’s past year of commit activity
    60 Apache-2.0 7 0 0 Updated Apr 7, 2026
  • Awesome-LLMs-for-Mathematical-Modeling Public

    🥇 A curated list of awesome Large Language Models/Agents for Mathematical Modeling tasks, including papers,models,datasets and codebases. 专门用于数学建模任务的大模型/Agent。

    DataArcTech/Awesome-LLMs-for-Mathematical-Modeling’s past year of commit activity
    6 Apache-2.0 0 0 0 Updated Apr 7, 2026
  • F5-TTS-NPU Public Forked from SWivid/F5-TTS

    NPU version of "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

    DataArcTech/F5-TTS-NPU’s past year of commit activity
    Python 0 MIT 2,140 0 0 Updated Mar 27, 2026
  • ai-video-dubber Public

    AI Video Dubbing Agent — automatically dubs videos from any language to any language with speaker cloning, Gemini multimodal optimization, and ElevenLabs TTS

    DataArcTech/ai-video-dubber’s past year of commit activity
    Python 0 0 0 0 Updated Mar 23, 2026
  • team-dev-agent Public

    AI-driven software development workflow for teams. Claude Code + Codex best practices, frontend UI prompts, quality gates, and automation scripts.

    DataArcTech/team-dev-agent’s past year of commit activity
    Shell 2 1 0 0 Updated Mar 20, 2026
  • RAG-ARC Public

    A modular, high-performance Retrieval-Augmented Generation framework with multi-path retrieval, graph extraction, and fusion ranking

    DataArcTech/RAG-ARC’s past year of commit activity
    Python 42 MIT 12 0 1 Updated Mar 4, 2026
  • DataArc-SynData-Toolkit Public

    Synthetic Data Generation Platform By DataArcTech

    DataArcTech/DataArc-SynData-Toolkit’s past year of commit activity
    Python 1,616 49 8 1 Updated Feb 28, 2026
  • .github Public
    DataArcTech/.github’s past year of commit activity
    0 0 0 0 Updated Feb 9, 2026

Top languages

Loading…

Most used topics

Loading…