Reddit Research Toolkit

A self-contained tool for sourcing audience insight from Reddit. Runs a 4-stage pipeline — search → quote extraction → quality control → enrichment — and stores structured quotes in your own Supabase project. Built to be driven by Claude Code: open this repo in Claude Code, ask it to run research on a theme, and the included skill orchestrates query generation, pipeline runs, and pattern analysis.

What it does

Given a theme (e.g. "working moms") and a set of Reddit search queries:

Search Reddit posts & comments via the ScrapeCreators API
Extract verbatim first-person quotes (≥140 chars) via Gemini
Quality control each quote: keep or reject with a reason
Enrich kept quotes with structured dimensions — dominant emotion, journey stage, villain, breaking point, specificity score, signal score

Output lands in three Postgres tables (research_projects, research_jobs, research_quotes) plus two query-friendly views (v_research_dimension_counts, v_research_top_quotes).

Prerequisites

You'll need three things — all free or pay-as-you-go:

Service	What it does	Sign up
Supabase	Postgres database for storing quotes	https://supabase.com → New project (free tier is enough)
Google Gemini	LLM for extraction / QC / enrichment	https://aistudio.google.com/app/apikey (free tier covers thousands of calls/day)
ScrapeCreators	Reddit search API	https://scrapecreators.com (paid)

You'll also need Node 20+ and pnpm (npm i -g pnpm).

Setup

git clone <this-repo-url> reddit-research-toolkit
cd reddit-research-toolkit
pnpm install
cp .env.example .env
# Fill in .env with your four keys

Then apply the schema to your Supabase project. Two options:

Option A — Dashboard (easiest): open your project → SQL Editor → New Query → paste the contents of supabase/migrations/0001_research_schema.sql → Run.

Option B — Supabase CLI: supabase link --project-ref <your-ref> && supabase db push.

Verify it worked:

# In the Supabase SQL editor:
SELECT * FROM v_research_dimension_counts;
# Should return an empty result (not an error).

First run

pnpm research:run \
  --theme "Working moms" \
  --theme_definition "US working moms balancing full-time work and young kids." \
  --queries_file ./queries/example.json

The pipeline prints colored progress to your terminal as it moves through each stage. On success it reports kept/rejected/enriched counts and a project ID. Query your results in Supabase using the views.

Using with Claude Code

The repo ships with a Claude Code skill at .claude/skills/reddit-research/SKILL.md. Open this directory in Claude Code and ask:

"Run research on working moms"

Claude reads the skill, proposes 10–15 search queries for the theme, asks you to confirm, then calls pnpm research:run for you. After the run completes it summarizes results and offers pattern queries.

You can edit the skill directly to change query strategy, the runner script (src/scripts/research-run.ts) to change CLI behavior, or the services (src/services/) to change pipeline logic.

Pattern queries

-- Dimension distribution for a project
SELECT dimension_name, dimension_value, count
FROM v_research_dimension_counts
WHERE project_id = '<project-id>'
ORDER BY dimension_name, count DESC;

-- Top quotes by signal
SELECT quote_text, dominant_emotion, villain, specificity_score, source_url
FROM v_research_top_quotes
WHERE project_id = '<project-id>'
LIMIT 20;

Project layout

src/
  config/         Supabase client + env validation
  services/       4-stage pipeline (orchestrator, Gemini, ScrapeCreators, events, rate-limiter, types)
  scripts/        CLI entrypoint (research-run.ts)
supabase/
  migrations/     SQL schema
queries/          Example query files
.claude/
  skills/         Reddit research skill for Claude Code

Notes

The Supabase service-role key has admin rights. Fine for a local single-user tool; never commit your .env and never expose this key from a browser.
Default model is gemini-2.0-flash. Change it in src/services/research.types.ts (RESEARCH_DEFAULTS.gemini_model).
Pipeline targets ~1500 pending and ~1000 kept quotes per run. Adjust via RESEARCH_DEFAULTS in the same file.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude/skills/reddit-research		.claude/skills/reddit-research
queries		queries
src		src
supabase/migrations		supabase/migrations
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Research Toolkit

What it does

Prerequisites

Setup

First run

Using with Claude Code

Pattern queries

Project layout

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Reddit Research Toolkit

What it does

Prerequisites

Setup

First run

Using with Claude Code

Pattern queries

Project layout

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages