querygym provides several state-of-the-art query reformulation methods.
Simple keyword expansion using LLM.
import querygym as qg
reformulator = qg.create_reformulator("genqr", model="gpt-4")
result = reformulator.reformulate(qg.QueryItem("q1", "neural networks"))Ensemble of multiple keyword expansion prompts for better coverage.
reformulator = qg.create_reformulator(
"genqr_ensemble",
model="gpt-4",
params={"repeat_query_weight": 3}
)Parameters:
repeat_query_weight(int): Number of times to repeat original query (default: 3)
Generates pseudo-documents relevant to the query.
reformulator = qg.create_reformulator("query2doc", model="gpt-4")Supports both zero-shot and chain-of-thought variants.
Decomposes query into sub-questions, generates answers, and refines.
reformulator = qg.create_reformulator("qa_expand", model="gpt-4")Multi-granularity information expansion.
reformulator = qg.create_reformulator("mugi", model="gpt-4")Context-based passage synthesis using retrieved documents.
# Load contexts
contexts = qg.load_contexts("contexts.jsonl")
# Create reformulator
reformulator = qg.create_reformulator("lamer", model="gpt-4")
# Reformulate with contexts
results = reformulator.reformulate_batch(queries, contexts=contexts)Note: LameR requires contexts from initial retrieval.
Query to entity expansion.
reformulator = qg.create_reformulator("query2e", model="gpt-4")Context-based sentence extraction from retrieved documents.
# Requires contexts
contexts = qg.load_contexts("contexts.jsonl")
reformulator = qg.create_reformulator("csqe", model="gpt-4")
results = reformulator.reformulate_batch(queries, contexts=contexts)Multi-round reasoning-based expansion with iterative corpus feedback.
reformulator = qg.create_reformulator(
"thinkqe",
model="deepseek-ai/DeepSeek-R1-Distill-Qwen-14B",
params={
"searcher": searcher,
"num_interaction": 3,
"keep_passage_num": 5,
"gen_num": 2,
"accumulate": True,
"use_passage_filter": True,
"search_k": 1000,
},
)| Method | Requires Context | Type | Best For |
|---|---|---|---|
| genqr | No | Keyword expansion | General queries |
| genqr_ensemble | No | Keyword expansion | Robust expansion |
| query2doc | No | Pseudo-document | Dense retrieval |
| qa_expand | No | QA-based | Complex queries |
| mugi | No | Multi-granular | Diverse expansion |
| lamer | Yes | Context synthesis | Re-ranking |
| query2e | No | Entity expansion | Entity queries |
| csqe | Yes | Sentence extraction | Precision-focused |
| thinkqe | Yes | Iterative reasoning | Multi-round feedback |
All methods support custom parameters:
reformulator = qg.create_reformulator(
"genqr_ensemble",
model="gpt-4",
params={
"repeat_query_weight": 5,
"temperature": 0.7
},
llm_config={
"temperature": 0.8,
"max_tokens": 512
}
)Process multiple queries efficiently:
queries = qg.load_queries("queries.tsv")
reformulator = qg.create_reformulator("genqr", model="gpt-4")
# Batch reformulation with progress bar
results = reformulator.reformulate_batch(queries)See Prompt Bank for details on customizing prompts.