Weaver is a modular pipeline that dynamically combines SQL and Large Language Models (LLMs) for advanced table-based question answering. Unlike rigid approaches, Weaver generates flexible execution plans that use SQL for structured data operations and LLMs for semantic reasoning, automatically deciding the best tool for each subtask. Our method consistently outperforms state-of-the-art approaches across four major TableQA datasets while reducing API costs and improving accuracy through intelligent query decomposition.
Paper: Weaver: Interweaving SQL and LLM for Table Reasoning
- Clone the repository:
git clone https://github.com/rohitkhoja/weaver.git
cd weaver
- Install dependencies:
pip install -r requirements.txt
- Install Weaver:
pip install -e .
- Copy the environment template:
cp .env.example .env
- Configure your
.env
file with the following essential settings:
# π REQUIRED: LLM API Key (choose one provider)
OPENAI_API_KEY=your-openai-api-key-here
# π― REQUIRED: LLM Model (LiteLLM format: provider/model)
LLM_MODEL=openai/gpt-4o-mini
# π REQUIRED: Dataset Directory (where your CSV files are stored)
WEAVER_DATASETS_DIR=./datasets
# ποΈ REQUIRED: Database Configuration (MySQL recommended)
WEAVER_DB_TYPE=mysql
WEAVER_DB_HOST=localhost
WEAVER_DB_PORT=3306
WEAVER_DB_NAME=weaver_db
WEAVER_DB_USER=root
WEAVER_DB_PASSWORD=your-mysql-password
# π Optional: Logging Level
WEAVER_LOG_LEVEL=INFO
β οΈ Important: For now, use MySQL as the database backend. Support for other databases is in progress.
Make sure you have MySQL installed and running:
# Create database
mysql -u root -p
CREATE DATABASE weaver_db;
exit
from weaver import TableQA, WeaverConfig
# Initialize with environment configuration
config = WeaverConfig.from_env()
qa = TableQA(config)
# Ask a question using JSON object format
question_obj = {
"table_id": "example-001",
"question": "Which country had the most cyclists finish within the top 10?",
"table_file_name": "./datasets/WikiTableQuestions/csv/203-csv/733.csv",
"target_value": "Italy",
"table_name": "2008 ClΓ‘sica de San SebastiΓ‘n"
}
result = qa.ask(question_obj)
print(f"Answer: {result.answer}")
print(f"Correct: {result.is_correct}")
from weaver import TableQA, WeaverConfig
config = WeaverConfig.from_env()
qa = TableQA(config)
# Process multiple questions from a dataset
results = qa.evaluate_dataset(
dataset_name="wikitq",
data_path="./datasets/wikitq.json",
num_samples=100
)
# Calculate accuracy
accuracy = sum(r.is_correct for r in results) / len(results)
print(f"Accuracy: {accuracy:.2%}")
question_obj = {
"table_id": "ADI/2011/page_61.pdf",
"question": "What is the percentage change in cash flow hedges in 2011 compared to 2010?",
"table_file_name": "./datasets/FINQA/csv/ADI_2011_page_61.csv",
"target_value": "9.9%",
"table_name": "ADI/2011/page_61.pdf",
"paragraphs": "Additional context about cash flow hedges and financial data..."
}
result = qa.ask(question_obj)
print(f"Answer: {result.answer}")
# Ask a single question
python -m weaver.cli.main ask "Which country won the most medals?" \
--table-path ./datasets/olympics.csv
# Evaluate on a dataset
python -m weaver.cli.main evaluate wikitq \
--data-path ./datasets/wikitq.json \
--num-samples 50
# Show configuration
python -m weaver.cli.main config-info
Weaver supports multiple TableQA datasets. Place your data in the structure specified by WEAVER_DATASETS_DIR
:
datasets/
βββ WikiTableQuestions/
β βββ csv/
β βββ 203-csv/
β βββ 733.csv
βββ FINQA/
β βββ csv/
β βββ ADI_2011_page_61.csv
βββ TabFact/
β βββ csv/
βββ OTT-QA/
β βββ tables/
βββ wikitq.json # Question dataset
βββ finqa.json # Question dataset
βββ tabfact.json # Question dataset
βββ ott-qa.json # Question dataset
[
{
"table_id": "nu-0",
"question": "Which country had the most cyclists finish within the top 10?",
"table_file_name": "./datasets/WikiTableQuestions/csv/203-csv/733.csv",
"target_value": "Italy",
"table_name": "2008 ClΓ‘sica de San SebastiΓ‘n",
"paragraphs": "Optional context text..."
}
]
Variable | Description | Example | Required |
---|---|---|---|
OPENAI_API_KEY |
OpenAI API key | sk-proj-... |
β |
LLM_MODEL |
LLM model in LiteLLM format | openai/gpt-4o-mini |
β |
WEAVER_DATASETS_DIR |
Path to datasets directory | ./datasets |
β |
WEAVER_DB_TYPE |
Database type | mysql |
β |
WEAVER_DB_HOST |
Database host | localhost |
β |
WEAVER_DB_PORT |
Database port | 3306 |
β |
WEAVER_DB_NAME |
Database name | weaver_db |
β |
WEAVER_DB_USER |
Database username | root |
β |
WEAVER_DB_PASSWORD |
Database password | your_password |
β |
WEAVER_LOG_LEVEL |
Logging level | INFO |
βͺ |
LLM_TEMPERATURE |
Model temperature | 0.01 |
βͺ |
LLM_MAX_TOKENS |
Max output tokens | 2048 |
βͺ |
Weaver uses LiteLLM and supports 100+ LLM providers:
# OpenAI
export OPENAI_API_KEY="sk-..."
export LLM_MODEL="openai/gpt-4o-mini"
# Anthropic Claude
export ANTHROPIC_API_KEY="sk-ant-..."
export LLM_MODEL="anthropic/claude-3-sonnet-20240229"
Weaver has been evaluated on four major TableQA datasets:
- WikiTableQuestions: Complex reasoning over Wikipedia tables
- TabFact: Fact verification over tables
- FinQA: Financial reasoning with numerical tables
- OTT-QA: Open table-and-text QA
Our experiments show that Weaver consistently outperforms state-of-the-art methods while reducing API calls and error rates.
For detailed results and analysis, see our paper.
Weaver's modular pipeline consists of:
- Table Preprocessor: Handles table loading and column filtering
- Context Manager: Manages paragraphs and external context
- Plan Generator: Creates step-by-step execution plans
- SQL-LLM Executor: Dynamically executes SQL and LLM operations
- Answer Extractor: Formats and validates final answers
The system dynamically decides when to use SQL for structured operations and when to leverage LLMs for semantic reasoning.
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
If you use Weaver in your research, please cite our paper:
@misc{khoja2025weaverinterweavingsqlllm,
title={Weaver: Interweaving SQL and LLM for Table Reasoning},
author={Rohit Khoja and Devanshu Gupta and Yanjie Fu and Dan Roth and Vivek Gupta},
year={2025},
eprint={2505.18961},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2505.18961},
}
This work was inspired by and builds upon several important contributions in the field:
- BlendSQL : A Scalable Dialect for Unifying Hybrid Question Answering in Relational Algebra
- ProTrix : Building Models for Planning and Reasoning over Tables with Sentence Context
- H-Star : LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables
- Binder : Binding Language Models in Symbolic Languages
This project is licensed under the MIT License - see the LICENSE file for details.