A Mastra template that demonstrates how to protect against token limits by generating AI summaries from large datasets before passing as output from tool calls.
🎯 Key Learning: This template shows how to use large context window models (OpenAI GPT-4.1 Mini) as a "summarization layer" to compress large documents into focused summaries, enabling efficient downstream processing without hitting token limits.
This template showcases a crucial architectural pattern for working with large documents and LLMs:
🚨 The Problem: Large PDFs can contain 50,000+ tokens, which would overwhelm context windows and cost thousands of tokens for processing.
✅ The Solution: Use a large context window model (OpenAI GPT-4.1 Mini) to generate focused summaries, then use those summaries for downstream processing.
- Input: PDF URL
- Download & Summarize: Fetch PDF, extract text, and generate AI summary using OpenAI GPT-4.1 Mini
- Generate Questions: Create focused questions from the summary (not the full text)
- 📉 Token Reduction: 80-95% reduction in token usage
- 🎯 Better Quality: More focused questions from key insights
- 💰 Cost Savings: Dramatically reduced processing costs
- ⚡ Faster Processing: Summaries are much faster to process than full text
- Node.js 20.9.0 or higher
- OpenAI API key (for both summarization and question generation)
-
Clone and install dependencies:
git clone <repository-url> cd template-pdf-questions pnpm install
-
Set up environment variables:
cp .env.example .env # Edit .env and add your API keys
OPENAI_API_KEY="your-openai-api-key-here"
-
Run the example:
npx tsx example.ts
This template demonstrates a crucial pattern for working with large datasets in LLM applications:
When processing large documents (PDFs, reports, transcripts), you often encounter:
- Token limits: Documents can exceed context windows
- High costs: Processing 50,000+ tokens repeatedly is expensive
- Poor quality: LLMs perform worse on extremely long inputs
- Slow processing: Large inputs take longer to process
Instead of passing raw data through your pipeline:
- Use a large context window model (OpenAI GPT-4.1 Mini) to digest the full content
- Generate focused summaries that capture key information
- Pass summaries to downstream processing instead of raw data
// ❌ BAD: Pass full text through pipeline
const questions = await generateQuestions(fullPdfText); // 50,000 tokens!
// ✅ GOOD: Summarize first, then process
const summary = await summarizeWithGPT41Mini(fullPdfText); // 2,000 tokens
const questions = await generateQuestions(summary); // Much better!
- Large documents: PDFs, reports, transcripts
- Batch processing: Multiple documents
- Cost optimization: Reduce token usage
- Quality improvement: More focused processing
- Chain operations: Multiple LLM calls on same data
import { mastra } from './src/mastra/index';
const run = await mastra.getWorkflow('pdfToQuestionsWorkflow').createRunAsync();
// Using a PDF URL
const result = await run.start({
inputData: {
pdfUrl: 'https://example.com/document.pdf',
},
});
console.log(result.result.questions);
import { mastra } from './src/mastra/index';
const agent = mastra.getAgent('pdfQuestionsAgent');
// The agent can handle the full process with natural language
const response = await agent.stream([
{
role: 'user',
content: 'Please download this PDF and generate questions from it: https://example.com/document.pdf',
},
]);
for await (const chunk of response.textStream) {
console.log(chunk);
}
import { mastra } from './src/mastra/index';
import { pdfFetcherTool } from './src/mastra/tools/download-pdf-tool';
import { generateQuestionsFromTextTool } from './src/mastra/tools/generate-questions-from-text-tool';
// Step 1: Download PDF and generate summary
const pdfResult = await pdfFetcherTool.execute({
context: { pdfUrl: 'https://example.com/document.pdf' },
mastra,
runtimeContext: new RuntimeContext(),
});
console.log(`Downloaded ${pdfResult.fileSize} bytes from ${pdfResult.pagesCount} pages`);
console.log(`Generated ${pdfResult.summary.length} character summary`);
// Step 2: Generate questions from summary
const questionsResult = await generateQuestionsFromTextTool.execute({
context: {
extractedText: pdfResult.summary,
maxQuestions: 10,
},
mastra,
runtimeContext: new RuntimeContext(),
});
console.log(questionsResult.questions);
{
status: 'success',
result: {
questions: [
"What is the main objective of the research presented in this paper?",
"Which methodology was used to collect the data?",
"What are the key findings of the study?",
// ... more questions
],
success: true
}
}
pdfToQuestionsWorkflow
: Main workflow orchestrating the processtextQuestionAgent
: Mastra agent specialized in generating educational questionspdfQuestionAgent
: Complete agent that can handle the full PDF to questions pipeline
pdfFetcherTool
: Downloads PDF files from URLs, extracts text, and generates AI summariesgenerateQuestionsFromTextTool
: Generates comprehensive questions from summarized content
download-and-summarize-pdf
: Downloads PDF from provided URL and generates AI summarygenerate-questions-from-summary
: Creates comprehensive questions from the AI summary
- ✅ Token Limit Protection: Demonstrates how to handle large datasets without hitting context limits
- ✅ 80-95% Token Reduction: AI summarization drastically reduces processing costs
- ✅ Large Context Window: Uses OpenAI GPT-4.1 Mini to handle large documents efficiently
- ✅ Zero System Dependencies: Pure JavaScript solution
- ✅ Single API Setup: OpenAI for both summarization and question generation
- ✅ Fast Text Extraction: Direct PDF parsing (no OCR needed for text-based PDFs)
- ✅ Educational Focus: Generates focused learning questions from key insights
- ✅ Multiple Interfaces: Workflow, Agent, and individual tools available
This template uses a pure JavaScript approach that works for most PDFs:
-
Text-based PDFs (90% of cases): Direct text extraction using
pdf2json
- ⚡ Fast and reliable
- 🔧 No system dependencies
- ✅ Works out of the box
-
Scanned PDFs: Would require OCR, but most PDFs today contain embedded text
- Simplicity: No GraphicsMagick, ImageMagick, or other system tools needed
- Speed: Direct text extraction is much faster than OCR
- Reliability: Works consistently across different environments
- Educational: Easy for developers to understand and modify
- Single Path: One clear workflow with no complex branching
OPENAI_API_KEY=your_openai_api_key_here
You can customize the question generation by modifying the textQuestionAgent
:
export const textQuestionAgent = new Agent({
name: 'Generate questions from text agent',
instructions: `
You are an expert educational content creator...
// Customize instructions here
`,
model: openai('gpt-4o'),
});
src/mastra/
├── agents/
│ ├── pdf-question-agent.ts # PDF processing and question generation agent
│ └── text-question-agent.ts # Text to questions generation agent
├── tools/
│ ├── download-pdf-tool.ts # PDF download tool
│ ├── extract-text-from-pdf-tool.ts # PDF text extraction tool
│ └── generate-questions-from-text-tool.ts # Question generation tool
├── workflows/
│ └── generate-questions-from-pdf-workflow.ts # Main workflow
├── lib/
│ └── util.ts # Utility functions including PDF text extraction
└── index.ts # Mastra configuration
# Run with a test PDF
export OPENAI_API_KEY="your-api-key"
npx tsx example.ts
- Make sure you've set the environment variable
- Check that your API key is valid and has sufficient credits
- Verify the PDF URL is accessible and publicly available
- Check network connectivity
- Ensure the URL points to a valid PDF file
- Some servers may require authentication or have restrictions
- The PDF might be password-protected
- Very large PDFs might take longer to process
- Scanned PDFs without embedded text won't work (rare with modern PDFs)
- Solution: Use a smaller PDF file (under ~5-10 pages)
- Automatic Truncation: The tool automatically uses only the first 4000 characters for very large documents
- Helpful Errors: Clear messages guide you to use smaller PDFs when needed
- Single dependency for PDF processing (
pdf2json
) - No system tools or complex setup required
- Works immediately after
pnpm install
- Multiple usage patterns (workflow, agent, tools)
- Direct text extraction (no image conversion)
- Much faster than OCR-based approaches
- Handles reasonably-sized documents efficiently
- Pure JavaScript/TypeScript
- Easy to understand and modify
- Clear separation of concerns
- Simple error handling with helpful messages
- Generates multiple question types
- Covers different comprehension levels
- Perfect for creating study materials
This token limit protection pattern can be applied to many other scenarios:
- Legal documents: Summarize contracts before analysis
- Research papers: Extract key findings before comparison
- Technical manuals: Create focused summaries for specific topics
- Social media: Summarize large thread conversations
- Customer feedback: Compress reviews before sentiment analysis
- Meeting transcripts: Extract action items and decisions
- Log analysis: Summarize error patterns before classification
- Survey responses: Compress feedback before theme extraction
- Code reviews: Summarize changes before generating reports
- Use OpenAI GPT-4.1 Mini for initial summarization (large context window)
- Pass summaries to downstream tools, not raw data
- Chain summaries for multi-step processing
- Preserve metadata (file size, page count) for context
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request