ai/studio

Quantum Minds Document Operators

Introduction

Document operators in Quantum Minds enable you to work with unstructured and semi-structured text data, including PDFs, web pages, and raw text. These operators leverage Retrieval Augmented Generation (RAG) techniques to extract insights, summarize content, and provide accurate responses based on your document collections.

Available Document Operators

Operator	Description	Common Use Cases
RAGSummarize	Summarizes content using RAG techniques	Document summarization, content overview, knowledge extraction
RAGSummarizeV2	Enhanced RAG summarization with improved context	Complex document analysis, multi-document summarization
TextSummarize	Generates summaries from text input	Content condensation, key point extraction
RAGParse	Parses documents using specialized models	Structure extraction, form processing, data extraction
InvoiceExtractor	Extracts structured data from invoices	Invoice processing, financial document handling
InvoiceExtractorV2	Enhanced invoice data extraction	Complex invoice layouts, multi-page invoices
RAGSubQuestions	Breaks complex queries into sub-questions	Comprehensive document Q&A, detailed research
CreateKnowledgeBase	Creates knowledge bases from documents	Knowledge management, searchable document repositories
GeminiMultiModal	Processes multiple content types with Gemini	Multi-format content analysis, audio processing
ClaudeMultiModal	Processes multiple content types with Claude	Image and document understanding

RAGSummarize

The RAGSummarize operator uses Retrieval Augmented Generation to summarize content from document collections, providing accurate and contextually relevant summaries.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Instructions for what to summarize or question to answer
collection	string	Yes	Document collection to query
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Generated summary or answer

Example Usage

Prompt: "Summarize the key benefits of our new product offering based on the marketing materials"
Collection: "product_marketing_docs"

Output: Comprehensive summary of product benefits extracted from the marketing materials

Best Practices

Be specific about what information you're looking for
Include context about why you need the information
Specify any required structure for the output
For large collections, narrow the focus to relevant documents
Request citations or reference points when needed

RAGSummarizeV2

Enhanced version of RAGSummarize with improved context handling and better performance on complex document collections.

Key Improvements Over V1

Better handling of large document collections
Improved accuracy on complex topics
Enhanced context preservation
More efficient retrieval mechanism
Better handling of diverse document formats

When to Use V2 Instead of V1

For large document collections (100+ documents)
When dealing with technical or specialized content
When accuracy is critical
For multi-document summarization tasks
When needing to resolve contradictions across documents

TextSummarize

The TextSummarize operator generates concise summaries of text input without requiring a document collection.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Text to summarize or instructions
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Generated summary

Example Usage

Prompt: "Summarize the following meeting transcript: [transcript text]"

Output: Concise summary of the key points discussed in the meeting

Best Practices

Specify desired length or level of detail
Indicate any specific aspects to focus on
Include all text to be summarized in the prompt
Consider breaking very long texts into logical sections

RAGParse

The RAGParse operator extracts structured information from documents using specialized parsing models.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Instructions for what information to extract
collection	string	Yes	Document collection to process
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Extracted structured information

Example Usage

Prompt: "Extract all tables from the financial reports and present them in a structured format with proper headers"
Collection: "quarterly_financial_reports"

Output: Structured representation of all tables from the financial reports

Best Practices

Be specific about the structures you want to extract
Provide examples of the desired output format
For complex documents, focus on specific sections
Consider pre-processing complex documents into smaller collections

InvoiceExtractor and InvoiceExtractorV2

Specialized operators for extracting structured data from invoice documents.

InvoiceExtractor

Inputs

Parameter	Type	Required	Description
file	file	Yes	Invoice file to process
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (object)
content	string	Extracted invoice data

InvoiceExtractorV2

Enhanced version with improved accuracy and expanded field recognition.

Key Improvements Over V1

Better handling of complex layouts
Enhanced field detection for non-standard invoices
Multi-page invoice support
Improved line item extraction
Better handling of diverse currencies and formats

Example Usage

File: [Uploaded invoice PDF]

Output:
{
  "vendor": "Acme Corp",
  "invoice_number": "INV-12345",
  "date": "2023-09-15",
  "due_date": "2023-10-15",
  "total_amount": 1250.00,
  "tax_amount": 250.00,
  "currency": "USD",
  "line_items": [
    {
      "description": "Professional Services",
      "quantity": 5,
      "unit_price": 200.00,
      "amount": 1000.00
    }
  ],
  "payment_details": {
    "bank_name": "First Bank",
    "account_number": "XXXX1234"
  }
}

Best Practices

Use high-quality scans when possible
Test with a variety of invoice formats
Validate extracted data for critical fields
Consider post-processing for specialized formats
Use V2 for complex or non-standard invoice layouts

RAGSubQuestions

The RAGSubQuestions operator breaks down complex queries into manageable sub-questions and then synthesizes a comprehensive answer.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Complex question or research topic
collection	string	Yes	Document collection to query
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Comprehensive answer

Example Usage

Prompt: "What are the regulatory implications, integration challenges, and potential customer benefits of implementing blockchain in our payment processing system?"
Collection: "fintech_research_documents"

Output: 
Comprehensive analysis that breaks down the question into sub-aspects:
1. Regulatory implications of blockchain in payments
2. Technical integration challenges
3. Customer benefits and experience improvements
Each sub-question is researched and answered before synthesizing a complete response.

Best Practices

Present complex, multi-faceted questions
Provide context about your specific situation
Allow sufficient processing time for thorough analysis
Consider specifying aspects if you have particular areas of interest
Review sub-questions in the response to ensure alignment with your intent

CreateKnowledgeBase

The CreateKnowledgeBase operator builds vector database collections from document content for use with RAG operators.

Inputs

Parameter	Type	Required	Description
process	string	Yes	Process type (e.g., "create", "update")
dataframe	string	Yes	Document content as dataframe
collection_name	string	No	Name for the new collection
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Status and information about the created collection
collection	string	Reference to the created collection

Example Usage

Process: "create"
Dataframe: [Document content data]
Collection_name: "legal_contracts_2023"

Output: Confirmation of knowledge base creation with access information

Best Practices

Organize related documents into cohesive collections
Use clear, descriptive names for collections
Consider pre-processing documents for optimal embedding
Structure document metadata for better retrieval
Regularly update collections with new content

MultiModal Operators

GeminiMultiModal

Processes multiple content types (text, images, PDFs, audio) using Google's Gemini model.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Instructions or questions
file	string	No	File to analyze (PDF, image, audio)
trigger	string	No	Optional control signal
audio out	Enum	No	Whether to generate audio output

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Analysis results
audio	string	Audio response (if requested)

ClaudeMultiModal

Processes text, images, and PDFs using Anthropic's Claude model.

Inputs

Parameter	Type	Required	Description
prompt	string	Yes	Instructions or questions
file	string	No	File to analyze (PDF, image)
trigger	string	No	Optional control signal

Outputs

Parameter	Type	Description
type	string	Output format (markdown)
content	string	Analysis results

Example Usage

# GeminiMultiModal
Prompt: "Analyze this quarterly report and summarize the key financial metrics and business outlook. Also extract the revenue forecast chart and explain its significance."
File: [Quarterly report PDF]
Audio out: "True"

Output: 
- Text analysis of the quarterly report
- Extracted financial metrics
- Analysis of the revenue forecast chart
- Audio narration of the analysis

# ClaudeMultiModal
Prompt: "What information can you extract from this product diagram? List all components and explain how they interact."
File: [Product diagram image]

Output: Detailed analysis of the product diagram with component identification and relationship explanation

Choosing Between MultiModal Operators

Feature	GeminiMultiModal	ClaudeMultiModal
Audio processing	Yes	No
Audio output	Yes	No
Complex document understanding	Good	Excellent
Image analysis	Excellent	Good
Response quality	Concise	Detailed
Processing speed	Faster	More thorough

Common Document Operation Patterns

Document Q&A System

CreateKnowledgeBase → RAGSummarizeV2

Comprehensive Research

CreateKnowledgeBase → RAGSubQuestions → TableToTextSummary

Document Processing Pipeline

InvoiceExtractorV2 → TableRowProcessor → DataFrameMerge → CreateDataset

Multi-Modal Analysis

GeminiMultiModal → RAGSummarize → CardGenerator

Integration with Lightning RAG

Document operators in Quantum Minds are designed to work seamlessly with collections created in Lightning RAG. This integration enables:

Collection Reuse: Use the same document collections across Lightning RAG chat interfaces and Quantum Minds workflows
Enhanced Processing: Apply specialized document operators to collections for deeper analysis
Workflow Automation: Build automated workflows that process documents and generate insights
Output Augmentation: Combine RAG results with other data sources for comprehensive analysis

Setting Up Lightning RAG Integration

Create and build collections in Lightning RAG following the standard process
In Quantum Minds, reference these collections by name in document operators
Use the results in downstream operators for visualization, summarization, or further processing

Best Practices for Document Operators

Collection Management

Organize by Topic: Create focused collections around specific subjects
Size Considerations: Aim for collections under 1000 documents for optimal performance
Update Strategy: Establish a process for refreshing collections with new content
Quality Control: Include only relevant, high-quality documents

Prompt Engineering

Be Specific: Clearly state what information you're looking for
Provide Context: Include background information relevant to your query
Output Format: Specify desired output structure (bullet points, paragraphs, etc.)
Depth vs. Breadth: Indicate whether you want detailed analysis of specific points or broader coverage

Processing Optimization

Pre-processing: Clean and structure documents before collection creation
Chunking Strategy: Consider how documents are chunked for embedding
Metadata Enrichment: Add useful metadata to improve retrieval
Selective Processing: Target specific document sections when possible

Error Handling

Document Quality Checks: Validate document quality before processing
Format Compatibility: Ensure documents are in supported formats
Fallback Mechanisms: Design minds with alternative paths if document processing fails
Validation Steps: Include validation of extracted information

Next Steps

Explore how Document Operators can be combined with MongoDB Operators for handling both unstructured documents and semi-structured data.

Overview | Operator Categories | SQL Operators | Table Operators