Lightning ⚡ RAG User Documentation
Introduction
Lightning RAG (L⚡RAG) is an advanced Retrieval Augmented Generation platform designed to transform how you interact with your data. By combining powerful document processing, database connectivity, and natural language understanding, L⚡RAG enables you to have intelligent conversations with all your information sources through an intuitive chat interface.
This documentation provides comprehensive guidance on how to use Lightning RAG effectively, from creating your first collection to building sophisticated data interactions.
Getting Started
System Requirements
- Modern web browser (Chrome, Firefox, Safari, or Edge)
- Internet connection for cloud-based features
- Minimum screen resolution: 1280x800
Accessing Lightning RAG
- Navigate to your organization's Lightning RAG instance URL
- Log in with your credentials
- You'll be directed to the Collections dashboard
Core Concepts
Collections
Collections are the fundamental building blocks in Lightning RAG. A collection is a set of related data from a specific source type that has been processed and optimized for conversational AI interaction.
Collection Types
Lightning RAG supports three primary data structure categories:
Unstructured Data
- PDF Collections
- Document-based collections with free-form text, tables, images
- Supports multiple document formats (PDF, DOCX, TXT)
- Uses OCR and document understanding technology
- PDF Collections
Semi-structured Data
- MongoDB Collections
- Works with NoSQL document databases
- Handles nested document structures
- Supports aggregation pipelines
- API Collections
- Connects to external APIs
- Auto-generates schema from OpenAPI/Swagger definitions
- Handles authentication and parameter mapping
- MongoDB Collections
Structured Data
- SQL Collections
- Connects to relational databases with rigid schemas
- Supports schema understanding and query generation
- Compatible with PostgreSQL, MySQL, SQL Server, and more
- Excel Collections
- Processes tabular spreadsheet data
- Handles multiple sheets and complex formulas
- Auto-converts to optimized SQL structures internally
- SQL Collections
Embedding Types
For unstructured data collections (PDF), Lightning RAG offers three embedding technologies:
PaddleOCR
- High-accuracy document processing optimized for complex layouts
- Best for documents with mixed content (text, tables, images)
- Default processing engine
Llama Parse (Cloud)
- Advanced cloud-based parsing for sophisticated document structures
- Superior handling of tables and structured data
- Requires internet connectivity
Docling (On-Premise)
- Secure, locally-hosted solution for sensitive document processing
- Ensures data never leaves your infrastructure
- Ideal for confidential or regulated information
User Interface Overview
Navigation
- Collections: Manage and interact with your data sources
- Dashboards: View insights and analytics across collections
- Analytics: Track usage, performance metrics, and user engagement
- Settings: Configure system preferences and user access
Collections Dashboard
The Collections dashboard displays all your available collections with key information:
- Collection name and type
- Item count (documents, tables, etc.)
- Status (Ready, Processing)
- Action buttons (Build, Chat, Share)
Filtering and Sorting
- Filter collections by data structure (Unstructured, Semi-structured, Structured) or specific type
- Search collections by name using the search bar
- Sort collections by name, type, creation date, or status
Creating Collections
Step 1: Initiate Collection Creation
- Click the "+ New Collection" button in the top right corner
- Select the data structure category and collection type
- Enter a name for your collection
Step 2: Configure Source
Depending on the collection type, you'll see different source options:
For Unstructured Data (PDF Collections)
Choose a source:
- Upload: Upload files from your computer
- Web Scraper: Extract content from websites
- URL: Import documents from direct links
Select embedding type:
- PaddleOCR: Best for general documents
- Llama Parse: Optimal for complex structures
- Docling: For sensitive information
For Semi-structured Data
MongoDB Collections:
- Enter connection details:
- Connection URI
- Database name
- Collection names
- Authentication credentials
API Collections:
- Enter API details:
- API name and description
- Base URL
- Authentication method
- Endpoint configuration
For Structured Data
SQL Collections:
- Enter database connection details:
- Database type (PostgreSQL, MySQL, etc.)
- Host and port
- Database name
- Authentication credentials
- Table selection (optional)
Excel Collections:
- Upload Excel files or provide URL
- Select sheets to include
- Choose embedding options (Schema only or Data enhanced)
Step 3: Create and Build
- Click "Create Collection" to initialize your collection
- The system will automatically begin the build process
- Building includes:
- Document parsing and OCR (for unstructured data)
- Schema analysis (for structured and semi-structured data)
- Vector embedding generation
- Index optimization
Working with Collections
Collection States
Collections exist in one of two states:
- Processing: The collection is being built or updated
- Ready: The collection is available for chat interaction
Collection Actions
Build/Rebuild
The Build process prepares your collection for chat interaction:
- Click the "Build" button on a collection card
- Monitor progress in the detail view
- Building time varies based on collection size and complexity
Chat
Start a conversation with your collection:
- Click the "Chat" button on a Ready collection
- Type natural language questions in the chat interface
- Receive AI-generated responses based on your collection data
- Follow-up with additional questions for context-aware responses
Share
Share collections with team members:
- Click the "Share" button on a collection
- Set access permissions (View, Chat, Edit)
- Enter recipient email addresses or copy shareable link
- Optional: Add expiration date or password protection
Collection Details
Access detailed information and settings by clicking on a collection card:
Overview Tab
- Collection metadata (type, data structure category, creation date, size)
- Status and processing information
- Recent activity log
Content Tab
- List of items in the collection (documents, tables, endpoints)
- Preview functionality for supported content types
- Item-specific metadata
Settings Tab
- Rename collection
- Modify embedding type (for unstructured data collections)
- Configure refresh settings
- Delete collection
Analytics Tab
- Usage statistics (query count, user count)
- Performance metrics (response time, accuracy)
- Popular queries and topics
Advanced Features
Dynamic Collection Mapping
For enterprise users, Lightning RAG supports dynamic collection mapping:
- Create collection templates with variable placeholders
- Set up mapping rules based on user roles or session parameters
- Collections automatically adapt to the current user context
Published Links
Create embeddable Lightning RAG dashboards:
- Configure a collection for publishing
- Generate a MINDSHARE_KEY for secure access
- Set refresh rate for dashboard data
- Embed the published link in other applications
RBAC Settings
Control access with role-based permissions:
- Configure organization-level access policies
- Assign roles to team members
- Set granular permissions for collections and features
Troubleshooting
Common Issues
Collection Building Fails
- Check source file formats and compatibility
- Verify database connection details
- Ensure API endpoints are accessible
- Review error logs in the detail view
Chat Responses Are Inaccurate
- Rebuild the collection with updated content
- Try refining your question with more specific details
- Check if the information exists in your collection
- For unstructured data, consider changing the embedding type
Performance Issues
- Split large collections into smaller, focused collections
- Optimize database queries for structured data collections
- Reduce the scope of semi-structured collections
- Use local embedding types for faster processing of unstructured data
Best Practices
Collection Organization
- Create purpose-specific collections rather than catch-all repositories
- Use clear, descriptive naming conventions
- Consider organizing by data structure type for efficient management
- Regularly audit and clean up unused collections
Data Type Selection Guidelines
- Unstructured Data: Ideal for narrative content, reports, manuals, and documents with mixed content types
- Semi-structured Data: Best for flexible data models, nested information, and varying schemas
- Structured Data: Optimal for relational information, tabular data, and precise querying needs
Effective Querying
- Start with simple, direct questions
- Provide context in your questions
- Ask for specific formats when needed (tables, lists, summaries)
- Use follow-up questions to refine results
Data Management
- Keep source data updated for optimal results
- Schedule regular rebuilds for frequently changing collections
- Monitor analytics to identify usage patterns and improvement areas
- Implement version control for critical collections
Glossary
- Collection: A processed dataset optimized for conversational AI
- Embedding: Vector representations of content for semantic search
- MINDSHARE_KEY: Secure access token for published dashboards
- OCR: Optical Character Recognition for extracting text from images
- RAG: Retrieval Augmented Generation - combining retrieval systems with generative AI
- RBAC: Role-Based Access Control for managing permissions
- Structured Data: Data organized according to a predefined schema (SQL, Excel)
- Semi-structured Data: Data with some organizational properties but flexible schema (MongoDB, APIs)
- Unstructured Data: Data without predefined organization (PDF, documents)
- Vector Database: Database optimized for similarity search of embeddings