Lexis.ai - Legal EXtraction & Intelligence System ⚖️

Lexis.ai is an industry-grade AI-powered legal document intelligence platform that combines OCR, LLM-based extraction, risk analytics, and conversational AI to help you analyze contracts and uncover hidden risks.

✨ Features

📄 Document Intelligence

Smart Upload: Drag-and-drop document upload with automatic processing
OCR Extraction: Extract text from images and PDFs using Tesseract
AI Classification: Automatically classify documents (invoices, receipts, contracts)
Structured Extraction: Extract key fields (vendor, amount, date, invoice number)
Thumbnail Generation: Auto-generate document previews

💬 Conversational AI

Chat with Documents: Ask natural language questions about your documents
Context-Aware: Select multiple documents for cross-document queries
Smart Responses: Powered by local LLM (Ollama) for privacy

📊 Analytics & Insights

Financial Analytics: Track total spend, average amounts, highest transactions
Vendor Analysis: See spending breakdown by vendor
Monthly Trends: Visualize spending patterns over time
Natural Language Queries: Ask questions like "What's my total spend with Acme Corp?"

⚡ Workflow Automation

Pre-built Workflows: Invoice processing, receipt categorization, contract analysis
Real-time Progress: Visual feedback for each workflow step
Export Options: CSV, QuickBooks IIF, Excel formats

🎨 Beautiful UI

Modern Design: Purple gradient theme with glassmorphism effects
Smooth Animations: Framer Motion powered interactions
Dark Mode: Full dark mode support
Responsive: Works on desktop, tablet, and mobile

🏗️ Architecture

Frontend (React + TypeScript + Vite)

Framework: React 18 with TypeScript
Styling: Tailwind CSS + shadcn/ui components
Animations: Framer Motion
State Management: React Hooks
Database: Supabase (PostgreSQL)
Authentication: Supabase Auth

Backend (FastAPI + Python)

Framework: FastAPI
LLM: Ollama (local) with Gemma 3:4b model
OCR: Tesseract + pytesseract
PDF Processing: pdfplumber
Image Processing: Pillow
Cloud Integration: Vultr Object Storage (simulated)

AI/ML Stack

Local LLM: Ollama (privacy-first, no external API calls)
Model: Gemma 3:4b (efficient, fast, accurate)
RAG: Retrieval-Augmented Generation for document Q&A
Extraction: LLM-based structured data extraction

🚀 Quick Start

Prerequisites

Node.js 18+ and npm
Python 3.10+
Ollama (for local LLM)
Tesseract OCR
Supabase Account (free tier works)

1. Clone the Repository

git clone https://github.com/yourusername/Lexis.ai.git
cd Lexis.ai

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows:
venv\Scripts\activate
# Mac/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Copy environment template
copy .env.example .env  # Windows
# cp .env.example .env  # Mac/Linux

# Install Ollama and pull the model
ollama pull gemma3:4b

# Run the backend
uvicorn main:app --reload

Backend will run on http://localhost:8000

3. Frontend Setup

cd frontend

# Install dependencies
npm install

# Copy environment template
copy .env.example .env.local  # Windows
# cp .env.example .env.local  # Mac/Linux

# Edit .env.local and add your Supabase credentials
# VITE_SUPABASE_URL=your_supabase_url
# VITE_SUPABASE_PUBLISHABLE_KEY=your_supabase_anon_key

# Run the frontend
npm run dev

Frontend will run on http://localhost:8080

4. Supabase Setup

Create a free account at supabase.com
Create a new project
Run the SQL schema (see database/schema.sql)
Copy your project URL and anon key to .env.local

📦 Tech Stack

Frontend

React 18
TypeScript
Vite
Tailwind CSS
shadcn/ui
Framer Motion
React Router
Supabase Client

Backend

FastAPI
Python 3.10+
Ollama (LLM)
Tesseract OCR
pdfplumber
Pillow
python-dotenv

Database & Auth

Supabase (PostgreSQL)
Supabase Auth

AI/ML

Ollama
Gemma 3:4b
Tesseract OCR

🔒 Security

✅ No API Keys in Code: All secrets in environment variables
✅ Local LLM: Privacy-first with Ollama (no external API calls)
✅ Secure Auth: Supabase authentication with JWT
✅ CORS Protection: Configured for production
✅ .gitignore: Protects sensitive files

📝 Environment Variables

Backend (`.env`)

# Optional: Only if using Gemini instead of Ollama
GEMINI_API_KEY=your_key_here

# Ollama Configuration
OLLAMA_MODEL=gemma3:4b

Frontend (`.env.local`)

VITE_SUPABASE_URL=your_supabase_project_url
VITE_SUPABASE_PUBLISHABLE_KEY=your_supabase_anon_key

🎯 Key Features Explained

Document Processing Pipeline

Upload → User uploads document (PDF/Image)
OCR → Tesseract extracts text
Classification → LLM identifies document type
Extraction → LLM extracts structured fields
Storage → Saved to Supabase + Vultr backup
Ready → Available for chat and analytics

Chat System

Uses RAG (Retrieval-Augmented Generation)
Combines extracted data + OCR text for context
Local LLM ensures privacy
Supports multi-document queries

Analytics Engine

Real-time calculation from extracted data
Vendor aggregation
Monthly trend analysis
Natural language query support

🐛 Known Issues & Limitations

Rate Limits: Demo has 3 uploads and 5 questions per document limit
OCR Accuracy: Depends on image quality
LLM Speed: Local Ollama may be slower than cloud APIs
File Size: Large PDFs (>10MB) may take longer to process

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

Ollama - Local LLM runtime
Supabase - Backend as a Service
Tesseract - OCR engine
shadcn/ui - Beautiful UI components
Vultr - Cloud infrastructure partner

📧 Contact

For questions or support, please open an issue on GitHub.

Built with ❤️ for legal intelligence and contract protection

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.agent/workflows		.agent/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
SECURITY_AUDIT.md		SECURITY_AUDIT.md
overview.txt		overview.txt
planning.txt		planning.txt
rules.txt		rules.txt

Folders and files

Latest commit

History

Repository files navigation

Lexis.ai - Legal EXtraction & Intelligence System ⚖️

✨ Features

📄 Document Intelligence

💬 Conversational AI

📊 Analytics & Insights

⚡ Workflow Automation

🎨 Beautiful UI

🏗️ Architecture

Frontend (React + TypeScript + Vite)

Backend (FastAPI + Python)

AI/ML Stack

🚀 Quick Start

Prerequisites

1. Clone the Repository

2. Backend Setup

3. Frontend Setup

4. Supabase Setup

📦 Tech Stack

Frontend

Backend

Database & Auth

AI/ML

🔒 Security

📝 Environment Variables

Backend (.env)

Frontend (.env.local)

🎯 Key Features Explained

Document Processing Pipeline

Chat System

Analytics Engine

🐛 Known Issues & Limitations

🤝 Contributing

📄 License

🙏 Acknowledgments

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Backend (`.env`)

Frontend (`.env.local`)

Packages