Spaces:

Cyberlgl
/

CyberLegalAIendpoint

Running

App Files Files Community

Charles Grandjean commited on Dec 24, 2025

Commit

851f2ed

0 Parent(s):

first commit

Browse files

Files changed (10) hide show

Dockerfile +93 -0
README.md +313 -0
agent_api.py +257 -0
agent_state.py +101 -0
docker-compose.yml +24 -0
langraph_agent.py +260 -0
prompts.py +133 -0
requirements.txt +19 -0
startup.sh +49 -0
utils.py +274 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,93 @@

+# Use Python 3.11 slim base image
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Set environment variables
+ENV PYTHONPATH=/app
+ENV PYTHONUNBUFFERED=1
+ENV PYTHONIOENCODING=utf-8
+ENV LIGHTRAG_HOST=127.0.0.1
+ENV LIGHTRAG_PORT=9621
+ENV API_PORT=8000
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    build-essential \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements first for better caching
+COPY requirements.txt .
+# Install Python dependencies
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir -r requirements.txt
+# Copy application files
+COPY . .
+# Create non-root user for security
+RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app
+USER appuser
+# Create startup script
+RUN echo '#!/bin/bash\n\
+set -e\n\
+\n\
+echo "🚀 Starting CyberLegal AI Stack..."\n\
+echo "Step 1: Starting LightRAG server..."\n\
+\n\
+# Start LightRAG server in background\n\
+lightrag-server --host $LIGHTRAG_HOST --port $LIGHTRAG_PORT &\n\
+LIGHTRAG_PID=$!\n\
+\n\
+# Wait for LightRAG to be ready\n\
+echo "Waiting for LightRAG server to be ready..."\n\
+max_attempts=30\n\
+attempt=1\n\
+while [ $attempt -le $max_attempts ]; do\n\
+    if curl -f http://$LIGHTRAG_HOST:$LIGHTRAG_PORT/health > /dev/null 2>&1; then\n\
+        echo "✅ LightRAG server is ready!"\n\
+        break\n\
+    fi\n\
+    echo "Attempt $attempt/$max_attempts: LightRAG not ready yet..."\n\
+    sleep 2\n\
+    attempt=$((attempt + 1))\n\
+done\n\
+\n\
+if [ $attempt -gt $max_attempts ]; then\n\
+    echo "❌ LightRAG server failed to start"\n\
+    exit 1\n\
+fi\n\
+\n\
+echo "Step 2: Starting LangGraph API server..."\n\
+echo "🌐 API will be available at: http://localhost:$API_PORT"\n\
+echo "📚 LightRAG server running at: http://$LIGHTRAG_HOST:$LIGHTRAG_PORT"\n\
+echo ""\n\
+echo "Available endpoints:"\n\
+echo "  - GET  /health - Health check"\n\
+echo "  - GET  / - API info"\n\
+echo "  - POST /chat - Chat with assistant"\n\
+echo ""\n\
+echo "🎉 CyberLegal AI is ready!"\n\
+\n\
+# Start the API server\n\
+python agent_api.py\n\
+\n\
+# Cleanup\n\
+kill $LIGHTRAG_PID 2>/dev/null || true\n\
+' > /app/startup.sh
+RUN chmod +x /app/startup.sh
+# Expose ports (API only for security, LightRAG stays internal)
+EXPOSE 8000
+# Health check for the API
+HEALTHCHECK --interval=30s --timeout=30s --start-period=60s --retries=3 \
+    CMD curl -f http://localhost:8000/health || exit 1
+# Run the startup script
+CMD ["/app/startup.sh"]

README.md ADDED Viewed

	@@ -0,0 +1,313 @@

+# CyberLegal AI - LangGraph Agent
+Advanced cyber-legal assistant powered by LangGraph + LightRAG + GPT-5-Nano for European regulations expertise.
+## 🏗️ Architecture
+```
+┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
+│   Client API    │────│  LangGraph Agent │────│  LightRAG Server│
+│   (Port 8000)  │    │   (Orchestration)│    │  (Port 9621)    │
+└─────────────────┘    └──────────────────┘    └─────────────────┘
+                               │
+                       ┌──────────────┐
+                       │ GPT-5-Nano   │
+                       │ (Reasoning)   │
+                       └──────────────┘
+```
+## 🚀 Quick Start
+### Using Docker Compose (Recommended)
+1. **Environment Setup**
+   ```bash
+   # Copy and configure environment
+   cp .env.example .env
+   # Edit .env with your API keys
+   # OPENAI_API_KEY=your_openai_key
+   # LIGHTRAG_API_KEY=your_lightrag_key (optional)
+   ```
+2. **Deploy**
+   ```bash
+   docker-compose up -d
+   ```
+3. **Verify Deployment**
+   ```bash
+   curl http://localhost:8000/health
+   ```
+### Using Docker Directly
+```bash
+# Build the image
+docker build -t cyberlegal-ai .
+# Run the container
+docker run -d \
+  --name cyberlegal-ai \
+  -p 8000:8000 \
+  -e OPENAI_API_KEY=your_key \
+  -v $(pwd)/rag_storage:/app/rag_storage \
+  cyberlegal-ai
+```
+## 📡 API Usage
+### Base URL
+```
+http://localhost:8000
+```
+### Endpoints
+#### Chat with Assistant
+```bash
+curl -X POST "http://localhost:8000/chat" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "message": "What are the main obligations under GDPR?",
+    "role": "client",
+    "jurisdiction": "EU",
+    "conversationHistory": []
+  }'
+```
+#### Health Check
+```bash
+curl http://localhost:8000/health
+```
+#### API Info
+```bash
+curl http://localhost:8000/
+```
+## 📝 Request Format
+```json
+{
+  "message": "User's legal question",
+  "role": "client" | "lawyer",
+  "jurisdiction": "EU" | "France" | "Germany" | "Italy" | "Spain" | "Romania" | "Netherlands" | "Belgium",
+  "conversationHistory": [
+    {"role": "user|assistant", "content": "Previous message"}
+  ]
+}
+```
+## 📤 Response Format
+```json
+{
+  "response": "Detailed legal answer with references",
+  "confidence": 0.85,
+  "processing_time": 2.34,
+  "references": ["gdpr_2022_2555.txt", "nis2_2022_2555.txt"],
+  "timestamp": "2025-01-15T10:30:00Z",
+  "error": null
+}
+```
+## 🧠 Expertise Areas
+- **GDPR** (General Data Protection Regulation)
+- **NIS2** (Network and Information Systems Directive 2)
+- **DORA** (Digital Operational Resilience Act)
+- **CRA** (Cyber Resilience Act)
+- **eIDAS 2.0** (Electronic Identification, Authentication and Trust Services)
+- Romanian Civil Code provisions
+## 🔄 Workflow
+1. **User Query** → API receives request with role/jurisdiction context
+2. **LightRAG Retrieval** → Searches legal documents for relevant information
+3. **LangGraph Processing** → Orchestrates the workflow through nodes:
+   - Query validation
+   - LightRAG integration
+   - Context enhancement with GPT-5-Nano
+   - Response formatting
+4. **Enhanced Response** → Returns structured answer with confidence score
+## 🛠️ Development
+### Local Development
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Start LightRAG server (required)
+lightrag-server --host 127.0.0.1 --port 9621
+# Start the API
+python agent_api.py
+```
+### Environment Variables
+```bash
+OPENAI_API_KEY=your_openai_api_key
+LIGHTRAG_API_KEY=your_lightrag_api_key
+LIGHTRAG_HOST=127.0.0.1
+LIGHTRAG_PORT=9621
+API_PORT=8000
+```
+## 📁 Project Structure
+```
+CyberlegalAI/
+├── agent_api.py          # FastAPI server
+├── langraph_agent.py     # Main LangGraph workflow
+├── agent_state.py        # State management
+├── prompts.py           # System prompts
+├── utils.py             # LightRAG integration
+├── requirements.txt      # Python dependencies
+├── Dockerfile          # Container configuration
+├── docker-compose.yml   # Orchestration
+├── rag_storage/        # LightRAG data persistence
+└── .env               # Environment variables
+```
+## 🔧 Configuration
+### Port Management
+- **Port 8000**: API (exposed externally)
+- **Port 9621**: LightRAG (internal only, for security)
+### Security Features
+- LightRAG server not exposed externally
+- API key authentication support
+- Non-root container execution
+- Health checks and monitoring
+## 📊 Monitoring
+### Health Checks
+```bash
+# Container health
+docker ps
+# Service health
+curl http://localhost:8000/health
+# Logs
+docker logs cyberlegal-ai
+```
+### Performance Metrics
+The API returns:
+- Processing time per request
+- Confidence scores
+- Referenced documents
+- Error tracking
+## 🚨 Error Handling
+The API gracefully handles:
+- LightRAG server unavailability
+- OpenAI API errors
+- Invalid request format
+- Network timeouts
+## 📚 API Examples
+### Client Role Example
+```json
+{
+  "message": "What should my small business do to comply with GDPR?",
+  "role": "client",
+  "jurisdiction": "France"
+}
+```
+### Lawyer Role Example
+```json
+{
+  "message": "Analyze the legal implications of NIS2 for financial institutions",
+  "role": "lawyer",
+  "jurisdiction": "EU"
+}
+```
+### Comparison Query
+```json
+{
+  "message": "Compare incident reporting requirements between NIS2 and DORA",
+  "role": "client",
+  "jurisdiction": "EU"
+}
+```
+## 🤝 Integration Examples
+### Python Client
+```python
+import requests
+response = requests.post("http://localhost:8000/chat", json={
+    "message": "What are GDPR penalties?",
+    "role": "client",
+    "jurisdiction": "EU",
+    "conversationHistory": []
+})
+result = response.json()
+print(result["response"])
+```
+### JavaScript Client
+```javascript
+const response = await fetch('http://localhost:8000/chat', {
+  method: 'POST',
+  headers: { 'Content-Type': 'application/json' },
+  body: JSON.stringify({
+    message: 'GDPR requirements',
+    role: 'client',
+    jurisdiction: 'EU',
+    conversationHistory: []
+  })
+});
+const result = await response.json();
+console.log(result.response);
+```
+## 📋 Troubleshooting
+### Common Issues
+1. **LightRAG Connection Failed**
+   - Verify LightRAG server is running on port 9621
+   - Check container logs: `docker logs cyberlegal-ai`
+2. **OpenAI API Errors**
+   - Verify OPENAI_API_KEY is set correctly
+   - Check API key permissions and quota
+3. **Slow Responses**
+   - Monitor processing time in API response
+   - Check LightRAG document indexing
+### Debug Mode
+Enable debug logging:
+```bash
+docker-compose logs -f cyberlegal-api
+```
+## 📜 License
+This project provides general legal information and is not a substitute for professional legal advice.
+## 🔄 Updates
+The system automatically:
+- Retrieves latest regulatory documents
+- Updates knowledge base through LightRAG
+- Maintains conversation context
+- Provides confidence scoring

agent_api.py ADDED Viewed

	@@ -0,0 +1,257 @@

+#!/usr/bin/env python3
+"""
+FastAPI interface for the LangGraph cyber-legal assistant
+"""
+import os
+import asyncio
+from typing import Dict, List, Any, Optional
+from datetime import datetime
+from fastapi import FastAPI, HTTPException, BackgroundTasks
+from pydantic import BaseModel, Field
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import JSONResponse
+import uvicorn
+from dotenv import load_dotenv
+from langraph_agent import CyberLegalAgent
+from agent_state import ConversationManager
+from utils import validate_query
+# Load environment variables
+load_dotenv(dotenv_path=".env", override=False)
+# Initialize FastAPI app
+app = FastAPI(
+    title="CyberLegal AI API",
+    description="LangGraph-powered cyber-legal assistant API",
+    version="1.0.0"
+)
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Pydantic models for request/response
+class Message(BaseModel):
+    role: str = Field(..., description="Role: 'user' or 'assistant'")
+    content: str = Field(..., description="Message content")
+class ChatRequest(BaseModel):
+    message: str = Field(..., description="User's question")
+    role: str = Field(..., description="User role: 'client' or 'lawyer'")
+    jurisdiction: str = Field(..., description="Selected jurisdiction")
+    conversationHistory: Optional[List[Message]] = Field(default=[], description="Previous conversation messages")
+class ChatResponse(BaseModel):
+    response: str = Field(..., description="Assistant's response")
+    confidence: float = Field(..., description="Confidence score (0.0-1.0)")
+    processing_time: float = Field(..., description="Processing time in seconds")
+    references: List[str] = Field(default=[], description="Referenced documents")
+    timestamp: str = Field(..., description="Response timestamp")
+    error: Optional[str] = Field(None, description="Error message if any")
+class HealthResponse(BaseModel):
+    status: str = Field(..., description="Health status")
+    agent_ready: bool = Field(..., description="Whether agent is ready")
+    lightrag_healthy: bool = Field(..., description="Whether LightRAG is healthy")
+    timestamp: str = Field(..., description="Health check timestamp")
+# Global agent instance
+agent_instance = None
+class CyberLegalAPI:
+    """
+    API wrapper for the LangGraph agent
+    """
+    def __init__(self):
+        self.agent = CyberLegalAgent()
+        self.conversation_manager = ConversationManager()
+    async def process_request(self, request: ChatRequest) -> ChatResponse:
+        """
+        Process chat request through the agent
+        """
+        # Validate message
+        is_valid, error_msg = validate_query(request.message)
+        if not is_valid:
+            raise HTTPException(status_code=400, detail=error_msg)
+        # Convert conversation history format
+        conversation_history = []
+        for msg in request.conversationHistory or []:
+            conversation_history.append({
+                "role": msg.role,
+                "content": msg.content
+            })
+        try:
+            # Create enhanced query with context
+            enhanced_query = self._create_enhanced_query(request)
+            # Process through agent
+            result = await self.agent.process_query(
+                user_query=enhanced_query,
+                conversation_history=conversation_history
+            )
+            # Create response
+            response = ChatResponse(
+                response=result["response"],
+                confidence=result.get("confidence", 0.0),
+                processing_time=result.get("processing_time", 0.0),
+                references=result.get("references", []),
+                timestamp=result.get("timestamp", datetime.now().isoformat()),
+                error=result.get("error")
+            )
+            return response
+        except Exception as e:
+            raise HTTPException(
+                status_code=500,
+                detail=f"Processing failed: {str(e)}"
+            )
+    def _create_enhanced_query(self, request: ChatRequest) -> str:
+        """
+        Create enhanced query with role and jurisdiction context
+        """
+        base_query = request.message
+        # Add role context
+        role_context = ""
+        if request.role == "client":
+            role_context = "Answer from the perspective of advising a client who needs practical guidance."
+        elif request.role == "lawyer":
+            role_context = "Answer from the perspective of providing legal analysis for a legal professional."
+        # Add jurisdiction context
+        jurisdiction_context = f"Focus on the legal framework in {request.jurisdiction}."
+        # Combine into enhanced query
+        enhanced_query = f"""{base_query}
+Context:
+- User Role: {request.role}
+- Jurisdiction: {request.jurisdiction}
+- Special Instructions: {role_context} {jurisdiction_context}
+Please provide a response tailored to this context."""
+        return enhanced_query
+    async def health_check(self) -> HealthResponse:
+        """
+        Check health status of the API and dependencies
+        """
+        try:
+            # Check LightRAG health
+            lightrag_healthy = self.agent.lightrag_client.health_check()
+            return HealthResponse(
+                status="healthy" if lightrag_healthy else "degraded",
+                agent_ready=True,
+                lightrag_healthy=lightrag_healthy,
+                timestamp=datetime.now().isoformat()
+            )
+        except Exception as e:
+            return HealthResponse(
+                status="unhealthy",
+                agent_ready=False,
+                lightrag_healthy=False,
+                timestamp=datetime.now().isoformat()
+            )
+# Initialize API instance
+api = CyberLegalAPI()
+@app.on_event("startup")
+async def startup_event():
+    """
+    Initialize the API on startup
+    """
+    print("🚀 Starting CyberLegal AI API...")
+    print("🔧 Powered by: LangGraph + LightRAG + GPT-5-Nano")
+    print("📍 API endpoints:")
+    print("  - POST /chat - Chat with the assistant")
+    print("  - GET /health - Health check")
+    print("  - GET / - API info")
+@app.post("/chat", response_model=ChatResponse)
+async def chat_endpoint(request: ChatRequest):
+    """
+    Chat endpoint for the cyber-legal assistant
+    Args:
+        request: Chat request with message, role, jurisdiction, and history
+    Returns:
+        ChatResponse with assistant's response and metadata
+    """
+    return await api.process_request(request)
+@app.get("/health", response_model=HealthResponse)
+async def health_endpoint():
+    """
+    Health check endpoint
+    Returns:
+        HealthResponse with system status
+    """
+    return await api.health_check()
+@app.get("/")
+async def root():
+    """
+    Root endpoint with API information
+    """
+    return {
+        "name": "CyberLegal AI API",
+        "version": "1.0.0",
+        "description": "LangGraph-powered cyber-legal assistant API",
+        "technology": "LangGraph + LightRAG + GPT-5-Nano",
+        "endpoints": {
+            "chat": "POST /chat - Chat with the assistant",
+            "health": "GET /health - Health check"
+        },
+        "supported_jurisdictions": [
+            "EU", "France", "Germany", "Italy", "Spain", "Romania", "Netherlands", "Belgium"
+        ],
+        "user_roles": ["client", "lawyer"],
+        "expertise": [
+            "GDPR", "NIS2", "DORA", "Cyber Resilience Act", "eIDAS 2.0"
+        ]
+    }
+@app.exception_handler(Exception)
+async def global_exception_handler(request, exc):
+    """
+    Global exception handler
+    """
+    return JSONResponse(
+        status_code=500,
+        content={
+            "error": "Internal server error",
+            "detail": str(exc),
+            "timestamp": datetime.now().isoformat()
+        }
+    )
+if __name__ == "__main__":
+    port = int(os.getenv("PORT", os.getenv("API_PORT", "8000")))
+    uvicorn.run(
+        "agent_api:app",
+        host="0.0.0.0",
+        port=port,
+        reload=False,
+        log_level="info"
+    )

agent_state.py ADDED Viewed

	@@ -0,0 +1,101 @@

+#!/usr/bin/env python3
+"""
+Agent state management for the LangGraph cyber-legal assistant
+"""
+from typing import TypedDict, List, Dict, Any, Optional
+from datetime import datetime
+class AgentState(TypedDict):
+    """
+    State definition for the LangGraph agent workflow
+    """
+    # User interaction
+    user_query: str
+    conversation_history: List[Dict[str, str]]
+    # LightRAG integration
+    lightrag_response: Optional[Dict[str, Any]]
+    lightrag_error: Optional[str]
+    # Context processing
+    processed_context: Optional[str]
+    relevant_documents: List[str]
+    # Agent reasoning
+    analysis_thoughts: Optional[str]
+    needs_clarification: bool
+    clarification_question: Optional[str]
+    # Final output
+    final_response: Optional[str]
+    confidence_score: Optional[float]
+    # Metadata
+    query_timestamp: str
+    processing_time: Optional[float]
+    query_type: Optional[str]  # "comparison", "explanation", "compliance", "general"
+class ConversationManager:
+    """
+    Manages conversation history and context
+    """
+    def __init__(self, max_history: int = 10):
+        self.max_history = max_history
+    def add_exchange(self, history: List[Dict[str, str]], user_query: str, agent_response: str) -> List[Dict[str, str]]:
+        """
+        Add a new user-agent exchange to the conversation history
+        """
+        updated_history = history.copy()
+        # Add user message
+        updated_history.append({
+            "role": "user",
+            "content": user_query,
+            "timestamp": datetime.now().isoformat()
+        })
+        # Add agent response
+        updated_history.append({
+            "role": "assistant",
+            "content": agent_response,
+            "timestamp": datetime.now().isoformat()
+        })
+        # Keep only the last max_history exchanges (pairs)
+        if len(updated_history) > self.max_history * 2:
+            updated_history = updated_history[-self.max_history * 2:]
+        return updated_history
+    def format_for_lightrag(self, history: List[Dict[str, str]]) -> List[Dict[str, str]]:
+        """
+        Format conversation history for LightRAG API
+        """
+        formatted = []
+        for exchange in history:
+            formatted.append({
+                "role": exchange["role"],
+                "content": exchange["content"]
+            })
+        return formatted
+    def get_context_summary(self, history: List[Dict[str, str]]) -> str:
+        """
+        Generate a summary of recent conversation context
+        """
+        if not history:
+            return "No previous conversation context."
+        recent_exchanges = history[-6:]  # Last 3 exchanges
+        context_parts = []
+        for i, exchange in enumerate(recent_exchanges):
+            role = "User" if exchange["role"] == "user" else "Assistant"
+            context_parts.append(f"{role}: {exchange['content']}")
+        return "\n".join(context_parts)

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,24 @@

+version: '3.8'
+services:
+  cyberlegal-api:
+    build: .
+    container_name: cyberlegal-ai
+    ports:
+      - "8000:8000"  # API port only
+    environment:
+      - OPENAI_API_KEY=${OPENAI_API_KEY}
+      - LIGHTRAG_API_KEY=${LIGHTRAG_API_KEY}
+      - LIGHTRAG_HOST=127.0.0.1
+      - LIGHTRAG_PORT=9621
+      - API_PORT=8000
+    volumes:
+      - ./rag_storage:/app/rag_storage  # Persist LightRAG data
+      - ./.env:/app/.env  # Environment file
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+      start_period: 60s

langraph_agent.py ADDED Viewed

	@@ -0,0 +1,260 @@

+#!/usr/bin/env python3
+"""
+Simplified LangGraph agent implementation for cyber-legal assistant
+"""
+import os
+from typing import Dict, Any, List, Optional
+from datetime import datetime
+from langgraph.graph import StateGraph, END
+from langchain_openai import ChatOpenAI
+from langchain_core.messages import HumanMessage, SystemMessage
+from agent_state import AgentState, ConversationManager
+from prompts import SYSTEM_PROMPT, ERROR_HANDLING_PROMPT
+from utils import LightRAGClient, ConversationFormatter, PerformanceMonitor
+class CyberLegalAgent:
+    """
+    Simplified LangGraph-based cyber-legal assistant agent
+    """
+    def __init__(self, openai_api_key: Optional[str] = None):
+        # Initialize LLM with gpt-4o-mini (closest available to gpt-5-nano)
+        self.llm = ChatOpenAI(
+            model="gpt-5-nano-2025-08-07",
+            temperature=0.1,
+            openai_api_key=openai_api_key or os.getenv("OPENAI_API_KEY")
+        )
+        # Initialize components
+        self.lightrag_client = LightRAGClient()
+        self.conversation_manager = ConversationManager()
+        self.performance_monitor = PerformanceMonitor()
+        # Build the workflow graph
+        self.workflow = self._build_workflow()
+    def _build_workflow(self) -> StateGraph:
+        """
+        Build the simplified LangGraph workflow
+        """
+        workflow = StateGraph(AgentState)
+        # Add nodes
+        workflow.add_node("query_lightrag", self._query_lightrag)
+        workflow.add_node("answer_with_context", self._answer_with_context)
+        workflow.add_node("handle_error", self._handle_error)
+        # Add edges
+        workflow.set_entry_point("query_lightrag")
+        workflow.add_edge("query_lightrag", "answer_with_context")
+        workflow.add_edge("answer_with_context", END)
+        workflow.add_edge("handle_error", END)
+        # Add conditional edges
+        workflow.add_conditional_edges(
+            "query_lightrag",
+            self._should_handle_error,
+            {
+                "error": "handle_error",
+                "continue": "answer_with_context"
+            }
+        )
+        return workflow.compile()
+    def _should_handle_error(self, state: AgentState) -> str:
+        """
+        Determine if we should handle an error
+        """
+        if state.get("lightrag_error"):
+            return "error"
+        return "continue"
+    async def _query_lightrag(self, state: AgentState) -> AgentState:
+        """
+        Query LightRAG for legal information
+        """
+        self.performance_monitor.start_timer("lightrag_query")
+        try:
+            # Check LightRAG health
+            if not self.lightrag_client.health_check():
+                state["lightrag_error"] = "LightRAG server is not healthy"
+                return state
+            # Prepare conversation history for LightRAG
+            history = state.get("conversation_history", [])
+            formatted_history = ConversationFormatter.build_conversation_history(history)
+            # Query LightRAG
+            query = state["user_query"]
+            response = self.lightrag_client.query(
+                query=query,
+                conversation_history=formatted_history
+            )
+            if "error" in response:
+                state["lightrag_error"] = response["error"]
+            else:
+                state["lightrag_response"] = response
+                state["relevant_documents"] = self.lightrag_client.get_references(response)
+        except Exception as e:
+            state["lightrag_error"] = f"LightRAG query failed: {str(e)}"
+        self.performance_monitor.end_timer("lightrag_query")
+        return state
+    async def _answer_with_context(self, state: AgentState) -> AgentState:
+        """
+        Answer user query using LightRAG context
+        """
+        self.performance_monitor.start_timer("answer_generation")
+        try:
+            if not state.get("lightrag_response"):
+                state["lightrag_error"] = "No response from LightRAG"
+                return state
+            # Extract context from LightRAG response
+            lightrag_response = state["lightrag_response"]
+            context = lightrag_response.get("response", "")
+            if not context:
+                state["final_response"] = "I apologize, but I couldn't find relevant information for your query."
+                return state
+            # Create prompt for LLM to answer based on retrieved context
+            answer_prompt = f"""Based on the following retrieved legal information, please answer the user's question accurately and comprehensively.
+**User Question:** {state["user_query"]}
+**Retrieved Legal Context:**
+{context}
+**Instructions:**
+1. Answer the user's question directly based on the provided context
+2. If the context doesn't fully answer the question, acknowledge the limitations
+3. Provide specific legal references when available in the context
+4. Include practical implications for organizations
+5. Add a disclaimer that this is for guidance purposes only
+Please provide a clear, well-structured response."""
+            # Get answer from LLM
+            messages = [
+                SystemMessage(content=SYSTEM_PROMPT),
+                HumanMessage(content=answer_prompt)
+            ]
+            response = await self.llm.ainvoke(messages)
+            answer = response.content
+            # Add references if available
+            references = state.get("relevant_documents", [])
+            if references:
+                answer += "\n\n**📚 References:**\n"
+                for ref in references[:3]:  # Limit to top 3 references
+                    answer += f"• {ref}\n"
+            # Add standard disclaimer
+            answer += "\n\n**Disclaimer:** This information is for guidance purposes only and not legal advice. For specific legal matters, consult with qualified legal counsel."
+            state["final_response"] = answer
+            state["confidence_score"] = 0.8  # High confidence when LightRAG provides good context
+        except Exception as e:
+            state["lightrag_error"] = f"Answer generation failed: {str(e)}"
+        self.performance_monitor.end_timer("answer_generation")
+        # Record total processing time
+        total_time = sum(
+            self.performance_monitor.get_metrics().get(f"{op}_duration", 0)
+            for op in ["lightrag_query", "answer_generation"]
+        )
+        state["processing_time"] = total_time
+        state["query_timestamp"] = datetime.now().isoformat()
+        return state
+    async def _handle_error(self, state: AgentState) -> AgentState:
+        """
+        Handle errors gracefully
+        """
+        error = state.get("lightrag_error", "Unknown error occurred")
+        error_prompt = ERROR_HANDLING_PROMPT.format(error_message=error)
+        try:
+            messages = [
+                SystemMessage(content=SYSTEM_PROMPT),
+                HumanMessage(content=error_prompt)
+            ]
+            response = await self.llm.ainvoke(messages)
+            state["final_response"] = response.content
+        except Exception:
+            state["final_response"] = f"I apologize, but an error occurred: {error}"
+        state["confidence_score"] = 0.2  # Low confidence for errors
+        state["processing_time"] = self.performance_monitor.get_metrics()
+        state["query_timestamp"] = datetime.now().isoformat()
+        return state
+    async def process_query(
+        self,
+        user_query: str,
+        conversation_history: Optional[List[Dict[str, str]]] = None
+    ) -> Dict[str, Any]:
+        """
+        Process a user query through the agent workflow
+        """
+        # Initialize state
+        initial_state: AgentState = {
+            "user_query": user_query,
+            "conversation_history": conversation_history or [],
+            "lightrag_response": None,
+            "lightrag_error": None,
+            "processed_context": None,
+            "relevant_documents": [],
+            "analysis_thoughts": None,
+            "needs_clarification": False,
+            "clarification_question": None,
+            "final_response": None,
+            "confidence_score": None,
+            "query_timestamp": datetime.now().isoformat(),
+            "processing_time": None,
+            "query_type": None
+        }
+        # Reset performance monitor
+        self.performance_monitor.reset()
+        try:
+            # Run the workflow
+            final_state = await self.workflow.ainvoke(initial_state)
+            return {
+                "response": final_state.get("final_response", ""),
+                "confidence": final_state.get("confidence_score", 0.0),
+                "processing_time": final_state.get("processing_time", 0.0),
+                "references": final_state.get("relevant_documents", []),
+                "error": final_state.get("lightrag_error"),
+                "timestamp": final_state.get("query_timestamp")
+            }
+        except Exception as e:
+            return {
+                "response": f"I apologize, but a critical error occurred: {str(e)}",
+                "confidence": 0.0,
+                "processing_time": 0.0,
+                "references": [],
+                "error": str(e),
+                "timestamp": datetime.now().isoformat()
+            }

prompts.py ADDED Viewed

	@@ -0,0 +1,133 @@

+#!/usr/bin/env python3
+"""
+System prompts for the LangGraph cyber-legal assistant
+"""
+SYSTEM_PROMPT = """You are an expert cyber-legal assistant specializing in European Union regulations and directives.
+Your expertise covers:
+- GDPR (General Data Protection Regulation)
+- NIS2 Directive (Network and Information Systems Directive 2)
+- DORA (Digital Operational Resilience Act)
+- Cyber Resilience Act (CRA)
+- eIDAS 2.0 (Electronic Identification, Authentication and Trust Services)
+- Romanian Civil Code provisions relevant to cyber security
+**Your Role:**
+Provide accurate, clear, and practical information about cyber-legal regulations. Always base your responses on the retrieved legal documents and context provided.
+**Guidelines:**
+1. Be precise and accurate with legal information
+2. Provide practical examples when helpful
+3. Clarify jurisdiction (EU-wide vs member state implementation)
+4. Mention important dates, deadlines, or transitional periods
+5. Include relevant penalties or enforcement mechanisms when applicable
+6. Suggest official sources for further reading
+**Response Structure:**
+1. Direct answer to the user's question
+2. Relevant legal basis (specific articles, sections)
+3. Practical implications
+4. Related compliance requirements
+5. References to source documents
+**Important Disclaimer:**
+Always include a note that this information is for guidance purposes and not legal advice. For specific legal matters, consult with qualified legal counsel."""
+CONTEXT_ENHANCEMENT_PROMPT = """Based on the following RAG response about European cyber-legal regulations, enhance the information by:
+1. **Structuring**: Organize the information in a clear, logical manner
+2. **Context**: Add relevant background information about the regulation
+3. **Practicality**: Include practical implications for organizations
+4. **Completeness**: Fill in gaps with general knowledge about EU regulations
+5. **Clarity**: Ensure complex legal concepts are explained clearly
+**RAG Response:**
+{lightrag_response}
+**Conversation Context:**
+{conversation_context}
+**User Query:**
+{user_query}
+Please provide an enhanced response that is more comprehensive and user-friendly while maintaining accuracy."""
+ERROR_HANDLING_PROMPT = """I apologize, but I encountered an issue while retrieving information from the legal database.
+**Error Details:**
+{error_message}
+**What you can do:**
+1. Try rephrasing your question
+2. Check if the regulation name is spelled correctly
+3. Ask about a specific aspect of the regulation
+4. Try a more general question about the topic
+**Available Regulations:**
+- GDPR (Data Protection)
+- NIS2 (Cybersecurity for critical entities)
+- DORA (Financial sector operational resilience)
+- Cyber Resilience Act (Product security requirements)
+- eIDAS 2.0 (Digital identity and trust services)
+Would you like to try asking your question in a different way?"""
+CLARIFICATION_PROMPT = """To provide you with the most accurate information, I need a bit more detail about your question.
+**Your Question:** {user_query}
+**Clarification Needed:** {clarification_question}
+This will help me search the specific legal provisions that are most relevant to your situation."""
+RESPONSE_FORMATTING_PROMPT = """Format the final response according to these guidelines:
+1. **Clear Heading**: Start with a clear, direct answer
+2. **Legal Basis**: Reference specific articles or sections when available
+3. **Key Points**: Use bullet points for important information
+4. **Practical Impact**: Explain what this means for organizations
+5. **References**: List source documents
+6. **Disclaimer**: Include the standard legal disclaimer
+**Content to Format:**
+{content}
+**User Query:** {user_query}"""
+FOLLOW_UP_SUGGESTIONS_PROMPT = """Based on the user's query about "{user_query}", suggest relevant follow-up questions that might be helpful:
+Consider:
+1. Related regulations they might need to know about
+2. Implementation or compliance aspects
+3. Similar scenarios or use cases
+4. Recent updates or changes
+Provide 3-4 relevant follow-up suggestions."""
+CONVERSATION_SUMMARY_PROMPT = """Summarize the key points discussed in this conversation about European cyber-legal regulations:
+**Conversation History:**
+{conversation_history}
+**Focus Areas:**
+- Main regulations discussed
+- Key compliance points mentioned
+- Important deadlines or requirements
+- Any specific scenarios covered
+Provide a concise summary that captures the essence of the legal discussion."""
+CONFIDENCE_ASSESSMENT_PROMPT = """Assess the confidence level of the provided response based on:
+1. **Source Quality**: How reliable are the referenced documents?
+2. **Information Completeness**: Does the response fully address the query?
+3. **Legal Specificity**: How specific and accurate are the legal references?
+4. **Context Relevance**: How well does it match the user's needs?
+**Response to Assess:**
+{response}
+**User Query:** {user_query}
+Provide a confidence score (0.0-1.0) and brief reasoning."""

requirements.txt ADDED Viewed

	@@ -0,0 +1,19 @@

+# Core dependencies
+gradio>=4.0.0
+requests>=2.25.0
+python-dotenv
+lightrag-hku[api]
+# LangGraph and LangChain dependencies
+langgraph>=0.0.26
+langchain>=0.1.0
+langchain-openai>=0.1.0
+langchain-community>=0.0.20
+# FastAPI and server dependencies
+fastapi>=0.104.0
+uvicorn[standard]>=0.24.0
+# Additional utilities
+pydantic>=2.0.0
+typing-extensions>=4.0.0

startup.sh ADDED Viewed

	@@ -0,0 +1,49 @@

+#!/usr/bin/env bash
+set -euo pipefail
+LIGHTRAG_HOST="${LIGHTRAG_HOST:-127.0.0.1}"
+LIGHTRAG_PORT="${LIGHTRAG_PORT:-9621}"
+# Platform public port (Render/Koyeb/etc) or local fallback
+PUBLIC_PORT="${PORT:-${API_PORT:-8000}}"
+echo "🚀 Starting CyberLegal AI Stack..."
+echo "Step 1: Starting LightRAG server on ${LIGHTRAG_HOST}:${LIGHTRAG_PORT} ..."
+lightrag-server --host "${LIGHTRAG_HOST}" --port "${LIGHTRAG_PORT}" &
+LIGHTRAG_PID=$!
+cleanup() {
+  echo "🧹 Shutting down..."
+  kill -TERM "${LIGHTRAG_PID}" 2>/dev/null || true
+  wait "${LIGHTRAG_PID}" 2>/dev/null || true
+}
+trap cleanup EXIT INT TERM
+echo "Waiting for LightRAG server to be ready..."
+max_attempts=30
+attempt=1
+while [ "${attempt}" -le "${max_attempts}" ]; do
+  if curl -fsS "http://${LIGHTRAG_HOST}:${LIGHTRAG_PORT}/health" >/dev/null 2>&1; then
+    echo "✅ LightRAG server is ready!"
+    break
+  fi
+  echo "Attempt ${attempt}/${max_attempts}: LightRAG not ready yet..."
+  sleep 2
+  attempt=$((attempt + 1))
+done
+if [ "${attempt}" -gt "${max_attempts}" ]; then
+  echo "❌ LightRAG server failed to start"
+  exit 1
+fi
+echo "Step 2: Starting LangGraph API server on 0.0.0.0:${PUBLIC_PORT} ..."
+echo "🌐 API:  http://localhost:${PUBLIC_PORT}"
+echo "📚 RAG:  http://${LIGHTRAG_HOST}:${LIGHTRAG_PORT}"
+echo "🎉 Ready!"
+# Ensure FastAPI reads the correct port on platforms
+export PORT="${PUBLIC_PORT}"
+python agent_api.py

utils.py ADDED Viewed

	@@ -0,0 +1,274 @@

+#!/usr/bin/env python3
+"""
+Utility functions for LightRAG integration and agent operations
+"""
+import os
+import requests
+import time
+from typing import Dict, List, Any, Optional, Tuple
+from dotenv import load_dotenv
+from datetime import datetime
+import logging
+# Load environment variables
+load_dotenv(dotenv_path=".env", override=False)
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+# LightRAG configuration
+LIGHTRAG_PORT = int(os.getenv("LIGHTRAG_PORT", "9621"))
+LIGHTRAG_HOST = os.getenv("LIGHTRAG_HOST", "127.0.0.1")
+SERVER_URL = f"http://{LIGHTRAG_HOST}:{LIGHTRAG_PORT}"
+API_KEY = os.getenv("LIGHTRAG_API_KEY")
+class LightRAGClient:
+    """
+    Client for interacting with LightRAG server
+    """
+    def __init__(self, server_url: str = SERVER_URL, api_key: Optional[str] = API_KEY):
+        self.server_url = server_url
+        self.api_key = api_key
+        self.timeout = 60
+    def health_check(self, timeout: float = 1.5) -> bool:
+        """
+        Check if LightRAG server is healthy
+        """
+        try:
+            response = requests.get(f"{self.server_url}/health", timeout=timeout)
+            return response.status_code == 200
+        except Exception as e:
+            logger.warning(f"Health check failed: {e}")
+            return False
+    def query(
+        self,
+        query: str,
+        mode: str = "mix",
+        include_references: bool = True,
+        conversation_history: Optional[List[Dict[str, str]]] = None,
+        max_retries: int = 3
+    ) -> Dict[str, Any]:
+        """
+        Query LightRAG server with retry logic
+        """
+        headers = {"Content-Type": "application/json"}
+        if self.api_key:
+            headers["X-API-Key"] = self.api_key
+        payload = {
+            "query": query,
+            "mode": mode,
+            "include_references": include_references,
+            "conversation_history": conversation_history or [],
+        }
+        for attempt in range(max_retries):
+            try:
+                response = requests.post(
+                    f"{self.server_url}/query",
+                    json=payload,
+                    headers=headers,
+                    timeout=self.timeout
+                )
+                if response.status_code == 200:
+                    return response.json()
+                else:
+                    logger.warning(f"Query failed with status {response.status_code}, attempt {attempt + 1}")
+            except requests.exceptions.Timeout:
+                logger.warning(f"Query timeout, attempt {attempt + 1}")
+            except Exception as e:
+                logger.warning(f"Query error: {e}, attempt {attempt + 1}")
+            if attempt < max_retries - 1:
+                time.sleep(2 ** attempt)  # Exponential backoff
+        return {"error": f"Query failed after {max_retries} attempts"}
+    def get_references(self, response_data: Dict[str, Any]) -> List[str]:
+        """
+        Extract reference information from LightRAG response
+        """
+        references = response_data.get("references", []) or []
+        ref_list = []
+        for ref in references[:5]:  # Limit to top 5 references
+            file_name = str(ref.get("file_path", "Unknown file")).split("/")[-1]
+            ref_list.append(file_name)
+        return ref_list
+class ResponseProcessor:
+    """
+    Process and enhance LightRAG responses
+    """
+    @staticmethod
+    def extract_main_content(response: Dict[str, Any]) -> str:
+        """
+        Extract the main response content
+        """
+        return response.get("response", "No response available.")
+    @staticmethod
+    def format_references(references: List[str]) -> str:
+        """
+        Format reference list for display
+        """
+        if not references:
+            return ""
+        ref_text = "\n\n**📚 References:**\n"
+        for ref in references:
+            ref_text += f"• {ref}\n"
+        return ref_text
+    @staticmethod
+    def extract_key_entities(response: Dict[str, Any]) -> List[str]:
+        """
+        Extract key entities mentioned in the response
+        """
+        # This could be enhanced if LightRAG provides entity information
+        content = response.get("response", "")
+        # Simple entity extraction based on common legal terms
+        legal_entities = []
+        regulations = ["GDPR", "NIS2", "DORA", "CRA", "eIDAS", "Cyber Resilience Act"]
+        for reg in regulations:
+            if reg.lower() in content.lower():
+                legal_entities.append(reg)
+        return list(set(legal_entities))  # Remove duplicates
+class ConversationFormatter:
+    """
+    Format conversation data for different purposes
+    """
+    @staticmethod
+    def build_conversation_history(history: List[Dict[str, str]], max_turns: int = 10) -> List[Dict[str, str]]:
+        """
+        Build conversation history for LightRAG API
+        """
+        if not history:
+            return []
+        # Take last max_turns pairs (user + assistant)
+        recent_history = history[-max_turns*2:]
+        formatted = []
+        for exchange in recent_history:
+            formatted.append({
+                "role": exchange["role"],
+                "content": exchange["content"]
+            })
+        return formatted
+    @staticmethod
+    def create_context_summary(history: List[Dict[str, str]]) -> str:
+        """
+        Create a summary of conversation context
+        """
+        if not history:
+            return "No previous conversation."
+        recent_exchanges = history[-4:]  # Last 2 exchanges
+        context_parts = []
+        for exchange in recent_exchanges:
+            role = "User" if exchange["role"] == "user" else "Assistant"
+            content = exchange["content"][:100] + "..." if len(exchange["content"]) > 100 else exchange["content"]
+            context_parts.append(f"{role}: {content}")
+        return "\n".join(context_parts)
+class PerformanceMonitor:
+    """
+    Monitor agent performance and timing
+    """
+    def __init__(self):
+        self.metrics = {}
+    def start_timer(self, operation: str) -> None:
+        """
+        Start timing an operation
+        """
+        self.metrics[f"{operation}_start"] = time.time()
+    def end_timer(self, operation: str) -> float:
+        """
+        End timing an operation and return duration
+        """
+        start_time = self.metrics.get(f"{operation}_start")
+        if start_time:
+            duration = time.time() - start_time
+            self.metrics[f"{operation}_duration"] = duration
+            return duration
+        return 0.0
+    def get_metrics(self) -> Dict[str, Any]:
+        """
+        Get all collected metrics
+        """
+        return self.metrics.copy()
+    def reset(self) -> None:
+        """
+        Reset all metrics
+        """
+        self.metrics.clear()
+def validate_query(query: str) -> Tuple[bool, Optional[str]]:
+    """
+    Validate user query
+    """
+    if not query or not query.strip():
+        return False, "Query cannot be empty."
+    if len(query) > 1000:
+        return False, "Query is too long. Please keep it under 1000 characters."
+    return True, None
+def format_error_message(error: str) -> str:
+    """
+    Format error messages for user display
+    """
+    error_map = {
+        "Server unreachable": "❌ The legal database is currently unavailable. Please try again in a moment.",
+        "timeout": "❌ The request timed out. Please try again.",
+        "invalid json": "❌ There was an issue processing the response. Please try again.",
+        "health check failed": "❌ The system is initializing. Please wait a moment and try again."
+    }
+    for key, message in error_map.items():
+        if key.lower() in error.lower():
+            return message
+    return f"❌ An error occurred: {error}"
+def create_safe_filename(query: str, timestamp: str) -> str:
+    """
+    Create a safe filename for logging purposes
+    """
+    # Remove problematic characters
+    safe_query = "".join(c for c in query if c.isalnum() or c in (' ', '-', '_')).strip()
+    safe_query = safe_query[:50]  # Limit length
+    return f"{timestamp}_{safe_query}.log"