TENGWAR - Private RAG Assistant

100% private enterprise RAG platform (Retrieval-Augmented Generation) for teams that need AI-powered document search without sending data to the cloud.

Role

Founder / Solo Developer

Domain

AI · RAG · Internal Tools · Knowledge Management

Stack

Next.js 14ReactTypeScriptASP.NET Core 8C#SignalRPythonLangChainPostgreSQLpgvectorOllamaApache TikaDockerLinux (Debian)NVIDIA GPUTailwind CSS

Problem & Context

Many SMBs require internal knowledge assistants but can't send proprietary documents to external LLM APIs (legal, GDPR, trade secrets, IP protection). TENGWAR emerged to solve this: a plug-and-play hardware + software package running locally, embedding company documents (PDFs, Word, Excel, PowerPoint, confluence exports, etc.) into vector databases and providing ChatGPT-like Q&A interfaces—with precise source citations—without any data leaving the premises. Supports 100+ languages, multi-user access control and integrates with internal authentication systems.

Responsibilities

Full-stack development: UI, backend, AI pipeline & hardware selection
RAG architecture: embedding models, vector databases (pgvector), retrieval strategies
Multi-format ingestion: PDF, DOCX, XLSX, PPTX, TXT, MD, HTML (20+ formats)
Real-time chat interface: Next.js UI with SignalR streaming for progressive responses
Source citation overlay: every answer links to exact document & page number
Role-based access control: departments & user permissions on document collections
Hardware integration: Linux-based appliance configuration (Debian, systemd services)
Multi-language translation layer: automatic query & response translation (100+ languages)
Deployment & support: client onboarding, training & iterative improvements

Architecture & Stack

Frontend: Next.js 14 (React, TypeScript), Tailwind CSS, SignalR client
Backend: ASP.NET Core 8 (C#), SignalR WebSocket streaming
RAG Pipeline: Python (LangChain, sentence-transformers), pgvector (PostgreSQL extension)
Embedding models: multilingual sentence-transformers (local inference)
LLM inference: Ollama (local models: Llama, Mistral, etc.) or external API fallback
Document ingestion: Apache Tika + custom parsers for metadata extraction
Database: PostgreSQL 16 + pgvector for vector search
Authentication: LDAP / Active Directory integration + JWT tokens
Hardware: Intel NUC / custom Linux appliance with GPU (NVIDIA RTX for embeddings)
Deployment: Docker Compose orchestration, systemd service management
Monitoring: Prometheus + Grafana for usage analytics

Outcomes

Enabled 100% private AI assistant for clients unable to use cloud LLMs
20+ file format support eliminated manual document conversion workflows
Precise source citations increased user trust & compliance auditability
Role-based access ensured departmental data isolation & security compliance
Multi-language support expanded addressable market to non-English organizations
Plug-and-play hardware model reduced deployment friction & IT burden

Learn More

Official product page: tengwar.net ↗