ETHERIA
A full-stack web-based health assistant that assesses symptoms through natural conversation and voice — grounding every answer in your own records, peer-reviewed literature, and a medical knowledge graph.

Overview
Etheria pairs a real-time chat and voice interface with a tri-layer retrieval system — personal medical records, PubMed literature, and a clinical knowledge graph — behind a multi-layer safety framework that keeps responses preliminary, cited, and non-diagnostic.
The Problem
Healthcare access is delayed by limited doctor availability, long waits, and no immediate guidance for early or unclear symptoms.
People fall back on unreliable online searches — fuelling misinformation, anxiety, and poor decisions.
General-purpose AI chatbots hallucinate clinical facts and carry no structured safety guardrails for medical use.
No production-grade open system combines longitudinal personal context, population-level evidence, and deterministic safety in one place.
Objectives
Talk like a human
A conversational assistant that takes symptoms through natural voice and text, and understands them in context.
Ground every answer
Analyse symptoms against medical knowledge bases and patient history for accurate, safe, context-aware guidance.
Surface urgency
Classify how urgent a condition is and steer the user toward self-care, diagnostics, or timely consultation.
Key Features
Real-time chat & voice
forumToken-by-token streaming responses with a low-latency voice interface for spoken assessment.
Tri-layer RAG
hubRetrieval across personal records, PubMed literature, and a medical knowledge graph for grounded answers.
Structured medical output
clinical_notesTriage levels, differential diagnosis, and ICD-10 codes with inline PubMed citations.
Multi-layer safety
shieldA safety framework that keeps responses controlled, non-diagnostic, and compliant by design.
Document pipeline
upload_fileUpload reports and records; they are parsed, chunked, and embedded into personalized context.
System Architecture
Frontend
- →App Router with SSE token streaming and WebSocket voice.
- →Clerk RS256 JWT auth across four protected routes.
- →60 fps Canvas voice orb with a 6-state voice FSM.
- →Triage badges and a differential panel with ICD-10 + citations.
Backend
- →Pure asyncio on Uvicorn; five routers, never blocking the loop.
- →Streaming chat over SSE; secure WebSocket voice (Whisper → TTS).
- →ASGI rate-limit middleware; buffering disabled for instant streams.
- →Temporal ingestion: parse (pdfplumber / PyMuPDF / OCR) → chunk → embed.
Data Layer
- →PostgreSQL 16 + pgvector: 1024-dim chunks, HNSW cosine, per-user isolation.
- →Neo4j 5 knowledge graph: ~5k nodes from UMLS, SNOMED CT, RxNorm.
- →BGE-large embeddings at 512 tokens with 50-token overlap.
- →Redis 7 cache cuts retrieval from 2.8s to 0.4s, plus rate limiting.
Results
- checkColour-coded triage: RED (emergency), YELLOW (urgent), GREEN (non-urgent).
- checkEvery response carries inline PubMed citations (PMIDs) for factual grounding.
- checkLow-latency streaming chat responses, token by token.
- checkA hardcoded disclaimer is appended at the application layer for compliance.
Tech Stack
Team
Department of Artificial Intelligence & Data Science, NMAM Institute of Technology, Nitte.