AI Virtual Physician
Created a virtual physician in 48 hours that conducts primary care consults, applies symptom triage logic, and uses structured prompts to escalate emergencies or provide empathetic self-care guidance. Built with Next.js, Vercel AI SDK, and Supabase.
Problem
People often feel uncertain about whether their symptoms need self-care or professional help. Traditional telehealth can be expensive or slow, and many users delay seeking support. I wanted to explore how AI could offer a low-friction, trustworthy way to triage symptoms at home.
Objective
In 48 hours, design an AI agent that simulates a primary care consultation. It should handle mild symptoms with self-care suggestions, recognize emergency symptoms, and escalate appropriately, all while following strict safety and empathy protocols.
User Flow (Discovery)
Interviewed medical professionals and supplemented with online research to understand the typical flow of primary care consultations.
Patient Journey

Healthcare Professional Journey

Link to Figma for Full Resolution Version
Solution
AI Primary Care Triage Agent
Tech Stack Used: Next.js, the Vercel AI SDK, OpenAI, and Supabase Auth, Postgres DB, and Typeform.
Key Functions
Symptom Intake via Typeform: Collects structured inputs (symptom type, onset, severity, concerns) and sends them to the AI agent, which merges them with prior session data for context-aware triage.
Triage & Classification: Distinguishes mild symptoms from emergency red flags using both user language and clinical context to choose the correct handling path.
Edge Case Handling: For mixed symptoms, escalates automatically if any single symptom meets emergency criteria and delivers clear, immediate escalation messaging with follow-up instructions.
Self-Care Recommendations: For mild cases, provides exactly three specific, numbered self-care steps with empathetic validation and asks, “How does this sound to you?” to confirm understanding.
Empathy Protocols: Uses supportive, plain-language responses (i.e., “It’s understandable you’re concerned about ___”) while avoiding dismissive tone or medical jargon.
Escalation Flows: For high-risk or unclear situations, the agent clearly states it cannot safely proceed, gives next steps, includes a disclaimer, and sets a follow-up time window.
Here is the system prompt I crafted for this project. I focused less on crafting a lengthy prompt (<500 words), since my next step would be implementing agent-to-agent tooling rather than relying on the system prompt to handle additional edge cases.
System Diagram

Link to Figma for Full Resolution Version
LLM Evaluation Plan
The AI agent was tested with over 100 synthetic patient inputs to ensure prompt compliance, safety behavior, and structured empathy. Key checks included:
Prompt Adherence: Required language like “I understand” and the exact timeline prompt (“When did this first start, and has it been getting better, worse, or staying the same?”) were verified across all runs.
Required Context Logic: The model was blocked from making recommendations until 5 required inputs (location, severity, onset, modifiers, timeline) were gathered.
Emergency Escalation: For inputs with critical symptoms (e.g., chest pain), responses had to include all escalation language and disclaimers word-for-word.
Empathy Protocols: The model was evaluated on whether it responded empathetically to pain or fear using phrases like “That sounds really uncomfortable.”
Output Structure: Mild cases required exactly 3 self-care tips, plain language only, and had to end with “How does this sound to you?”
Outcome
Mild Symptom Transcript



Emergency Symptom Transcript


My Learnings
Prompt design > product surface: Most time was spent refining reasoning logic, not UI. Effective prompt engineering (e.g. JSON/XML formatting) drove performance.
LLMs lack determinism: Inconsistent responses led to implementing Typeform pre-intake to ensure complete patient context and reduce hallucinations.
Reasoning opacity: GPT offered limited transparency into decision paths. To mitigate this, I proposed chain-of-thought prompting and rationale layers in output for user trust and clarity.
Adaptability gap: Static intake missed key context. I designed a smart prompt flow that dynamically requests missing info mid-chat to build richer patient profiles over time.
High-stakes use case sensitivity: Highlighted how clinical triage requires rigorous validation and empathetic tone design due to safety risks of incorrect advice.
Time-Constrained Execution: Building this in 48 hours forced me to prioritize core logic, focus on simplicity, and apply best practices in prompting and flow design.
