AI Virtual Physician

Created a virtual physician in 48 hours that conducts primary care consults, applies symptom triage logic, and uses structured prompts to escalate emergencies or provide empathetic self-care guidance. Built with Next.js, Vercel AI SDK, OpenAI, and Supabase (Auth + Postgres).

Year

2025

Year

2025

Problem

People often feel uncertain about whether their symptoms need self-care or professional help. Traditional telehealth can be expensive or slow, and many users delay seeking support. I wanted to explore how AI could offer a low-friction, trustworthy way to triage symptoms at home.

Objective

In 48 hours, design an AI agent that simulates a primary care consultation. It should handle mild symptoms with self-care suggestions, recognize emergency symptoms, and escalate appropriately, all while following strict safety and empathy protocols.

User Flow (Discovery)

Interviewed medical professionals and supplemented with online research to understand the typical flow of primary care consultations.

Patient Journey

Healthcare Professional Journey

Link to Figma for Full Resolution Version

Solution

AI Primary Care Triage Agent

Goal: Conduct structured triage and deliver context-aware guidance for mild and emergency scenarios.
Tech Stack Used: Next.js, the Vercel AI SDK, OpenAI, and Supabase Auth, Postgres DB, and Typeform.
Key Functions
- Symptom Intake via Typeform: Captures structured inputs like symptom type, onset, severity, and concerns through a custom Typeform. Responses are passed to the AI agent and merged with prior session data to provide context-aware triage and personalization.
- Triage and Classification: Applies logic to differentiate between mild symptoms (like fatigue, headache, and cough) and emergency red flags (like chest pain or difficulty breathing). The agent determines the appropriate handling path based on both the user's language and symptom context.
- Edge Case Handling: Handles mixed-symptom presentations by applying escalation rules when any individual symptom meets emergency criteria. Emergency handling includes immediate escalation messaging with phrases like “This is beyond what I can safely assess remotely” and directions for follow-up care.
- Self-Care Recommendations: For mild cases, provides exactly three numbered self-care actions with friendly, specific instructions. Each recommendation includes contextual validation (e.g., “That sounds uncomfortable”) and asks the user for confirmation with “How does this sound to you?”
- Empathy Protocols: Responds with emotionally intelligent phrases such as “It’s completely understandable that you’re concerned about [symptom]” and avoids dismissive language like “don’t worry.” Every interaction is written in plain language without medical jargon.
- Escalation Flows: For high-risk or unclear cases, the agent clearly states that it cannot safely continue and advises immediate next steps. All escalations include a built-in disclaimer and a follow-up time window.

Here is the system prompt I crafted for this project. I focused less on crafting a lengthy prompt (<500 words), since my next step would be implementing agent-to-agent tooling rather than relying on the system prompt to handle additional edge cases.

System Diagram

Link to Figma for Full Resolution Version

LLM Evaluation Plan

The AI agent was tested with over 100 synthetic patient inputs to ensure prompt compliance, safety behavior, and structured empathy. Key checks included:
- Prompt Adherence: Required language like “I understand” and the exact timeline prompt (“When did this first start, and has it been getting better, worse, or staying the same?”) were verified across all runs.
- Required Context Logic: The model was blocked from making recommendations until 5 required inputs (location, severity, onset, modifiers, timeline) were gathered.
- Emergency Escalation: For inputs with critical symptoms (e.g., chest pain), responses had to include all escalation language and disclaimers word-for-word.
- Empathy Protocols: The model was evaluated on whether it responded empathetically to pain or fear using phrases like “That sounds really uncomfortable.”
- Output Structure: Mild cases required exactly 3 self-care tips, plain language only, and had to end with “How does this sound to you?”

Outcome

Mild Symptom Transcript

Full Resolution Version

Emergency Symptom Transcript

Full Resolution Version

My Learnings

Prompt design > product surface: Most time was spent refining reasoning logic, not UI. Effective prompt engineering (e.g. JSON/XML formatting) drove performance.

LLMs lack determinism: Inconsistent responses led to implementing Typeform pre-intake to ensure complete patient context and reduce hallucinations.
Reasoning opacity: GPT offered limited transparency into decision paths. To mitigate this, I proposed chain-of-thought prompting and rationale layers in output for user trust and clarity.
Adaptability gap: Static intake missed key context. I designed a smart prompt flow that dynamically requests missing info mid-chat to build richer patient profiles over time.
High-stakes use case sensitivity: Highlighted how clinical triage requires rigorous validation and empathetic tone design due to safety risks of incorrect advice.
Time-Constrained Execution: Building this in 48 hours forced me to prioritize core logic, focus on simplicity, and apply best practices in prompting and flow design.