check_circle Action completed
Sarah Martinez
Sarah Martinez
MRN: 2847-9361
Demographics
Age / DOB 52 yrs (06/14/1973)
Gender Female
Blood Type A+
Clinical Information
BMI 28.4 (Overweight)
Primary Physician Dr. J. Chen
Last Visit Sep 28, 2025
warning
Allergies: Penicillin, Sulfa drugs
Current Medications
medication Metformin 500mg - 2x daily
medication Lisinopril 10mg - 1x daily
Current Visit
Visit Type Follow-up
Admission Oct 30, 2025 9:15 AM
Department Endocrinology
family_restroom Family history of Type 2 Diabetes
directions_walk Sedentary lifestyle
smoking_rooms Former smoker (quit 2019)
psychology AI-Assisted Diagnosis
info AI CAPABILITIES
Can Evaluate
check_circle Type 1 Diabetes
check_circle Type 2 Diabetes
check_circle Prediabetes
Cannot Evaluate
cancel Other endocrine disorders
cancel General symptoms
cancel Cardiovascular conditions
Diagnosis
Impaired Glucose Tolerance vs Type 2 Diabetes
3
Principle 3: Contextual confidence scores Show certainty for each claim. The visual meter and percentage clearly communicate the AI's confidence level.
Confidence Level
65%
Low Moderate High
info
Why is confidence moderate?
HbA1c is borderline (6.7% vs 6.5% threshold) and fasting glucose is in prediabetic range. Clinical guidelines recommend repeat testing or OGTT before confirming diagnosis.
Evidence
show_chart HbA1c: 6.7% (Oct 15, 2025)
show_chart Fasting Glucose: 118 mg/dL (Oct 20, 2025) - Borderline
description Family history of Type 2 Diabetes
help_outline No repeat HbA1c or OGTT to confirm diagnosis
Step-by-Step Analysis
1
HbA1c (6.7%) → Just above threshold of 6.5%
subdirectory_arrow_right Primary criterion marginally met ✓
subdirectory_arrow_right ⚠ Close to threshold - repeat test recommended
2
Fasting Glucose (118 mg/dL) → Below threshold of 126 mg/dL
subdirectory_arrow_right ⚠ Prediabetic range - creates diagnostic uncertainty
3
Family History → Increases prior probability by 2.4x
subdirectory_arrow_right Genetic risk factor present ✓
4
Missing Confirmatory Testing
subdirectory_arrow_right ⚠ No repeat HbA1c or OGTT available
insights
Conclusion: Mixed evidence with borderline results. Confirmatory testing needed → Provisional Type 2 Diabetes (65% confidence)
error ⚠ CRITICAL: Physician review required - Additional testing recommended
Physician Actions
5
Principle 5: Human override options Expert can always intervene. These action buttons ensure physicians maintain full control over final decisions.
schedule undo Undo
Review & Modify Assessment
Low High 65%
account_tree Data Lineage & Traceability
memory
MedicalAI-DiabetesSpecialist v2.4
subdirectory_arrow_right FDA 510(k) Cleared | Trained on 847K anonymized records
science
Input Sources
subdirectory_arrow_right Lab Results via Epic EHR (HL7 FHIR API)
subdirectory_arrow_right Patient questionnaire validated Oct 10, 2025
verified_user
Compliance & Audit
subdirectory_arrow_right HIPAA & GDPR Compliant | Analysis ID: DX-20251030-7A8B9C
Michael Johnson
Michael Johnson
MRN: 5621-4738
Demographics
Age / DOB 58 yrs (03/22/1967)
Gender Male
Blood Type O+
Clinical Information
BMI 32.1 (Obese)
Primary Physician Dr. A. Patel
Last Visit Oct 10, 2025
warning
Allergies: None known
Current Medications
medication Atorvastatin 20mg - 1x daily
medication Aspirin 81mg - 1x daily
Current Visit
Visit Type New Patient
Admission Oct 30, 2025 10:30 AM
Department Endocrinology
family_restroom Strong family history of Type 2 Diabetes
directions_walk Sedentary lifestyle
restaurant Poor dietary habits
monitoring Hypertension
psychology AI-Assisted Diagnosis
info AI CAPABILITIES
Can Evaluate
check_circle Type 1 Diabetes
check_circle Type 2 Diabetes
check_circle Prediabetes
Cannot Evaluate
cancel Other endocrine disorders
cancel General symptoms
cancel Cardiovascular conditions
Diagnosis
Type 2 Diabetes Mellitus, Uncontrolled
3
Principle 3: Contextual confidence scores Show certainty for each claim. The visual meter and percentage clearly communicate the AI's confidence level.
Confidence Level
95%
Low Moderate High
info
Why is confidence high?
Multiple lab values significantly exceed diagnostic thresholds, confirmed by repeat testing. All clinical criteria are clearly met with no conflicting evidence.
Evidence
show_chart HbA1c: 8.2% (Oct 15, 2025) - Well above threshold
show_chart Fasting Glucose: 156 mg/dL (Oct 20, 2025) - Elevated
show_chart Repeat HbA1c: 8.1% (Oct 28, 2025) - Confirms diagnosis
description Family history of Type 2 Diabetes + BMI 32
Step-by-Step Analysis
1
HbA1c (8.2%) → Significantly above threshold of 6.5%
subdirectory_arrow_right ✓ Primary criterion clearly met
2
Fasting Glucose (156 mg/dL) → Well above threshold of 126 mg/dL
subdirectory_arrow_right ✓ Secondary criterion confirms diagnosis
3
Repeat HbA1c (8.1%) → Consistent results
subdirectory_arrow_right ✓ Confirmatory testing validates initial diagnosis
4
Risk Factors Present
subdirectory_arrow_right ✓ Family history + obesity strongly support diagnosis
insights
Conclusion: All diagnostic criteria met with confirmatory testing. Clear Type 2 Diabetes diagnosis (95% confidence)
check_circle ✓ Diagnosis meets all criteria - Ready for treatment planning
Physician Actions
5
Principle 5: Human override options Expert can always intervene. These action buttons ensure physicians maintain full control over final decisions.
schedule undo Undo
Review & Modify Assessment
Low High 95%
account_tree Data Lineage & Traceability
memory
MedicalAI-DiabetesSpecialist v2.4
subdirectory_arrow_right FDA 510(k) Cleared | Trained on 847K anonymized records
science
Input Sources
subdirectory_arrow_right Lab Results via Epic EHR (HL7 FHIR API)
subdirectory_arrow_right Patient questionnaire validated Oct 10, 2025
verified_user
Compliance & Audit
subdirectory_arrow_right HIPAA & GDPR Compliant | Analysis ID: DX-20251030-5F2A1D
Lisa Chen
Lisa Chen
MRN: 8942-1563
Demographics
Age / DOB 44 yrs (11/08/1981)
Gender Female
Blood Type B+
Clinical Information
BMI 24.8 (Normal)
Primary Physician Dr. R. Kim
Last Visit Aug 15, 2025
warning
Allergies: Iodine contrast
Current Medications
medication Levothyroxine 50mcg - 1x daily
medication Vitamin D 2000 IU - 1x daily
Current Visit
Visit Type Screening
Admission Oct 30, 2025 2:00 PM
Department Endocrinology
directions_walk Active lifestyle - exercises regularly
restaurant Healthy diet
family_restroom No family history of diabetes
psychology AI-Assisted Diagnosis
info AI CAPABILITIES
Can Evaluate
check_circle Type 1 Diabetes
check_circle Type 2 Diabetes
check_circle Prediabetes
Cannot Evaluate
cancel Other endocrine disorders
cancel General symptoms
cancel Cardiovascular conditions
Diagnosis
Prediabetes - Further Testing Required
3
Principle 3: Contextual confidence scores Show certainty for each claim. The visual meter and percentage clearly communicate the AI's confidence level.
Confidence Level
35%
Low Moderate High
info
Why is confidence low?
All test results are below diagnostic thresholds for Type 2 Diabetes. Data suggests prediabetes, but OGTT and repeat testing are essential before making any diagnosis.
Evidence
show_chart HbA1c: 6.2% (Oct 15, 2025) - Below diagnostic threshold
show_chart Fasting Glucose: 108 mg/dL (Oct 20, 2025) - Prediabetic range
help_outline No OGTT or additional testing available
help_outline Incomplete symptom history and risk factor assessment
Step-by-Step Analysis
1
HbA1c (6.2%) → Below diagnostic threshold of 6.5%
subdirectory_arrow_right ⚠ In prediabetic range (5.7-6.4%) but not diagnostic
2
Fasting Glucose (108 mg/dL) → Below threshold of 126 mg/dL
subdirectory_arrow_right ⚠ Impaired fasting glucose but not diabetes
3
Missing Critical Data
subdirectory_arrow_right ⚠ No OGTT to clarify borderline results
subdirectory_arrow_right ⚠ Incomplete clinical context
insights
Conclusion: Insufficient evidence for Type 2 Diabetes. Results suggest Prediabetes, but comprehensive testing required for definitive diagnosis (35% confidence)
error ⚠ URGENT: Insufficient data for diagnosis - Comprehensive testing required before treatment
Physician Actions
5
Principle 5: Human override options Expert can always intervene. These action buttons ensure physicians maintain full control over final decisions.
schedule undo Undo
Review & Modify Assessment
Low High 35%
account_tree Data Lineage & Traceability
memory
MedicalAI-DiabetesSpecialist v2.4
subdirectory_arrow_right FDA 510(k) Cleared | Trained on 847K anonymized records
science
Input Sources
subdirectory_arrow_right Lab Results via Epic EHR (HL7 FHIR API)
subdirectory_arrow_right Limited patient history data available
verified_user
Compliance & Audit
subdirectory_arrow_right HIPAA & GDPR Compliant | Analysis ID: DX-20251030-9D4E2F
Domain Expert AI — TL;DR — Ken Hung
Prototype 02 Medical AI · High-Stakes Domain Designing for Different AI Archetypes
Domain Expert AI
High-Stakes Decision Support

Showing confidence badly makes outcomes worse, not better.

1

Experts are being misled by the interfaces meant to help them.

Domain experts — clinicians, lawyers, financial analysts — use AI to augment decisions that carry real consequences. The instinct is to show a confidence score and call it transparency. Research says otherwise: displaying confidence badly causes expert accuracy to drop, not rise. A radiology study found accuracy fell from 82% to 46% when AI confidence was shown incorrectly.

The problem isn't the AI model — it's the interface. Experts who trust a high-confidence wrong answer more than a low-confidence right one aren't making bad decisions. They're responding rationally to a badly designed signal. In specialized AI, UX is not decoration — it is a safety layer.


2

Five principles of Domain Expert AI UX

01
Domain Guardrails
Specialized models have boundaries, and those boundaries must be visible. When the AI declines to evaluate something outside its training, users should understand why — not receive a hallucinated answer that exceeds its scope. An explicit AI Scope indicator prevents both over-reliance and false confidence in the model's range.
02
Transparent Reasoning Chains
Experts don't just need the conclusion — they need the logic. Expandable reasoning panels reveal which data points were evaluated, what thresholds were applied, and where uncertainty entered the chain. Experts can agree with the conclusion while disagreeing with a step — and act accordingly.
03
Contextual Confidence Scores
A single percentage means very little without context. Confidence display must answer three questions: how certain is the model, what is that certainty based on, and what should the user do differently at this confidence level? The score should change the interface state — not just decorate it.
04
Verifiable Citations
Every output should be traceable to its source. Model version, training data, data lineage, and audit ID aren't compliance overhead — they're the foundation of institutional trust. In legal, medical, and financial AI, an unverifiable output is an unusable one.
05
Human Override Options
The expert must always have full authority over the AI's recommendation. Approve, modify, and reject actions should be equally prominent — not visually weighted toward acceptance. Every override should be documented with reasoning, creating an audit trail that improves the system and protects the human who made the call.

3

Confidence shapes the entire interface — not just one element

The prototype uses a medical diagnosis context to show how the same interface must respond differently at three confidence levels. The UX adapts: evidence framing, alert severity, action emphasis, and the reasoning chain all shift with certainty.

65%
Moderate Confidence
Impaired Glucose Tolerance vs Type 2 Diabetes
UX response: Borderline evidence with caution flags. Physician required to review before proceeding.
95%
High Confidence
Type 2 Diabetes Mellitus, Uncontrolled
UX response: All evidence confirmed. Clear path to treatment with streamlined approval flow.
35%
Low Confidence
Prediabetes — Further Testing Required
UX response: Urgent alert surfaced. Missing data gaps explicit. System gates strongly toward review first.

Beyond the three scenarios, the prototype includes a sticky patient panel that keeps context always visible, an expandable reasoning chain that shows each diagnostic step with evidence weights, a data lineage section with model version and audit ID, and a Review & Modify panel that requires documented reasoning before an override is accepted — creating an audit trail without slowing the expert down.


4

"In specialized AI, UX is not decoration — it is a safety layer."