Icon Legend

This session is not on your agenda.

This session is on your agenda. Click again to remove it.

Presentation Icons

ACVIM Resident Research Award Eligible

Additional Fees May Be Required

By Invite Only

Capturing for On Demand

Cardiology Research Abstract Award Eligible

ERC Approved Lecture for ACVIM Residency

Poster

Livestream

One Health

Pre-Registration Required

Research Abstract

Specialty Symposium

ACVIM Resident Research Award Winner

Cardiology Research Abstract Award Winner

123 Views

Evaluation ViewAttendees 1

Small Animal Internal Medicine

In Person Only

OT09 - Using Artificial Intelligence for Scalable Data Extraction from Medical Records in the Dog Aging Project

Thursday, June 11, 2026

5:00 PM - 5:15 PM PT

Location: Summit SCC, Room 444

CE: 0.25 Medical Hours

Research Abstract - Oral Presenter(s)

Yumi Chang, DVM

PhD student
Texas A&M University
College Station, Texas, United States

Proceedings/Handout

Background:
The Dog Aging Project (DAP) collects longitudinal veterinary records from client-owned dogs. Dates with matched body weight (BW) are essential information for clinical studies, but manual extraction from heterogeneous PDF formats is labor-intensive.
Hypothesis/

Objective:
To compare the weighted accuracy (WA) of rule-based versus large language model (LLM)-based automated extraction of dates with matched BW from veterinary records.
Animals:
59 PDF veterinary records containing 559 paired date and BW from 49 dogs enrolled in DAP.

Methods:
A pilot set of 9 medical records in 9 formats was used to develop the extraction rules for a rule-based algorithm, and prompts for VetRec (a commercial veterinary LLM), targeting ≥80% accuracy (p = 0.17). Next, 50 additional records were analyzed using both methods, including 14 unseen formats in the pilot set. Automatic outputs were compared against ground truth verified by a veterinarian and the DAP project manager. WA were calculated as 3*(exact match) / [3*(exact match) + 3*(partial match) + 3*(hallucination) + 2*(omission) + 1*(non‑compliance)].

Results:
Across all 559 data points, VetRec achieved an 83% WA, compared with 51% for the rule-based method (p < .001). The rule-based model frequently misassigned non-measurement dates (e.g., scan dates, dates of birth) to nearby BW values based on spatial proximity. VetRec largely avoided this contextual error but generated 46 non-compliant outputs, an error inherently avoided by deterministic rules.
Conclusion and Clinical Importance:
LLM-based extraction significantly outperformed rule-based methods for automated veterinary data extraction, offering a scalable solution for heterogeneous medical record formatting across institutions.