Using Natural Language Processing on the Free Text of Clinical Documents to Screen for Evidence of Homelessness Among US Veterans

Information retrieval algorithms based on natural language processing (NLP) of the free text of medical records have been used to find documents of interest from databases. Homelessness is a high priority non-medical diagnosis that is noted in electronic medical records of Veterans in Veterans Affairs (VA) facilities. Using a human-reviewed reference standard corpus of clinical documents of Veterans with evidence of homelessness and those without, an open-source NLP tool (Automated Retrieval Console v2.0, ARC) was trained to classify documents. The best performing model based on document level work-flow performed well on a test set (Precision 94%, Recall 97%, F-Measure 96). Processing of a naïve set of 10,000 randomly selected documents from the VA using this best performing model yielded 463 documents flagged as positive, indicating a 4.7% prevalence of homelessness. Human review noted a precision of 70% for these flags resulting in an adjusted prevalence of homelessness of 3.3% which matches current VA estimates. Further refinements are underway to improve the performance. We demonstrate an effective and rapid lifecycle of using an off-the-shelf NLP tool for screening targets of interest from medical records.

Publication Date:

2013

Pages:

537–546

Volume:

2013

Journal Name:

AMIA Annual Symposium Proceedings

Location:

USA

Using Natural Language Processing on the Free Text of Clinical Documents to Screen for Evidence of Homelessness Among US Veterans

Homelessness 101

Topics

Teacher Resources

Stories

Gallery

Poverty Hub

Ethics

Methodology

Knowledge Mobilization

Program Evaluation

Best, Promising & Emerging Practices

Monitoring Progress

Alberta

British Columbia

Manitoba

New Brunswick

Newfoundland and Labrador

Northwest Territories

Nova Scotia

Nunavut

Ontario

Prince Edward Island

Quebec

Saskatchewan

Yukon Territory

Prevention

Ending Homelessness

Accommodations & Supports

Systems Integration

Engaging Clients

Priority Populations

Canadian Observatory on Homelessness

Using Natural Language Processing on the Free Text of Clinical Documents to Screen for Evidence of Homelessness Among US Veterans