A pioneering collaboration between UNLV's Computer Science and School of Nursing — applying natural language processing, transformer models, and machine learning to nursing administrative data to improve care quality, equity, and efficiency at scale.
Nursing generates enormous volumes of structured and unstructured data — clinical notes, ICD-coded diagnoses, care plans, NCLEX performance records, and patient outcomes data. Yet most of it remains underutilized because extracting meaningful patterns requires specialized computational tools that most nursing researchers don't have access to.
This project directly addresses that gap. Dr. Fonseca brings deep expertise in natural language processing, health data pipelines, and transformer-based AI. Dr. Vanderlaan brings clinical nursing domain expertise, validated measurement tools, and population-level health data experience. Together, they are building the computational infrastructure and analytical frameworks nursing science has been waiting for.
"The most valuable insights in healthcare data are buried in complexity that only AI can unlock — if guided by clinical expertise."
In partnership with the National Council of State Boards of Nursing (NCSBN) — the organization that oversees nursing licensure and practice standards across the United States — this research has the potential to reshape how nursing quality is measured, monitored, and improved nationwide.
The National Council of State Boards of Nursing governs nursing practice and licensure for all 50 states — making it uniquely positioned to translate research into nationwide impact.
NCSBN administers the NCLEX examination — the gateway to nursing practice — and sets competency standards across all U.S. jurisdictions. AI tools developed through this partnership have the potential for national-scale adoption.
NCSBN maintains comprehensive longitudinal datasets on nursing education, licensure, and practice patterns across the U.S. — an unprecedented data asset for population-level AI analysis.
NCSBN findings directly inform state nursing boards, legislation, and workforce policy. Research conducted through this initiative can shape how nursing quality is regulated and improved nationwide.
Nursing documentation — care plans, shift notes, discharge summaries — contains rich clinical insights that structured codes cannot capture. We apply BERT-based NLP models trained on healthcare text to automatically extract, classify, and analyze these narratives at population scale, identifying quality signals invisible to manual review.
Nursing diagnoses represent a standardized taxonomy of patient conditions that guide care planning. We use transformer models — including BEHRT (BERT adapted for EHR data) — to automatically and accurately classify nursing diagnoses from ICD-coded administrative claims, enabling scalable quality measurement without manual coding.
Using longitudinal patient and nurse data, we build predictive models that surface early warning signals for adverse outcomes, care gaps, and quality deficiencies — enabling proactive intervention before problems escalate. Models are designed to be clinically interpretable, not just accurate.
Large nursing datasets — claims files, NCLEX records, state administrative data — require industrial-grade data engineering before any AI model can run. We design and build the ETL pipelines, data cleaning protocols, and storage architectures that make research possible at population scale.
From raw administrative data to actionable nursing quality insights — a reproducible, scalable workflow.
Claims, NCLEX & state nursing records
ETL, ICD normalization, de-ID
Transformer model encoding on clinical sequences
Nursing diagnoses, quality flags, risk scores
Policy reports, quality dashboards, publications
All transformer model training runs on UNLV's 18-node NVIDIA RTX™ A4000 Data Analytics Lab and the $500K NSF-funded GPU cluster housed at the Switch Cloud Center in Las Vegas — providing the raw compute power required for large-scale BEHRT pre-training and fine-tuning on nursing data.
This project works because it combines exactly the right expertise — neither researcher could do this alone.
Dr. Fonseca specializes in NLP, health informatics, and AI systems design. He has developed AR indoor navigation systems (AIVR 2022), health applications with NSF and HEERF II funding, and data pipelines for the Walk2School2Day mobile app. His expertise in transformer models and scalable data engineering provides the computational backbone for this nursing AI initiative.
WHAT FONSECA BRINGS
NLP model development · Data pipeline architecture · GPU cluster management · AI engineering · Model interpretability
A certified nurse-midwife with a PhD in Nursing and Master's in Public Health (Emory University), Dr. Vanderlaan is a national authority on maternal health data, nursing workforce policy, and clinical quality measurement. Her validated methods for identifying high-risk patients in administrative data are foundational to the AI models this project develops.
WHAT VANDERLAAN BRINGS
Clinical domain expertise · Validated nursing measures · NCSBN relationships · Research design · Health policy translation
The fundamental challenge in applying AI to nursing is the gap between computational capability and clinical validity. An AI model that classifies nursing diagnoses inaccurately — even subtly — can lead to wrong policy conclusions. Dr. Vanderlaan's clinical expertise serves as the essential validation layer, ensuring that every model we build is grounded in nursing science, not just statistical patterns. Dr. Fonseca ensures that the computational tools are production-grade, reproducible, and scalable. Neither could do this work effectively without the other.
The UNLV Department of Computer Science has a long history of funded research, product development, and interdisciplinary collaboration that makes this nursing AI initiative possible. From the Department of Energy's Licensing Support Network to NSF-funded educational technology, UNLV CS has been solving complex data challenges for decades.
18 workstations each equipped with NVIDIA RTX™ A4000 GPUs — the most powerful single-slot professional GPU — for concurrent model development and training.
A $500K GPU cluster funded through NSF, housed at the Switch Cloud Center in Las Vegas, providing on-demand high-performance computing for large-scale model training.
Graduate and undergraduate researchers from UNLV CS contribute to data engineering, model development, and testing — making this project a training ground for the next generation of health AI researchers.
Fonseca and Taghva previously received funding from the UNLV School of Public Health (Walk2School2Day app) and the Office of Economic Development — demonstrating the department's capacity for interdisciplinary health technology projects. Dr. Fonseca's AIVR 2022 paper on OCR-enhanced AR navigation further illustrates the team's applied AI capabilities.
Dr. Fonseca is currently Senior Personnel on a $999,815 NSF grant developing AI-powered educational games to improve STEM math skills — providing additional infrastructure and student researcher experience that directly supports nursing AI development.
Dr. Vanderlaan's parallel project with Dr. Taghva applies the same population-level data approach — using Virginia's All-Payer Claims Database — to study severe maternal morbidity and risk-appropriate care.
Explore APCD projectThe published research paper emerging from this collaboration — using BEHRT and transformer models to automatically classify nursing diagnoses from administrative claims data.
Learn moreWe're interested in collaborating with nursing schools, state boards of nursing, healthcare systems, and clinical researchers who want to harness AI for quality improvement.
Contact the Team All Research