AI

OLMo accuracy vs. Dolma estimated co-occurrence frequency on CASI dataset. Each dot shows a jargon-expansion pair.

Diagnosing our datasets: How does my language model learn clinical information?

Large language models (LLMs) have performed well across various clinical natural language processing tasks, despite not being directly trained on electronic health record (EHR) data. In this work, we examine how popular open-source LLMs learn clinical information from large mined corpora through two crucial but understudied lenses: (1) their interpretation of clinical jargon, a foundational…

Workflow diagram of offline modeling and online planning for blood flow.

Real-time virtual intervention for simple and serial coronary artery disease using the HarVI framework

Virtual planning tools that provide intuitive user interaction and immediate hemodynamic feedback are crucial for cardiologists to effectively treat coronary artery disease. Current FDA-approved tools for coronary intervention planning require days of preliminary processing and rely on conventional 2D displays for hemodynamic evaluation. Immersion offered by extended reality (XR) has been found to benefit intervention…

Infographic describing the research

A method for intelligent allocation of diagnostic testing by leveraging data from commercial wearable devices: a case study on COVID-19

Mass surveillance testing can help control outbreaks of infectious diseases such as COVID-19. However, diagnostic test shortages are prevalent globally and continue to occur in the US with the onset of new COVID-19 variants and emerging diseases like monkeypox, demonstrating an unprecedented need for improving our current methods for mass surveillance testing. By targeting surveillance…

Graphical abstract of study

Non-invasive wearables for remote monitoring of HbA1c and glucose variability: proof of concept

Diabetes prevalence continues to grow and there remains a significant diagnostic gap in one-third of the US population that has pre-diabetes. Innovative, practical strategies to improve monitoring of glycemic health are desperately needed. In this proof-of-concept study, we explore the relationship between non-invasive wearables and glycemic metrics and demonstrate the feasibility of using non-invasive wearables…

Infographic showing objectives of the research

Engineering digital biomarkers of interstitial glucose from noninvasive smartwatches

Prediabetes affects one in three people and has a 10% annual conversion rate to type 2 diabetes without lifestyle or medical interventions. Management of glycemic health is essential to prevent progression to type 2 diabetes. However, there is currently no commercially-available and noninvasive method for monitoring glycemic health to aid in self-management of prediabetes. There…

Primitives of PTM-Mamba

PTM-Mamba: A PTM-Aware Protein Language Model with Bidirectional Gated Mamba Blocks

Proteins serve as the workhorses of living organisms, orchestrating a wide array of vital functions. Post-translational modifications (PTMs) of their amino acids greatly influence the structural and functional diversity of different protein types and uphold proteostasis, allowing cells to swiftly respond to environmental changes and intricately regulate complex biological processes. To this point, efforts to…

Origin of fusion oncoproteins

FusOn-pLM: a fusion oncoprotein-specific language model via adjusted rate masking

Fusion oncoproteins, a class of chimeric proteins arising from chromosomal translocations, are major drivers of various pediatric cancers. These proteins are intrinsically disordered and lack druggable pockets, making them highly challenging therapeutic targets for both small molecule-based and structure-based approaches. Protein language models (pLMs) have recently emerged as powerful tools for capturing physicochemical and functional…