More stories

  • in

    Cellphone compass can measure tiny concentrations of compounds important for human health

    Nearly every modern cellphone has a built-in compass, or magnetometer, that detects the direction of Earth’s magnetic field, providing critical information for navigation. Now a team of researchers at the National Institute of Standards and Technology (NIST) has developed a technique that uses an ordinary cellphone magnetometer for an entirely different purpose — to measure the concentration of glucose, a marker for diabetes, to high accuracy.
    The same technique, which uses the magnetometer in conjunction with magnetic materials designed to change their shape in response to biological or environmental cues, could be used to rapidly and cheaply measure a host of other biomedical properties for monitoring or diagnosing human disease. The method also has the potential to detect environmental toxins, said NIST scientist Gary Zabow.
    In their proof-of-concept study, Zabow and fellow NIST researcher Mark Ferris clamped to a cellphone a tiny well containing the solution to be tested and a strip of hydrogel — a porous material that swells when immersed in water. The researchers embedded tiny magnetic particles within the hydrogel, which they had engineered to react either to the presence of glucose or to pH levels (a measure of acidity) by expanding or contracting. Changing pH levels can be associated with a variety of biological disorders.
    As the hydrogels enlarged or shrunk, they moved the magnetic particles closer to or farther from the cellphone’s magnetometer, which detected the corresponding changes in the strength of the magnetic field. Employing this strategy, the researchers measured glucose concentrations as small as a few millionths of a mole (the scientific unit for a certain number of atoms or molecules in a substance). Although such high sensitivity is not required for at-home monitoring of glucose levels using a drop of blood, it might in the future enable routine testing for glucose in saliva, which contains a much smaller concentration of the sugar.
    The researchers reported their findings in the March 30, 2024 edition of Nature Communications.
    Engineered, or “smart,” hydrogels like the ones the NIST team employed are inexpensive and relatively easy to fabricate, Ferris said, and can be tailored to react to a host of different compounds that medical researchers may want to measure. In their experiments, he and Zabow stacked single layers of two different hydrogels, each of which contracted and expanded at different rates in response to pH or glucose. These bilayers amplified the motion of the hydrogels, making it easier for the magnetometer to track changes in magnetic field strength.
    Because the technique does not require any electronics or power source beyond that of the cellphone nor call for any special processing of the sample, it offers an inexpensive way to conduct testing — even in locations with relatively few resources.

    Future efforts to improve the accuracy of such measurements using cellphone magnetometers might allow detection of DNA strands, specific proteins and histamines — compounds involved in the body’s immune response — at concentrations as low as a few tens of nanomoles (billionths of a mole).
    That improvement could have substantial benefit. For instance, measuring histamines, which are typically detected in urine at concentrations ranging from about 45 to 190 nanomoles, would ordinarily require a 24-hour urine collection and a sophisticated laboratory analysis.
    “An at-home test using a cellphone magnetometer sensitive to nanomolar concentrations would allow measurements to be done with much less hassle,” said Ferris. More generally, enhanced sensitivity would be essential when only a small amount of a substance is available for testing in extremely dilute quantities, Zabow added.
    Similarly, the team’s study suggests that a cellphone magnetometer can measure pH levels with the same sensitivity as a thousand-dollar benchtop meter but at a fraction of the cost. A home-brewer or a baker could use the magnetometer to quickly test the pH of various liquids to perfect their craft, and an environmental scientist could measure the pH of ground water samples on-site with higher accuracy than a litmus test strip could provide.
    In order to make the cellphone measurements a commercial success, engineers will need to develop a method to mass produce the hydrogel test strips and ensure that they have a long shelf life, Zabow said. Ideally, he added, the hydrogel strips should be designed to react more quickly to environmental cues in order to speed up measurements. More

  • in

    Physics-based predictive tool will speed up battery and superconductor research

    From lithium-ion batteries to next-generation superconductors, the functionality of many modern, advanced technologies depends on the physical property known as intercalation. Unfortunately, it’s difficult to identify in advance which of the many possible intercalated materials are stable, which necessitates a lot of trial-and-error lab work in product development.
    Now, in a study recently published in ACS Physical Chemistry Au, researchers from the Institute of Industrial Science, The University of Tokyo, and collaborating partners have devised a straightforward equation that correctly predicts the stability of intercalated materials. The systematic design guidelines enabled by this work will speed up the development of upcoming high-performance electronics and energy-storage devices.
    To appreciate the research team’s achievement, we need to understand the context of this research. Intercalation is the reversible insertion of guests (atoms or molecules) into hosts (for example, 2D-layered materials). The purpose of intercalation is commonly to modify the host’s properties or structure for improved device performance, as seen in, for example, commercial lithium-ion batteries. Although many synthetic methods are available for preparing intercalated materials, researchers have had no reliable means of predicting which host-guest combinations are stable. Therefore, much lab work has been needed to devise new intercalated materials for imparting next-generation device functionalities. Minimizing this lab work by proposing a straightforward predictive tool for host-guest stability was the goal of the research team’s study.
    “We are the first to develop accurate predictive tools for host-guest intercalation energies, and the stability of intercalated compounds,” explains Naoto Kawaguchi, lead author of the study. “Our analysis, based on a database of 9,000 compounds, uses straightforward principles from undergraduate first-year chemistry.”
    A particular highlight of the work is that only two guest properties and eight host-derived descriptors were necessary for the researchers’ energy and stability calculations. In other words, initial ‘best guesses’ weren’t necessary; only the underlying physics of the host-guest systems. Furthermore, the researchers validated their model against nearly 200 sets of regression coefficients.
    “We’re excited because our regression model formulation is straightforward and physically reasonable,” says Teruyasu Mizoguchi, senior author. “Other computational models in the literature lack a physical basis or validation against unknown intercalated compounds.”
    This work is an important step forward in minimizing the laborious lab work that’s typically required to prepare intercalated materials. Given that many current and upcoming energy storage and electronic devices depend on such materials, the time and expense required for corresponding research and development will be minimized. Consequently, products with advanced functionalities will reach the market faster than what has been previously possible. More

  • in

    App may pave way to treatments for no. 1 dementia in under-60s

    UCSF-led research shows smartphone cognitive testing is comparable to gold-standard methods; may detect FTD in gene carriers before symptoms start.
    A smartphone app could enable greater participation in clinical trials for people with frontotemporal dementia (FTD), a devastating neurological disorder that often manifests in midlife.
    Research into the condition has been hampered by problems with early diagnosis and difficulty tracking how people are responding to treatments that are only likely to be effective at the early stages of disease.
    To address this, a research team led by UC San Francisco deployed cognitive tests through a mobile app and found it could detect early signs of FTD in people who were genetically predisposed to get the disease but had not yet developed symptoms. These tests were at least as sensitive as neuropsychological evaluations done in the clinic.
    The study appears in JAMA Network Open on April 1, 2024.
    More than 30 FTD clinical trials are underway or in the planning stages, including one that may become the first drug approved to slow progression in some gene carriers. Researchers hope the new mobile technology will hasten the work.
    “Eventually, the app may be used to monitor treatment effects, replacing many or most in-person visits to clinical trials’ sites,” said first author Adam Staffaroni, PhD, clinical neuropsychologist and associate professor in the UCSF Department of Neurology and the Weill Institute for Neurosciences.

    FTD is the No. 1 cause of dementia in patients under 60, with up to 30% of cases attributed to genetics. It has three main variants with symptoms that may overlap. The most common causes dramatic personality shifts, which may manifest as lack of empathy, apathy, impulsivity, compulsive eating, and socially and sexually inappropriate behavior. Another affects movement, and a third impacts speech, language and comprehension, which is the variant that Bruce Willis is reported to have. In rare cases, FTD triggers bursts of visual creativity.
    FTD is not easy to diagnose
    As with Alzheimer’s disease, patients with FTD are believed to be most responsive to treatment early on, ideally before their symptoms even emerge. “Most FTD patients are diagnosed relatively late in the disease, because they are young, and their symptoms are mistaken for psychiatric disorders,” said senior author Adam Boxer, MD, PhD, endowed professor in memory and aging at the UCSF Department of Neurology.
    “We’ve heard from families that they often suspect their loved one has FTD long before a physician agrees that is the diagnosis,” said Boxer, who is also director of the UCSF Alzheimer’s Disease and Frontotemporal Dementia Clinical Trials Program.
    The researchers tracked 360 participants with an average age of 54 enrolled in ongoing studies at ALLFTDcenters and UCSF. About 90% had data on disease stage. These included 60% who did not have FTD or were gene carriers who had not yet developed symptoms, 20% with early signs of the disease and 21% with symptoms.
    Software that can detect a waning ability to plan
    Staffaroni and Boxer collaborated with software company Datacubed Health, which developed the platform, to include tests of executive function, such as planning and prioritizing, filtering distractions and controlling impulses. In FTD, the part of the brain responsible for executive functioning shrinks as the disease progresses.

    The rich data collected by the app, including voice recordings and body movements, enabled the researchers to develop new tests that eventually could help with early diagnosis and monitoring of symptoms.
    “We developed the capability to record speech while participants engaged with several different tests,” said Staffaroni. “We also created tests of walking, balance and slowed movements, as well as different aspects of language.”
    FTD researchers say they are closer to finding treatments that may eventually slow the progression of the disease, which is fatal. These include gene and other therapies, such as antisense oligonucleotides (ASOs), to increase or decrease the production of proteins that are abnormal in gene carriers.
    Although there are currently no plans to make the app available to the public, it could be a boon to research.
    “A major barrier has been a lack of outcome measures that can be easily collected and are sensitive to treatment effects at early stages of the disease,” said Staffaroni. “We hope that smartphone assessments will facilitate new trials of promising therapies.” More

  • in

    Chatbot outperformed physicians in clinical reasoning in head-to-head study

    ChatGPT-4, an artificial intelligence program designed to understand and generate human-like text, outperformed internal medicine residents and attending physicians at two academic medical centers at processing medical data and demonstrating clinical reasoning. In a research letter published in JAMA Internal Medicine, physician-scientists at Beth Israel Deaconess Medical Center (BIDMC) compared a large language model’s (LLM) reasoning abilities directly against human performance using standards developed to assess physicians.
    “It became clear very early on that LLMs can make diagnoses, but anybody who practices medicine knows there’s a lot more to medicine than that,” said Adam Rodman MD, an internal medicine physician and investigator in the department of medicine at BIDMC. “There are multiple steps behind a diagnosis, so we wanted to evaluate whether LLMs are as good as physicians at doing that kind of clinical reasoning. It’s a surprising finding that these things are capable of showing the equivalent or better reasoning than people throughout the evolution of clinical case.”
    Rodman and colleagues used a previously validated tool developed to assess physicians’ clinical reasoning called the revised-IDEA (r-IDEA) score. The investigators recruited 21 attending physicians and 18 residents who each worked through one of 20 selected clinical cases comprised of four sequential stages of diagnostic reasoning. The authors instructed physicians to write out and justify their differential diagnoses at each stage. The chatbot GPT-4 was given a prompt with identical instructions and ran all 20 clinical cases. Their answers were then scored for clinical reasoning (r-IDEA score) and several other measures of reasoning.
    “The first stage is the triage data, when the patient tells you what’s bothering them and you obtain vital signs,” said lead author Stephanie Cabral, MD, a third-year internal medicine resident at BIDMC. “The second stage is the system review, when you obtain additional information from the patient. The third stage is the physical exam, and the fourth is diagnostic testing and imaging.”
    Rodman, Cabral and their colleagues found that the chatbot earned the highest r-IDEA scores, with a median score of 10 out of 10 for the LLM, 9 for attending physicians and 8 for residents. It was more of a draw between the humans and the bot when it came to diagnostic accuracy — how high up the correct diagnosis was on the list of diagnosis they provided — and correct clinical reasoning. But the bots were also “just plain wrong” — had more instances of incorrect reasoning in their answers — significantly more often than residents, the researchers found. The finding underscores the notion that AI will likely be most useful as a tool to augment, not replace, the human reasoning process.
    “Further studies are needed to determine how LLMs can best be integrated into clinical practice, but even now, they could be useful as a checkpoint, helping us make sure we don’t miss something,” Cabral said. “My ultimate hope is that AI will improve the patient-physician interaction by reducing some of the inefficiencies we currently have and allow us to focus more on the conversation we’re having with our patients.
    “Early studies suggested AI could makes diagnoses, if all the information was handed to it,” Rodman said. “What our study shows is that AI demonstrates real reasoning — maybe better reasoning than people through multiple steps of the process. We have a unique chance to improve the quality and experience of healthcare for patients.”
    Co-authors included Zahir Kanjee, MD, Philip Wilson, MD, and Byron Crowe, MD, of BIDMC; Daniel Restrepo, MD, of Massachusetts General Hospital; and Raja-Elie Abdulnour, MD, of Brigham and Women’s Hospital.
    This work was conducted with support from Harvard Catalyst | The Harvard Clinical and Translational Science Center (National Center for Advancing Translational Sciences, National Institutes of Health) (award UM1TR004408) and financial contributions from Harvard University and its affiliated academic healthcare centers.
    Potential Conflicts of Interest: Rodman reports grant funding from the Gordon and Betty Moore Foundation. Crowe reports employment and equity in Solera Health. Kanjee reports receipt of royalties for books edited and membership on a paid advisory board for medical education products not related to AI from Wolters Kluwer, as well as honoraria for continuing medical education delivered from Oakstone Publishing. Abdulnour reports employment by the Massachusetts Medical Society (MMS), a not-for-profit organization that owns NEJM Healer. Abdulnour does not receive royalty from sales of NEJM Healer and does not have equity in NEJM Healer. No funding was provided by the MMS for this study. Abdulnour reports grant funding from the Gordan and Betty Moore Foundation via the National Academy of Medicine Scholars in Diagnostic Excellence. More

  • in

    Research reveals language barriers limit effectiveness of cybersecurity resources

    The idea for Fawn Ngo’s latest research came from a television interview.
    Ngo, a University of South Florida criminologist, had spoken with a Vietnamese language network in California about her interest in better understanding how people become victims of cybercrime.
    Afterward, she began receiving phone calls from viewers recounting their own experiences of victimization.
    “Some of the stories were unfortunate and heartbreaking,” said Ngo, an associate professor in the USF College of Behavioral and Community Sciences. “They made me wonder about the availability and accessibility of cybersecurity information and resources for non-English speakers. Upon investigating further, I discovered that such information and resources were either limited or nonexistent.”
    The result is what’s believed to be the first study to explore the links among demographic characteristics, cyber hygiene practices and cyber victimization using a sample of limited English proficiency internet users.
    Ngo is the lead author of an article, “Cyber Hygiene and Cyber Victimization Among Limited English Proficiency (LEP) Internet Users: A Mixed-Method Study,” which just published in the journal Victims & Offenders. The article’s co-authors are Katherine Holman, a USF graduate student and former Georgia state prosecutor, and Anurag Agarwal, professor of information systems, analytics and supply chain at Florida Gulf Coast University.
    Their research, which focused on Spanish and Vietnamese speakers, led to two closely connected main takeaways: LEP Internet users share the same concern about cyber threats and the same desire for online safety as any other individual. However, they are constrained by a lack of culturally and linguistically appropriate resources, which also hampers accurate collection of cyber victimization data among vulnerable populations. Online guidance that provides the most effective educational tools and reporting forms is only available in English. The most notable example is the website for the Internet Crime Complaint Center, which serves as the FBI’s primary apparatus for combatting cybercrime.As a result, the study showed that many well-intentioned LEP users still engage in risky online behaviors like using unsecured networks and sharing passwords. For example, only 29 percent of the study’s focus group participants avoided using public Wi-Fi over the previous 12 months, and only 17 percent said they had antivirus software installed on their digital devices.

    Previous research cited in Ngo’s paper has shown that underserved populations exhibit poorer cybersecurity knowledge and outcomes, most commonly in the form of computer viruses and hacked accounts, including social media accounts. Often, it’s because they lack awareness and understanding and isn’t a result of disinterest, Ngo said.
    “According to cybersecurity experts, humans are the weakest link in the chain of cybersecurity,” Ngo said. “If we want to secure our digital borders, we must ensure that every member in society, regardless of their language skills, is well-informed about the risks inherent in the cyber world.”
    The study’s findings point to a need for providing cyber hygiene information and resources in multiple formats, including visual aids and audio guides, to accommodate diverse literacy levels within LEP communities, Ngo said. She added that further research is needed to address the current security gap and ensure equitable access to cybersecurity resources for all Internet users.
    In the meantime, Ngo is preparing to launch a website with cybersecurity information and resources in different languages and a link to report victimization.
    “It’s my hope that cybersecurity information and resources will become as readily accessible in other languages as other vital information, such as information related to health and safety,” Ngo said. “I also want LEP victims to be included in national data and statistics on cybercrime and their experiences accurately represented and addressed in cybersecurity initiatives.” More

  • in

    Artificial intelligence boosts super-resolution microscopy

    Generative artificial intelligence (AI) might be best known from text or image-creating applications like ChatGPT or Stable Diffusion. But its usefulness beyond that is being shown in more and more different scientific fields. In their recent work, to be presented at the upcoming International Conference on Learning Representations (ICLR), researchers from the Center for Advanced Systems Understanding (CASUS) at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR) in collaboration with colleagues from Imperial College London and University College London have provided a new open-source algorithm called Conditional Variational Diffusion Model. Based on generative AI, this model improves the quality of images by reconstructing them from randomness. In addition, the CVDM is computationally less expensive than established diffusion models — and it can be easily adapted for a variety of applications.
    With the advent of big data and new mathematical and data science methods, researchers aim to decipher yet unexplainable phenomena in biology, medicine, or the environmental sciences using inverse problem approaches. Inverse problems deal with recovering the causal factors leading to certain observations. You have a greyscale version of an image and want to recover the colors. There are usually several valid solutions here, as, for example, light blue and light red look identical in the grayscale image. The solution to this inverse problem can therefore be the image with the light blue or the one with the light red shirt.
    Analyzing microscopic images can also be a typical inverse problem. “You have an observation: your microscopic image. Applying some calculations, you then can learn more about your sample than first meets the eye,” says Gabriel della Maggiora, PhD student at CASUS and lead author of the ICLR paper. The results can be higher-resolution or better-quality images. However, the path from the observations, i.e. the microscopic images, to the “super images” is usually not obvious. Additionally, observational data is often noisy, incomplete, or uncertain. This all adds to the complexity of solving inverse problems making them exciting mathematical challenges.
    The power of generative AI models like Sora
    One of the powerful tools to tackle inverse problems with is generative AI. Generative AI models in general learn the underlying distribution of the data in a given training dataset. A typical example is image generation. After the training phase, generative AI models generate completely new images that are, however, consistent with the training data.
    Among the different generative AI variations, a particular family named diffusion models has recently gained popularity among researchers. With diffusion models, an iterative data generation process starts from basic noise, a concept used in information theory to mimic the effect of many random processes that occur in nature. Concerning image generation, diffusion models have learned which pixel arrangements are common and uncommon in the training dataset images. They generate the new desired image bit by bit until a pixel arrangement coincides best with the underlying structure of the training data. A good example for the power of diffusion models is the US software company OpenAI’s text-to-video model Sora. An implemented diffusion component gives Sora the ability to generate videos that appear more realistic than anything AI models have created before.
    But there is one drawback. “Diffusion models have long been known as computationally expensive to train. Some researchers were recently giving up on them exactly for that reason,” says Dr. Artur Yakimovich, Leader of a CASUS Young Investigator Group and corresponding author of the ICLR paper. “But new developments like our Conditional Variational Diffusion Model allow minimizing ‘unproductive runs’, which do not lead to the final model. By lowering the computational effort and hence power consumption, this approach may also make diffusion models more eco-friendly to train.”
    Clever training does the trick — not just in sports

    The ‘unproductive runs’ are an important drawback of diffusion models. One of the reasons is that the model is sensitive to the choice of the predefined schedule controlling the dynamics of the diffusion process: This schedule governs how the noise is added: too little or too much, wrong place or wrong time — there are many possible scenarios that end with a failed training. So far, this schedule has been set as a hyperparameter which has to be tuned for each and every new application. In other words, while designing the model, researchers usually estimate the schedule they chose in a trial-and-error manner. In the new paper presented at the ICLR, the authors incorporated the schedule already in the training phase so that their CVDM is capable of finding the optimal training on its own. The model then yielded better results than other models relying on a predefined schedule.
    Among others, the authors demonstrated the applicability of the CVDM to a scientific problem: super-resolution microscopy, a typical inverse problem. Super-resolution microscopy aims to overcome the diffraction limit, a limit that restricts resolution due to the optical characteristics of the microscopic system. To surmount this limit algorithmically, data scientists reconstruct higher-resolution images by eliminating both blurring and noise from recorded, limited-resolution images. In this scenario, the CVDM yielded comparable or even superior results compared to commonly used methods.
    “Of course, there are several methods out there to increase the meaningfulness of microscopic images — some of them relying on generative AI models,” says Yakimovich. “But we believe that our approach has some new unique properties that will leave an impact in the imaging community, namely high flexibility and speed at a comparable or even better quality compared to other diffusion model approaches. In addition, our CVDM provides direct hints where it is not very sure about the reconstruction — a very helpful property that sets the path forward to address these uncertainties in new experiments and simulations.” More

  • in

    Revolutionary biomimetic olfactory chips to enable advanced gas sensing and odor detection

    A research team led by the School of Engineering of the Hong Kong University of Science and Technology (HKUST) has addressed the long-standing challenge of creating artificial olfactory sensors with arrays of diverse high-performance gas sensors. Their newly developed biomimetic olfactory chips (BOC) are able to integrate nanotube sensor arrays on nanoporous substrates with up to 10,000 individually addressable gas sensors per chip, a configuration that is similar to how olfaction works for humans and other animals.
    For decades, researchers worldwide have been developing artificial olfaction and electronic noses (e-noses) with the aim of emulating the intricate mechanism of the biological olfactory system to effectively discern complex odorant mixtures. Yet, major challenges of their development lie on the difficulty of miniaturizing the system and increasing its recognition capabilities in determining the exact gas species and their concentrations within complex odorant mixtures.
    To tackle these issues, the research team led by Prof. FAN Zhiyong, Chair Professor at HKUST’s Department of Electronic & Computer Engineering and Department of Chemical & Biological Engineering, used an engineered material composition gradient that allows for wide arrays of diverse sensors on one small nanostructured chip. Leveraging the power of artificial intelligence, their biomimetic olfactory chips exhibit exceptional sensitivity to various gases with excellent distinguishability for mixed gases and 24 distinct odors. With a vision to expand their olfactory chip’s applications, the team also integrated the chips with vision sensors on a robot dog, creating a combined olfactory and visual system that can accurately identify objects in blind boxes.
    The development of the biomimetic olfactory chips will not only improve the existing broad applications of the artificial olfaction and e-noses systems in food, environmental, medical and industrial process control etc, but also open up new possibilities in intelligent systems, such as advanced robots and portable smart devices, for applications in security patrols and rescue operations.
    For example, in their applications in real-time monitoring and quality control, the biomimetic olfactory chips can be used to detect and analyze specific odors or volatile compounds associated with different stages of industrial processes to ensure safety; detect any abnormal or hazardous gases in environmental monitoring; and identify leakage in pipes to facilitate timely repair.
    The technology presented in this study serves as a pivotal breakthrough in the realm of odor digitization. As the scientific community witnesses the triumphant prevalence of visual information digitization, facilitated by the modern and mature imaging sensing technologies, the realm of scent-based information has yet remained untapped due to the absence of advanced odor sensors. The work conducted by Prof. Fan’s team has paved the way for the development of biomimetic odor sensors that possess immense potential. With further advancements, these sensors could find widespread utilization, akin to the ubiquitous presence of miniaturized cameras in cell phones and portable electronics, thereby enriching and enhancing people’s quality of life.
    “In the future, with the development of suitable bio-compatible materials, we hope that the biomimetic olfactory chip can also be placed on human body to allow us to smell odor that normally cannot be smelled. It can also monitor the abnormalities in volatile organic molecules in our breath and emitted by our skin, to warn us on potential diseases, reaching further potential of biomimetic engineering,” said Prof. Fan. More

  • in

    Could AI play a role in locating damage to the brain after stroke?

    Artificial intelligence (AI) may serve as a future tool for neurologists to help locate where in the brain a stroke occurred. In a new study, AI processed text from health histories and neurologic examinations to locate lesions in the brain. The study, which looked specifically at the large language model called generative pre-trained transformer 4 (GPT-4), is published in the March 27, 2024, online issue of Neurology® Clinical Practice, an official journal of the American Academy of Neurology.
    A stroke can cause long-term disability or even death. Knowing where a stroke has occurred in the brain helps predict long-term effects such as problems with speech and language or a person’s ability to move part of their body. It can also help determine the best treatment and a person’s overall prognosis.
    Damage to the brain tissue from a stroke is called a lesion. A neurologic exam can help locate lesions, when paired with a review of a person’s health history. The exam involves symptom evaluation and thinking and memory tests. People with stroke often have brain scans to locate lesions.
    “Not everyone with stroke has access to brain scans or neurologists, so we wanted to determine whether GPT-4 could accurately locate brain lesions after stroke based on a person’s health history and a neurologic exam,” said study author Jung-Hyun Lee, MD, of State University of New York (SUNY) Downstate Health Sciences University in Brooklyn and a member of the American Academy of Neurology.
    The study used 46 published cases of people who had stroke. Researchers gathered text from participants’ health histories and neurologic exams. The raw text was fed into GPT-4. Researchers asked it to answer three questions: whether a participant had one or more lesions; on which side of the brain lesions were located; and in which region of the brain the lesions were found. They repeated these questions for each participant three times. Results from GPT-4 were then compared to brain scans for each participant.
    Researchers found that GPT-4 processed the text from the health histories and neurologic exams to locate lesions in many participants’ brains, identifying which side of the brain the lesion was on, as well as the specific brain region, with the exception of lesions in the cerebellum and spinal cord.
    For the majority of people, GPT-4 was able to identify on which side of the brain lesions were found with a sensitivity of 74% and a specificity of 87%. Sensitivity is the percentage of actual positives that are correctly identified as positive. Specificity is the percentage of negatives that are correctly identified. It also identified the brain region with a sensitivity of 85% and a specificity of 94%.

    When looking at how often the three tests had the same result for each participant, GPT-4 was consistent for 76% of participants regarding the number of brain lesions. It was consistent for 83% of participants for the side of the brain, and for 87% of participants regarding the brain regions.
    However, when combining its responses to all three questions across all three times, GPT-4 provided accurate answers for 41% of participants.
    “While not yet ready for use in the clinic, large language models such as generative pre-trained transformers have the potential not only to assist in locating lesions after stroke, they may also reduce health care disparities because they can function across different languages,” said Lee. “The potential for use is encouraging, especially due to the great need for improved health care in underserved areas across multiple countries where access to neurologic care is limited.”
    A limitation of the study is that the accuracy of GPT-4 depends on the quality of the information it is provided. While researchers had detailed health histories and neurologic exam information for each participant, such information is not always available for everyone who has a stroke. More