More stories

  • in

    Compact accelerator technology achieves major energy milestone

    Particle accelerators hold great potential for semiconductor applications, medical imaging and therapy, and research in materials, energy and medicine. But conventional accelerators require plenty of elbow room — kilometers — making them expensive and limiting their presence to a handful of national labs and universities.
    Researchers from The University of Texas at Austin, several national laboratories, European universities and the Texas-based company TAU Systems Inc. have demonstrated a compact particle accelerator less than 20 meters long that produces an electron beam with an energy of 10 billion electron volts (10 GeV). There are only two other accelerators currently operating in the U.S. that can reach such high electron energies, but both are approximately 3 kilometers long.
    “We can now reach those energies in 10 centimeters,” said Bjorn “Manuel” Hegelich, associate professor of physics at UT and CEO of TAU Systems, referring to the size of the chamber where the beam was produced. He is the senior author on a recent paper describing their achievement in the journal Matter and Radiation at Extremes.
    Hegelich and his team are currently exploring the use of their accelerator, called an advanced wakefield laser accelerator, for a variety of purposes. They hope to use it to test how well space-bound electronics can withstand radiation, to image the 3D internal structures of new semiconductor chip designs, and even to develop novel cancer therapies and advanced medical-imaging techniques.
    This kind of accelerator could also be used to drive another device called an X-ray free electron laser, which could take slow-motion movies of processes on the atomic or molecular scale. Examples of such processes include drug interactions with cells, changes inside batteries that might cause them to catch fire, chemical reactions inside solar panels, and viral proteins changing shape when infecting cells.
    The concept for wakefield laser accelerators was first described in 1979. An extremely powerful laser strikes helium gas, heats it into a plasma and creates waves that kick electrons from the gas out in a high-energy electron beam. During the past couple of decades, various research groups have developed more powerful versions. Hegelich and his team’s key advance relies on nanoparticles. An auxiliary laser strikes a metal plate inside the gas cell, which injects a stream of metal nanoparticles that boost the energy delivered to electrons from the waves.
    The laser is like a boat skimming across a lake, leaving behind a wake, and electrons ride this plasma wave like surfers.

    “It’s hard to get into a big wave without getting overpowered, so wake surfers get dragged in by Jet Skis,” Hegelich said. “In our accelerator, the equivalent of Jet Skis are nanoparticles that release electrons at just the right point and just the right time, so they are all sitting there in the wave. We get a lot more electrons into the wave when and where we want them to be, rather than statistically distributed over the whole interaction, and that’s our secret sauce.”
    For this experiment, the researchers used one of the world’s most powerful pulsed lasers, the Texas Petawatt Laser, which is housed at UT and fires one ultra-intense pulse of light every hour. A single petawatt laser pulse contains about 1,000 times the installed electrical power in the U.S. but lasts only 150 femtoseconds, less than a billionth as long as a lightning discharge. The team’s long-term goal is to drive their system with a laser they’re currently developing that fits on a tabletop and can fire repeatedly at thousands of times per second, making the whole accelerator far more compact and usable in much wider settings than conventional accelerators.
    The study’s co-first authors are Constantin Aniculaesei, corresponding author now at Heinrich Heine University Düsseldorf, Germany; and Thanh Ha, doctoral student at UT and researcher at TAU Systems. Other UT faculty members are professors Todd Ditmire and Michael Downer.
    Hegelich and Aniculaesei have submitted a patent application describing the device and method to generate nanoparticles in a gas cell. TAU Systems, spun out of Hegelich’s lab, holds an exclusive license from the University for this foundational patent. As part of the agreement, UT has been issued shares in TAU Systems.
    Support for this research was provided by the U.S. Air Force Office of Scientific Research, the U.S. Department of Energy, the U.K. Engineering and Physical Sciences Research Council and the European Union’s Horizon 2020 research and innovation program. More

  • in

    Defending your voice against deepfakes

    Recent advances in generative artificial intelligence have spurred developments in realistic speech synthesis. While this technology has the potential to improve lives through personalized voice assistants and accessibility-enhancing communication tools, it also has led to the emergence of deepfakes, in which synthesized speech can be misused to deceive humans and machines for nefarious purposes.
    In response to this evolving threat, Ning Zhang, an assistant professor of computer science and engineering at the McKelvey School of Engineering at Washington University in St. Louis, developed a tool called AntiFake, a novel defense mechanism designed to thwart unauthorized speech synthesis before it happens. Zhang presented AntiFake Nov. 27 at the Association for Computing Machinery’s Conference on Computer and Communications Security in Copenhagen, Denmark.
    Unlike traditional deepfake detection methods, which are used to evaluate and uncover synthetic audio as a post-attack mitigation tool, AntiFake takes a proactive stance. It employs adversarial techniques to prevent the synthesis of deceptive speech by making it more difficult for AI tools to read necessary characteristics from voice recordings. The code is freely available to users.
    “AntiFake makes sure that when we put voice data out there, it’s hard for criminals to use that information to synthesize our voices and impersonate us,” Zhang said. “The tool uses a technique of adversarial AI that was originally part of the cybercriminals’ toolbox, but now we’re using it to defend against them. We mess up the recorded audio signal just a little bit, distort or perturb it just enough that it still sounds right to human listeners, but it’s completely different to AI.”
    To ensure AntiFake can stand up against an ever-changing landscape of potential attackers and unknown synthesis models, Zhang and first author Zhiyuan Yu, a graduate student in Zhang’s lab, built the tool to be generalizable and tested it against five state-of-the-art speech synthesizers. AntiFake achieved a protection rate of over 95%, even against unseen commercial synthesizers. They also tested AntiFake’s usability with 24 human participants to confirm the tool is accessible to diverse populations.
    Currently, AntiFake can protect short clips of speech, taking aim at the most common type of voice impersonation. But, Zhang said, there’s nothing to stop this tool from being expanded to protect longer recordings, or even music, in the ongoing fight against disinformation.
    “Eventually, we want to be able to fully protect voice recordings,” Zhang said. “While I don’t know what will be next in AI voice tech — new tools and features are being developed all the time — I do think our strategy of turning adversaries’ techniques against them will continue to be effective. AI remains vulnerable to adversarial perturbations, even if the engineering specifics may need to shift to maintain this as a winning strategy.” More

  • in

    New framework for using AI in health care considers medical knowledge, practices, procedures, values

    Health care organizations are looking to artificial intelligence (AI) tools to improve patient care, but their translation into clinical settings has been inconsistent, in part because evaluating AI in health care remains challenging. In a new article, researchers propose a framework for using AI that includes practical guidance for applying values and that incorporates not just the tool’s properties but the systems surrounding its use.
    The article was written by researchers at Carnegie Mellon University, The Hospital for Sick Children, the Dalla Lana School of Public Health, Columbia University, and the University of Toronto. It is published in Patterns.
    “Regulatory guidelines and institutional approaches have focused narrowly on the performance of AI tools, neglecting knowledge, practices, and procedures necessary to integrate the model within the larger social systems of medical practice,” explains Alex John London, K&L Gates Professor of Ethics and Computational Technologies at Carnegie Mellon, who coauthored the article. “Tools are not neutral — they reflect our values — so how they work reflects the people, processes, and environments in which they are put to work.”
    London is also Director of Carnegie Mellon’s Center for Ethics and Policy and Chief Ethicist at Carnegie Mellon’s Block Center for Technology and Society as well as a faculty member in CMU’s Department of Philosophy.
    London and his coauthors advocate for a conceptual shift in which AI tools are viewed as parts of a larger “intervention ensemble,” a set of knowledge, practices, and procedures that are necessary to deliver care to patients. In previous work with other colleagues, London has applied this concept to pharmaceuticals and to autonomous vehicles. The approach treats AI tools as “sociotechnical systems,” and the authors’ proposed framework seeks to advance the responsible integration of AI systems into health care.
    Previous work in this area has been largely descriptive, explaining how AI systems interact with human systems. The framework proposed by London and his colleagues is proactive, providing guidance to designers, funders, and users about how to ensure that AI systems can be integrated into workflows with the greatest potential to help patients. Their approach can also be used for regulation and institutional insights, as well as for appraising, evaluating, and using AI tools responsibly and ethically. To illustrate their framework, the authors apply it to the development of AI systems developed for diagnosing more than mild diabetic retinopathy.
    “Only a small majority of models evaluated through clinical trials have shown a net benefit,” says Melissa McCradden, a Bioethicist at the Hospital for Sick Children and Assistant Professor of Clinical and Public Health at the Dalla Lana School of Public Health, who coauthored the article. “We hope our proposed framework lends precision to evaluation and interests regulatory bodies exploring the kinds of evidence needed to support the oversight of AI systems.” More

  • in

    Measuring long-term heart stress dynamics with smartwatch data

    Biomedical engineers at Duke University have developed a method using data from wearable devices such as smartwatches to digitally mimic an entire week’s worth of an individual’s heartbeats. The previous record covered only a few minutes.
    Called the Longitudinal Hemodynamic Mapping Framework (LHMF), the approach creates “digital twins” of a specific patient’s blood flow to assess its 3D characteristics. The advance is an important step toward improving on the current gold standard in evaluating the risks of heart disease or heart attack, which uses snapshots of a single moment in time — a challenging approach for a disease that progresses over months to years.
    The research was conducted in collaboration with computational scientists at Lawrence Livermore National Laboratory and was published on November 15, 2023, at the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23). The conference is the leading global conference in the field of high-performance computing.
    “Modeling a patient’s 3D blood flow for even a single day would take a century’s worth of compute time on today’s best supercomputers,” said Cyrus Tanade, a PhD candidate in the laboratory of Amanda Randles, the Alfred Winborne and Victoria Stover Mordecai Associate Professor of Biomedical Sciences at Duke. “If we want to capture blood flow dynamics over long periods of time, we need a paradigm-shifting solution in how we approach 3D personalized simulations.”
    Over the past decade, researchers have steadily made progress toward accurately modeling the pressures and forces created by blood flowing through an individual’s specific vascular geometry. Randles, one of the leaders in the field, has developed a software package called HARVEY to tackle this challenge using the world’s fastest supercomputers.
    One of the most commonly accepted uses of such coronary digital twins is to determine whether or not a patient should receive a stent to treat a plaque or lesion. This computational method is much less invasive than the traditional approach of threading a probe on a guide wire into the artery itself.
    While this application requires only a handful of heartbeat simulations and works for a single snapshot in time, the field’s goal is to track pressure dynamics over weeks or months after a patient leaves a hospital. To get even 10 minutes of simulated data on the Duke group’s computer cluster, however, they had to lock it down for four months.

    “Obviously, that’s not a workable solution to help patients because of the computing costs and time requirements,” Randles said. “Think of it as taking three weeks to simulate what the weather will be like tomorrow. By the time you predict a rainstorm, the water would have already dried up.”
    To ever apply this technology to real-world people over the long term, researchers must find a way to reduce the computational load. The new paper introduces the Longitudinal Hemodynamic Mapping Framework, which cuts what used to take nearly a century of simulation time down to just 24 hours.
    “The solution is to simulate the heartbeats in parallel rather than sequentially by breaking the task up amongst many different nodes,” Tanade said. “Conventionally, the tasks are broken up spatially with parallel computing. But here, they’re broken up in time as well.”
    For example, one could reasonably assume that the specifics of a coronary flow at 10:00 am on a Monday will likely have little impact on the flow at 2:00 pm on a Wednesday. This allowed the team to develop a method to accurately simulate different chunks of time simultaneously and piece them back together. This breakdown made the pieces small enough to be simulated using cloud computing systems like Amazon Web Services rather than requiring large-scale supercomputers.
    To put the mapping framework to the test, researchers used tried and true methods to simulate 750 heartbeats — about 10 minutes of biological time — with the lab’s allotment of computing time on Duke’s computer cluster. Using continuous data on heart rate and electrocardiography from a smartwatch, it produced a complete set of 3D blood flow biomarkers that could correlate with disease progression and adverse events. It took four months to complete and exceeded the existing record by an order of magnitude.
    They then compared these results to those produced by LHMF running on Amazon World Services and Summit, an Oak Ridge National Laboratory system, in just a few hours. The errors were negligible, proving that LHMF could work on a useful time scale.

    The team then further refined LHMF by introducing a clustering method, further reducing the computational costs and allowing them to track the frictional force of blood on vessel walls — a well-known biomarker of cardiovascular disease — for over 700,000 heartbeats, or one week of continuous activity. These results allowed the group to create a personalized, longitudinal hemodynamic map, showing how the forces vary over time and the percentage of time spent in various vulnerable states.
    “The results significantly differed from those obtained over a single heartbeat,” Tanade said. “This demonstrates that capturing longitudinal blood flow metrics provides nuances and information that is otherwise not perceptible with the previous gold standard approach.”
    “If we can create a temporal map of wall stress in critical areas like the coronary artery, we could predict the risk of a patient developing atherosclerosis or the progression of such diseases,” Randles added. “This method could allow us to identify cases of heart disease much earlier than is currently possible.”
    This work was supported by the National Science Foundation (ACI-1548562, DGE164486), the Department of Energy (DE-AC52-07NA27344, DE-AC05-00OR22725), Amazon Web Services, the National Insitutes of Health (DP1AG082343) and the Coulter Foundation. More

  • in

    Immersive engagement in mixed reality can be measured with reaction time

    In the real world/digital world cross-over of mixed reality, a user’s immersive engagement with the program is called presence. Now, UMass Amherst researchers are the first to identify reaction time as a potential presence measurement tool. Their findings, published in IEEE Transactions on Visualization and Computer Graphics, have implications for calibrating mixed reality to the user in real time.
    “In virtual reality, the user is in the virtual world; they have no connection with their physical world around them,” explains Fatima Anwar, assistant professor of electrical and computer engineering, and an author on the paper. “Mixed reality is a combination of both: You can see your physical world, but then on top of that, you have that spatially related information that is virtual.” She gives attaching a virtual keyboard onto a physical table as an example. This is similar to augmented reality but takes it a step further by making the digital elements more interactive with the user and the environment.
    The uses for mixed reality are most obvious within gaming, but Anwar says that it’s rapidly expanding into other fields: academics, industry, construction and healthcare.
    However, mixed reality experiences vary in quality: “Does the user feel that they are present in that environment? How immersive do they feel? And how does that impact their interactions with the environment?” asks Anwar. This is what is defined as “presence.”
    Up to now, presence has been measured with subjective questionnaires after a user exits a mixed-reality program. Unfortunately, when presence is measured after the fact, it’s hard to capture a user’s feelings of the entire experience, especially during long exposure scenes. (Also, people are not very articulate in describing their feelings, making them an unreliable data source.) The ultimate goal is to have an instantaneous measure of presence so that the mixed reality simulation can be adjusted in the moment for optimal presence. “Oh, their presence is going down. Let’s do an intervention,” says Anwar.
    Yasra Chandio, doctoral candidate in computer engineering and lead study author, gives medical procedures as an example of the importance of this real-time presence calibration: If a surgeon needs millimeter-level precision, they may use mixed reality as a guide to tell them exactly where they need to operate.
    “If we just show the organ in front of them, and we don’t adjust for the height of the surgeon, for instance, that could be delaying the surgeon and could have inaccuracies for them,” she says. Low presence can also contribute to cybersickness, a feeling of dizziness or nausea that can occur in the body when a user’s bodily perception does not align with what they’re seeing. However, if the mixed reality system is internally monitoring presence, it can make adjustments in real-time, like moving the virtual organ rendering closer to eye level.

    One marker within mixed reality that can be measured continuously and passively is reaction time, or how quickly a user interacts with the virtual elements. Through a series of experiments, the researchers determined that reaction time is associated with presence such that slow reaction time indicates low presence and high reaction time indicates high presence with 80% predictive accuracy even with the small dataset.
    To test this, the researchers put participants in modified “Fruit Ninja” mixed reality scenarios (without the scoring), adjusting how authentic the digital elements appeared to manipulate presence.
    Presence is a combination of two factors: place illusion and plausibility illusion. “First of all, virtual objects should look real,” says Anwar. That’s place illusion. “The objects should look at how physical things look, and the second thing is: are they behaving in a real way? Do they follow the laws of physics while they’re behaving in the real world?” This is plausibility illusion.
    In one experiment, they adjusted place illusion and the fruit appeared either as lifelike fruit or abstract cartoons. In another experiment, they adjusted the plausibility illusion by showing mugs filling up with coffee either in the correct upright position or sideways.
    What they found: People were quicker in reacting to the lifelike fruit than they would to the cartoonish-looking food. And the same thing happened in the plausibility and implausible behavior of the coffee mug.
    Reaction time is a good measure of presence because it highlights if the virtual elements are a tool or a distraction. “If a person is not feeling present, they would be looking into that environment and figuring out things,” explains Chandio. “Their cognition in perception is focused on something other than the task at hand, because they are figuring out what is going on.”
    “Some people are going to argue, ‘Why would you not create the best scene in the first place?’ but that’s because humans are very complex,” Chandio explains. “What works for me may not work for you may not work for Fatima, because we have different bodies, our hands move differently, we think of the world differently. We perceive differently.” More

  • in

    AI may spare breast cancer patients unnecessary treatments

    A new AI (Artificial Intelligence) tool may make it possible to spare breast cancer patients unnecessary chemotherapy treatments by using a more precise method of predicting their outcomes, reports a new Northwestern Medicine study.
    AI evaluations of patient tissues were better at predicting the future course of a patient’s disease than evaluations performed by expert pathologists.
    The AI tool was able to identify breast cancer patients who are currently classified as high or intermediate risk but who become long-term survivors. That means the duration or intensity of their chemotherapy could be reduced. This is important since chemotherapy is associated with unpleasant and harmful side effects such as nausea, or more rarely, damage to the heart.
    Currently pathologist evaluate cancerous cells in a patient’s tissue to determine treatment. But patterns of non-cancerous cells are very important in predicting outcomes, the study showed.
    This is the first study to use AI for comprehensive evaluation of both the cancerous and non-cancerous elements of invasive breast cancer.
    “Our study demonstrates the importance of non-cancer components in determining a patient’s outcome,” said corresponding study author Lee Cooper, associate professor of pathology at Northwestern University Feinberg School of Medicine. “The importance of these elements was known from biological studies, but this knowledge has not been effectively translated to clinical use.”
    The study will be published Nov. 27 in Nature Medicine.

    In 2023, about 300,000 U.S. women will receive a diagnosis of invasive breast cancer. About one in eight U.S. women will receive a breast cancer diagnosis in their lifetime.
    During diagnosis, a pathologist reviews the cancerous tissue to determine how abnormal the tissue appears. This process, known as grading, focuses on the appearance of cancer cells and has remained largely unchanged for decades. The grade, determined by the pathologist, is used to help determine what treatment a patient will receive.
    Many studies of breast cancer biology have shown that the non-cancerous cells, including cells from the immune system and cells that provide form and structure for the tissue, can play an important role in sustaining or inhibiting cancer growth.
    Cooper and colleagues built an AI model to evaluate breast cancer tissue from digital images that measures the appearance of both cancerous and non-cancerous cells, as well as interactions between them.
    “These patterns are challenging for a pathologist to evaluate as they can be difficult for the human eye to categorize reliably,” said Cooper, also a member of the Robert H. Lurie Comprehensive Cancer Center of Northwestern University. “The AI model measures these patterns and presents information to the pathologist in a way that makes the AI decision-making process clear to the pathologist.”
    The AI system analyzes 26 different properties of a patient’s breast tissue to generate an overall prognostic score. The system also generates individual scores for the cancer, immune and stromal cells to explain the overall score to the pathologist. For example, in some patients, a favorable prognosis score may be due to properties of their immune cells, where for others it may be due to properties of their cancer cells. This information could be used by a patient’s care team in creating an individualized treatment plan.

    Adoption of the new model could provide patients diagnosed with breast cancers with a more accurate estimate of the risk associated with their disease, empowering them to make informed decisions about their clinical care, Cooper said.
    Additionally, this model may help in assessing therapeutic response, allowing treatment to be escalated or de-escalated depending on how the microscopic appearance of the tissue changes over time. For example, the tool may be able to recognize the effectiveness of a patient’s immune system in targeting the cancer during chemotherapy, which could be used to reduce the duration or intensity of chemotherapy.
    “We also hope that this model could reduce disparities for patients who are diagnosed in community settings,” Cooper said. “These patients may not have access to a pathologist who specializes in breast cancer, and our AI model could help a generalist pathologist when evaluating breast cancers.”
    How the study worked
    The study was conducted in collaboration with the American Cancer Society (ACS) which created a unique dataset of breast cancer patients through their Cancer Prevention Studies. This dataset has representation of patients from over 423 U.S. counties, many who received a diagnosis or care at community medical centers. This is important, because most studies typically use data from large academic medical centers which represent only a portion of the U.S. population. In this collaboration, Northwestern developed the AI software while scientists at the ACS and National Cancer Institute provided expertise on breast cancer epidemiology and clinical outcomes.
    To train the AI model, scientists required hundreds of thousands of human-generated annotations of cells and tissue structures within digital images of patient tissues. To achieve this, they created an international network of medical students and pathologists across several continents. These volunteers provided this data through a website over the course of several years to make it possible for the AI model to reliably interpret images of breast cancer tissue.
    Next the scientists will evaluate this model prospectively to validate it for clinical use. This coincides with the transition to using digital images for diagnosis at Northwestern Medicine, which will happen over the next three years.
    The scientists also are working to develop models for more specific types of breast cancers like triple-negative or HER2-positive. Invasive breast cancer encompasses several different categories, and the important tissue patterns may vary across these categories.
    “This will improve our ability to predict outcomes and will provide further insights into the biology of breast cancers,” Cooper said.
    Other Northwestern authors include Mohamed Amgad Tageldin, Kalliopi Siziopikou and Jeffery Goldstein.
    This research was supported by grants U01CA220401 and U24CA19436201 from the National Cancer Institute of the U.S. National Institutes of Health. More

  • in

    How heat can be used in computing

    Physicists at Martin Luther University Halle-Wittenberg (MLU) and Central South University in China have demonstrated that, combining specific materials, heat in technical devices can be used in computing. Their discovery is based on extensive calculations and simulations. The new approach demonstrates how heat signals can be steered and amplified for use in energy-efficient data processing. The team’s research findings have been published in the journal Advanced Electronic Materials.
    Electric current flow heats up electronic device. The generated heat is dissipated and energy is lost. “For decades, people have been looking for ways to re-use this lost energy in electronics,” explains Dr Jamal Berakdar, a professor of physics at MLU. This is extremely challenging, he says, due to the difficulty in directing and controlling accurately heat signals. However, both are necessary if heat signals are to be used to reliably process data.
    Berakdar carried out extensive calculations together with two colleagues from Central South University in China. The idea: instead of conventional electronic circuits, non-conductive magnetic strips are used in conjunction with normal metal spacers. “This unusual combination makes it possible to conduct and amplify heat signals in a controlled manner in order to power logical computing operations and heat diodes,” explains Berakdar.
    One disadvantage of the new method, however, is its speed. “This method does not produce the kind of computing speeds we see in modern smartphones,” says Berakdar. That is why the new method is currently probably less relevant for use in everyday electronics and is better suited for next generation computers which will be used to perform energy-saving calculations. “Our technology can contribute to saving energy in information technology by making good use of surplus heat,” Berakdar concludes.
    The study was funded by the German Research Foundation (DFG), the National Natural Science Foundation of China, the Natural Science Foundation of Hunan Province and as part of the Central South University Innovation-Driven Research Program. More

  • in

    New method uses crowdsourced feedback to help train robots

    To teach an AI agent a new task, like how to open a kitchen cabinet, researchers often use reinforcement learning — a trial-and-error process where the agent is rewarded for taking actions that get it closer to the goal.
    In many instances, a human expert must carefully design a reward function, which is an incentive mechanism that gives the agent motivation to explore. The human expert must iteratively update that reward function as the agent explores and tries different actions. This can be time-consuming, inefficient, and difficult to scale up, especially when the task is complex and involves many steps.
    Researchers from MIT, Harvard University, and the University of Washington have developed a new reinforcement learning approach that doesn’t rely on an expertly designed reward function. Instead, it leverages crowdsourced feedback, gathered from many nonexpert users, to guide the agent as it learns to reach its goal.
    While some other methods also attempt to utilize nonexpert feedback, this new approach enables the AI agent to learn more quickly, despite the fact that data crowdsourced from users are often full of errors. These noisy data might cause other methods to fail.
    In addition, this new approach allows feedback to be gathered asynchronously, so nonexpert users around the world can contribute to teaching the agent.
    “One of the most time-consuming and challenging parts in designing a robotic agent today is engineering the reward function. Today reward functions are designed by expert researchers — a paradigm that is not scalable if we want to teach our robots many different tasks. Our work proposes a way to scale robot learning by crowdsourcing the design of reward function and by making it possible for nonexperts to provide useful feedback,” says Pulkit Agrawal, an assistant professor in the MIT Department of Electrical Engineering and Computer Science (EECS) who leads the Improbable AI Lab in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).
    In the future, this method could help a robot learn to perform specific tasks in a user’s home quickly, without the owner needing to show the robot physical examples of each task. The robot could explore on its own, with crowdsourced nonexpert feedback guiding its exploration.

    “In our method, the reward function guides the agent to what it should explore, instead of telling it exactly what it should do to complete the task. So, even if the human supervision is somewhat inaccurate and noisy, the agent is still able to explore, which helps it learn much better,” explains lead author Marcel Torne ’23, a research assistant in the Improbable AI Lab.
    Torne is joined on the paper by his MIT advisor, Agrawal; senior author Abhishek Gupta, assistant professor at the University of Washington; as well as others at the University of Washington and MIT. The research will be presented at the Conference on Neural Information Processing Systems next month.
    Noisy feedback
    One way to gather user feedback for reinforcement learning is to show a user two photos of states achieved by the agent, and then ask that user which state is closer to a goal. For instance, perhaps a robot’s goal is to open a kitchen cabinet. One image might show that the robot opened the cabinet, while the second might show that it opened the microwave. A user would pick the photo of the “better” state.
    Some previous approaches try to use this crowdsourced, binary feedback to optimize a reward function that the agent would use to learn the task. However, because nonexperts are likely to make mistakes, the reward function can become very noisy, so the agent might get stuck and never reach its goal.
    “Basically, the agent would take the reward function too seriously. It would try to match the reward function perfectly. So, instead of directly optimizing over the reward function, we just use it to tell the robot which areas it should be exploring,” Torne says.

    He and his collaborators decoupled the process into two separate parts, each directed by its own algorithm. They call their new reinforcement learning method HuGE (Human Guided Exploration).
    On one side, a goal selector algorithm is continuously updated with crowdsourced human feedback. The feedback is not used as a reward function, but rather to guide the agent’s exploration. In a sense, the nonexpert users drop breadcrumbs that incrementally lead the agent toward its goal.
    On the other side, the agent explores on its own, in a self-supervised manner guided by the goal selector. It collects images or videos of actions that it tries, which are then sent to humans and used to update the goal selector.
    This narrows down the area for the agent to explore, leading it to more promising areas that are closer to its goal. But if there is no feedback, or if feedback takes a while to arrive, the agent will keep learning on its own, albeit in a slower manner. This enables feedback to be gathered infrequently and asynchronously.
    “The exploration loop can keep going autonomously, because it is just going to explore and learn new things. And then when you get some better signal, it is going to explore in more concrete ways. You can just keep them turning at their own pace,” adds Torne.
    And because the feedback is just gently guiding the agent’s behavior, it will eventually learn to complete the task even if users provide incorrect answers.
    Faster learning
    The researchers tested this method on a number of simulated and real-world tasks. In simulation, they used HuGE to effectively learn tasks with long sequences of actions, such as stacking blocks in a particular order or navigating a large maze.
    In real-world tests, they utilized HuGE to train robotic arms to draw the letter “U” and pick and place objects. For these tests, they crowdsourced data from 109 nonexpert users in 13 different countries spanning three continents.
    In real-world and simulated experiments, HuGE helped agents learn to achieve the goal faster than other methods.
    The researchers also found that data crowdsourced from nonexperts yielded better performance than synthetic data, which were produced and labeled by the researchers. For nonexpert users, labeling 30 images or videos took fewer than two minutes.
    “This makes it very promising in terms of being able to scale up this method,” Torne adds.
    In a related paper, which the researchers presented at the recent Conference on Robot Learning, they enhanced HuGE so an AI agent can learn to perform the task, and then autonomously reset the environment to continue learning. For instance, if the agent learns to open a cabinet, the method also guides the agent to close the cabinet.
    “Now we can have it learn completely autonomously without needing human resets,” he says.
    The researchers also emphasize that, in this and other learning approaches, it is critical to ensure that AI agents are aligned with human values.
    In the future, they want to continue refining HuGE so the agent can learn from other forms of communication, such as natural language and physical interactions with the robot. They are also interested in applying this method to teach multiple agents at once.
    This research is funded, in part, by the MIT-IBM Watson AI Lab. More