More stories

  • in

    AI systems are already skilled at deceiving and manipulating humans

    Many artificial intelligence (AI) systems have already learned how to deceive humans, even systems that have been trained to be helpful and honest. In a review article publishing in the journal Patterns on May 10, researchers describe the risks of deception by AI systems and call for governments to develop strong regulations to address this issue as soon as possible.
    “AI developers do not have a confident understanding of what causes undesirable AI behaviors like deception,” says first author Peter S. Park, an AI existential safety postdoctoral fellow at MIT. “But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.”
    Park and colleagues analyzed literature focusing on ways in which AI systems spread false information — through learned deception, in which they systematically learn to manipulate others.
    The most striking example of AI deception the researchers uncovered in their analysis was Meta’s CICERO, an AI system designed to play the game Diplomacy, which is a world-conquest game that involves building alliances. Even though Meta claims it trained CICERO to be “largely honest and helpful” and to “never intentionally backstab” its human allies while playing the game, the data the company published along with its Science paper revealed that CICERO didn’t play fair.
    “We found that Meta’s AI had learned to be a master of deception,” says Park. “While Meta succeeded in training its AI to win in the game of Diplomacy — CICERO placed in the top 10% of human players who had played more than one game — Meta failed to train its AI to win honestly.”
    Other AI systems demonstrated the ability to bluff in a game of Texas hold ’em poker against professional human players, to fake attacks during the strategy game Starcraft II in order to defeat opponents, and to misrepresent their preferences in order to gain the upper hand in economic negotiations.
    While it may seem harmless if AI systems cheat at games, it can lead to “breakthroughs in deceptive AI capabilities” that can spiral into more advanced forms of AI deception in the future, Park added.

    Some AI systems have even learned to cheat tests designed to evaluate their safety, the researchers found. In one study, AI organisms in a digital simulator “played dead” in order to trick a test built to eliminate AI systems that rapidly replicate.
    “By systematically cheating the safety tests imposed on it by human developers and regulators, a deceptive AI can lead us humans into a false sense of security,” says Park.
    The major near-term risks of deceptive AI include making it easier for hostile actors to commit fraud and tamper with elections, warns Park. Eventually, if these systems can refine this unsettling skill set, humans could lose control of them, he says.
    “We as a society need as much time as we can get to prepare for the more advanced deception of future AI products and open-source models,” says Park. “As the deceptive capabilities of AI systems become more advanced, the dangers they pose to society will become increasingly serious.”
    While Park and his colleagues do not think society has the right measure in place yet to address AI deception, they are encouraged that policymakers have begun taking the issue seriously through measures such as the EU AI Act and President Biden’s AI Executive Order. But it remains to be seen, Park says, whether policies designed to mitigate AI deception can be strictly enforced given that AI developers do not yet have the techniques to keep these systems in check.
    “If banning AI deception is politically infeasible at the current moment, we recommend that deceptive AI systems be classified as high risk,” says Park.
    This work was supported by the MIT Department of Physics and the Beneficial AI Foundation. More

  • in

    AI intervention mitigates tension among conflicting ethnic groups

    Prejudice and fear have always been at the core of intergroup hostilities.
    While intergroup interaction is a prerequisite for initiating peace and stability at the junction of clashing interests, values, and cultures, the risk of further escalation precisely from direct interactions cannot be ruled out. In particular, a shortage of impartial, nonpartisan personnel to properly manage an electronic contact — or E-contact — session may cause the process to backfire and become destabilized.
    Now, a research team including Kyoto University has shown that interactive AI programs may help reduce prejudice and anxiety among historically divided ethnic groups in Afghanistan during online interactions.
    “Compared to the control group, participants in the AI intervention group showed more engagement in our study and significantly less prejudice and anxiety toward other ethnic groups,” says Sofia Sahab of KyotoU’s Graduate School of Informatics.
    In collaboration with Nagoya University, Nagoya Institute of Technology, and Hokkaido University, Sahab’s team has tested the effectiveness of using a CAI — or conversational AI — on the discussion platform D-Agree to facilitate unbiased and constructive conversations. The program ensures participants a safe, private space to talk freely, a setting that is commonly taken for granted in war-free countries.
    “Our over-decade-long work on AI agent-based consensus-building support has empirically demonstrated AI agents’ applicability in de-escalating confrontational situations,” remarks co-author Takayuki Ito, also of the informatics school.
    Sahab’s team applied a randomized controlled experiment to determine the causal effects of conversational AI facilitation in online discussions in reducing prejudice and anxiety.

    Participants from three ethnic backgrounds were divided into two groups — an AI group and a non-AI control group — to gauge the effects. As expected, the former expressed more empathy toward outside groups than participants in the control group.
    “The neutral AI agents aim to reduce risks by coordinating guided conversations as naturally as possible. By providing fair and cost-effective strategies to encourage positive interactions, we can promote lasting harmony among diverse ethnic groups,” adds Sahab.
    In the long term, the researchers are considering the potential for AI intervention beyond border conflicts to promote positive social change.
    “AI may have come at a pivotal time to aid humanity in enhancing social sustainability with CAI-mediated human interactions,” reflects Sahab. More

  • in

    Blockchain could offer a solution to the UK’s transport ticketing systems

    A new approach to transport ticketing offers a step towards an integrated, transparent system that works efficiently for both ticket providers and passengers across all modes of transport.
    Traditional ticketing systems are based on solutions that are vulnerable to issues including a lack of transferability across multi-modal transport networks, and an inability to adapt to policy changes and new technologies.
    Experts at the University of Birmingham have outlined a system that offers a new foundation for all ticketing providers. In a new paper, published in IET Blockchain, STUB (System for Ticketing Ubiquity within Blockchains) brings together the capabilities of two versatile technologies — blockchain and ontology.
    A blockchain is a distributed ledger that that records transactions in a way that ensures security, transparency, and immutability. An ontology is formal representation of knowledge within a domain and the relationships between those concepts, used to model and manage complex information systems.
    The researchers showed how both technologies could be combined to create a robust, transparent, and interconnected data framework that ensures consistent and reliable shared knowledge.
    Utilising these data structures, ticket providers can sell and validate tokenised tickets on the blockchain, ensuring universal accessibility across all providers. The integration of ontology allows providers to capture and share contextual information about the transport network, enabling providers to offer comprehensive data about routes, schedules, and availability, thereby streamlining the ticketing process.
    Lead author, Dr Joe Preece, said: “Transport systems around the world are becoming increasingly interconnected. Ticketing systems are key to this and there is a growing interest in the use of smarter transport ticketing that harnesses emerging technologies to overcome the limitations of traditional systems.

    “The system we have devised enables ticket providers to operate in a more transparent, flexible environment, that will ultimately offer passengers a more user-friendly experience.
    “STUB’s approach is not to be a single central data platform with transport policy baked-in, but instead to be a policy-agnostic approach that empowers existing ticket providers and technologies to share core ticketing data and to build new solutions on top of.
    “In essence, this may provide a modernised approach to the Rail Settlement Plan, that enables multi-modal ticketing, automated revenue and refund allocation, and dynamic fare pricing, whilst retaining the technologies in the sector that already work well.
    The next step for the team will be to set up a pilot scheme for the technology in a regional transport network, to demonstrate its efficacy, and to get feedback from ticket operators and passengers.
    “A big challenge to implementation will be the integration with existing ticketing infrastructure to work alongside the current standardised approaches whilst we scale up the technology. Setting up a successful pilot will be key to breaking down these barriers.” More

  • in

    AI knowledge gets your foot in the door

    Employers are significantly more likely to offer job interviews and higher salaries to graduates with experience of artificial intelligence, according to new research published in the journal Oxford Economic Papers.
    Researchers from Anglia Ruskin University (ARU) conducted an experiment by submitting CVs for job vacancies from British 21-year-old applicants who held a 2:1 degree. Some of the applicants possessed AI capital — they had studied an ‘AI in business’ module — and this was mentioned in their cover letter for the application.
    A matched pair of male applicants, one with AI capital and the other without, submitted applications, resulting in a total of 1,360 applications from male applicants to 680 UK companies. A total of 1,316 similarly matched applications from female applicants were sent to 658 firms.
    Male applicants with AI capital received an interview invitation in 54% of cases, whereas male applicants without AI capital were invited to interview in 28% of cases.
    Female applicants with AI capital received an interview invitation in 50% of cases, whereas female applicants without AI capital received one in 32% of cases.
    In large firms, applicants with AI capital were 36 percentage points more likely to be invited to an interview than in small-medium sized firms.
    Male applicants with AI qualifications were shortlisted for jobs offering wages that were, on average, 12% higher than those for male applicants without AI capital, while female applicants with AI qualifications were offered interviews for jobs offering wages that were, on average, 13% higher than without AI capital.

    Lead author Professor Nick Drydakis, Professor of Economics at Anglia Ruskin University (ARU), said: “In the UK, AI is causing dramatic shifts in the workforce, and firms need to respond to these demands by upgrading their workforces through enhancing their AI skill levels.
    “Our study clearly indicates that employers value AI knowledge and skills among job applicants. Those applicants with AI capital were significantly more likely to be invited to interview and were also more likely to have access to better paid jobs.
    “Job applicants with AI capital might possess the knowledge, skills and capabilities related to data analysis, data-driven decision-making, creativity, innovation, and effective communication, among other factors. These skills can enhance business operations, making them more efficient and potentially contributing to increased productivity within a firm.
    “Larger firms particularly valued AI capital, possibly because they tend to undergo more AI-based structural technological transformations and have greater capacity for innovation.” More

  • in

    Learning the imperfections: New approach to using neural networks for low-power digital pre-distortion (DPD) in mmWave systems

    Engineers at Tokyo Institute of Technology (Tokyo Tech) have demonstrated a simple computational approach for improving the linearization of power amplifiers (PA), such as those used in mmWave systems and other telecommunication systems. The proposed technique involves training small neural networks to directly estimate the coefficients of a polynomial for digital pre-distortion (DPD) based on their frequency response during calibration sweeps.
    In the world around us, a quiet but very important evolution has been taking place in engineering over the last decades. As technology evolves, it becomes increasingly clear that building devices that are physically as close as possible to being perfect is not always the right approach. That’s because it often leads to designs that are very expensive, complex to build, and power-hungry. Engineers, especially electronic engineers, have become very skilled in using highly imperfect devices in ways that allow them to behave close enough to the ideal case to be successfully applicable. Historically, a well-known example is that of disk drives, where advances in control systems have made it possible to achieve incredible densities while using electromechanical hardware littered with imperfections, such as nonlinearities and instabilities of various kinds.
    A similar problem has been emerging for radio communication systems. As the carrier frequencies keep increasing and channel packing becomes more and more dense, the requirements in terms of linearity for the radio-frequency power amplifiers (RF-PAs) used in telecommunication systems have been getting stringent. Traditionally, the best linearity is provided by designs known as “Class A,” which sacrifice great amounts of power to maintain operation in a region where transistors respond in the most linear possible way. On the other hand, highly energy-efficient designs are affected by nonlinearities that render them unstable without suitable correction. The situation has been getting worse because the modulation systems used by the latest cellular systems have a very high power ratio between the lowest- and highest-intensity symbols. Specific RF-PA types such as Doherty amplifiers are highly suitable and power-efficient, but their native non-linearity is not acceptable.
    Over the last two decades, high-speed digital signal processing has become widely available, economical, and power-efficient, leading to the emergence of algorithms allowing the real-time correction of amplifier non-linearities through intentionally “distorting” the signal in a way that compensates the amplifier’s physical response. These algorithms have become collectively known as digital pre-distortion (DPD), and represent an evolution of earlier implementations of the same approach in the analog domain. Throughout the years, many types of DPD algorithms have been proposed, typically involving real-time feedback from the amplifier through a so-called “observation signal,” and fairly intense calculations. While this approach has been instrumental to the development of third- and fourth-generation cellular networks (3G, 4G), it falls short of the emerging requirements for fifth-generation (5G) networks, due to two reasons. First, dense antenna arrays are subject to significant disturbances between adjacent elements, known as cross-talking, making it difficult to obtain clean observation signals and causing instability. The situation is made considerably worse by the use of ever-increasing frequencies. Second, dense arrays of antennas require very low-power solutions, and this is not compatible with the idea of complex processing taking place for each individual element.
    “We came up with a solution to this problem starting from two well-established mathematical facts. First, when a non-linearity is applied to a sinusoidal signal, it distorts it, leading to the appearance of new frequencies. Their intensity provides a sort of signature, that, if the non-linearity is a polynomial, is almost univocally associated with a set of coefficients. Second, multi-layer neural networks, of the early kinds, introduced decades ago, are universal function approximations, therefore, are capable of learning such an association, and inverting it,” explains Prof. Ludovico Minati, leading inventor of the patent on which the study is based and formerly a specially-appointed associate professor at Tokyo Tech.
    The most recent types of RF-PAs based on CMOS technology, even when they are heavily nonlinear, tend to have a relatively simple response, free from memory effects. “This implies that the DPD problem can be reduced to finding the coefficients of a suitable polynomial, in a way that is quick and stable enough for real-world operation,” explains Dr. Aravind Tharayil Narayanan, lead author of the study. Through a dedicated hardware architecture, the engineers at the Nano Sensing Unit of Tokyo Tech were able to implement a system that automatically determines the polynomial coefficients for DPD, based on a limited amount of data that could be acquired within the course of a few milliseconds. Performing calibration in the “foreground,” that is, one path at a time, reduces issues related to cross-talk and greatly simplifies the design. While there is no observation signal needed, the calibration can adjust itself to varying conditions through the inputs of additional signals, such as die temperature, power supply voltage, and settings of the phase shifters and couplers connecting the antenna. While standards compliance may pose some limitations, the approach is in principle widely applicable.
    “Because there is very limited processing happening in real-time, the hardware complexity is truly reduced to a minimum, and the power efficiency is maximized. Our results prove that this approach could in principle be sufficiently effective to support the most recent emerging standards. Another very convenient feature is that a considerable amount of hardware can be shared between elements, which is particularly convenient in dense array designs,” added Prof. Hiroyuki Ito, head of the Nano Sensing Unit of TokyoTech where the technology was developed. As a part of an industry-academia collaboration effort funded by NEDO, the authors were able to test the concept on realistic, leading-edge hardware operating at 28 GHz provided by Fujitsu Limited, working in close collaboration with a team of engineers in the Product Planning Division of the Mobile System Business Unit. Future work will include large-scale implementation using dedicated ASIC designs, detailed standards compliance analysis and realistic benchmarking on the field under a variety of settings.
    An international PCT application for the methodology and design has been filed. More

  • in

    Good vibrations: New tech may lead to smaller, more powerful wireless devices

    What if your earbuds could do everything your smartphone can do already, except better? What sounds a bit like science fiction may actually not be so far off. A new class of synthetic materials could herald the next revolution of wireless technologies, enabling devices to be smaller, require less signal strength and use less power.
    The key to these advances lies in what experts call phononics, which is similar to photonics. Both take advantage of similar physical laws and offer new ways to advance technology. While photonics takes advantage of photons — or light — phononics does the same with phonons, which are the physical particles that transmit mechanical vibrations through a material, akin to sound, but at frequencies much too high to hear.
    In a paper published in Nature Materials, researchers at the University of Arizona Wyant College of Optical Sciences and Sandia National Laboratories report clearing a major milestone toward real-world applications based on phononics. By combining highly specialized semiconductor materials and piezoelectric materials not typically used together, the researchers were able to generate giant nonlinear interactions between phonons. Together with previous innovations demonstrating amplifiers for phonons using the same materials, this opens up the possibility of making wireless devices such as smartphones or other data transmitters smaller, more efficient and more powerful.
    “Most people would probably be surprised to hear that there are something like 30 filters inside their cell phone whose sole job it is to transform radio waves into sound waves and back,” said the study’s senior author, Matt Eichenfield, who holds a joint appointment at the UArizona College of Optical Sciences and Sandia National Laboratories in Albuquerque, New Mexico.
    Part of what are known as front-end processors, these piezoelectric filters, made on special microchips, are necessary to convert sound and electronic waves multiple times each time a smartphone receives or sends data, he said. Because these can’t be made out of the same materials, such as silicon, as the other critically important chips in the front-end processor, the physical size of your device is much bigger than it needs to be, and along the way, there are losses from going back and forth between radio waves and sound waves that add up and degrade the performance, Eichenfield said.
    “Normally, phonons behave in a completely linear fashion, meaning they don’t interact with each other,” he said. “It’s a bit like shining one laser pointer beam through another; they just go through each other.”
    Nonlinear phononics refers to what happens in special materials when the phonons can and do interact with each other, Eichenfield said. In the paper, the researchers demonstrated what he calls “giant phononic nonlinearities.” The synthetic materials produced by the research team caused the phonons to interact with each other much more strongly than in any conventional material.

    “In the laser pointer analogy, this would be like changing the frequency of the photons in the first laser pointer when you turn on the second,” he said. “As a result, you’d see the beam from the first one changing color.”
    With the new phononics materials, the researchers demonstrated that one beam of phonons can, in fact, change the frequency of another beam. What’s more, they showed that phonons can be manipulated in ways that could only be realized with transistor-based electronics — until now.
    The group has been working toward the goal of making all of the components needed for radio frequency signal processors using acoustic wave technologies instead of transistor-based electronics on a single chip, in a way that’s compatible with standard microprocessor manufacturing, and the latest publication proves that it can be done. Previously, the researchers succeeded in making acoustic components including amplifiers, switches and others. With the acoustic mixers described in the latest publication, they have added the last piece of the puzzle.
    “Now, you can point to every component in a diagram of a radiofrequency front-end processor and say, ‘Yeah, I can make all of these on one chip with acoustic waves,'” Eichenfield said. “We’re ready to move on to making the whole shebang in the acoustic domain.”
    Having all the components needed to make a radio frequency front end on a single chip could shrink devices such as cell phones and other wireless communication gadgets by as much as a factor of a 100, according to Eichenfield.
    The team accomplished its proof of principle by combining highly specialized materials into microelectronics-sized devices through which they sent acoustic waves. Specifically, they took a silicon wafer with a thin layer of lithium niobate — a synthetic material used extensively in piezoelectronic devices and cell phones — and added an ultra-thin layer (fewer than 100 atoms thick) of a semiconductor containing indium gallium arsenide.

    “When we combined these materials in just the right way, we were able to experimentally access a new regime of phononic nonlinearity,” said Sandia engineer Lisa Hackett, the lead author on the paper. “This means we have a path forward to inventing high-performance tech for sending and receiving radio waves that’s smaller than has ever been possible.”
    In this setup, acoustic waves moving through the system behave in nonlinear ways when they travel through the materials. This effect can be used to change frequencies and encode information. A staple of photonics, nonlinear effects have long been used to make things like invisible laser light into visible laser pointers, but taking advantage of nonlinear effects in phononics has been hindered by limitations in technology and materials. For example, while lithium niobate is one of the most nonlinear phononic materials known, its usefulness for technical applications is hindered by the fact that those nonlinearities are very weak when used on its own.
    By adding the indium-gallium arsenide semiconductor, Eichenfield’s group created an environment in which the acoustic waves traveling through the material influence the distribution of electrical charges in the indium gallium arsenide semiconductor film, causing the acoustic waves to mix in specific ways that can be controlled, opening up the system to various applications.
    “The effective nonlinearity you can generate with these materials is hundreds or even thousands of times larger than was possible before, which is crazy,” Eichenfield said. “If you could do the same for nonlinear optics, you would revolutionize the field.”
    With physical size being one of the fundamental limitations of current, state-of-the-art radiofrequency processing hardware, the new technology could open the door to electronic devices that are even more capable than their current counterparts, according to the authors. Communication devices that take virtually no space, have better signal coverage and longer battery life, are on the horizon. More

  • in

    New machine learning algorithm promises advances in computing

    Systems controlled by next-generation computing algorithms could give rise to better and more efficient machine learning products, a new study suggests.
    Using machine learning tools to create a digital twin, or a virtual copy, of an electronic circuit that exhibits chaotic behavior, researchers found that they were successful at predicting how it would behave and using that information to control it.
    Many everyday devices, like thermostats and cruise control, utilize linear controllers — which use simple rules to direct a system to a desired value. Thermostats, for example, employ such rules to determine how much to heat or cool a space based on the difference between the current and desired temperatures.
    Yet because of how straightforward these algorithms are, they struggle to control systems that display complex behavior, like chaos.
    As a result, advanced devices like self-driving cars and aircraft often rely on machine learning-based controllers, which use intricate networks to learn the optimal control algorithm needed to best operate. However, these algorithms have significant drawbacks, the most demanding of which is that they can be extremely challenging and computationally expensive to implement.
    Now, having access to an efficient digital twin is likely to have a sweeping impact on how scientists develop future autonomous technologies, said Robert Kent, lead author of the study and a graduate student in physics at The Ohio State University.
    “The problem with most machine learning-based controllers is that they use a lot of energy or power and they take a long time to evaluate,” said Kent. “Developing traditional controllers for them has also been difficult because chaotic systems are extremely sensitive to small changes.”
    These issues, he said, are critical in situations where milliseconds can make a difference between life and death, such as when self-driving vehicles must decide to brake to prevent an accident.

    The study was published recently in Nature Communications.
    Compact enough to fit on an inexpensive computer chip capable of balancing on your fingertip and able to run without an internet connection, the team’s digital twin was built to optimize a controller’s efficiency and performance, which researchers found resulted in a reduction of power consumption. It achieves this quite easily, mainly because it was trained using a type of machine learning approach called reservoir computing.
    “The great thing about the machine learning architecture we used is that it’s very good at learning the behavior of systems that evolve in time,” Kent said. “It’s inspired by how connections spark in the human brain.”
    Although similarly sized computer chips have been used in devices like smart fridges, according to the study, this novel computing ability makes the new model especially well-equipped to handle dynamic systems such as self-driving vehicles as well as heart monitors, which must be able to quickly adapt to a patient’s heartbeat.
    “Big machine learning models have to consume lots of power to crunch data and come out with the right parameters, whereas our model and training is so extremely simple that you could have systems learning on the fly,” he said.
    To test this theory, researchers directed their model to complete complex control tasks and compared its results to those from previous control techniques. The study revealed that their approach achieved a higher accuracy at the tasks than its linear counterpart and is significantly less computationally complex than a previous machine learning-based controller.

    “The increase in accuracy was pretty significant in some cases,” said Kent. Though the outcome showed that their algorithm does require more energy than a linear controller to operate, this tradeoff means that when it is powered up, the team’s model lasts longer and is considerably more efficient than current machine learning-based controllers on the market.
    “People will find good use out of it just based on how efficient it is,” Kent said. “You can implement it on pretty much any platform and it’s very simple to understand.” The algorithm was recently made available to scientists.
    Outside of inspiring potential advances in engineering, there’s also an equally important economic and environmental incentive for creating more power-friendly algorithms, said Kent.
    As society becomes more dependent on computers and AI for nearly all aspects of daily life, demand for data centers is soaring, leading many experts to worry over digital systems’ enormous power appetite and what future industries will need to do to keep up with it.
    And because building these data centers as well as large-scale computing experiments can generate a large carbon footprint, scientists are looking for ways to curb carbon emissions from this technology.
    To advance their results, future work will likely be steered toward training the model to explore other applications like quantum information processing, Kent said. In the meantime, he expects that these new elements will reach far into the scientific community.
    “Not enough people know about these types of algorithms in the industry and engineering, and one of the big goals of this project is to get more people to learn about them,” said Kent. “This work is a great first step toward reaching that potential.”
    This study was supported by the U.S. Air Force’s Office of Scientific Research. Other Ohio State co-authors include Wendson A.S. Barbosa and Daniel J. Gauthier. More

  • in

    A better way to control shape-shifting soft robots

    Imagine a slime-like robot that can seamlessly change its shape to squeeze through narrow spaces, which could be deployed inside the human body to remove an unwanted item.
    While such a robot does not yet exist outside a laboratory, researchers are working to develop reconfigurable soft robots for applications in health care, wearable devices, and industrial systems.
    But how can one control a squishy robot that doesn’t have joints, limbs, or fingers that can be manipulated, and instead can drastically alter its entire shape at will? MIT researchers are working to answer that question.
    They developed a control algorithm that can autonomously learn how to move, stretch, and shape a reconfigurable robot to complete a specific task, even when that task requires the robot to change its morphology multiple times. The team also built a simulator to test control algorithms for deformable soft robots on a series of challenging, shape-changing tasks.
    Their method completed each of the eight tasks they evaluated while outperforming other algorithms. The technique worked especially well on multifaceted tasks. For instance, in one test, the robot had to reduce its height while growing two tiny legs to squeeze through a narrow pipe, and then un-grow those legs and extend its torso to open the pipe’s lid.
    While reconfigurable soft robots are still in their infancy, such a technique could someday enable general-purpose robots that can adapt their shapes to accomplish diverse tasks.
    “When people think about soft robots, they tend to think about robots that are elastic, but return to their original shape. Our robot is like slime and can actually change its morphology. It is very striking that our method worked so well because we are dealing with something very new,” says Boyuan Chen, an electrical engineering and computer science (EECS) graduate student and co-author of a paper on this approach.

    Chen’s co-authors include lead author Suning Huang, an undergraduate student at Tsinghua University in China who completed this work while a visiting student at MIT; Huazhe Xu, an assistant professor at Tsinghua University; and senior author Vincent Sitzmann, an assistant professor of EECS at MIT who leads the Scene Representation Group in the Computer Science and Artificial Intelligence Laboratory. The research will be presented at the International Conference on Learning Representations.
    Controlling dynamic motion
    Scientists often teach robots to complete tasks using a machine-learning approach known as reinforcement learning, which is a trial-and-error process in which the robot is rewarded for actions that move it closer to a goal.
    This can be effective when the robot’s moving parts are consistent and well-defined, like a gripper with three fingers. With a robotic gripper, a reinforcement learning algorithm might move one finger slightly, learning by trial and error whether that motion earns it a reward. Then it would move on to the next finger, and so on.
    But shape-shifting robots, which are controlled by magnetic fields, can dynamically squish, bend, or elongate their entire bodies.
    “Such a robot could have thousands of small pieces of muscle to control, so it is very hard to learn in a traditional way,” says Chen.

    To solve this problem, he and his collaborators had to think about it differently. Rather than moving each tiny muscle individually, their reinforcement learning algorithm begins by learning to control groups of adjacent muscles that work together.
    Then, after the algorithm has explored the space of possible actions by focusing on groups of muscles, it drills down into finer detail to optimize the policy, or action plan, it has learned. In this way, the control algorithm follows a coarse-to-fine methodology.
    “Coarse-to-fine means that when you take a random action, that random action is likely to make a difference. The change in the outcome is likely very significant because you coarsely control several muscles at the same time,” Sitzmann says.
    To enable this, the researchers treat a robot’s action space, or how it can move in a certain area, like an image.
    Their machine-learning model uses images of the robot’s environment to generate a 2D action space, which includes the robot and the area around it. They simulate robot motion using what is known as the material-point-method, where the action space is covered by points, like image pixels, and overlayed with a grid.
    The same way nearby pixels in an image are related (like the pixels that form a tree in a photo), they built their algorithm to understand that nearby action points have stronger correlations. Points around the robot’s “shoulder” will move similarly when it changes shape, while points on the robot’s “leg” will also move similarly, but in a different way than those on the “shoulder.”
    In addition, the researchers use the same machine-learning model to look at the environment and predict the actions the robot should take, which makes it more efficient.
    Building a simulator
    After developing this approach, the researchers needed a way to test it, so they created a simulation environment called DittoGym.
    DittoGym features eight tasks that evaluate a reconfigurable robot’s ability to dynamically change shape. In one, the robot must elongate and curve its body so it can weave around obstacles to reach a target point. In another, it must change its shape to mimic letters of the alphabet.
    “Our task selection in DittoGym follows both generic reinforcement learning benchmark design principles and the specific needs of reconfigurable robots. Each task is designed to represent certain properties that we deem important, such as the capability to navigate through long-horizon explorations, the ability to analyze the environment, and interact with external objects,” Huang says. “We believe they together can give users a comprehensive understanding of the flexibility of reconfigurable robots and the effectiveness of our reinforcement learning scheme.”
    Their algorithm outperformed baseline methods and was the only technique suitable for completing multistage tasks that required several shape changes.
    “We have a stronger correlation between action points that are closer to each other, and I think that is key to making this work so well,” says Chen.
    While it may be many years before shape-shifting robots are deployed in the real world, Chen and his collaborators hope their work inspires other scientists not only to study reconfigurable soft robots but also to think about leveraging 2D action spaces for other complex control problems. More