More stories

  • in

    New test reveals AI still lacks common sense

    Natural language processing (NLP) has taken great strides recently — but how much does AI understand of what it reads? Less than we thought, according to researchers at USC’s Department of Computer Science. In a recent paper Assistant Professor Xiang Ren and PhD student Yuchen Lin found that despite advances, AI still doesn’t have the common sense needed to generate plausible sentences.
    “Current machine text-generation models can write an article that may be convincing to many humans, but they’re basically mimicking what they have seen in the training phase,” said Lin. “Our goal in this paper is to study the problem of whether current state-of-the-art text-generation models can write sentences to describe natural scenarios in our everyday lives.”
    Understanding scenarios in daily life
    Specifically, Ren and Lin tested the models’ ability to reason and showed there is a large gap between current text generation models and human performance. Given a set of common nouns and verbs, state-of-the-art NLP computer models were tasked with creating believable sentences describing an everyday scenario. While the models generated grammatically correct sentences, they were often logically incoherent.
    For instance, here’s one example sentence generated by a state-of-the-art model using the words “dog, frisbee, throw, catch”:
    “Two dogs are throwing frisbees at each other.”
    The test is based on the assumption that coherent ideas (in this case: “a person throws a frisbee and a dog catches it,”) can’t be generated without a deeper awareness of common-sense concepts. In other words, common sense is more than just the correct understanding of language — it means you don’t have to explain everything in a conversation. This is a fundamental challenge in the goal of developing generalizable AI — but beyond academia, it’s relevant for consumers, too.

    advertisement

    Without an understanding of language, chatbots and voice assistants built on these state-of-the-art natural-language models are vulnerable to failure. It’s also crucial if robots are to become more present in human environments. After all, if you ask a robot for hot milk, you expect it to know you want a cup of mile, not the whole carton.
    “We also show that if a generation model performs better on our test, it can also benefit other applications that need commonsense reasoning, such as robotic learning,” said Lin. “Robots need to understand natural scenarios in our daily life before they make reasonable actions to interact with people.”
    Joining Lin and Ren on the paper are USC’s Wangchunshu Zhou, Ming Shen, Pei Zhou; Chandra Bhagavatula from the Allen Institute of Artificial Intelligence; and Yejin Choi from the Allen Institute of Artificial Intelligence and Paul G. Allen School of Computer Science & Engineering, University of Washington.
    The common sense test
    Common-sense reasoning, or the ability to make inferences using basic knowledge about the world — like the fact that dogs cannot throw frisbees to each other — has resisted AI researchers’ efforts for decades. State-of-the-art deep-learning models can now reach around 90% accuracy, so it would seem that NLP has gotten closer to its goal.

    advertisement

    But Ren, an expert in natural language processing and Lin, his student, needed more convincing about this statistic’s accuracy. In their paper, published in the Findings of Empirical Methods in Natural Language Processing (EMNLP) conference on Nov. 16, they challenge the effectiveness of the benchmark and, therefore, the level of progress the field has actually made.
    “Humans acquire the ability to compose sentences by learning to understand and use common concepts that they recognize in their surrounding environment,” said Lin.
    “Acquiring this ability is regarded as a major milestone in human development. But we wanted to test if machines can really acquire such generative commonsense reasoning ability.”
    To evaluate different machine models, the pair developed a constrained text generation task called CommonGen, which can be used as a benchmark to test the generative common sense of machines. The researchers presented a dataset consisting of 35,141 concepts associated with 77,449 sentences. They found the even best performing model only achieved an accuracy rate of 31.6% versus 63.5% for humans.
    “We were surprised that the models cannot recall the simple commonsense knowledge that ‘a human throwing a frisbee’ should be much more reasonable than a dog doing it,” said Lin. “We find even the strongest model, called the T5, after training with a large dataset, can still make silly mistakes.”
    It seems, said the researchers, that previous tests have not sufficiently challenged the models on their common sense abilities, instead mimicking what they have seen in the training phase.
    “Previous studies have primarily focused on discriminative common sense,” said Ren. “They test machines with multi-choice questions, where the search space for the machine is small — usually four or five candidates.”
    For instance, a typical setting for discriminative common-sense testing is a multiple-choice question answering task, for example: “Where do adults use glue sticks?” A: classroom B: office C: desk drawer.
    The answer here, of course, is “B: office.” Even computers can figure this out without much trouble. In contrast, a generative setting is more open-ended, such as the CommonGen task, where a model is asked to generate a natural sentence from given concepts.
    Ren explains: “With extensive model training, it is very easy to have a good performance on those tasks. Unlike those discriminative commonsense reasoning tasks, our proposed test focuses on the generative aspect of machine common sense.”
    Ren and Lin hope the data set will serve as a new benchmark to benefit future research about introducing common sense to natural language generation. In fact, they even have a leaderboard depicting scores achieved by the various popular models to help other researchers determine their viability for future projects.
    “Robots need to understand natural scenarios in our daily life before they make reasonable actions to interact with people,” said Lin.
    “By introducing common sense and other domain-specific knowledge to machines, I believe that one day we can see AI agents such as Samantha in the movie Her that generate natural responses and interact with our lives.” More

  • in

    Researcher aids in the development of a pathway to solve cybersickness

    Associate Professor of Psychology and Director of the Neuroimaging Center at NYU Abu Dhabi Bas Rokers and a team of researchers have evaluated the state of research on cybersickness and formulated a research and development agenda to eliminate cybersickness, allowing for broader adoption of immersive technologies.
    In the paper titled Identifying Causes of and Solutions for Cybersickness in Immersive Technology: Reformulation of a Research and Development Agenda, published in the International Journal of Human-Computer Interaction, Rokers and his team discuss the process of creating a research and development agenda based on participant feedback from a workshop titled Cybersickness: Causes and Solutions and analysis of related research. The new agenda recommends prioritizing the creation of powerful, lightweight, and untethered head-worn displays, reducing visual latencies, standardizing symptom and aftereffect measurement, developing improved countermeasures, and improving the understanding of the magnitude of the problem and its implications for job performance.
    The results of this study have identified a clear path towards finding a solution for cybersickness and allowing for the widespread use of immersive technologies. In addition to its use in entertainment and gaming, VR and AR have significant applications in the domains of education, manufacturing, training, health care, retail, and tourism. For example, it can enable educators to introduce students to distant locations and immerse themselves in a way that textbooks cannot. It can also allow healthcare workers to reach patients in remote and underserved areas, where they can provide diagnostics, surgical planning and image-guided treatment.
    “As there are possible applications across many industries, understanding how to identify and evaluate the opportunities for mass adoption and the collaborative use of AR and VR is critical,” said Rokers. “Achieving the goal of resolving cybersickness will allow the world to embrace the potential of immersive technology to enhance training, performance, and recreation.”

    Story Source:
    Materials provided by New York University. Note: Content may be edited for style and length. More

  • in

    New electronic chip delivers smarter, light-powered AI

    Researchers have developed artificial intelligence technology that brings together imaging, processing, machine learning and memory in one electronic chip, powered by light.
    The prototype shrinks artificial intelligence technology by imitating the way that the human brain processes visual information.
    The nanoscale advance combines the core software needed to drive artificial intelligence with image-capturing hardware in a single electronic device.
    With further development, the light-driven prototype could enable smarter and smaller autonomous technologies like drones and robotics, plus smart wearables and bionic implants like artificial retinas.
    The study, from an international team of Australian, American and Chinese researchers led by RMIT University, is published in the journal Advanced Materials.
    Lead researcher Associate Professor Sumeet Walia, from RMIT, said the prototype delivered brain-like functionality in one powerful device.

    advertisement

    “Our new technology radically boosts efficiency and accuracy by bringing multiple components and functionalities into a single platform,” Walia who also co-leads the Functional Materials and Microsystems Research Group said.
    “It’s getting us closer to an all-in-one AI device inspired by nature’s greatest computing innovation — the human brain.
    “Our aim is to replicate a core feature of how the brain learns, through imprinting vision as memory.
    “The prototype we’ve developed is a major leap forward towards neurorobotics, better technologies for human-machine interaction and scalable bionic systems.”
    Total package: advancing AI
    Typically artificial intelligence relies heavily on software and off-site data processing.

    advertisement

    The new prototype aims to integrate electronic hardware and intelligence together, for fast on-site decisions.
    “Imagine a dash cam in a car that’s integrated with such neuro-inspired hardware — it can recognise lights, signs, objects and make instant decisions, without having to connect to the internet,” Walia said.
    “By bringing it all together into one chip, we can deliver unprecedented levels of efficiency and speed in autonomous and AI-driven decision-making.”
    The technology builds on an earlier prototype chip from the RMIT team, which used light to create and modify memories.
    New built-in features mean the chip can now capture and automatically enhance images, classify numbers, and be trained to recognise patterns and images with an accuracy rate of over 90%.
    The device is also readily compatible with existing electronics and silicon technologies, for effortless future integration.
    Seeing the light: how the tech works
    The prototype is inspired by optogenetics, an emerging tool in biotechnology that allows scientists to delve into the body’s electrical system with great precision and use light to manipulate neurons.
    The AI chip is based on an ultra-thin material — black phosphorous — that changes electrical resistance in response to different wavelengths of light.
    The different functionalities such as imaging or memory storage are achieved by shining different colours of light on the chip.
    Study lead author Dr Taimur Ahmed, from RMIT, said light-based computing was faster, more accurate and required far less energy than existing technologies.
    “By packing so much core functionality into one compact nanoscale device, we can broaden the horizons for machine learning and AI to be integrated into smaller applications,” Ahmed said.
    “Using our chip with artificial retinas, for example, would enable scientists to miniaturise that emerging technology and improve accuracy of the bionic eye.
    “Our prototype is a significant advance towards the ultimate in electronics: a brain-on-a-chip that can learn from its environment just like we do.” More

  • in

    Machine learning innovation to develop chemical library

    Machine learning has been used widely in the chemical sciences for drug design and other processes.
    The models that are prospectively tested for new reaction outcomes and used to enhance human understanding to interpret chemical reactivity decisions made by such models are extremely limited.
    Purdue University innovators have introduced chemical reactivity flowcharts to help chemists interpret reaction outcomes using statistically robust machine learning models trained on a small number of reactions. The work is published in Organic Letters.
    “Developing new and fast reactions is essential for chemical library design in drug discovery,” said Gaurav Chopra, an assistant professor of analytical and physical chemistry in Purdue’s College of Science. “We have developed a new, fast and one-pot multicomponent reaction (MCR) of N-sulfonylimines that was used as a representative case for generating training data for machine learning models, predicting reaction outcomes and testing new reactions in a blind prospective manner.
    “We expect this work to pave the way in changing the current paradigm by developing accurate, human understandable machine learning models to interpret reaction outcomes that will augment the creativity and efficiency of human chemists to discover new chemical reactions and enhance organic and process chemistry pipelines.”
    Chopra said the Purdue team’s human-interpretable machine learning approach, introduced as chemical reactivity flowcharts, can be extended to explore the reactivity of any MCR or any chemical reaction. It does not need large-scale robotics since these methods can be used by the chemists while doing reaction screening in their laboratories.
    “We provide the first report of a framework to combine fast synthetic chemistry experiments and quantum chemical calculations for understanding reaction mechanism and human-interpretable statistically robust machine learning models to identify chemical patterns for predicting and experimentally testing heterogeneous reactivity of N-sulfonylimines,” Chopra said.
    This work aligns with other innovations and research from Chopra’s labs, whose team members work with the Purdue Research Foundation Office of Technology Commercialization to patent numerous technologies.
    “The unprecedented use of a machine learning model in generating chemical reactivity flowcharts helped us to understand the reactivity of traditionally used different N-sulfonylimines in MCRs,” said Krupal Jethava, a postdoctoral fellow in Chopra’s laboratory, who co-authored the work. “We believe that working hand-to-hand with organic and computational chemists will open up a new avenue for solving complex chemical reactivity problems for other reactions in the future.”
    Chopra said the Purdue researchers hope their work will pave the way to become one of many examples that will showcase the power of machine learning for new synthetic methodology development for drug design and beyond in the future.
    “In this work, we strived to ensure that our machine learning model can be easily understood by chemists not well versed in this field,” said Jonathan Fine, a former Purdue graduate student, who co-authored the work. “We believe that these models have the ability not only be used to predict reactions but also be used to better understand when a given reaction will occur. To demonstrate this, we used our model to guide additional substrates to test whether a reaction will occur.”

    Story Source:
    Materials provided by Purdue University. Original written by Chris Adam. Note: Content may be edited for style and length. More

  • in

    Could your vacuum be listening to you?

    A team of researchers demonstrated that popular robotic household vacuum cleaners can be remotely hacked to act as microphones.
    The researchers — including Nirupam Roy, an assistant professor in the University of Maryland’s Department of Computer Science — collected information from the laser-based navigation system in a popular vacuum robot and applied signal processing and deep learning techniques to recover speech and identify television programs playing in the same room as the device.
    The research demonstrates the potential for any device that uses light detection and ranging (Lidar) technology to be manipulated for collecting sound, despite not having a microphone. This work, which is a collaboration with assistant professor Jun Han at the University of Singapore was presented at the Association for Computing Machinery’s Conference on Embedded Networked Sensor Systems (SenSys 2020) on November 18, 2020.
    “We welcome these devices into our homes, and we don’t think anything about it,” said Roy, who holds a joint appointment in the University of Maryland Institute for Advanced Computer Studies (UMIACS). “But we have shown that even though these devices don’t have microphones, we can repurpose the systems they use for navigation to spy on conversations and potentially reveal private information.”
    The Lidar navigation systems in household vacuum bots shine a laser beam around a room and sense the reflection of the laser as it bounces off nearby objects. The robot uses the reflected signals to map the room and avoid collisions as it moves through the house.
    Privacy experts have suggested that the maps made by vacuum bots, which are often stored in the cloud, pose potential privacy breaches that could give advertisers access to information about such things as home size, which suggests income level, and other lifestyle-related information. Roy and his team wondered if the Lidar in these robots could also pose potential security risks as sound recording devices in users’ homes or businesses.

    advertisement

    Sound waves cause objects to vibrate, and these vibrations cause slight variations in the light bouncing off an object. Laser microphones, used in espionage since the 1940s, are capable of converting those variations back into sound waves. But laser microphones rely on a targeted laser beam reflecting off very smooth surfaces, such as glass windows.
    A vacuum Lidar, on the other hand, scans the environment with a laser and senses the light scattered back by objects that are irregular in shape and density. The scattered signal received by the vacuum’s sensor provides only a fraction of the information needed to recover sound waves. The researchers were unsure if a vacuum bot’s Lidar system could be manipulated to function as a microphone and if the signal could be interpreted into meaningful sound signals.
    First, the researchers hacked a robot vacuum to show they could control the position of the laser beam and send the sensed data to their laptops through Wi-Fi without interfering with the device’s navigation.
    Next, they conducted experiments with two sound sources. One source was a human voice reciting numbers played over computer speakers and the other was audio from a variety of television shows played through a TV sound bar. Roy and his colleagues then captured the laser signal sensed by the vacuum’s navigation system as it bounced off a variety of objects placed near the sound source. Objects included a trash can, cardboard box, takeout container and polypropylene bag — items that might normally be found on a typical floor.
    The researchers passed the signals they received through deep learning algorithms that were trained to either match human voices or to identify musical sequences from television shows. Their computer system, which they call LidarPhone, identified and matched spoken numbers with 90% accuracy. It also identified television shows from a minute’s worth of recording with more than 90% accuracy.

    advertisement

    “This type of threat may be more important now than ever, when you consider that we are all ordering food over the phone and having meetings over the computer, and we are often speaking our credit card or bank information,” Roy said. “But what is even more concerning for me is that it can reveal much more personal information. This kind of information can tell you about my living style, how many hours I’m working, other things that I am doing. And what we watch on TV can reveal our political orientations. That is crucial for someone who might want to manipulate the political elections or target very specific messages to me.”
    The researchers emphasize that vacuum cleaners are just one example of potential vulnerability to Lidar-based spying. Many other devices could be open to similar attacks such as smartphone infrared sensors used for face recognition or passive infrared sensors used for motion detection.
    “I believe this is significant work that will make the manufacturers aware of these possibilities and trigger the security and privacy community to come up with solutions to prevent these kinds of attacks,” Roy said.
    This research was partially supported by a grant from Singapore Ministry of Education Academic Research Fund Tier 1 (Award No. R-252-000-A26-133).
    The research paper, Spying with Your Robot Vacuum Cleaner: Eavesdropping via Lidar Sensors, Sriram Sami, Yimin Dai, Sean Rui Xiang Tan, Nirupam Roy and Jun Han, was presented on November 18, 2020, at the Association for Computing Machinery, SenSys 2020. More

  • in

    Upgraded radar can enable self-driving cars to see clearly no matter the weather

    A new kind of radar could make it possible for self-driving cars to navigate safely in bad weather. Electrical engineers at the University of California San Diego developed a clever way to improve the imaging capability of existing radar sensors so that they accurately predict the shape and size of objects in the scene. The system worked well when tested at night and in foggy conditions.
    The team will present their work at the Sensys conference Nov. 16 to 19.
    Inclement weather conditions pose a challenge for self-driving cars. These vehicles rely on technology like LiDAR and radar to “see” and navigate, but each has its shortcomings. LiDAR, which works by bouncing laser beams off surrounding objects, can paint a high-resolution 3D picture on a clear day, but it cannot see in fog, dust, rain or snow. On the other hand, radar, which transmits radio waves, can see in all weather, but it only captures a partial picture of the road scene.
    Enter a new UC San Diego technology that improves how radar sees.
    “It’s a LiDAR-like radar,” said Dinesh Bharadia, a professor of electrical and computer engineering at the UC San Diego Jacobs School of Engineering. It’s an inexpensive approach to achieving bad weather perception in self-driving cars, he noted. “Fusing LiDAR and radar can also be done with our techniques, but radars are cheap. This way, we don’t need to use expensive LiDARs.”
    The system consists of two radar sensors placed on the hood and spaced an average car’s width apart (1.5 meters). Having two radar sensors arranged this way is key — they enable the system to see more space and detail than a single radar sensor.

    advertisement

    During test drives on clear days and nights, the system performed as well as a LiDAR sensor at determining the dimensions of cars moving in traffic. Its performance did not change in tests simulating foggy weather. The team “hid” another vehicle using a fog machine and their system accurately predicted its 3D geometry. The LiDAR sensor essentially failed the test.
    Two eyes are better than one
    The reason radar traditionally suffers from poor imaging quality is because when radio waves are transmitted and bounced off objects, only a small fraction of signals ever gets reflected back to the sensor. As a result, vehicles, pedestrians and other objects appear as a sparse set of points.
    “This is the problem with using a single radar for imaging. It receives just a few points to represent the scene, so the perception is poor. There can be other cars in the environment that you don’t see,” said Kshitiz Bansal, a computer science and engineering Ph.D. student at UC San Diego. “So if a single radar is causing this blindness, a multi-radar setup will improve perception by increasing the number of points that are reflected back.”
    The team found that spacing two radar sensors 1.5 meters apart on the hood of the car was the optimal arrangement. “By having two radars at different vantage points with an overlapping field of view, we create a region of high-resolution, with a high probability of detecting the objects that are present,” Bansal said.

    advertisement

    A tale of two radars
    The system overcomes another problem with radar: noise. It is common to see random points, which do not belong to any objects, appear in radar images. The sensor can also pick up what are called echo signals, which are reflections of radio waves that are not directly from the objects that are being detected.
    More radars mean more noise, Bharadia noted. So the team developed new algorithms that can fuse the information from two different radar sensors together and produce a new image free of noise.
    Another innovation of this work is that the team constructed the first dataset combining data from two radars.
    “There are currently no publicly available datasets with this kind of data, from multiple radars with an overlapping field of view,” Bharadia said. “We collected our own data and built our own dataset for training our algorithms and for testing.”
    The dataset consists of 54,000 radar frames of driving scenes during the day and night in live traffic, and in simulated fog conditions. Future work will include collecting more data in the rain. To do this, the team will first need to build better protective covers for their hardware.
    The team is now working with Toyota to fuse the new radar technology with cameras. The researchers say this could potentially replace LiDAR. “Radar alone cannot tell us the color, make or model of a car. These features are also important for improving perception in self-driving cars,” Bharadia said.
    Video: https://www.youtube.com/watch?v=5BrC0Jt4xUc&feature=emb_logo More

  • in

    Machine learning guarantees robots' performance in unknown territory

    A small drone takes a test flight through a space filled with randomly placed cardboard cylinders acting as stand-ins for trees, people or structures. The algorithm controlling the drone has been trained on a thousand simulated obstacle-laden courses, but it’s never seen one like this. Still, nine times out of 10, the pint-sized plane dodges all the obstacles in its path.
    This experiment is a proving ground for a pivotal challenge in modern robotics: the ability to guarantee the safety and success of automated robots operating in novel environments. As engineers increasingly turn to machine learning methods to develop adaptable robots, new work by Princeton University researchers makes progress on such guarantees for robots in contexts with diverse types of obstacles and constraints.
    “Over the last decade or so, there’s been a tremendous amount of excitement and progress around machine learning in the context of robotics, primarily because it allows you to handle rich sensory inputs,” like those from a robot’s camera, and map these complex inputs to actions, said Anirudha Majumdar, an assistant professor of mechanical and aerospace engineering at Princeton.
    However, robot control algorithms based on machine learning run the risk of overfitting to their training data, which can make algorithms less effective when they encounter inputs that differ from those they were trained on. Majumdar’s Intelligent Robot Motion Lab addressed this challenge by expanding the suite of available tools for training robot control policies, and quantifying the likely success and safety of robots performing in novel environments.
    In three new papers, the researchers adapted machine learning frameworks from other arenas to the field of robot locomotion and manipulation. They turned to generalization theory, which is typically used in contexts that map a single input onto a single output, such as automated image tagging. The new methods are among the first to apply generalization theory to the more complex task of making guarantees on robots’ performance in unfamiliar settings. While other approaches have provided such guarantees under more restrictive assumptions, the team’s methods offer more broadly applicable guarantees on performance in novel environments, said Majumdar.
    In the first paper, a proof of principle for applying the machine learning frameworks, the team tested their approach in simulations that included a wheeled vehicle driving through a space filled with obstacles, and a robotic arm grasping objects on a table. They also validated the technique by assessing the obstacle avoidance of a small drone called a Parrot Swing (a combination quadcopter and fixed-wing airplane) as it flew down a 60-foot-long corridor dotted with cardboard cylinders. The guaranteed success rate of the drone’s control policy was 88.4%, and it avoided obstacles in 18 of 20 trials (90%).

    advertisement

    The work, published Oct. 3 in the International Journal of Robotics Research, was coauthored by Majumdar; Alec Farid, a graduate student in mechanical and aerospace engineering; and Anoopkumar Sonar, a computer science concentrator from Princeton’s Class of 2021.
    When applying machine learning techniques from other areas to robotics, said Farid, “there are a lot of special assumptions you need to satisfy, and one of them is saying how similar the environments you’re expecting to see are to the environments your policy was trained on. In addition to showing that we can do this in the robotic setting, we also focused on trying to expand the types of environments that we could provide a guarantee for.”
    “The kinds of guarantees we’re able to give range from about 80% to 95% success rates on new environments, depending on the specific task, but if you’re deploying [an unmanned aerial vehicle] in a real environment, then 95% probably isn’t good enough,” said Majumdar. “I see that as one of the biggest challenges, and one that we are actively working on.”
    Still, the team’s approaches represent much-needed progress on generalization guarantees for robots operating in unseen environments, said Hongkai Dai, a senior research scientist at the Toyota Research Institute in Los Altos, California.
    “These guarantees are paramount to many safety-critical applications, such as self-driving cars and autonomous drones, where the training set cannot cover every possible scenario,” said Dai, who was not involved in the research. “The guarantee tells us how likely it is that a policy can still perform reasonably well on unseen cases, and hence establishes confidence on the policy, where the stake of failure is too high.”
    In two other papers, to be presented Nov. 18 at the virtual Conference on Robot Learning, the researchers examined additional refinements to bring robot control policies closer to the guarantees that would be needed for real-world deployment. One paper used imitation learning, in which a human “expert” provides training data by manually guiding a simulated robot to pick up various objects or move through different spaces with obstacles. This approach can improve the success of machine learning-based control policies.

    advertisement

    To provide the training data, lead author Allen Ren, a graduate student in mechanical and aerospace engineering, used a 3D computer mouse to control a simulated robotic arm tasked with grasping and lifting drinking mugs of various sizes, shapes and materials. Other imitation learning experiments involved the arm pushing a box across a table, and a simulation of a wheeled robot navigating around furniture in a home-like environment.
    The researchers deployed the policies learned from the mug-grasping and box-pushing tasks on a robotic arm in the laboratory, which was able to pick up 25 different mugs by grasping their rims between its two finger-like grippers — not holding the handle as a human would. In the box-pushing example, the policy achieved 93% success on easier tasks and 80% on harder tasks.
    “We have a camera on top of the table that sees the environment and takes a picture five times per second,” said Ren. “Our policy training simulation takes this image and outputs what kind of action the robot should take, and then we have a controller that moves the arm to the desired locations based on the output of the model.”
    A third paper demonstrated the development of vision-based planners that provide guarantees for flying or walking robots to carry out planned sequences of movements through diverse environments. Generating control policies for planned movements brought a new problem of scale — a need to optimize vision-based policies with thousands, rather than hundreds, of dimensions.
    “That required coming up with some new algorithmic tools for being able to tackle that dimensionality and still be able to give strong generalization guarantees,” said lead author Sushant Veer, a postdoctoral research associate in mechanical and aerospace engineering.
    A key aspect of Veer’s strategy was the use of motion primitives, in which a policy directs a robot to go straight or turn, for example, rather than specifying a torque or velocity for each movement. Narrowing the space of possible actions makes the planning process more computationally tractable, said Majumdar.
    Veer and Majumdar evaluated the vision-based planners on simulations of a drone navigating around obstacles and a four-legged robot traversing rough terrain with slopes as high as 35 degrees — “a very challenging problem that a lot of people in robotics are still trying to solve,” said Veer.
    In the study, the legged robot achieved an 80% success rate on unseen test environments. The researchers are working to further improve their policies’ guarantees, as well as assessing the policies’ performance on real robots in the laboratory.
    The work was supported in part by the U.S. Office of Naval Research, the National Science Foundation, a Google Faculty Research Award and an Amazon Research Award. More

  • in

    AI tool may predict movies' future ratings

    Movie ratings can determine a movie’s appeal to consumers and the size of its potential audience. Thus, they have an impact on a film’s bottom line. Typically, humans do the tedious task of manually rating a movie based on viewing the movie and making decisions on the presence of violence, drug abuse and sexual content.
    Now, researchers at the USC Viterbi School of Engineering, armed with artificial intelligence tools, can rate a movie’s content in a matter of seconds, based on the movie script and before a single scene is shot. Such an approach could allow movie executives the ability to design a movie rating in advance and as desired, by making the appropriate edits on a script and before the shooting of a single scene. Beyond the potential financial impact, such instantaneous feedback would allow storytellers and decision-makers to reflect on the content they are creating for the public and the impact such content might have on viewers.
    Using artificial intelligence applied to scripts, Shrikanth Narayanan, University Professor and Niki & C. L. Max Nikias Chair in Engineering, and a team of researchers from the Signal Analysis and Interpretation Lab (SAIL) at USC Viterbi, have demonstrated that linguistic cues can effectively signal behaviors on violent acts, drug abuse and sexual content (actions that are often the basis for a film’s ratings) about to be taken by a film’s characters.
    Method:
    Using 992 movie scripts that included violent, substance-abuse and sexual content, as determined by Common Sense Media, a non-profit organization that rates and makes recommendations for families and schools, the SAIL research team trained artificial intelligence to recognize corresponding risk behaviors, patterns and language.
    The AI tool created receives as input all the script, processes it through a neural network and scans it for semantics and sentiment expressed. In the process, it classifies sentences and phrases as positive, negative, aggressive and other descriptors. The AI tool automatically classifies words and phrases into three categories: violence, drug abuse and sexual content.

    advertisement

    Victor Martinez, a doctoral candidate in computer science at USC Viterbi and the lead researcher on the study, which will appear in The Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing said, “Our model looks at the movie script, rather than the actual scenes, including e.g. sounds like a gunshot or explosion that occur later in the production pipeline. This has the benefit of providing a rating long before production to help filmmakers decide e.g. the degree of violence and whether it needs to be toned down.”
    The research team also includes Narayanan, a professor of electrical and computer engineering, computer science and linguistics, Krishna Somandepalli, a Ph.D. candidate in Electrical and Computing Engineering at USC Viterbi, and Professor Yalda T. Uhls of UCLA’s Department of Psychology. They discovered many interesting connections between the portrayals of risky behaviors.
    “There seems to be a correlation in the amount of content in a typical film focused on substance abuse and the amount of sexual content. Whether intentionally or not, filmmakers seem to match the level of substance abuse-related content with sexually explicit content,” said Martinez.
    Another interesting pattern also emerged. “We found that filmmakers compensate for low levels of violence with joint portrayals of substance abuse and sexual content,” Martinez said.
    Moreover, while many movies contain depictions of rampant drug-abuse and sexual content, the researchers found it highly unlikely for a film to have high levels of all three risky behaviors, perhaps because of Motion Picture Association (MPA) standards.

    advertisement

    They also found an interesting connection between risk behaviors and MPA ratings. As sexual content increases, the MPA appears to put less emphasis on violence/substance-abuse content. Thus, regardless of violent and substance abuse content, a movie with a lot of sexual content will likely receive an R rating.
    Narayanan whose SAIL lab has pioneered the field of media informatics and applied natural language processing in order to bring awareness in the creative community about the nuances of storytelling, calls media “a rich avenue for studying human communication, interaction and behavior, since it provides a window into society.”
    “At SAIL, we are designing technologies and tools, based on AI, for all stakeholders in this creative business — the writers, film-makers and producers — to raise awareness about the varied important details associated in telling their story on film,” Narayanan said.
    “Not only are we interested in the perspective of the storytellers of the narratives they weave,” Narayanan said, “but also in understanding the impact on the audience and the ‘take-away’ from the whole experience. Tools like these will help raise societally-meaningful awareness, for example, through identifying negative stereotypes.”
    Added Martinez: “In the future, I’m interested in studying minorities and how they are represented, particularly in cases of violence, sex and drugs.” More