More stories

  • in

    A language learning system that pays attention — more efficiently than ever before

    Human language can be inefficient. Some words are vital. Others, expendable.
    Reread the first sentence of this story. Just two words, “language” and “inefficient,” convey almost the entire meaning of the sentence. The importance of key words underlies a popular new tool for natural language processing (NLP) by computers: the attention mechanism. When coded into a broader NLP algorithm, the attention mechanism homes in on key words rather than treating every word with equal importance. That yields better results in NLP tasks like detecting positive or negative sentiment or predicting which words should come next in a sentence.
    The attention mechanism’s accuracy often comes at the expense of speed and computing power, however. It runs slowly on general-purpose processors like you might find in consumer-grade computers. So, MIT researchers have designed a combined software-hardware system, dubbed SpAtten, specialized to run the attention mechanism. SpAtten enables more streamlined NLP with less computing power.
    “Our system is similar to how the human brain processes language,” says Hanrui Wang. “We read very fast and just focus on key words. That’s the idea with SpAtten.”
    The research will be presented this month at the IEEE International Symposium on High-Performance Computer Architecture. Wang is the paper’s lead author and a PhD student in the Department of Electrical Engineering and Computer Science. Co-authors include Zhekai Zhang and their advisor, Assistant Professor Song Han.
    Since its introduction in 2015, the attention mechanism has been a boon for NLP. It’s built into state-of-the-art NLP models like Google’s BERT and OpenAI’s GPT-3. The attention mechanism’s key innovation is selectivity — it can infer which words or phrases in a sentence are most important, based on comparisons with word patterns the algorithm has previously encountered in a training phase. Despite the attention mechanism’s rapid adoption into NLP models, it’s not without cost.

    advertisement

    NLP models require a hefty load of computer power, thanks in part to the high memory demands of the attention mechanism. “This part is actually the bottleneck for NLP models,” says Wang. One challenge he points to is the lack of specialized hardware to run NLP models with the attention mechanism. General-purpose processors, like CPUs and GPUs, have trouble with the attention mechanism’s complicated sequence of data movement and arithmetic. And the problem will get worse as NLP models grow more complex, especially for long sentences. “We need algorithmic optimizations and dedicated hardware to process the ever-increasing computational demand,” says Wang.
    The researchers developed a system called SpAtten to run the attention mechanism more efficiently. Their design encompasses both specialized software and hardware. One key software advance is SpAtten’s use of “cascade pruning,” or eliminating unnecessary data from the calculations. Once the attention mechanism helps pick a sentence’s key words (called tokens), SpAtten prunes away unimportant tokens and eliminates the corresponding computations and data movements. The attention mechanism also includes multiple computation branches (called heads). Similar to tokens, the unimportant heads are identified and pruned away. Once dispatched, the extraneous tokens and heads don’t factor into the algorithm’s downstream calculations, reducing both computational load and memory access.
    To further trim memory use, the researchers also developed a technique called “progressive quantization.” The method allows the algorithm to wield data in smaller bitwidth chunks and fetch as few as possible from memory. Lower data precision, corresponding to smaller bitwidth, is used for simple sentences, and higher precision is used for complicated ones. Intuitively it’s like fetching the phrase “cmptr progm” as the low-precision version of “computer program.”
    Alongside these software advances, the researchers also developed a hardware architecture specialized to run SpAtten and the attention mechanism while minimizing memory access. Their architecture design employs a high degree of “parallelism,” meaning multiple operations are processed simultaneously on multiple processing elements, which is useful because the attention mechanism analyzes every word of a sentence at once. The design enables SpAtten to rank the importance of tokens and heads (for potential pruning) in a small number of computer clock cycles. Overall, the software and hardware components of SpAtten combine to eliminate unnecessary or inefficient data manipulation, focusing only on the tasks needed to complete the user’s goal.
    The philosophy behind the system is captured in its name. SpAtten is a portmanteau of “sparse attention,” and the researchers note in the paper that SpAtten is “homophonic with ‘spartan,’ meaning simple and frugal.” Wang says, “that’s just like our technique here: making the sentence more concise.” That concision was borne out in testing.

    advertisement

    The researchers coded a simulation of SpAtten’s hardware design — they haven’t fabricated a physical chip yet — and tested it against competing general-purposes processors. SpAtten ran more than 100 times faster than the next best competitor (a TITAN Xp GPU). Further, SpAtten was more than 1,000 times more energy efficient than competitors, indicating that SpAtten could help trim NLP’s substantial electricity demands.
    The researchers also integrated SpAtten into their previous work, to help validate their philosophy that hardware and software are best designed in tandem. They built a specialized NLP model architecture for SpAtten, using their Hardware-Aware Transformer (HAT) framework, and achieved a roughly two times speedup over a more general model.
    The researchers think SpAtten could be useful to companies that employ NLP models for the majority of their artificial intelligence workloads. “Our vision for the future is that new algorithms and hardware that remove the redundancy in languages will reduce cost and save on the power budget for data center NLP workloads” says Wang.
    On the opposite end of the spectrum, SpAtten could bring NLP to smaller, personal devices. “We can improve the battery life for mobile phone or IoT devices,” says Wang, referring to internet-connected “things” — televisions, smart speakers, and the like. “That’s especially important because in the future, numerous IoT devices will interact with humans by voice and natural language, so NLP will be the first application we want to employ.”
    Han says SpAtten’s focus on efficiency and redundancy removal is the way forward in NLP research. “Human brains are sparsely activated [by key words]. NLP models that are sparsely activated will be promising in the future,” he says. “Not all words are equal — pay attention only to the important ones.” More

  • in

    New mathematical method for generating random connected networks

    Many natural and human-made networks, such as computer, biological or social networks have a connectivity structure that critically shapes their behavior. The academic field of network science is concerned with analyzing such real-world complex networks and understanding how their structure influences their function or behavior. Examples are the vascular network of our bodies, the network of neurons in our brain, or the network of how an epidemic is spreading through a society.
    The need for reliable null models
    The analysis of such networks often focuses on finding interesting properties and features. For example, does the structure of a particular contact network help diseases spread especially quickly? In order to find out, we need a baseline — a set of random networks, a so-called “null model” — to compare to. Furthermore, since more connections obviously create more opportunities for infection, the number of connections of each node in the baseline should be matched to the network we analyze. Then if our network appears to facilitate spreading more than the baseline, we know it must be due to its specific network structure. However, creating truly random, unbiased, null models that are matched in some property is difficult — and usually requires a different approach for each property of interest. Existing algorithm that create connected networks with a specific number of connections for each node all suffer from uncontrolled bias, which means that some networks are generated more than others, potentially compromising the conclusions of the study.
    A new method that eliminates bias
    Szabolcs Horvát and Carl Modes at the Center for Systems Biology Dresden (CSBD) and the Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG) developed such a model which makes it possible to eliminate bias, and reach solid conclusions. Szabolcs Horvát describes: “We developed a null model for connected networks where the bias is under control and can be factored out. Specifically, we created an algorithm which can generate random connected networks with a prescribed number of connections for each node. With our method, we demonstrated that more naïve but commonly used approaches may lead to invalid conclusions.” The coordinating author of the study, Carl Modes concludes: “This finding illustrates the need for mathematically well-founded methods. We hope that our work will be useful to the broader network science community. In order to make it as easy as possible for other researchers to use it, we also developed a software and made it publicly available.”

    Story Source:
    Materials provided by Max-Planck-Gesellschaft. Note: Content may be edited for style and length. More

  • in

    Placing cosmological constraints on quantum gravity phenomenology

    A description of gravity compatible with the principles of quantum mechanics has long been a widely pursued goal in physics. Existing theories of this ‘quantum gravity’ often involve mathematical corrections to Heisenberg’s Uncertainty Principle (HUP), which quantifies the inherent limits in the accuracy of any quantum measurement. These corrections arise when gravitational interactions are considered, leading to a ‘Generalized Uncertainty Principle’ (GUP). Two specific GUP models are often used: the first modifies the HUP with a linear correction, while the second introduces a quadratic one. Through new research published in EPJ C, Serena Giardino and Vincenzo Salzano at the University of Szczecin in Poland have used well-established cosmological observations to place tighter constraints on the quadratic model, while discrediting the linear model.
    The GUP can influence the black hole evaporation process first described by Stephen Hawking, and may also lead to better understanding of the relationship between thermodynamics and gravity. Intriguingly, the GUP also places a lower limit on length scales that are possible to probe — below the so-called ‘Planck length,’ any concentration of energy would collapse under gravity to form a black hole. Previously, both the linear and quadratic GUP models were rigorously tested by comparing their predictions with data gathered in quantum experiments, placing stringent limits on their parameters.
    In their study, Giardino and Salzano instead compared the predictions of GUP-influenced models of the universe with observations of cosmological phenomena, including supernovae and cosmic microwave background radiation. These comparisons were not widely made in the past, since the constraints they imposed on the GUP parameters were believed to be far weaker than those possible in quantum experiments. However, the researchers’ analysis revealed that stricter bounds could be imposed on the quadratic model, comparable to those placed by some quantum experiments. In addition, they showed that the linear correction to the HUP generally could not account for the observed data. Ultimately, these results highlight the promising role of cosmological observations in constraining the phenomenology of quantum gravity.

    Story Source:
    Materials provided by Springer. Note: Content may be edited for style and length. More

  • in

    Quantum effects help minimize communication flaws

    Among the most active fields of research in modern physics, both at an academic level and beyond, are quantum computation and communication, which apply quantum phenomena such as superposition and entanglement to perform calculations, or to exchange information. A number of research groups around the world have built quantum devices that are able to perform calculations faster than any classical computer. Yet, there is still a long way to go before these devices can be converted into marketable quantum computers. One reason for this is that both quantum computation and quantum communication are strongly deteriorated by the ease with which a quantum superposition state can be destroyed, or entanglement between two or more quantum particles can be lost.
    The primary approach to overcome these limitations is the application of so-called quantum error-correcting codes. This, however, requires an amount of resources exceeding that which can be currently achieved in a controlled way. While, in the long run, error correction is likely to become an integral part of future quantum devices, a complementary approach is to mitigate the noise — that is, the cumulative effect of uncorrected errors — without relying on so many additional resources. These are referred to as noise reduction schemes.
    Noise mitigation without additional resources through simple quantum schemes
    A new approach along this research line was recently proposed to reduce noise in a communication scheme between two parties. Imagine two parties who want to communicate by exchanging a quantum particle, yet the particle has to be sent over some faulty transmission lines.
    Recently, a team of researchers at Hong-Kong University proposed that an overall reduction in noise could be achieved by directing the particle along a quantum superposition of paths through regions of noise in opposite order. In particular, while classically a particle can only travel along one path, in quantum mechanics it can move along multiple paths at once. If one uses this property to send the particle along two quantum paths, one can, for instance, lead the particle across the noisy regions in opposite order simultaneously. This effect had been demonstrated experimentally by two independent research investigations.
    These results suggested that, to achieve this noise reduction, it is necessary to place the noisy transmission lines in a quantum superposition of opposite orders. Shortly after this, research groups in Vienna and in Grenoble realised that this effect can also be achieved via simpler configurations, which can even completely eliminate the noise between the two parties.
    All of these schemes have now been implemented experimentally and compared with each other by a research team led by Philip Walther at the University of Vienna. In this work, different ways of passing through two noisy regions in quantum superposition are compared for a variety of noise types. The experimental results are also supported with numerical simulations to extend the study to more generic types of noise. Surprisingly, it is found that the simplest schemes for quantum superposition of noisy channels also offer the best reduction of the noise affecting communication.
    “Error correction in modern quantum technologies is among the most pressing needs of current quantum computation and communication schemes. Our work shows that, at least in the case of quantum communication, already with the technologies currently in use it may be possible to mitigate this issue with no need for additional resources,” says Giulia Rubino, first author of the publication in Physical Review Research. The ease of the demonstrated technique allows immediate use in current long-distance communications, and promises potential further applications in quantum computation and quantum thermodynamics.

    Story Source:
    Materials provided by University of Vienna. Note: Content may be edited for style and length. More

  • in

    Virtual reality helping to treat fear of heights

    Researchers from the University of Basel have developed a virtual reality app for smartphones to reduce fear of heights. Now, they have conducted a clinical trial to study its efficacy. Trial participants who spent a total of four hours training with the app at home showed an improvement in their ability to handle real height situations.
    Fear of heights is a widespread phenomenon. Approximately 5% of the general population experiences a debilitating level of discomfort in height situations. However, the people affected rarely take advantage of the available treatment options, such as exposure therapy, which involves putting the person in the anxiety-causing situation under the guidance of a professional. On the one hand, people are reluctant to confront their fear of heights. On the other hand, it can be difficult to reproduce the right kinds of height situations in a therapy setting.
    This motivated the interdisciplinary research team led by Professor Dominique de Quervain to develop a smartphone-based virtual reality exposure therapy app called Easyheights. The app uses 360° images of real locations, which the researchers captured using a drone. People can use the app on their own smartphones together with a special virtual reality headset.
    Gradually increasing the height
    During the virtual experience, the user stands on a platform that is initially one meter above the ground. After allowing acclimatization to the situation for a certain interval, the platform automatically rises. In this way, the perceived distance above the ground increases slowly but steadily without an increase in the person’s level of fear.
    The research team studied the efficacy of this approach in a randomized, controlled trial and published the results in the journal NPJ Digital Medicine. Fifty trial participants with a fear of heights either completed a four-hour height training program (one 60-minute session and six 30-minute sessions over the course of two weeks) using virtual reality, or were assigned to the control group, which did not complete these training sessions.
    Before and after the training phase — or the same period of time without training — the trial participants ascended the Uetliberg lookout tower near Zurich as far as their fear of heights allowed them. The researchers recorded the height level reached by the participants along with their subjective fear level at each level of the tower. At the end of the trial, the researchers evaluated the results from 22 subjects who completed the Easyheights training and 25 from the control group.
    The group that completed the training with the app exhibited less fear on the tower and was able to ascend further towards the top than they could before completing the training. The control group exhibited no positive changes. The efficacy of the Easyheights training proved comparable to that of conventional exposure therapy.
    Therapy in your own living room
    Researchers have already been studying the use of virtual reality for treating fear of heights for more than two decades. “What is new, however, is that smartphones can be used to produce the virtual scenarios that previously required a technically complicated type of treatment, and this makes it much more accessible,” explains Dr. Dorothée Bentz, lead author of the study.
    The results from the study suggest that the repeated use of a smartphone-based virtual reality exposure therapy can greatly improve the behavior and subjective state of well-being in height situations. People who suffer from a mild fear of heights will soon be able to download the free app from major app stores and complete training sessions on their own. However, the researchers recommend that people who suffer from a serious fear of heights only use the app with the supervision of a professional.
    The current study is one of several projects in progress at the Transfaculty Research Platform for Molecular and Cognitive Neurosciences, led by Professor Andreas Papassotiropoulos and Professor Dominique de Quervain. Their goal is to improve the treatment of mental disorders through the use of new technologies and to make these treatments widely available.

    Story Source:
    Materials provided by University of Basel. Note: Content may be edited for style and length. More

  • in

    Emerging robotics technology may lead to better buildings in less time

    Emerging robotics technology may soon help construction companies and contractors create buildings in less time at higher quality and at lower costs.
    Purdue University innovators developed and are testing a novel construction robotic system that uses an innovative mechanical design with advances in computer vision sensing technology to work in a construction setting.
    The technology was developed with support from the National Science Foundation.
    “Our work helps to address workforce shortages in the construction industry by automating key construction operations,” said Jiansong Zhang, an assistant professor of construction management technology in the Purdue Polytechnic Institute. “On a construction site, there are many unknown factors that a construction robot must be able to account for effectively. This requires much more advanced sensing and reasoning technologies than those commonly used in a manufacturing environment.”
    The Purdue team’s custom end effector design allows for material to be both placed and fastened in the same operation using the same arm, limiting the amount of equipment that is required to complete a given task.
    Computer vision algorithms developed for the project allow the robotic system to sense building elements and match them to building information modeling (BIM) data in a variety of environments, and keep track of obstacles or safety hazards in the system’s operational context.
    “By basing the sensing for our robotic arm around computer vision technology, rather than more limited-scope and expensive sensing systems, we have the capability to complete many sensing tasks with a single affordable sensor,” Zhang said. “This allows us to implement a more robust and versatile system at a lower cost.”
    Undergraduate researchers in Zhang’s Automation and Intelligent Construction (AutoIC) Lab helped create this robotic technology.
    The innovators worked with the Purdue Research Foundation Office of Technology Commercialization to patent the technology.
    This work will be featured at OTC’s 2021 Technology Showcase: The State of Innovation. The annual showcase, being held virtually this year Feb. 10-11, will feature novel innovations from inventors at Purdue and across the state of Indiana.

    Story Source:
    Materials provided by Purdue University. Original written by Chris Adam. Note: Content may be edited for style and length. More

  • in

    AI can predict early death risk

    Researchers at Geisinger have found that a computer algorithm developed using echocardiogram videos of the heart can predict mortality within a year.
    The algorithm — an example of what is known as machine learning, or artificial intelligence (AI) — outperformed other clinically used predictors, including pooled cohort equations and the Seattle Heart Failure score. The results of the study were published in Nature Biomedical Engineering.
    “We were excited to find that machine learning can leverage unstructured datasets such as medical images and videos to improve on a wide range of clinical prediction models,” said Chris Haggerty, Ph.D., co-senior author and assistant professor in the Department of Translational Data Science and Informatics at Geisinger.
    Imaging is critical to treatment decisions in most medical specialties and has become one of the most data-rich components of the electronic health record (EHR). For example, a single ultrasound of the heart yields approximately 3,000 images, and cardiologists have limited time to interpret these images within the context of numerous other diagnostic data. This creates a substantial opportunity to leverage technology, such as machine learning, to manage and analyze this data and ultimately provide intelligent computer assistance to physicians.
    For their study, the research team used specialized computational hardware to train the machine learning model on 812,278 echocardiogram videos collected from 34,362 Geisinger patients over the last ten years. The study compared the results of the model to cardiologists’ predictions based on multiple surveys. A subsequent survey showed that when assisted by the model, cardiologists’ prediction accuracy improved by 13 percent. Leveraging nearly 50 million images, this study represents one of the largest medical image datasets ever published.
    “Our goal is to develop computer algorithms to improve patient care,” said Alvaro Ulloa Cerna, Ph.D., author and senior data scientist in the Department of Translational Data Science and Informatics at Geisinger. “In this case, we’re excited that our algorithm was able to help cardiologists improve their predictions about patients, since decisions about treatment and interventions are based on these types of clinical predictions.”

    Story Source:
    Materials provided by Geisinger Health System. Note: Content may be edited for style and length. More

  • in

    School closures may not reduce coronavirus deaths as much as expected

    School closures, the loss of public spaces, and having to work remotely due to the coronavirus pandemic have caused major disruptions in people’s social lives all over the world.
    Researchers from City University of Hong Kong, the Chinese Academy of Sciences, and Rensselaer Polytechnic Institute suggest a reduction in fatal coronavirus cases can be achieved without the need for so much social disruption. They discuss the impacts of the closures of various types of facilities in the journal Chaos, from AIP Publishing.
    After running thousands of simulations of the pandemic response in New York City with variations in social distancing behavior at home, in schools, at public facilities, and in the workplace while considering differences in interactions between different age groups, the results were stunning. The researchers found school closures are not largely beneficial in preventing serious cases of COVID-19. Less surprisingly, social distancing in public places, particularly among elderly populations, is the most important.
    “School only represents a small proportion of social contact. … It is more likely that people get exposure to viruses in public facilities, like restaurants and shopping malls,” said Qingpeng Zhang, one of the authors. “Since we focus here on the severe infections and deceased cases, closing schools contributes little if the elderly citizens are not protected in public facilities and other places.”
    Because New York City is so densely populated, the effects of schools are significantly smaller than general day-to-day interactions in public, because students are generally the least vulnerable to severe infections. But keeping public spaces open allows for spread to occur from less-vulnerable young people to the more-vulnerable older population.
    “Students may bridge the connection between vulnerable people, but these people are already highly exposed in public facilities,” Zhang said. “In other cities where people are much more distanced, the results may change.”
    Though the present findings are specific to New York, replacing the age and location parameters in the model can extend its results to any city. This will help determine the ideal local control measures to contain the pandemic with minimal social disruptions.
    “These patterns are unique for different cities, and good practice in one city may not translate to another city,” said Zhang.
    The authors emphasized that while these findings have promising implications, the model is still just a model, and it cannot capture the intricacies and subtle details of real-life interactions to a perfect extent. The inclusion of mobile phone, census, transportation, or other big data in the future can help inform a more realistic decision.
    “Given the age and location mixing patterns, there are so many variables to be considered, so the optimization is challenging,” said Zhang. “Our model is an attempt.”

    Story Source:
    Materials provided by American Institute of Physics. Note: Content may be edited for style and length. More