More stories

  • in

    The Ramanujan Machine: Researchers develop 'conjecture generator'

    Using AI and computer automation, Technion researchers have developed a “conjecture generator” that creates mathematical conjectures, which are considered to be the starting point for developing mathematical theorems. They have already used it to generate a number of previously unknown formulas. The study, which was published in the journal Nature, was carried out by undergraduates from different faculties under the tutelage of Assistant Professor Ido Kaminer of the Andrew and Erna Viterbi Faculty of Electrical Engineering at the Technion.
    The project deals with one of the most fundamental elements of mathematics — mathematical constants. A mathematical constant is a number with a fixed value that emerges naturally from different mathematical calculations and mathematical structures in different fields. Many mathematical constants are of great importance in mathematics, but also in disciplines that are external to mathematics, including biology, physics, and ecology. The golden ratio and Euler’s number are examples of such fundamental constants. Perhaps the most famous constant is pi, which was studied in ancient times in the context of the circumference of a circle. Today, pi appears in numerous formulas in all branches of science, with many math aficionados competing over who can recall more digits after the decimal point: 3.14159…
    The Technion researchers proposed and examined a new idea: The use of computer algorithms to automatically generate mathematical conjectures that appear in the form of formulas for mathematical constants.
    A conjecture is a mathematical conclusion or proposition that has not been proved; once the conjecture is proved, it becomes a theorem. Discovery of a mathematical conjecture on fundamental constants is relatively rare, and its source often lies in mathematical genius and exceptional human intuition. Newton, Riemann, Goldbach, Gauss, Euler, and Ramanujan are examples of such genius, and the new approach presented in the paper is named after Srinivasa Ramanujan.
    Ramanujan, an Indian mathematician born in 1887, grew up in a poor family, yet managed to arrive in Cambridge at the age of 26 at the initiative of British mathematicians Godfrey Hardy and John Littlewood. Within a few years he fell ill and returned to India, where he died at the age of 32. During his brief life he accomplished great achievements in the world of mathematics. One of Ramanujan’s rare capabilities was the intuitive formulation of unproven mathematical formulas. The Technion research team therefore decided to name their algorithm “the Ramanujan Machine,” as it generates conjectures without proving them, by “imitating” intuition using AI and considerable computer automation.
    According to Prof. Kaminer, “Our results are impressive because the computer doesn’t care if proving the formula is easy or difficult, and doesn’t base the new results on any prior mathematical knowledge, but only on the numbers in mathematical constants. To a large degree, our algorithms work in the same way as Ramanujan himself, who presented results without proof. It’s important to point out that the algorithm itself is incapable of proving the conjectures it found — at this point, the task is left to be resolved by human mathematicians.”
    The conjectures generated by the Technion’s Ramanujan Machine have delivered new formulas for well-known mathematical constants such as pi, Euler’s number (e), Apéry’s constant (which is related to the Riemann zeta function), and the Catalan constant. Surprisingly, the algorithms developed by the Technion researchers succeeded not only in creating known formulas for these famous constants, but in discovering several conjectures that were heretofore unknown. The researchers estimate this algorithm will be able to significantly expedite the generation of mathematical conjectures on fundamental constants and help to identify new relationships between these constants.
    As mentioned, until now, these conjectures were based on rare genius. This is why in hundreds of years of research, only a few dozens of formulas were found. It took the Technion’s Ramanujan Machine just a few hours to discover all the formulas for pi discovered by Gauss, the “Prince of Mathematics,” during a lifetime of work, along with dozens of new formulas that were unknown to Gauss.
    According to the researchers, “Similar ideas can in the future lead to the development of mathematical conjectures in all areas of mathematics, and in this way provide a meaningful tool for mathematical research.”
    The research team has launched a website, RamanujanMachine.com, which is intended to inspire the public to be more involved in the advancement of mathematical research by providing algorithmic tools that will be available to mathematicians and the public at large. Even before the article was published, hundreds of students, experts, and amateur mathematicians had signed up to the website.
    The research study started out as an undergraduate project in the Rothschild Scholars Technion Program for Excellence with the participation of Gal Raayoni and George Pisha, and continued as part of the research projects conducted in the Andrew and Erna Viterbi Faculty of Electrical Engineering with the participation of Shahar Gottlieb, Yoav Harris, and Doron Haviv. This is also where the most significant breakthrough was made — by an algorithm developed by Shahar Gottlieb — which led to the article’s publication in Nature. Prof. Kaminer adds that the most interesting mathematical discovery made by the Ramanujan Machine’s algorithms to date relates to a new algebraic structure concealed within a Catalan constant. The structure was discovered by high school student Yahel Manor, who participated in the project as part of the Alpha Program for science-oriented youth. Prof. Kaminer added that, “Industry colleagues Uri Mendlovic and Yaron Hadad also participated in the study, and contributed greatly to the mathematical and algorithmic concepts that form the foundation for the Ramanujan Machine. It is important to emphasize that the entire project was executed on a voluntary basis, received no funding, and participants joined the team out of pure scientific curiosity.”
    Prof. Ido Kaminer is the head of the Robert and Ruth Magid Electron Beam Quantum Dynamics Laboratory. He is a faculty member in the Andrew and Erna Viterbi Faculty of Electrical Engineering and the Solid State Institute. Kaminer is affiliated with the Helen Diller Quantum Center and the Russell Berrie Nanotechology Institute. More

  • in

    Artificial intelligence yields new ways to combat the coronavirus

    USC researchers have developed a new method to counter emergent mutations of the coronavirus and hasten vaccine development to stop the pathogen responsible for killing thousands of people and ruining the economy.
    Using artificial intelligence (AI), the research team at the USC Viterbi School of Engineering developed a method to speed the analysis of vaccines and zero in on the best potential preventive medical therapy.
    The method is easily adaptable to analyze potential mutations of the virus, ensuring the best possible vaccines are quickly identified — solutions that give humans a big advantage over the evolving contagion. Their machine-learning model can accomplish vaccine design cycles that once took months or years in a matter of seconds and minutes, the study says.
    “This AI framework, applied to the specifics of this virus, can provide vaccine candidates within seconds and move them to clinical trials quickly to achieve preventive medical therapies without compromising safety,” said Paul Bogdan, associate professor of electrical and computer engineering at USC Viterbi and corresponding author of the study. “Moreover, this can be adapted to help us stay ahead of the coronavirus as it mutates around the world.”
    The findings appear today in Nature Research’s Scientific Reports
    When applied to SARS-CoV-2 — the virus that causes COVID-19 — the computer model quickly eliminated 95% of the compounds that could’ve possibly treated the pathogen and pinpointed the best options, the study says.

    advertisement

    The AI-assisted method predicted 26 potential vaccines that would work against the coronavirus. From those, the scientists identified the best 11 from which to construct a multi-epitope vaccine, which can attack the spike proteins that the coronavirus uses to bind and penetrate a host cell. Vaccines target the region — or epitope — of the contagion to disrupt the spike protein, neutralizing the ability of the virus to replicate.
    Moreover, the engineers can construct a new multi-epitope vaccine for a new virus in less than a minute and validate its quality within an hour. By contrast, current processes to control the virus require growing the pathogen in the lab, deactivating it and injecting the virus that caused a disease. The process is time-consuming and takes more than one year; meanwhile, the disease spreads.
    USC method could help counter COVID-19 mutations
    The method is especially useful during this stage of the pandemic as the coronavirus begins to mutate in populations around the world. Some scientists are concerned that the mutations may minimize the effectiveness of vaccines by Pfizer and Moderna, which are now being distributed. Recent variants of the virus that have emerged in the United Kingdom, South Africa and Brazil seem to spread more easily, which scientists say will rapidly lead to many more cases, deaths and hospitalizations.
    But Bogdan said that if SARS-CoV-2 becomes uncontrollable by current vaccines, or if new vaccines are needed to deal with other emerging viruses, then USC’s AI-assisted method can be used to design other preventive mechanisms quickly.

    advertisement

    For example, the study explains that the USC scientists used only one B-cell epitope and one T-cell epitope, whereas applying a bigger dataset and more possible combinations can develop a more comprehensive and quicker vaccine design tool. The study estimates the method can perform accurate predictions with over 700,000 different proteins in the dataset.
    “The proposed vaccine design framework can tackle the three most frequently observed mutations and be extended to deal with other potentially unknown mutations,” Bogdan said.
    The raw data for the research comes from a giant bioinformatics database called the Immune Epitope Database (IEDB) in which scientists around the world have been compiling data about the coronavirus, among other diseases. IEDB contains over 600,000 known epitopes from some 3,600 different species, along with the Virus Pathogen Resource, a complementary repository of information about pathogenic viruses. The genome and spike protein sequence of SARS-CoV-2 comes from the National Center for Biotechnical Information.
    COVID-19 has led to 87 million cases and more than 1.88 million deaths worldwide, including more than 400,000 fatalities in the United States. It has devastated the social, financial and political fabric of many countries.
    The study authors are Bogdan, Zikun Yang and Shahin Nazarian of the Ming Hsieh Department of Electrical and Computer Engineering at USC Viterbi.
    Support for the study comes from the National Science Foundation (NSF) under the Career Award (CPS/CNS-1453860) and NSF grants (CCF-1837131, MCB-1936775 and CNS-1932620); a U.S. Army Research Office grant (W911NF-17-1-0076); a Defense Advanced Research Projects Agency (DARPA) Young Faculty Award and Director Award grant (N66001-17-1-4044), and a Northrop Grumman grant. More

  • in

    Engineers develop programming technology to transform 2D materials into 3D shapes

    University of Texas at Arlington researchers have developed a technique that programs 2D materials to transform into complex 3D shapes.
    The goal of the work is to create synthetic materials that can mimic how living organisms expand and contract soft tissues and thus achieve complex 3D movements and functions. Programming thin sheets, or 2D materials, to morph into 3D shapes can enable new technologies for soft robotics, deployable systems, and biomimetic manufacturing, which produces synthetic products that mimic biological processes.
    Kyungsuk Yum, an associate professor in the Materials Science and Engineering Department, and his team have developed the 2D material programming technique for 3D shaping. It allows the team to print 2D materials encoded with spatially controlled in-plane growth or contraction that can transform to programmed 3D structures.
    Their research, supported by a National Science Foundation Early Career Development Award that Yum received in 2019, was published in January in Nature Communications.
    “There are a variety of 3D-shaped 2D materials in biological systems, and they play diverse functions,” Yum said. “Biological organisms often achieve complex 3D morphologies and motions of soft slender tissues by spatially controlling their expansion and contraction. Such biological processes have inspired us to develop a method that programs 2D materials with spatially controlled in-plane growth to produce 3D shapes and motions.”
    With this inspiration, the researchers developed an approach that can uniquely create 3D structures with doubly curved morphologies and motions, commonly seen in living organisms but difficult to replicate with human-made materials.
    They were able to form 3D structures shaped like automobiles, stingrays, and human faces. To physically realize the concept of 2D material programming, they used a digital light 4D printing method developed by Yum and shared in Nature Communications in 2018.
    “Our 2D-printing process can simultaneously print multiple 2D materials encoded with individually customized designs and transform them on demand and in parallel to programmed 3D structures,” said Amirali Nojoomi, Yum’s former graduate student and first author of the paper. “From a technological point of view, our approach is scalable, customizable, and deployable, and it can potentially complement existing 3D-printing methods.”
    The researchers also introduced the concept of cone flattening, where they program 2D materials using a cone surface to increase the accessible space of 3D shapes. To solve a shape selection problem, they devised shape-guiding modules in 2D material programming that steer the direction of shape morphing toward targeted 3D shapes. Their flexible 2D-printing process can also enable multimaterial 3D structures.
    “Dr. Yum’s innovative research has many potential applications that could change the way we look at soft engineering systems,” said Stathis Meletis, chair of the Materials Science and Engineering Department. “His pioneering work is truly groundbreaking.”

    Story Source:
    Materials provided by University of Texas at Arlington. Original written by Jeremy Agor. Note: Content may be edited for style and length. More

  • in

    Pushed to the limit: A CMOS-based transceiver for beyond 5G applications at 300 GHz

    Scientists at Tokyo Institute of Technology (Tokyo Tech) and NTT Corporation (NTT) develop a novel CMOS-based transceiver for wireless communications at the 300 GHz band, enabling future beyond-5G applications. Their design addresses the challenges of operating CMOS technology at its practical limit and represents the first wideband CMOS phased-array system to operate at such elevated frequencies.
    Communication at higher frequencies is a perpetually sought-after goal in electronics because of the greater data rates that would be possible and to take advantage of underutilized portions of the electromagnetic spectrum. Many applications beyond 5G, as well as the IEEE802.15.3d standard for wireless communications, call for transmitters and receivers capable of operating close to or above 300 GHz.
    Unfortunately, our trusty CMOS technology is not entirely suitable for such elevated frequencies. Near 300 GHz, amplification becomes considerably difficult. Although a few CMOS-based transceivers for 300 GHz have been proposed, they either lack enough output power, can only operate in direct line-of-sight conditions, or require a large circuit area to be implemented.
    To address these issues, a team of scientists from Tokyo Tech, in collaboration with NTT, proposed an innovative design for a 300 GHz CMOS-based transceiver. Their work will be presented in the Digests of Technical Papers in the 2021 IEEE ISSCC (International Solid-State Circuits Conference), a conference where the latest advances in solid-state and integrated circuits are exposed.
    One of the key features of the proposed design is that it is bidirectional; a great portion of the circuit, including the mixer, antennas, and local oscillator, is shared between the receiver and the transmitter. This means the overall circuit complexity and the total circuit area required are much lower than in unidirectional implementations.
    Another important aspect is the use of four antennas in a phased array configuration. Existing solutions for 300 GHz CMOS transmitters use a single radiating element, which limits the antenna gain and the system’s output power. An additional advantage is the beamforming capability of phased arrays, which allows the device to adjust the relative phases of the antenna signals to create a combined radiation pattern with custom directionality. The antennas used are stacked “Vivaldi antennas,” which can be etched directly onto PCBs, making them easy to fabricate.
    The proposed transceiver uses a subharmonic mixer, which is compatible with a bidirectional operation and requires a local oscillator with a comparatively lower frequency. However, this type of mixing results in low output power, which led the team to resort to an old yet functional technique to boost it. Professor Kenichi Okada from Tokyo Tech, who led the study, explains: “Outphasing is a method generally used to improve the efficiency of power amplifiers by enabling their operation at output powers close to the point where they no longer behave linearly — that is, without distortion. In our work, we used this approach to increase the transmitted output power by operating the mixers at their saturated output power.” Another notable feature of the new transceiver is its excellent cancellation of local oscillator feedthrough (a “leakage” from the local oscillator through the mixer and onto the output) and image frequency (a common type of interference for the method of reception used).
    The entire transceiver was implemented in an area as small as 4.17 mm2. It achieved maximum rates of 26 Gbaud for transmission and 18 Gbaud for reception, outclassing most state-of-the-art solutions. Excited about the results, Okada remarks: “Our work demonstrates the first implementation of a wideband CMOS phased-array system that operates at frequencies higher than 200 GHz.” Let us hope this study helps us squeeze more juice out of CMOS technology for upcoming applications in wireless communications!

    Story Source:
    Materials provided by Tokyo Institute of Technology. Note: Content may be edited for style and length. More

  • in

    'Audeo' teaches artificial intelligence to play the piano

    Anyone who’s been to a concert knows that something magical happens between the performers and their instruments. It transforms music from being just “notes on a page” to a satisfying experience.
    A University of Washington team wondered if artificial intelligence could recreate that delight using only visual cues — a silent, top-down video of someone playing the piano. The researchers used machine learning to create a system, called Audeo, that creates audio from silent piano performances. When the group tested the music Audeo created with music-recognition apps, such as SoundHound, the apps correctly identified the piece Audeo played about 86% of the time. For comparison, these apps identified the piece in the audio tracks from the source videos 93% of the time.
    The researchers presented Audeo Dec. 8 at the NeurIPS 2020 conference.
    “To create music that sounds like it could be played in a musical performance was previously believed to be impossible,” said senior author Eli Shlizerman, an assistant professor in both the applied mathematics and the electrical and computer engineering departments. “An algorithm needs to figure out the cues, or ‘features,’ in the video frames that are related to generating music, and it needs to ‘imagine’ the sound that’s happening in between the video frames. It requires a system that is both precise and imaginative. The fact that we achieved music that sounded pretty good was a surprise.”
    Audeo uses a series of steps to decode what’s happening in the video and then translate it into music. First, it has to detect which keys are pressed in each video frame to create a diagram over time. Then it needs to translate that diagram into something that a music synthesizer would actually recognize as a sound a piano would make. This second step cleans up the data and adds in more information, such as how strongly each key is pressed and for how long.
    “If we attempt to synthesize music from the first step alone, we would find the quality of the music to be unsatisfactory,” Shlizerman said. “The second step is like how a teacher goes over a student composer’s music and helps enhance it.”
    The researchers trained and tested the system using YouTube videos of the pianist Paul Barton. The training consisted of about 172,000 video frames of Barton playing music from well-known classical composers, such as Bach and Mozart. Then they tested Audeo with almost 19,000 frames of Barton playing different music from these composers and others, such as Scott Joplin.
    Once Audeo has generated a transcript of the music, it’s time to give it to a synthesizer that can translate it into sound. Every synthesizer will make the music sound a little different — this is similar to changing the “instrument” setting on an electric keyboard. For this study, the researchers used two different synthesizers.
    “Fluidsynth makes synthesizer piano sounds that we are familiar with. These are somewhat mechanical-sounding but pretty accurate,” Shlizerman said. “We also used PerfNet, a new AI synthesizer that generates richer and more expressive music. But it also generates more noise.”
    Audeo was trained and tested only on Paul Barton’s piano videos. Future research is needed to see how well it could transcribe music for any musician or piano, Shlizerman said.
    “The goal of this study was to see if artificial intelligence could generate music that was played by a pianist in a video recording — though we were not aiming to replicate Paul Barton because he is such a virtuoso,” Shlizerman said. “We hope that our study enables novel ways to interact with music. For example, one future application is that Audeo can be extended to a virtual piano with a camera recording just a person’s hands. Also, by placing a camera on top of a real piano, Audeo could potentially assist in new ways of teaching students how to play.”
    Kun Su and Xiulong Liu, both doctoral students in electrical and computer engineering, are co-authors on this paper. This research was funded by the Washington Research Foundation Innovation Fund as well as the applied mathematics and electrical and computer engineering departments.

    Story Source:
    Materials provided by University of Washington. Original written by Sarah McQuate. Note: Content may be edited for style and length. More

  • in

    Shopping online? Here's what you should know about user reviews

    If you’re about to buy something online and its only customer review is negative, you’d probably reconsider the purchase, right? It turns out a product’s first review can have an outsized effect on the item’s future — it can even cause the product to fail.
    Shoppers, retailers and manufacturers alike feel the effects of customer reviews. Researchers at the University of Florida’s Warrington College of Business looked at the influence of the first review after noticing the exact same products getting positive reviews on one retailer’s website but negative reviews on others, said Sungsik Park, Ph.D., who studied the phenomenon as a doctoral student at UF.
    “Why would a product receive a 4.7-star rating with 100 reviews on Amazon, but only four or five reviews with a two-star rating Walmart or BestBuy?” Park wondered.
    To find out, Park — now an assistant professor at the Darla Moore School of Business at the University of South Carolina — teamed up with UF professors Jinhong Xie, Ph.D., and Woochoel Shin, Ph.D., to analyze what might cause the variation. By comparing identical vacuum cleaners, toasters and digital cameras on Amazon and Best Buy, they were able to isolate the first review as the variable in how the product fared. They showed that the first review can affect a product’s overall reviews for up to three years, influencing both the amount and the tone of later reviews.
    “The first review has the potential to sway the entire evolution path of online consumer reviews,” Shin said.
    How could one review have such a lasting impact? When the first review on a retailer’s site was positive, the product went on to garner a larger number of reviews overall, and they were more likely to be positive. When a product got a negative first review, fewer people were willing to take a chance on buying it, so it had fewer opportunities to receive positive reviews, creating a lingering impact from the first unhappy customer.
    “Once you think about how user reviews are generated, it makes sense,” Park said.
    The findings, published in the journal Marketing Science, suggest that retailers and manufacturers should take steps to detect negative first reviews and mitigate their impact.
    Firms generally monitor their online reviews and evaluate their strategies accordingly, Xie explained. “However, they do so by focusing on average rating rather than a single rating, and after the product has sufficient time to be evaluated by consumers. Our research suggests that firms need to pay attention to a special single review (i.e., the first one) as soon as it is posted.”
    Consumers, on the other hand, might want to check multiple sites’ reviews before they rule out a product. If you’re looking at several sites to compare prices, Park suggests comparison shopping reviews, too. (For big ticket items, Park also checks third-party reviews like Consumer Reports.)
    Because shoppers consider user reviews more trustworthy than information from advertising, it’s important to understand the factors that could skew those ratings.
    “We want consumers to know that this information can be easily distorted,” Park said.

    Story Source:
    Materials provided by University of Florida. Original written by Alisson Clark. Note: Content may be edited for style and length. More

  • in

    Using Artificial Intelligence to prevent harm caused by immunotherapy

    Researchers at Case Western Reserve University, using artificial intelligence (AI) to analyze simple tissue scans, say they have discovered biomarkers that could tell doctors which lung cancer patients might actually get worse from immunotherapy.
    Until recently, researchers and oncologists had placed these lung cancer patients into two broad categories: those who would benefit from immunotherapy, and those who likely would not.
    But a third category — patients called hyper-progressors who would actually be harmed by immunotherapy, including a shortened lifespan after treatment — has begun to emerge, said Pranjal Vaidya, a PhD student in biomedical engineering and researcher at the university’s Center for Computational Imaging and Personalized Diagnostics (CCIPD).
    “This is a significant subset of patients who should potentially avoid immunotherapy entirely,” said Vaidya, first author on a 2020 paper announcing the findings in the Journal for Immunotherapy of Cancer. “Eventually, we would want this to be integrated into clinical settings, so that the doctors would have all the information needed to make the call for each individual patient.”
    Ongoing research into immunotherapy
    Currently, only about 20% of all cancer patients will actually benefit from immunotherapy, a treatment that differs from chemotherapy in that it uses drugs to help the immune system fight cancer, while chemotherapy uses drugs to directly kill cancer cells, according to the National Cancer Institute.

    advertisement

    The CCIPD, led by Anant Madabhushi, Donnell Institute Professor of Biomedical Engineering, has become a global leader in the detection, diagnosis and characterization of various cancers and other diseases by meshing medical imaging, machine learning and AI.
    This new work follows other recent research by CCIPD scientists which has demonstrated that AI and machine learning can be used to predict which lung cancer patients will benefit from immunotherapy.
    In this and previous research, scientists from Case Western Reserve and Cleveland Clinic essentially teach computers to seek and identify patterns in CT scans taken when lung cancer is first diagnosed to reveal information that could have been useful if known before treatment.
    And while many cancer patients have benefitted from immunotherapy, researchers are seeking a better way to identify who would mostly likely respond to those treatments.
    “This is an important finding because it shows that radiomic patterns from routine CT scans are able to discern three kinds of response in lung cancer patients undergoing immunotherapy treatment — responders, non-responders and the hyper-progressors,” said Madabhushi, senior author of the study.

    advertisement

    “There are currently no validated biomarkers to distinguish this subset of high risk patients that not only don’t benefit from immunotherapy but may in fact develop rapid acceleration of disease on treatment,” said Pradnya Patil, MD, FACP, associate staff at Taussig Cancer Institute, Cleveland Clinic, and study author.
    “Analysis of radiomic features on pre-treatment routinely performed scans could provide a non-invasive means to identify these patients,” Patil said. “This could prove to be an invaluable tool for treating clinicians while determining optimal systemic therapy for their patients with advanced non- small cell lung cancer.”
    Information outside the tumor
    As with other previous cancer research at the CCIPD, scientists again found some of the most significant clues to which patients would be harmed by immunotherapy outside the tumor.
    “We noticed the radiomic features outside the tumor were more predictive than those inside the tumor, and changes in the blood vessels surrounding the nodule were also more predictive,” Vaidya said.
    This most recent research was conducted with data collected from 109 patients with non-small cell lung cancer being treated with immunotherapy, she said. More

  • in

    Machine-learning model helps determine protein structures

    Cryo-electron microscopy (cryo-EM) allows scientists to produce high-resolution, three-dimensional images of tiny molecules such as proteins. This technique works best for imaging proteins that exist in only one conformation, but MIT researchers have now developed a machine-learning algorithm that helps them identify multiple possible structures that a protein can take.
    Unlike AI techniques that aim to predict protein structure from sequence data alone, protein structure can also be experimentally determined using cryo-EM, which produces hundreds of thousands, or even millions, of two-dimensional images of protein samples frozen in a thin layer of ice. Computer algorithms then piece together these images, taken from different angles, into a three-dimensional representation of the protein in a process termed reconstruction.
    In a Nature Methods paper, the MIT researchers report a new AI-based software for reconstructing multiple structures and motions of the imaged protein — a major goal in the protein science community. Instead of using the traditional representation of protein structure as electron-scattering intensities on a 3D lattice, which is impractical for modeling multiple structures, the researchers introduced a new neural network architecture that can efficiently generate the full ensemble of structures in a single model.
    “With the broad representation power of neural networks, we can extract structural information from noisy images and visualize detailed movements of macromolecular machines,” says Ellen Zhong, an MIT graduate student and the lead author of the paper.
    With their software, they discovered protein motions from imaging datasets where only a single static 3D structure was originally identified. They also visualized large-scale flexible motions of the spliceosome — a protein complex that coordinates the splicing of the protein coding sequences of transcribed RNA.
    “Our idea was to try to use machine-learning techniques to better capture the underlying structural heterogeneity, and to allow us to inspect the variety of structural states that are present in a sample,” says Joseph Davis, the Whitehead Career Development Assistant Professor in MIT’s Department of Biology.

    advertisement

    Davis and Bonnie Berger, the Simons Professor of Mathematics at MIT and head of the Computation and Biology group at the Computer Science and Artificial Intelligence Laboratory, are the senior authors of the study, which appears today in Nature Methods. MIT postdoc Tristan Bepler is also an author of the paper.
    Visualizing a multistep process
    The researchers demonstrated the utility of their new approach by analyzing structures that form during the process of assembling ribosomes — the cell organelles responsible for reading messenger RNA and translating it into proteins. Davis began studying the structure of ribosomes while a postdoc at the Scripps Research Institute. Ribosomes have two major subunits, each of which contains many individual proteins that are assembled in a multistep process.
    To study the steps of ribosome assembly in detail, Davis stalled the process at different points and then took electron microscope images of the resulting structures. At some points, blocking assembly resulted in accumulation of just a single structure, suggesting that there is only one way for that step to occur. However, blocking other points resulted in many different structures, suggesting that the assembly could occur in a variety of ways.
    Because some of these experiments generated so many different protein structures, traditional cryo-EM reconstruction tools did not work well to determine what those structures were.

    advertisement

    “In general, it’s an extremely challenging problem to try to figure out how many states you have when you have a mixture of particles,” Davis says.
    After starting his lab at MIT in 2017, he teamed up with Berger to use machine learning to develop a model that can use the two-dimensional images produced by cryo-EM to generate all of the three-dimensional structures found in the original sample.
    In the new Nature Methods study, the researchers demonstrated the power of the technique by using it to identify a new ribosomal state that hadn’t been seen before. Previous studies had suggested that as a ribosome is assembled, large structural elements, which are akin to the foundation for a building, form first. Only after this foundation is formed are the “active sites” of the ribosome, which read messenger RNA and synthesize proteins, added to the structure.
    In the new study, however, the researchers found that in a very small subset of ribosomes, about 1 percent, a structure that is normally added at the end actually appears before assembly of the foundation. To account for that, Davis hypothesizes that it might be too energetically expensive for cells to ensure that every single ribosome is assembled in the correct order.
    “The cells are likely evolved to find a balance between what they can tolerate, which is maybe a small percentage of these types of potentially deleterious structures, and what it would cost to completely remove them from the assembly pathway,” he says.
    Viral proteins
    The researchers are now using this technique to study the coronavirus spike protein, which is the viral protein that binds to receptors on human cells and allows them to enter cells. The receptor binding domain (RBD) of the spike protein has three subunits, each of which can point either up or down.
    “For me, watching the pandemic unfold over the past year has emphasized how important front-line antiviral drugs will be in battling similar viruses, which are likely to emerge in the future. As we start to think about how one might develop small molecule compounds to force all of the RBDs into the ‘down’ state so that they can’t interact with human cells, understanding exactly what the ‘up’ state looks like and how much conformational flexibility there is will be informative for drug design. We hope our new technique can reveal these sorts of structural details,” Davis says.
    The research was funded by the National Science Foundation Graduate Research Fellowship Program, the National Institutes of Health, and the MIT Jameel Clinic for Machine Learning and Health. This work was supported by MIT Satori computation cluster hosted at the MGHPCC. More