More stories

  • in

    First direct imaging of small noble gas clusters at room temperature

    For the first time, scientists have succeeded in the stabilisation and direct imaging of small clusters of noble gas atoms at room temperature. This achievement opens up exciting possibilities for fundamental research in condensed matter physics and applications in quantum information technology. The key to this breakthrough, achieved by scientists at the University of Vienna in collaboration with colleagues at the University of Helsinki, was the confinement of noble gas atoms between two layers of graphene.
    This method overcomes the difficulty that noble gases do not form stable structures under experimental conditions at ambient temperatures. Details of the method and the first ever electron microscopy images of noble gas structures (krypton and xenon) have now been published in Nature Materials.
    A Noble Trap
    Jani Kotakoski’s group at the University of Vienna was investigating the use of ion irradiation to modify the properties of graphene and other two-dimensional materials when they noticed something unusual: when noble gases are used to irradiate, they can get trapped between two sheets of graphene. This happens when noble gas ions are fast enough to pass through the first but not the second graphene layer. Once trapped between the layers, the noble gases are free to move. This is because they do not form chemical bonds. However, in order to accommodate the noble gas atoms, the graphene bends to form tiny pockets. Here, two or more noble gas atoms can meet and form regular, densely packed, two-dimensional noble gas nanoclusters.
    Fun with Microscope
    “We used scanning transmission electron microscopy to observe these clusters, and they are really fascinating and a lot of fun to watch. They rotate, jump, grow and shrink as we image them,” says Manuel Längle, lead author of the study. “Getting the atoms between the layers was the hardest part of the work. Now that we have achieved this, we have a simple system for studying fundamental processes related to material growth and behavior ,” he adds. Commenting on the group’s future work, Jani Kotakoski says: “The next steps are to study the properties of clusters with different noble gases and how they behave at low and high temperatures. Due to the use of noble gases in light sources and lasers, these new structures may in future enable applications for example in quantum information technology.” More

  • in

    New study pinpoints the weaknesses in AI

    ChatGPT and other solutions built on Machine Learning are surging. But even the most successful algorithms have limitations. As the first in the world researchers from University of Copenhagen has proven mathematically that apart from simple problems it is not possible to create algorithms for AI that will always be stable. The study may lead to guidelines on how to better test algorithms and reminds us that machines do not have human intelligence after all.
    Machines interpret medical scanning images more accurately than doctors, they translate foreign languages, and may soon be able to drive cars more safely than humans. However, even the best algorithms do have weaknesses. A research team at Department of Computer Science, University of Copenhagen, tries to reveal them.
    Take an automated vehicle reading a road sign as an example. If someone has placed a sticker on the sign, this will not distract a human driver. But a machine may easily be put off because the sign is now different from the ones it was trained on.
    “We would like algorithms to be stable in the sense, that if the input is changed slightly the output will remain almost the same. Real life involves all kinds of noise which humans are used to ignore, while machines can get confused,” says Professor Amir Yehudayoff, heading the group.
    A language for discussing weaknesses
    As the first in the world, the group together with researchers from other countries has proven mathematically that apart from simple problems it is not possible to create algorithms for Machine Learning that will always be stable. The scientific article describing the result was approved for publication at one of the leading international conferences on theoretical computer science, Foundations of Computer Science (FOCS).
    “I would like to note that we have not worked directly on automated car applications. Still, this seems like a problem too complex for algorithms to always be stable,” says Amir Yehudayoff, adding that this does not necessarily imply major consequences in relation to development of automated cars:
    “If the algorithm only errs under a few very rare circumstances this may well be acceptable. But if it does so under a large collection of circumstances, it is bad news.”

    The scientific article cannot be applied by industry for identifying bugs in its algorithms. This wasn’t the intension, the professor explains:
    “We are developing a language for discussing the weaknesses in Machine Learning algorithms. This may lead to development of guidelines that describe how algorithms should be tested. And in the long run this may again lead to development of better and more stable algorithms.”
    From intuition to mathematics
    A possible application could be for testing algorithms for protection of digital privacy.
    “Some company might claim to have developed an absolutely secure solution for privacy protection. Firstly, our methodology might help to establish that the solution cannot be absolutely secure. Secondly, it will be able to pinpoint points of weakness,” says Amir Yehudayoff.
    First and foremost, though, the scientific article contributes to theory. Especially the mathematical content is groundbreaking, he adds:
    “We understand intuitively, that a stable algorithm should work almost as well as before when exposed to a small amount of input noise. Just like the road sign with a sticker on it. But as theoretical computer scientists we need a firm definition. We must be able to describe the problem in the language of mathematics. Exactly how much noise must the algorithm be able to withstand, and how close to the original output should the output be if we are to accept the algorithm to be stable? This is what we have suggested an answer to.”

    Important to keep limitations in mind
    The scientific article has received large interest from colleagues in the theoretical computer science world, but not from the tech industry. Not yet at least.
    “You should always expect some delay between a new theoretical development and interest from people working in applications,” says Amir Yehudayoff while adding smilingly:
    “And some theoretical developments will remain unnoticed forever.”
    However, he does not see that happening in this case:
    “Machine Learning continues to progress rapidly, and it is important to remember that even solutions which are very successful in the real world still do have limitations. The machines may sometimes seem to be able to think but after all they do not possess human intelligence. This is important to keep in mind.” More

  • in

    Artificial intelligence helps unlock advances in wireless communications

    A new wave of communication technology is quickly approaching and researchers at UBC Okanagan are investigating ways to configure next-generation mobile networks.
    Dr. Anas Chaaban works in the UBCO Communication Theory Lab where researchers are busy analyzing a theoretical wireless communication architecture that will be optimized to handle increasing data loads while sending and receiving data faster.
    Next-generation mobile networks are expected to outperform 5G on many fronts such as reliability, coverage and intelligence, explains Dr. Chaaban, an Assistant Professor in UBCO’s School of Engineering.
    And the benefits go far beyond speed. The next generation of technology is expected to be a fully integrated system that allows for instantaneous communications between devices, consumers and the surrounding environment, he says.
    These new networks will call for intelligent architectures that support massive connectivity, ultra-low latency, ultra-high reliability, high-quality experience, energy efficiency and lower deployment costs.
    “One way to meet these stringent requirements is to rethink traditional communication techniques by exploiting recent advances in artificial intelligence,” he says. “Traditionally, functions such as waveform design, channel estimation, interference mitigation and error detection and correction are developed based on theoretical models and assumptions. This traditional approach is not capable of adapting to new challenges introduced by emerging technologies.”
    Using a technology called transformer masked autoencoders, the researchers are developing techniques that enhance efficiency, adaptability and robustness. Dr. Chaaban says while there are many challenges in this research, it is expected it will play an important role in next-generation communication networks.
    “We are working on ways to take content like images or video files and break them down into smaller packets in order to transport them to a recipient,” he says “The interesting thing is that we can throw away a number of packets and rely on AI to recover them at the recipient, which then links them back together to recreate the image or video.”
    The experience, even today, is something users take for granted but next-generation technology — where virtual reality will be a part of everyday communications including cell phone calls — is positioned to improve wireless systems substantially, he adds. The potential is unparalleled.
    “AI provides us with the power to develop complex architectures that propel communications technologies forward to cope with the proliferation of advanced technologies such as virtual reality,” says Chaaban. “By collectively tackling these intricacies, the next generation of wireless technology can usher in a new era of adaptive, efficient and secure communication networks.” More

  • in

    Toward efficient spintronic materials

    A research team from Osaka University, The University of Tokyo, and Tokyo Institute of Technology revealed the microscopic origin of the large magnetoelectric effect in interfacial multiferroics composed of the ferromagnetic Co2FeSi Heusler alloy and the piezoelectric material. They observed element-specific changes in the orbital magnetic moments in the interfacial multiferroic material using an X-ray Magnetic Circular Dichroism (XMCD) measurement under the application of an electric field, and they showed the change contributes to the large magnetoelectric effect.
    The findings provide guidelines for designing materials with a large magnetoelectric effect, and it will be useful in developing new information writing technology that consumes less power in spintronic memory devices.
    The research results will be shown in an article, “Strain-induced specific orbital control in a Heusler alloy-based interfacial multiferroics” published in NPG Asia Materials.
    Controlling the direction of magnetization using low electric field is necessary for developing efficient spintronic devices. In spintronics, properties of an electron’s spin or magnetic moment are used to store information. The electron spins can be manipulated by straining orbital magnetic moments to create a high-performance magnetoelectric effect.
    Japanese researchers, including Jun Okabayashi from the University of Tokyo, revealed a strain-induced orbital control mechanism in interfacial multiferroics. In multiferroic material, the magnetic property can be controlled using an electric field — potentially leading to efficient spintronic devices. The interfacial multiferroics that Okabayashi and his colleagues studied consist of a junction between a ferromagnetic material and a piezoelectric material. The direction of magnetization in the material could be controlled by applying voltage.
    The team showed the microscopic origin of the large magnetoelectric effect in the material. The strain generated from the piezoelectric material could change the orbital magnetic moment of the ferromagnetic material. They revealed element-specific orbital control in the interfacial multiferroic material using reversible strain and provided guidelines for designing materials with a large magnetoelectric effect. The findings will be useful in developing new information writing technology that consumes less power. More

  • in

    Integrating dimensions to get more out of Moore’s Law and advance electronics

    Moore’s Law, a fundamental scaling principle for electronic devices, forecasts that the number of transistors on a chip will double every two years, ensuring more computing power — but a limit exists.
    Today’s most advanced chips house nearly 50 billion transistors within a space no larger than your thumbnail. The task of cramming even more transistors into that confined area has become more and more difficult, according to Penn State researchers.
    In a study published today (Jan. 10) in the journal Nature, Saptarshi Das, an associate professor of engineering science and mechanics and co-corresponding author of the study, and his team suggest a remedy: seamlessly implementing 3D integration with 2D materials.
    In the semiconductor world, 3D integration means vertically stacking multiple layers of semiconductor devices. This approach not only facilitates the packing of more silicon-based transistors onto a computer chip, commonly referred to as “More Moore,” but also permits the use of transistors made from 2D materials to incorporate diverse functionalities within various layers of the stack, a concept known as “More than Moore.”
    With the work outlined in the study, Saptarshi and the team demonstrate feasible paths beyond scaling current tech to achieve both More Moore and More than Moore through monolithic 3D integration. Monolithic 3D integration is a fabrication process wherein researchers directly make the devices on the one below, as compared to the traditional process of stacking independently fabricated layers.
    “Monolithic 3D integration offers the highest density of vertical connections as it does not rely on bonding of two pre-patterned chips — which would require microbumps where two chips are bonded together — so you have more space to make connections,” said Najam Sakib, graduate research assistant in engineering science and mechanics and co-author of the study.
    Monolithic 3D integration faces significant challenges, though, according to Darsith Jayachandran, graduate research assistant in engineering science and mechanics and co-corresponding author of the study, since conventional silicon components would melt under the processing temperatures.

    “One challenge is the process temperature ceiling of 450 degrees Celsius (C) for back-end integration for silicon-based chips — our monolithic 3D integration approach drops that temperate significantly to less than 200 C,” Jayachandran said, explaining that the process temperature ceiling is the maximum temperature allowed before damaging the prefabricated structures. “Incompatible process temperature budgets make monolithic 3D integration challenging with silicon chips, but 2D materials can withstand temperatures needed for the process.”
    The researchers used existing techniques for their approach, but they are the first to successfully achieve monolithic 3D integration at this scale using 2D transistors made with 2D semiconductors called transition metal dichalcogenides.
    The ability to vertically stack the devices in 3D integration also enabled more energy-efficient computing because it solved a surprising problem for such tiny things as transistors on a computer chip: distance.
    “By stacking devices vertically on top of each other, you’re decreasing the distance between devices, and therefore, you’re decreasing the lag and also the power consumption,” said Rahul Pendurthi, graduate research assistant in engineering science and mechanics and co-corresponding author of the study.
    By decreasing the distance between devices, the researchers achieved “More Moore.” By incorporating transistors made with 2D materials, the researchers met the “More than Moore” criterion as well. The 2D materials are known for their unique electronic and optical properties, including sensitivity to light, which makes these materials ideal as sensors. This is useful, the researchers said, as the number of connected devices and edge devices — things like smartphones or wireless home weather stations that gather data on the ‘edge’ of a network — continue to increase.
    “‘More Than Moore’ refers to a concept in the tech world where we are not just making computer chips smaller and faster, but also with more functionalities,” said Muhtasim Ul Karim Sadaf, graduate research assistant in engineering science and mechanics and co-author of the study. “It is about adding new and useful features to our electronic devices, like better sensors, improved battery management or other special functions, to make our gadgets smarter and more versatile.”
    Using 2D devices for 3D integration has several other advantages, the researchers said. One is superior carrier mobility, which refers to how an electrical charge is carried in semiconductor materials. Another is being ultra-thin, enabling the researchers to fit more transistors on each tier of the 3D integration and enable more computing power.

    While most academic research involves small-scale prototypes, this study demonstrated 3D integration at a massive scale, characterizing tens of thousands of devices. According to Das, this achievement bridges the gap between academia and industry and could lead to future partnerships where industry leverages Penn State’s 2D materials expertise and facilities. The advance in scaling was enabled by the availability of high-quality, wafer-scale transition metal dichalcogenides developed by researchers at Penn State’s Two-Dimensional Crystal Consortium (2DCC-MIP), a U.S. National Science Foundation (NSF) Materials Innovation Platform and national user facility.
    “This breakthrough demonstrates yet again the essential role of materials research as the foundation of the semiconductor industry and U.S. competitiveness,” said Charles Ying, program director for NSF’s Materials Innovation Platforms. “Years of effort by Penn State’s Two-Dimensional Crystal Consortium to improve the quality and size of 2D materials have made it possible to achieve 3D integration of semiconductors at a size that can be transformative for electronics.”
    According to Das, this technological advancement is only the first step.
    “Our ability to demonstrate, at wafer scale, a huge number of devices shows that we have been able to translate this research to a scale which can be appreciated by the semiconductor industry,” Das said. “We have put 30,000 transistors in each tier, which may be a record number. This puts Penn State in a very unique position to lead some of the work and partner with the U.S. semiconductor industry in advancing this research.”
    Along with Das, Jayachandran, Pendurthi, Sadaf and Sakib, other authors include Andrew Pannone, doctoral student in engineering science and mechanics; Chen Chen, assistant research professor in 2DCC-MIP; Ying Han, postdoctoral researcher in mechanical engineering; Nicholas Trainor, doctoral student in materials science and engineering; Shalini Kumari, postdoctoral scholar; Thomas McKnight, doctoral student in materials science and engineering; Joan Redwing, director of the 2DCC-MIP and distinguished professor of materials science and engineering and of electrical engineering; and Yang Yang, assistant professor of engineering science and mechanics.
    The U.S. National Science Foundation and Army Research Office supported this research. More

  • in

    Researchers use spinning metasurfaces to craft compact thermal imaging system

    Researchers have developed a new technology that uses meta-optical devices to perform thermal imaging. The approach provides richer information about imaged objects, which could broaden the use of thermal imaging in fields such as autonomous navigation, security, thermography, medical imaging and remote sensing.
    “Our method overcomes the challenges of traditional spectral thermal imagers, which are often bulky and delicate due to their reliance on large filter wheels or interferometers,” said research team leader Zubin Jacob from Purdue University. “We combined meta-optical devices and cutting-edge computational imaging algorithms to create a system that is both compact and robust while also having a large field of view.”
    In Optica, Optica Publishing Group’s journal for high-impact research, the authors describe their new spectro-polarimetric decomposition system, which uses a stack of spinning metasurfaces to break down thermal light into its spectral and polarimetric components. This allows the imaging system to capture the spectral and polarization details of thermal radiation in addition to the intensity information that is acquired with traditional thermal imaging.
    The researchers showed that the new system can be used with a commercial thermal camera to successfully classify various materials, a task that is typically challenging for conventional thermal cameras. The method’s ability to distinguish temperature variations and identify materials based on spectro-polarimetric signatures could help boost safety and efficiency for a variety of applications, including autonomous navigation.
    “Traditional autonomous navigation approaches rely heavily on RGB cameras, which struggle in challenging conditions like low light or bad weather,” said the paper’s first author Xueji Wang, a postdoctoral researcher at Purdue University. “When integrated with heat-assisted detection and ranging technology, our spectro-polarimetric thermal camera can provide vital information in these difficult scenarios, offering clearer images than RGB or conventional thermal cameras. Once we achieve real-time video capture, the technology could significantly enhance scene perception and overall safety.”
    Doing more with a smaller imager
    Spectro-polarimetric imaging in the long-wave infrared is crucial for applications such as night vision, machine vision, trace gas sensing and thermography. However, today’s spectro-polarimetric long-wave infrared imagers are bulky and limited in spectral resolution and field of view.

    To overcome these limitations the researchers turned to large-area metasurfaces — ultra-thin structured surfaces that can manipulate light in complex ways. After engineering spinning dispersive metasurfaces with tailored infrared responses, they developed a fabrication process that allowed these metasurfaces to be used to create large-area (2.5-cm diameter) spinning devices suitable for imaging applications. The resulting spinning stack measures less than 10 x 10 x 10 cm and can be used with a traditional infrared camera.
    “Integrating these large-area meta-optical devices with computational imaging algorithms facilitated the efficient reconstruction of the thermal radiation spectrum,” said Wang. “This enabled a more compact, robust and effective spectro-polarimetric thermal imaging system than was previously achievable.”
    Classifying materials with thermal imaging
    To evaluate their new system, the researchers spelled out “Purdue” using various materials and microstructures, each with unique spectro-polarimetric properties. Using the spectro-polarimetric information acquired with the system, they accurately distinguished the different materials and objects. They also demonstrated a three-fold increase in material classification accuracy compared to traditional thermal imaging methods, highlighting the system’s effectiveness and versatility.
    The researchers say that the new method could be especially useful for applications that require detailed thermal imaging. “In security, for example, it could revolutionize airport systems by detecting concealed items or substances on people,” said Wang. “Moreover, its compact and robust design enhances its suitability for diverse environmental conditions, making it particularly beneficial for applications such as autonomous navigation.”
    In addition to working to achieve video capture with the system, the researchers are trying to enhance the technique’s spectral resolution, transmission efficiency and speed of image capture and processing. They also plan to improve the metasurface design to enable more complex light manipulation for higher spectral resolution. Additionally, they want to extend the method to room-temperature imaging since the use of metasurface stacks restricted the method to high-temperature objects. They plan to do this using improved materials, metasurface designs and techniques like anti-reflection coatings. More

  • in

    AI discovers that not every fingerprint is unique

    From “Law and Order” to “CSI,” not to mention real life, investigators have used fingerprints as the gold standard for linking criminals to a crime. But if a perpetrator leaves prints from different fingers in two different crime scenes, these scenes are very difficult to link, and the trace can go cold.
    It’s a well-accepted fact in the forensics community that fingerprints of different fingers of the same person — “intra-person fingerprints” — are unique, and therefore unmatchable.
    Research led by Columbia Engineering undergraduate
    A team led by Columbia Engineering undergraduate senior Gabe Guo challenged this widely held presumption. Guo, who had no prior knowledge of forensics, found a public U.S. government database of some 60,000 fingerprints and fed them in pairs into an artificial intelligence-based system known as a deep contrastive network. Sometimes the pairs belonged to the same person (but different fingers), and sometimes they belonged to different people.
    AI has potential to greatly improve forensic accuracy
    Over time, the AI system, which the team designed by modifying a state-of-the-art framework, got better at telling when seemingly unique fingerprints belonged to the same person and when they didn’t. The accuracy for a single pair reached 77%. When multiple pairs were presented, the accuracy shot significantly higher, potentially increasing current forensic efficiency by more than tenfold. The project, a collaboration between Hod Lipson’s Creative Machines lab at Columbia Engineering and Wenyao Xu’s Embedded Sensors and Computing lab at University at Buffalo, SUNY, was published today in Science Advances.
    Study findings challenge-and surprise-forensics community
    Once the team verified their results, they quickly sent the findings to a well-established forensics journal, only to receive a rejection a few months later. The anonymous expert reviewer and editor concluded that “It is well known that every fingerprint is unique,” and therefore it would not be possible to detect similarities even if the fingerprints came from the same person.

    The team did not give up. They doubled down on the lead, fed their AI system even more data, and the system kept improving. Aware of the forensics community’s skepticism, the team opted to submit their manuscript to a more general audience. The paper was rejected again, but Lipson, who is the James and Sally Scapa Professor of Innovation in the Department of Mechanical Engineering and co-director of the Makerspace Facility, appealed. “I don’t normally argue editorial decisions, but this finding was too important to ignore,” he said. “If this information tips the balance, then I imagine that cold cases could be revived, and even that innocent people could be acquitted.”
    While the system’s accuracy is not sufficient to officially decide a case, it can help prioritize leads in ambiguous situations. After more back and forth, the paper was finally accepted for publication by Science Advances.
    Unveiled: a new kind of forensic marker to precisely capture fingerprints
    One of the sticking points was the following question: What alternative information was the AI actually using that has evaded decades of forensic analysis? After careful visualizations of the AI system’s decision process, the team concluded that the AI was using a new kind of forensic marker.
    “The AI was not using ‘minutiae,’ which are the branchings and endpoints in fingerprint ridges — the patterns used in traditional fingerprint comparison,” said Guo, who began the study as a first-year student at Columbia Engineering in 2021. “Instead, it was using something else, related to the angles and curvatures of the swirls and loops in the center of the fingerprint.”
    Columbia Engineering senior Aniv Ray and PhD student Judah Goldfeder, who helped analyze the data, noted that their results are just the beginning. “Just imagine how well this will perform once it’s trained on millions, instead of thousands of fingerprints,” said Ray.

    VIDEO: https://youtu.be/s5esfRbBc18
    A need for broader datasets
    The team is aware of potential biases in the data. The authors present evidence that indicates that the AI performs similarly across genders and races, where samples were available. However, they note, more careful validation needs to be done using datasets with broader coverage if this technique is to be used in practice.
    Transformative potential of AI in a well-established field
    This discovery is an example of more surprising things to come from AI, notes Lipson, . “Many people think that AI cannot really make new discoveries-that it just regurgitates knowledge,” he said. “But this research is an example of how even a fairly simple AI, given a fairly plain dataset that the research community has had lying around for years, can provide insights that have eluded experts for decades.”
    He added, “Even more exciting is the fact that an undergraduate student, with no background in forensics whatsoever, can use AI to successfully challenge a widely held belief of an entire field. We are about to experience an explosion of AI-led scientific discovery by non-experts, and the expert community, including academia, needs to get ready.” More

  • in

    Researchers developing AI to make the internet more accessible

    In an effort to make the internet more accessible for people with disabilities, researchers at The Ohio State University have begun developing an artificial intelligence agent that could complete complex tasks on any website using simple language commands.
    In the three decades since it was first released into the public domain, the world wide web has become an incredibly intricate, dynamic system. Yet because internet function is now so integral to society’s well-being, its complexity also makes it considerably harder to navigate.
    Today there are billions of websites available to help access information or communicate with others, and many tasks on the internet can take more than a dozen steps to complete. That’s why Yu Su, co-author of the study and an assistant professor of computer science and engineering at Ohio State, said their work, which uses information taken from live sites to create web agents — online AI helpers — is a step toward making the digital world a less confusing place.
    “For some people, especially those with disabilities, it’s not easy for them to browse the internet,” said Su. “We rely more and more on the computing world in our daily life and work, but there are increasingly a lot of barriers to that access, which, to some degree, widens the disparity.”
    The study was presented in December at the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), a flagship conference for AI and machine learning research.
    By taking advantage of the power of large language models, the agent works similarly to how humans behave when browsing the web, said Su. The Ohio State team showed that their model was able to understand the layout and functionality of different websites using only its ability to process and predict language.
    Researchers started the process by creating Mind2Web, the first dataset for generalist web agents. Though previous efforts to build web agents focused on toy simulated websites, Mind2Web fully embraces the complex and dynamic nature of real-world websites and emphasizes an agent’s ability of generalizing to entirely new websites it has never seen before. Su said that much of their success is due to their agent’s ability to handle the internet’s ever-evolving learning curve. The team lifted over 2,000 open-ended tasks from 137 different real-world websites, which they then used to train the agent.

    Some of the tasks included booking one-way and round-trip international flights, following celebrity accounts on Twitter, browsing comedy films from 1992 to 2017 streaming on Netflix, and even scheduling car knowledge tests at the DMV. Many of the tasks were very complex — for example, booking one of the international flights used in the model would take 14 actions. Such effortless versatility allows for diverse coverage on a number of websites, and opens up a new landscape for future models to explore and learn in an autonomous fashion, said Su.
    “It’s only become possible to do something like this because of the recent development of large language models like ChatGPT,” said Su. Since the chatbot became public in November 2022, millions of users have used it to automatically generate content, from poetry and jokes to cooking advice and medical diagnoses.
    Still, because one website could contain thousands of raw HTML elements, it would be too costly to feed so much information to a single large language model. To address this gap, the study also introduces a framework called MindAct, a two-pronged agent that uses both small and large language models to carry out these tasks. The team found that by using this strategy, MindAct significantly outperforms other common modeling strategies and is able to understand various concepts at a decent level.
    With more fine-tuning, the study points out, the model could likely be used in tandem with both open-and closed-source large language models such as Flan-T5 or GPT-4. However, their work does highlight an increasingly relevant ethical problem in creating flexible artificial intelligence, said Su. While it could certainly serve as a helpful agent to humans surfing the web, the model could also be used to enhance systems like ChatGPT and turn the entire internet into an unprecedentedly powerful tool, said Su.
    “On the one hand, we have great potential to improve our efficiency and to allow us to focus on the most creative part of our work,” he said. “But on the other hand, there’s tremendous potential for harm.” For instance, autonomous agents able to translate online steps into the real world could influence society by taking potentially dangerous actions, such as misusing financial information or spreading misinformation.
    “We should be extremely cautious about these factors and make a concerted effort to try to mitigate them,” said Su. But as AI research continues to evolve, he notes that it’s likely society will experience major growth in the commercial use and performance of generalist web agents in the years to come, especially as the technology has already gained so much popularity in the public eye.
    “Throughout my career, my goal has always been trying to bridge the gap between human users and the computing world,” said Su. “That said, the real value of this tool is that it will really save people time and make the impossible possible.”
    The research was supported by the National Science Foundation, the U.S. Army Research Lab and the Ohio Supercomputer Center. Other co-authors were Xiang Deng, Yu Gu, Boyuan Zheng, Shijie Chen, Samuel Stevens, Boshi Wang and Huan Sun, all of Ohio State. More