How to make skin invisible—major breakthrough

Researchers have developed a new way to see organs within a body by rendering overlying tissues transparent to visible light.

The counterintuitive process—a topical application of food-safe dye—was reversible in tests with animal subjects, and may ultimately apply to a wide range of medical diagnostics, from locating injuries to monitoring digestive disorders and identifying cancers.

Stanford University researchers published the research in the Sept. 6, 2024 issue of Science.  

Many uses

″Looking forward, this technology could make veins more visible for the drawing of blood, make laser-based tattoo removal more straightforward, or assist in the early detection and treatment of cancers,″ said Stanford University assistant professor of materials science and engineering Guosong Hong, a U.S. National Science Foundation CAREER grantee who helped lead this work, in a statement.

″For example, certain therapies use lasers to eliminate cancerous and precancerous cells, but are limited to areas near the skin’s surface. This technique may be able to improve that light penetration.″  

How it works

To master the new technique, the researchers developed a way to predict how light interacts with dyed biological tissues. This involves both light scattering and refraction, where light changes speed and bends as it travels from one material into another.

Light scattering is the reason we cannot see through our body: Fats, fluids within cells, proteins, and other materials each have a different refractive index, a property that dictates how much an incoming light wave will bend.

In most tissues, those materials are closely compacted together, so the varied refractive indices cause light to scatter as it passes through. It is the scattering effect that our eyes interpret as opaque, colored biological materials.

The researchers realized if they wanted to make biological material transparent, they had to find a way to match the different refractive indices so light could travel through unimpeded.

The researchers realized that dyes are the most effective at absorbing light, but can also be highly effective at directing light uniformly through a wide range of refractive indices.

They found that molecules of tartrazine (a food dye more commonly known as FD & C Yellow 5), when dissolved into water and absorbed into tissues, are perfectly structured to match refractive indices and prevent light from scattering, resulting in transparency.

The researchers gently rubbed a temporary tartrazine solution on mice. They applied the solution to the scalp, rendering the skin transparent to reveal blood vessels crisscrossing the brain and to the abdomen, which faded within minutes to show contractions of the intestine and movements caused by heartbeats and breathing.

The technique resolved features at the scale of microns (millionth of a meter) and enhanced microscope observations. The tartrazine did not appear to have long-term effects, and any excess was excreted in waste within 48 hours.

The researchers suspect that injecting the dye should lead to even deeper views within organisms, with implications for both biology and medicine.

Old formulas yield new window into medicine

Supported by a range of federal and private grants, the project began as an investigation into how microwave radiation interacts with biological tissues.

In exploring optics textbooks from the 1970s and 1980s, the researchers found two key concepts: mathematical equations called Kramers-Kronig relations and a phenomenon called Lorentz oscillation, where electrons and atoms resonate within molecules as photons pass through.

Well studied for more than a century, yet not applied to medicine in this way, the tools proved ideal for predicting how a given dye can raise the refractive index of biological fluids to perfectly match surrounding fats and proteins.

They realized that the same modifications that make materials transparent to microwaves could be tailored to impact the visible spectrum, with potential applications in medicine.

One tool that proved critical was a decades-old ellipsometer, a tool familiar to semiconductor manufacturing, not biology. In a possible first for medicine, the researchers realized it was also perfect to predict the optical properties of their target dyes.

With methods grounded in fundamental physics, the researchers hope their approach will launch a new field of study: matching dyes to biological tissues based on optical properties, potentially leading to a wide range of medical applications.

This research was supported by the U.S.. National Science Foundation, the U.S. National Institutes of Health, the U.S. Air Force Office of Scientific Research, the U.S. Army Long Term Health Education and Training program, and a range of private foundations and institutions.

Warning by researchers: The technique described above has not been tested on humans. Dyes may be harmful.  Always exercise caution with dyes and do not consume directly, apply to people or animals, or otherwise misuse.

Citation: Ou, Z., Duh, S., Rommelfanger, N. J., C. Keck, C. H., Jiang, S., BrinsonJr, K., Zhao, S., Schmidt, E. L., Wu, X., Yang, F., Cai, B., Cui, H., Qi, W., Wu, S., Tantry, A., Roth, R., Ding, J., Chen, X., Kaltschmidt, J. A., . . . Hong, G. (2024). Achieving optical transparency in live animals with absorbing molecules. Science. https://doi.org/adm6869 (open-access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

How stressed are you right now?

Current cortisol detectors have poor stability, caused by different and fluctuating conditions, such as changing pH and temperature on the silver layer; and the presence of other hormones such as progesterone, testosterone, and corticosterone.

A way-better cortisol detector

So researchers in China and the UK have now developed a new, improved detector that can accurately measure levels of cortisol—a key stress biomarker—in the blood. The solution uses iridium oxide nanoparticles to cover the detector’s conventional silver layer. This improves the stability, sensitivity and reproducibility of a cortisol detection device at a concentration 3,000 times lower than the normal range of cortisol.

The study, by researchers at Xi’an Jiaotong-Liverpool University, People’s Republic of China and Abertay University, United Kingdom, was published August 30, 2024 in the open-access journal Talanta, published by Elsevier.

Another recent stress-detection approach, using machine-learning with wearable sensors: Machine-learning Approach for Stress Detection Using Wearable Sensors.

So what are your stress-detector (and fixer) methods? Please share below!

Citations:

Ji, T., Ye, W., Xiao, W., Dawson, G., Dong, Q., & Gwenin, C. (2024). Iridium oxide-modified reference screen-printed electrodes for point-of-care portable electrochemical cortisol detection. Talanta, 280, 126776. https://www.sciencedirect.com/science/article/abs/pii/S003991402401155X?via%3Dihub (open-access)

Abd Al-Alim, M., Mubarak, R., M. Salem, N., & Sadek, I. (2024). A machine-learning approach for stress detection using wearable sensors in free-living environments. Computers in Biology and Medicine, 179, 108918. https://doi.org/10.1016/j.compbiomed.2024.108918 (open-access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Ultrasound device stimulates deep brain regions to treat chronic pain

University of Utah engineers have developed Diadem, which noninvasively stimulates deep brain regions, potentially disrupting the faulty signals that lead to chronic pain.

The researchers have published promising findings about an experimental therapy that has given many participants relief after a single treatment session. They are now recruiting participants for a final round of trials.

Diadem is a new biomedical device that uses ultrasound to noninvasively stimulate deep brain regions, potentially disrupting the faulty signals that lead to chronic pain.

Neuromodulation-based therapy

Diadem’s approach is based on neuromodulation, a therapeutic technique that seeks to directly regulate the activity of certain brain circuits. Other neuromodulation approaches are based on electric currents and magnetic fields, but those methods cannot selectively reach the brain structure investigated in the researchers’ recent trial —the anterior cingulate cortex, the researchers say.

The team is now preparing for a Phase 3 clinical trial, the final step before approval from the FDA to use Diadem as a treatment for the general public. They also plan to help deal with the opioid crisis.

Funding came from the National Institutes of Health and the University of Utah.

Citation: Riis, Thomas et al. Noninvasive targeted modulation of pain circuits with focused ultrasonic waves. Pain. 10.1097/j.pain.0000000000003322 (open-access)

Thumbnail Image credit: A. Angelica, ChatGPT

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

An entire brain-machine interface on a chip

EPFL researchers have developed a next-generation, miniaturized brain-machine interface (BMI) that is capable of direct brain-to-text communication on tiny silicon chips—no external computation required.

According to the researchers, the “MiBMI” (miniaturized BMI) chip enhances the efficiency and scalability of brain-machine interfaces and paves the way for practical, fully implantable devices. The MiBMI’s small area (8 millimeters square) and low power requirement make the system suitable for clinical and real-life applications.

Converting thoughts into readable text on a screen simply by thinking about writing

Brain-to-text conversion involves decoding neural signals generated when a person imagines writing letters or words. With current BMI systems, an external computer is required to process this data.

Instead, the researchers discovered that when a patient imagines “writing” characters by hand, it generates specific markers, or “distinctive neural codes” (DNCs), each in about 100 bytes. Using electrodes implanted in the brain, the MiBMI chipset processes these signals in real-time, translating the brain’s intended hand movements into corresponding digital text—no need to process thousands of bytes of data for each letter.

This makes the system fast and accurate, with low power consumption. It also allows for faster training times, making learning how to use the BMI easier and more accessible—especially for those with “locked-in” (unable to communicate) syndrome and other severe motor impairments.

Research collaborations

“While the chip has not yet been integrated into a working BMI, it has processed data from previous live recordings, such as those from the Shenoy lab at Stanford, converting [mental] handwriting activity into text with an impressive 91% accuracy,” said lead author Mohammed Ali Shaeri in a statement. The chip can currently decode up to 31 different characters, an achievement unmatched by any other integrated system, he notes. 

The researchers say this neurotechnological breakthrough is a feat of extreme miniaturization that combines expertise in integrated circuits, neural engineering, and artificial intelligence. They are collaborating with other research groups to test the system in different contexts, such as speech decoding and movement control. “Our goal is to develop a versatile BMI that can be tailored to various neurological disorders, providing a broader range of solutions for patients,” says Shoaran.

This research is published in the current issue of IEEE Journal of Solid-State Circuits.

Citation: M. A. Shaeri, U. Shin, A. Yadav, R. Caramellino, G. Rainer, M. Shoaran, “A 2.46mm2 Miniaturized Brain-Machine Interface (MiBMI) Enabling 31-Class Brain-to-Text Decoding”, in IEEE Journal of Solid-State Circuits (JSSC), 2024, doi: 10.1109/JSSC.2024.3443254

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Repurposed Alzheimer’s drug could someday help save your life

Researchers at the Harvard Wyss Institute for Biologically Inspired Engineering have discovered that the FDA-approved donepezil (DNP) drug can put tadpoles in a protective, reversible hibernation-like torpor state.

DNP is already being used in the clinic to treat Alzheimer’s, so it potentially could be rapidly repurposed for use in emergency situations to prevent irreversible organ injury while a person is being transported to a hospital, the researchers suggest.

Could save millions of lives every year

“Cooling a patient’s body down to slow its metabolic processes has long been used in medical settings to reduce injuries and long-term problems from severe conditions, but currently, it can only be done in a well-resourced hospital,” said co-author Michael Super, Ph.D., Director of Immuno-Materials at the Wyss Institute, in a statement.

“Achieving a similar state of biostasis [the ability of an organism to tolerate environmental changes without having to actively adapt to them, such as hibernation] with an easily administered drug like DNP could potentially save millions of lives every year.”

Lipid nanocarriers to reduce toxicity

The team used X. laevis tadpoles to evaluate DNP’s effects on a whole living organism and found that it did successfully induce a torpor-like state that could be reversed when the drug was removed.

But the drug seemed to cause some toxicity, and accumulated in all of the animals’ tissues. To solve that problem, the researchers encapsulated DNP inside lipid nanocarriers, and found that this both reduced toxicity and caused the drug to accumulate in the animals’ brain tissue. This is a promising result, as the central nervous system is known to also mediate hibernation and torpor in other animals.

Although DNP has been shown to protect neurons from metabolic stress in models of Alzheimer’s disease, the team cautions that more work is needed to understand exactly how it leads to torpor, as well as scale up production of the encapsulated DNP for use in larger animals and potentially, humans.

“Donepezil has been used worldwide by patients for decades, so its properties and manufacturing methods are well-established. Lipid nanocarriers similar to the ones we used are also now approved for clinical use in other applications,” said senior author Donald Ingber, Founding Director of the Wyss Institute for Biologically Inspired Engineering, in a statement.

Buying patients critical time

“This study demonstrates that an encapsulated version of the drug could potentially be used in the future to buy patients critical time to survive devastating injuries and diseases, and it could be easily formulated and produced at scale on a much shorter time scale than a new drug,” Ingber said.

This research, published August 22 in ACS Nano, was supported as part of the DARPA Biostasis Program, which funds projects that aim to extend the time for lifesaving medical treatment (the “Golden Hour” following traumatic injury or acute infection). The Wyss Institute has been a participant in the DARPA Biostasis Program since 2018 and has achieved several important milestones over the last few years.

Citation: Plaza Oliver M et al. Donepezil Nanoemulsion Induces a Torpor-like State with Reduced Toxicity in Nonhibernating Xenopus laevis Tadpoles. ACS Nano. 2024 Aug 21. https://pubmed.ncbi.nlm.nih.gov/39167921/ (open-access)

Thumbnail image credit: A. Angelica/ChatGPT

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

Massive water production possible on the Moon

Chinese Academy of Sciences (CAS) researchers have developed a new method of massive water production on the Moon.

Previous lunar explorations have found the water content in lunar minerals to be extremely low, ranging from 0.0001% to 0.02%, they note. Now Prof. Wang Junqiang’s team at the Ningbo Institute of Materials Technology and Engineering (NIMTE) of the Chinese Academy of Sciences (CAS) has developed a new method of massive water production based on the reaction between lunar regolith and endogenous hydrogen.

Concentrated heating regolith releases water

“We used lunar regolith samples brought back by the Chang’E-5 mission in our study,” said Wang. The study revealed that when the lunar regolith is heated above 1,200 K with concave mirrors, one gram of molten lunar regolith can generate 51 to 76 mg of water.

“So one ton of lunar regolith could produce more than 50 kg of water, which is equal to about a hundred 500-ml bottles of drinking water. This would be enough drinking water for 50 people for one day.”

Lunar ilmenite (FeTiO3) was found to contain the highest amount of solar wind-implanted hydrogen among the five primary minerals in the lunar regolith, owing to its unique lattice structure with sub-nanometer tunnels.

Other options: irrigating and breathing

The researchers suggest that this water could be used both for drinking and irrigating plants. Or it could be electrochemically decomposed into hydrogen and oxygen, with hydrogen used for energy and oxygen essential for breathing.

The researchers say these discoveries provide pioneering insights into water exploration on the Moon and shed light on the future construction of lunar research stations.

The results of the study were published in the Cell Press journal The Innovation.

Citation: Xiao Chen et al. August 22. Massive Water Production from Lunar Ilmenite through Reaction with Endogenous Hydrogen. The Innovation. https://www.cell.com/the-innovation/fulltext/S2666-6758(24)00128-0 (open access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

World’s fastest electron microscope

Imagine a camera so powerful it can take freeze-frame photographs of a moving electron—an object traveling so fast it could circle the Earth many times in a matter of a second.

Researchers at the University of Arizona, led by Mohammed Hassan, associate professor of physics and optical sciences, have now developed just that: the world’s fastest electron microscope. They believe this development will lead to groundbreaking advancements in physics, chemistry, bioengineering, materials sciences and more.

A transmission electron microscope (TEM) is a tool used by scientists and researchers to magnify objects up to millions of times their actual size to see details too small for a traditional light microscope to detect (by directing beams of electrons through a sample being studied). The interaction between the electrons and the sample is captured by lenses and detected by a camera sensor to generate detailed images of the sample.

The new speed: “attomicroscopy”

To see an electron frozen in place, U of A researchers have generated a single attosecond electron pulse, which is as fast as electrons move, thereby enhancing the microscope’s temporal resolution, like a high-speed camera capturing movements that would otherwise be invisible.

“The improvement of the temporal resolution inside of electron microscopes has been long anticipated and the focus of many research groups—because we all want to see the electron motion,” Hassan said. “These movements happen in attoseconds. But now, for the first time, we are able to attain attosecond temporal resolution with our electron transmission microscope, and we coined it “attomicroscopy.”

“For the first time, we can now see pieces of the electron in motion.”

Citation: Hui, D., Alqattan, H., Sennary, M., Golubev, N. V., & Hassan, M. T. (2024). Attosecond electron microscopy and diffraction. Science Advances. https://www.science.org/doi/10.1126/sciadv.adp5805 (open access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

AI links heat-wave events to global warming

Researchers at Stanford University and Colorado State University have developed a rapid, low-cost machine-learning approach for studying how individual extreme weather events have been affected by global warming.

They found that if global temperatures reach 2.0 C above pre-industrial-level events (equivalent to some of the worst heat waves in Europe, Russia, and India over the past 45 years) it could happen multiple times per decade. According to Copernicus Climate Change Service, global warming is currently approaching 1.3 C above pre-industrial levels.

Their method, detailed in an August 21 study in the journal Science Advances, uses machine learning to determine how much global warming has contributed to heat waves in the U.S. and elsewhere in recent years.

The approach proved highly accurate and could change how scientists study and predict the impact of climate change on a range of extreme weather events, according to the researchers. The results can also help to guide climate adaptation strategies and are relevant for lawsuits that seek to collect compensation for damages caused by climate change.

Training AI models to predict daily maximum temperatures

“We’ve seen the impacts that extreme weather events can have on human health, infrastructure, and ecosystems,” said study lead author Jared Trok, a PhD student in Earth system science at the Stanford Doerr School of Sustainability, in a statement. “To design effective solutions, we need to better understand the extent to which global warming drives changes in these extreme events.”

Trok and his co-authors trained AI models to predict daily maximum temperatures, based on the regional weather conditions and the global mean temperature, using data from a large database of climate model simulations extending from 1850 to 2100.

The researchers then used the actual weather conditions from specific real-world heat waves to predict how hot the heat waves would have been if the same weather conditions occurred, but at different levels of global warming. They then compared these predictions at different global warming levels to estimate how climate change influenced the frequency and severity of historical weather events.

Case studies and beyond

The researchers first put their AI method to work analyzing the 2023 Texas heat wave, which contributed to a record number of heat-related deaths in the state that year. The team found that global warming made the historic heat wave 1.18 to 1.42 degrees Celsius (2.12 to 2.56 F) hotter than it would have been without climate change. The researchers also found that their new technique was able to accurately predict the magnitude of record-setting heat waves in other parts of the world. The results were consistent with previously published studies of those events.

Based on these findings, the researchers used AI to predict how severe heat waves could become if the same weather patterns that caused previous record-breaking heat waves instead occurred under higher levels of global warming. They found that events equal to some of the worst heat waves in Europe, Russia, and India over the past 45 years could happen multiple times per decade if global temperatures reach 2.0 C above pre-industrial levels.

Accurate, low-cost analyses of extreme events in more parts of the world

The new AI method addresses some limitations of existing approaches, including those previously developed at Stanford, by using actual historical weather data when predicting the effect of global warming on extreme events. It does not require expensive new climate model simulations because the AI can be trained using existing simulations.

Together, these innovations will enable accurate, low-cost analyses of extreme events in more parts of the world, which is crucial for developing effective climate adaptation strategies. And it opens up new possibilities for fast, real-time analysis of the contribution of global warming to extreme weather, according to the researchers. The study was funded by Stanford University and the U.S. Department of Energy.

Citation: Trok, J. T., Barnes, E. A., Davenport, F. V., & Diffenbaugh, N. S. (2024). Machine learning–based extreme event attribution. Science Advances. https://www.science.org/doi/10.1126/sciadv.adl3242 (open access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

‘Metacrimes’ in the metaverse

Australian researchers are investigating “metacrime”—attacks, crimes or inappropriate activities that occur within the metaverse, the virtual world where users of VR headsets can choose an avatar to represent themselves as they interact with other users’ avatars or move through other 3D digital spaces.

According to Griffith University’s Ausma Bernot, who has teamed up with researchers from Monash University, Charles Sturt University and the University of Technology Sydney, “Nnfortunately, while those new environments are very exciting, they also have the potential to enable new crimes. We’ve seen reports of sexual harassment or assault against both adults and kids,” Bernot said in a statement.

The Australian eSafety Commissioner has estimated that around 680,000 adults in Australia are engaged in the metaverse. And UK’s Center for Countering Digital Hate conducted 11 hours and 30 minutes of recorded user interactions using Meta’s Oculus headset in the popular VRChat app (with more than 25,000 Community Created Worlds and climbing).

Impacts on mental or emotional well-being

The researchers found that most users had been faced with at least one negative experience in that virtual environment. Those experiences include being called offensive names, receiving repeated unwanted messages or contact, being challenged about cultural identity, being sent unwanted inappropriate content, or being provoked to respond to something or to start an argument.

Eleven percent of the participants had been exposed to a sexually graphic virtual space and nine percent had been touched (virtually) in a way they didn’t like. Of these respondents, 49 percent said the experience had a moderate to extreme impact on their mental or emotional well-being.

Monitor children’s activity

The two largest user groups are minors and men, so Bernot said it was “important for parents to monitor their children’s activity or consider limiting their access to multi-player games. Minors are more vulnerable to grooming and other abuse.

“They may not know how to deal with these situations, and while there are some features like a ‘safety bubble’ within some games (or just take the headset off), once immersed in these environments it does feel very real. It’s somewhere in between a physical attack and a social media harassment message—you’ll still feel that distress and it can take a significant toll on a user’s well-being. It is a real and palpable risk.”

Virtual rape

Monash University’s You Zhou said there had already been many reports of virtual rape, including one in the United Kingdom where police have launched a case for a 16-year-old girl whose avatar was attacked, causing psychological and emotional trauma similar to an attack in the physical world.

“When immersed in this world of virtual reality, and particularly when using higher quality VR headsets, users will not necessarily stop to consider whether the experience is reality or virtuality,” Zhou said in a statement.

“While there may not be physical contact, victims (mostly young girls) strongly claim the feeling of victimization was real. Without physical signs on a body, and unless the interaction was recorded, it can be almost impossible to show evidence of these experiences.”

The metaverse is expected to grow exponentially in coming years, so the research team’s findings highlight a need for metaverse companies to instill clear regulatory frameworks for their virtual environments, making them safe for everyone to inhabit, the researchers advised.

Citation: Zhou, Y., Tiwari, M., Bernot, A. et al. Metacrime and Cybercrime: Exploring the Convergence and Divergence in Digital Criminality. Asian J Criminol (2024). https://doi.org/10.1007/s11417-024-09436-y (open access)

Thumbnail image credit: A. Angelica/ChatGPT4

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter

How to detect covert AI-generated text

Machine-generated text has been fooling humans since the release of GPT-2 in 2019. Large language model (LLM) tools have gotten progressively better at crafting stories, news articles, student essays and more. So humans are often unable to recognize when they are reading text produced by an algorithm.

Misuse and harmful outcomes

These LLMs are being used to save time and even boost creativity in ideating and writing, but their power can lead to misuse and harmful outcomes that are already showing up where we consume information.

One way both academics and companies are trying to improve this detection is by employing machine learning models that can identify subtle patterns of word choice and grammatical constructions. These allow for recognizing LLM-generated text in a way that our human intuition cannot. 

Today, many commercial detectors are claiming to be highly successful at detecting machine-generated text, with up to 99% accuracy—but are these claims verified? Chris Callison-Burch, Professor in Computer and Information Science, and Liam Dugan, a doctoral student in Callison-Burch’s group, aimed to find out in their recent paper, published at the 62nd Annual Meeting of the Association for Computational Linguistics.

“As the technology to detect machine-generated text advances, so does the technology used to evade detectors,” said Callison-Burch in a statement. “It’s an arms race, and while the goal to develop robust detectors is one we should strive to achieve, there are many limitations and vulnerabilities in detectors that are available now.”   

Testing AI detector ability

To investigate those limitations and provide a path forward for developing robust detectors, the research team created Robust AI Detector (RAID), a data set of more than 10 million documents across recipes, news articles, blog posts and more, including AI-generated text and human-generated text.

RAID is the first standardized benchmark to test detection ability in current and future detectors. The team also created a leaderboard, which publicly ranks the performance of all detectors that have been evaluated using RAID in an unbiased way.

“The concept of a leaderboard has been key to success in many aspects of machine learning like computer vision,” said Dugan in a statement. “The RAID benchmark is the first leaderboard for robust detection of AI-generated text. We hope that our leaderboard will encourage transparency and high-quality research in this quickly evolving field.”

Dugan has already seen the influence this recemt paper is having in companies that develop detectors. “Originality.ai is a prominent company that develops detectors for AI-generated text,” he says. “They shared our work in a blog post, ranked their detector in our leaderboard and are using RAID to identify previously hidden vulnerabilities and improve their detection tool.”

How detectors fail

So do the current detectors hold up to the work at hand? RAID shows that not many do as well as they claim. 

“Detectors trained on ChatGPT were mostly useless in detecting machine-generated text outputs from other LLMs such as Llama and vice versa,” says Callison-Burch. “Detectors trained on news stories don’t hold up when reviewing machine-generated recipes or creative writing. What we found is that there are a myriad of detectors that only work well when applied to very specific use cases and when reviewing text similar to the text they were trained on.” 

Detectors are able to detect AI-generated text when it contains no edits or “disguises,” but when manipulated, current detectors are not reliably able to detect AI-generated text. Faulty detectors are not only an issue because they don’t work well; they can be as dangerous as the AI tool used to produce the text in the first place. 

“If universities or schools were relying on a narrowly trained detector to catch students’ use of ChatGPT to write assignments, they could be falsely accusing students of cheating when they are not,” says Callison-Burch. “They could also miss students who were cheating by using other LLMs to generate their homework.”   

Adversarial attacks and tricks

The team also looked into how adversarial attacks (such as replacing letters with look-alike symbols) can easily derail a detector and allow machine-generated text to fly under the radar.

“It turns out, there are a variety of edits a user can make to evade detection by the detectors we evaluated in this study,” says Dugan. “Something as simple as inserting extra spaces, swapping letters for symbols, or using alternative spelling or synonyms for a few words can cause a detector to be rendered useless.”

While current detectors are not robust enough to be of significant use in society just yet, openly evaluating detectors on large, diverse, shared resources is critical to accelerating progress and trust in detection, and that transparency will lead to the development of detectors that do hold up in a variety of use cases, the study concludes. 

Reducing harms

“My work is focused on reducing the harms that LLMs can inadvertently cause, and, at the very least, making people aware of the harms so that they can be better informed when interacting with information,” Dugan continues. “In the realm of information distribution and consumption, it will become increasingly important to understand where and how text is generated, and this paper is just one way I am working towards bridging those gaps in both the scientific and public communities.”

This study was funded by the Intelligence Advanced Research Activity (IARPA), a directive of the Office of the Director of National Intelligence and within the Human Interpretable Attribution of Text Using Underlying Structure (HIATUS) program.

Citation: Dugan, L., Hwang, A., Trhlik, F., Ludan, J. M., Zhu, A., Xu, H., & Ippolito, D. (2024). RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors. arXiv. https://arxiv.org/abs/2405.07940 (open access)

Let us know your thoughts! Sign up for a Mindplex account now, join our Telegram, or follow us on Twitter