Inscrivez-vous maintenant pour un meilleur devis personnalisé!

Nouvelles chaudes

How fake security reports are swamping open-source projects, thanks to AI

Feb, 11, 2025 Hi-network.com
AI fakes open-source program security patches and feature requests

You'd think artificial intelligence (AI) is a boon for developers. After all, a recent Google survey found that 75% of programmers rely on AI. On the other hand, almost 40% report having "little or no trust" in AI. Open-source project maintainers -- the people who manage software -- can understand that fact.

Many AI LLMs cannot deliver usable code

First, many AI large language models (LLMs) cannot deliver usable code for even simple projects. Far more troubling, however, is that open-source maintainers are finding that hackers are weaponizing AI to undermine open-source projects' foundations.

Also: Dumping open source for proprietary rarely pays off: Better to stick a fork in it

As Greg Kroah-Hartman, the Linux stable kernel maintainer, observed in early 2024, Common Vulnerabilities and Exposures (CVE), the master list of security holes, are "abused by security developers looking to pad their resumes." They submit many "stupid things." With AI scanning tools, numerous CVEs are being granted for bugs that don't exist. These security holes are rated by their level of dangerousness using the Common Vulnerability Scoring System (CVSS).

Worse still, as Dan Lorenic, CEO of security company Chainguard, observed, the National Vulnerability Database (NVD), which oversees CVEs, has been underfinanced and overwhelmed, so we can "expect a massive backlog of entries and false negatives."

Wasting valuable time on fake security issues

With government employee cuts expected to the NVD's parent organization, this flood of bogus AI-generated security reports making it into the CVE lists will only increase. This, in turn, means programmers, maintainers, and users will all need to waste valuable time on fake security issues.

Some open-source projects, such as Curl, have given up on CVEs entirely. As Daniel Steinberg, leader of Curl, said, "CVSS is dead to us."

Also: Why Mark Zuckerberg wants to redefine open source so badly

He's far from the only one to see this problem.

Seth Larson, Python Software Foundation security developer-in-residence, wrote: "Recently, I've noticed an uptick in extremely low-quality, spammy, and LLM-hallucinated security reports to open-source projects. The issue is that in the age of LLMs, these reports appear at first glance to be potentially legitimate and thus require time to refute." Larson believes these slop reports "should be treated as if they are malicious."

Patches introducing new vulnerabilities or backdoors

Why? Because these patches, while appearing legitimate at first glance, often contain code that is entirely wrong and nonfunctional. In the worst case, these patches will, the Open Source Security Foundation (OpenSSF) predicts, introduce new vulnerabilities or backdoors.

Alongside fake patches and security reports, AI is being employed to generate a deluge of feature requests across various open-source repositories. These requests, while sometimes seeming innovative or helpful, are often impractical, unnecessary, or simply impossible to implement. The sheer volume of these AI-generated requests overwhelms maintainers, making it hard to distinguish genuine user needs from artificial noise.

Also: We have an official open-source AI definition now, but the fight is far from over

Jeff Potliuk, a maintainer for Apache Airflow, an open-source workflow management platform, reported that the Outlier AI company had encouraged its members to post issues to the project "that make no sense and are either copies of other issues or completely useless and make no sense. This takes valuable time of maintainers who have to evaluate and close the issues. My investigation tracked to you as the source of problems -- where your instructional videos are tricking people into creating those issues to -- apparently train your AI."

These AI-driven issues have also been reported in Curl and React. To quote Potliuk: "This is wrong on so many levels. Please STOP. You are giving the community a disservice."

Fake contributions

The mechanics of deception behind these fake contributions are becoming increasingly sophisticated. AI models can now produce code snippets that, while nonfunctional, appear syntactically correct and contextually relevant. In addition, AI generates detailed explanations that mimic the language and style of a genuine contributor. Adding insult to injury, according to OpenSSF, some attackers use AI to create fake online identities, complete with GitHub histories containing thousands of minor but seemingly legitimate contributions.

The consequences of this AI-driven open-source code spam campaign are far-reaching. Besides maintainers wasting time sifting through and debunking fake contributions, this influx of AI-generated spam undermines the trust that forms the bedrock of open-source collaboration.

Stricter guidelines and verification processes

The open-source community is not standing idly by in the face of this threat. Projects are implementing stricter contribution guidelines and verification processes to weed out AI-generated content. In addition, maintainers share experiences and best practices for identifying and dealing with AI-generated code spam.

Also: Red Hat's take on open-source AI: Pragmatism over utopian dreams

As the battle against AI-generated deception in open-source projects continues, the community faces a critical challenge: preserving the collaborative spirit of open-source development while defending against increasingly sophisticated and automated attempts at manipulation.

As open-source programmer Navendu Pottekkat wrote: "Please don't turn this into a 'let's spam open-source projects' fest." Please, please don't. If you value open source, don't play AI games with it.

tag-icon Tags chauds: Innovation et Innovation

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.
Our company's operations and information are independent of the manufacturers' positions, nor a part of any listed trademarks company.