WordPress Ad Banner

Senators Express Concerns Over Meta’s LLaMA Language Model Leak


Today, two U.S. Senators have addressed a letter to Mark Zuckerberg, the CEO of Meta, raising concerns about the recent leak of Meta’s widely-used open-source large language model, LLaMA. The Senators, Richard Blumenthal (D-CT) and Josh Hawley (R-MO), expressed apprehension regarding the potential misuse of LLaMA in activities such as spam, fraud, malware, privacy violations, harassment, and other forms of wrongdoing and harm.

Senator Blumenthal, who chairs the Senate’s Subcommittee on Privacy, Technology, & the Law, along with Senator Hawley, the ranking member, requested information from Meta regarding the company’s assessment of the risks associated with releasing LLaMA. They also inquired about the measures taken to prevent the model’s abuse and how Meta plans to update its policies and practices in response to its unrestricted availability.

WordPress Ad Banner

It is worth noting that this subcommittee had previously questioned Sam Altman, the CEO of OpenAI, as well as AI critic Gary Marcus and Christina Montgomery, IBM’s chief privacy and trust officer, during a Senate hearing on AI regulations and rules that took place on May 16.

Letter points to Meta’s LLaMA release in February

In a recent letter, Meta’s release of the LLaMA language model has come under scrutiny. The letter highlights Meta’s decision to make LLaMA available for download by approved researchers, rather than centralizing access to the model and its underlying data and software.

According to the letter, LLaMA stands out from previous publicly available models due to its size and sophistication. However, shortly after the announcement, the complete model surfaced on BitTorrent, allowing unrestricted access to users worldwide without any form of monitoring or oversight. This widespread dissemination of LLaMA raises significant concerns regarding its potential for misuse and abuse.

The letter’s focus on the LLaMA leak appears to take aim at the open source community, which has been engaged in a passionate debate and moment of discussion. This follows a series of recent releases of large language models (LLMs) and an ongoing effort by startups, collectives, and academics to challenge the trend towards closed, proprietary LLMs and promote democratized access to such models.

Upon its release, LLaMA received immediate acclaim for its exceptional performance, surpassing models like GPT-3 despite having significantly fewer parameters. Several open-source models were developed in connection with LLaMA. For instance, Databricks introduced Dolly, a ChatGPT-like model inspired by Alpaca, an open-source LLM released by Stanford in mid-March. Notably, Alpaca utilized the weights from Meta’s LLaMA model. Vicuna, a fine-tuned version of LLaMA, achieves performance on par with GPT-4.

The ongoing discussion surrounding LLaMA’s leak and the subsequent developments in the open source community continue to shape the landscape of AI models and access to them.

Senators criticize Meta’s use of the word ‘leak’

The Senators had harsh words for Zuckerberg regarding LLaMA’s distribution and the use of the word “leak.”

“The choice to distribute LLaMA in such an unrestrained and permissive manner raises important and complicated questions about when and how it is appropriate to openly release sophisticated AI models,” the letter says.

“Given the seemingly minimal protections built into LLaMA’s release, Meta should have known that LLaMA would be broadly disseminated, and must have anticipated the potential for abuse,” it continues. “While Meta has described the release as a leak, its chief AI scientist has stated that open models are key to its commercial success. Unfortunately, Meta appears to have failed to conduct any meaningful risk assessment in advance of release, despite the realistic potential for broad distribution, even if unauthorized.”

Meta known as a particularly ‘open’ Big Tech company

Meta is known as a particularly “open” Big Tech company (thanks to FAIR, the Fundamental AI Research Team founded by Meta’s chief AI scientist Yann LeCun in 2013). It had made LLaMA’s model weights available for academics and researchers on a case-by-case basis — including Stanford for the Alpaca project — but those weights were subsequently leaked on 4chan. This allowed developers around the world to fully access a GPT-level LLM for the first time. 

It’s important to note, however, that none of these open-source LLMs are available yet for commercial use, because the LLaMA model is not released for commercial use, and the OpenAI GPT-3.5 terms of use prohibit using the model to develop AI models that compete with OpenAI.

But those building models from the leaked model weights may not abide by those rules.

Meta VP of AI research cited need to ‘lean into transparency’

“The pivots in AI are huge, and we are asking society to come along for the ride,” she said in the April interview. “That’s why, more than ever, we need to invite people to see the technology more transparently and lean into transparency.”

However, Pineau doesn’t fully align herself with statements from OpenAI that cite safety concerns as a reason to keep models closed. “I think these are valid concerns, but the only way to have conversations in a way that really helps us progress is by affording some level of transparency,” she told VentureBeat. 

She pointed to Stanford’s Alpaca project as an example of “gated access” — where Meta made the LLaMA weights available for academic researchers, who fine-tuned the weights to create a model with slightly different characteristics.

“We welcome this kind of investment from the ecosystem to help with our progress,” she said. But while she did not comment to VentureBeat on the 4chan leak that led to the wave of other LLaMA models, she told the Verge in a press statement, “While the [LLaMA] model is not accessible to all … some have tried to circumvent the approval process.”

Pineau did emphasize that Meta received complaints on both sides of the debate regarding its decision to partially open LLaMA. “On the one hand, we have many people who are complaining it’s not nearly open enough, they wish we would have enabled commercial use for these models,” she said. “But the data we train on doesn’t allow commercial usage of this data. We are respecting the data.”

However, there are also concerns that Meta was too open and that these models are fundamentally dangerous. “If people are equally complaining on both sides, maybe we didn’t do too bad in terms of making it a reasonable model,” she said. “I will say this is something we always monitor and with each of our releases, we carefully look at the trade-offs in terms of benefits and potential harm.”