Galactica large language model

Report incident ๐Ÿ”ฅ | Improve page ๐Ÿ’ | Access database ๐Ÿ”ข

Galatica is a large language model developed by Facebook that 'can store, combine and reason about scientific knowledge' in order to assist scientists 'summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more'.ย 

Released in November 2022, the system was trained on 106 billion tokens of open-access scientific text and data, including papers, textbooks, scientific websites, encyclopedias, reference material, and knowledge bases.

System ๐Ÿค–

Documents ๐Ÿ“ƒ

Operator: Meta/Facebook
Developer: Meta/Facebook
Country: USA; Global
Sector: Technology
Purpose: Assist scientists
Technology: Large language model (LLM); NLP/text analysis; Neural network; Deep learning
Issue: Accuracy/reliability; Bias/discrimination - race, ethnicity, gender, religion; Mis/disinformation; Safety
Transparency: Black box; Marketing

Risks and harms ๐Ÿ›‘

Galactica has been criticised for generating authoritative-sounding but often subtly incorrect or biased information, reproducing problems of bias and toxicity seen in other language models, and potentially enabling the spread of misinformation, which is particularly dangerous given its authoritative tone.

Transparency and accountability ๐Ÿ™ˆ

In a nod to the actual and/or potential limitations of its system, Meta notes (pdf) that 'there are no guarantees for truthful or reliable output from language models, even large ones on high-quality data like Galactica,' adding that the generated text might appear 'very authentic and highly confident,' but could still be wrong.

For Technology Review's Will Douglas Heaven, Meta's suggestion that 'the human-like text such models generate will always contain trustworthy information, as Meta appeared to do in its promotion of Galactica, is reckless and irresponsible.'

The marketing of the system demonstrates 'the all-too-common tendency of AI researchers to exaggerate the abilities of the systems they build', according to AI commentator Alberto Romero.

Incidents and issues ๐Ÿ”ฅ

Page info
Type: System
Published: November 2022
Last updated: May 2024