Galactica large language model
Report incident ๐ฅ | Improve page ๐ | Access database ๐ข
Galatica is a large language model developed by Facebook that 'can store, combine and reason about scientific knowledge' in order to assist scientists 'summarize academic papers, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more'.ย
Released in November 2022, the system was trained on 106 billion tokens of open-access scientific text and data, including papers, textbooks, scientific websites, encyclopedias, reference material, and knowledge bases.
Operator: Meta/Facebook
Developer: Meta/Facebook
Country: Global
Sector: Technology
Purpose: Assist scientists
Technology: Large language model (LLM); NLP/text analysis; Neural network; Deep learning
Issue: Accuracy/reliability; Bias/discrimination - race, ethnicity, gender, religion; Mis/disinformation; Safety
Transparency: Black box; Marketing
Risks and harms ๐
Galactica has been criticised for generating authoritative-sounding but often subtly incorrect or biased information, reproducing problems of bias and toxicity seen in other language models, and potentially enabling the spread of misinformation - which is considered particularly dangerous given its authoritative tone.
Transparency and accountability ๐
Galactica exhibited several key transparency and accountability limitations:
Data sources. The model produced information without clearly indicating its sources or reliability, making it challenging for users to verify claims.
Explainability. Users, including subject matter experts, found it difficult to understand how Galactica arrived at its outputs or to check their accuracy.
Model access. Meta took down the public demo after just three days, limiting broader scrutiny and evaluation of the model's capabilities and flaws.
Usage guidelines. There was insufficient guidance on how to interpret or use Galactica's outputs responsibly, especially given its potential to generate misinformation.