Unified Model Records

Llama 2

Type: family

Tags: opensource Publisher: Meta Released: 2023-07-19 v.1.0.0

Metadata

General information.

name
Llama 2
version
1.0
publisher
Meta
model type
Large Language Model
release date
2023-07-19
description
Meta developed and released the Llama 2 family of large language models (LLMs), a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama-2-Chat, are optimized for dialogue use cases. Llama-2-Chat models outperform open-source chat models on most benchmarks we tested, and in our human evaluations for helpfulness and safety, are on par with some popular closed-source models like ChatGPT and PaLM.
architecture
Llama 2 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

Relations

No relations specified.

Relationship Graph

Relationship Graph for llama2

Intended Use

  • Llama 2 is intended for commercial and research use in English.
  • Tuned models are intended for assistant-like chat.
  • Pretrained models can be adapted for a variety of natural language generation tasks.
  • Developers may fine-tune Llama 2 models for languages beyond English provided they comply with the Llama 2 Community License and the Acceptable Use Policy.
  • Use in any manner that violates applicable laws or regulations is out-of-scope.

Factors

  • Range of parameter sizes — 7B, 13B, and 70B — as well as pretrained and fine-tuned variations.
  • Input text only.
  • Output text only.
  • Model architecture uses an optimized transformer architecture.
  • Models are trained with a global batch-size of 4M tokens.

Evaluation Data

  • Evaluation data includes standard academic benchmarks across commonsense reasoning, world knowledge, reading comprehension, and math.
  • Automatic safety benchmarks such as TruthfulQA and ToxiGen for evaluating truthfulness and toxicity.
  • The BOLD dataset for measuring biases in open-ended language generation.
  • Use of internal evaluations library for consistency across evaluations.
  • Both pretrained Llama 2 and fine-tuned Llama 2-Chat models are evaluated on these benchmarks.

Training Data

  • 2 trillion tokens of data from publicly available sources were used for pretraining.
  • Fine-tuning data includes publicly available instruction datasets, as well as over one million new human-annotated examples.
  • The pretraining data has a cutoff of September 2022, but some tuning data is more recent, up to July 2023.
  • Neither the pretraining nor the fine-tuning datasets include Meta user data.
  • A new mix of publicly available online data was curated for the training process.

Additional Information

  • The 70B version uses Grouped-Query Attention (GQA) for improved inference scalability.
  • Token counts refer to pretraining data only.
  • The models were trained between January 2023 and July 2023.
  • A custom commercial license is available for use.
  • More detailed information can be found in the research paper "Llama-2: Open Foundation and Fine-tuned Chat Models".

Recommendations

  • Before deploying any applications of Llama 2, developers should perform safety testing and tuning tailored to specific applications.
  • Consult the Responsible Use Guide available on Meta AI's website.
  • Regular updating and fine-tuning with newer data and community feedback is recommended to improve model safety and effectiveness.
  • Consider language variations and cultural contexts when adapting Llama 2 models for languages beyond English.
  • Stay informed about updates to model versions and licenses.