Unified Model Records

LLaVA-1.6 Vicuna 13B

Type: model

Tags: opensource Publisher: LMSYS Org Released: 2023-03-30 v.1.0.0

Metadata

General information.

name
LLaVA-1.6 Vicuna 13B
description
Science QA is a dataset designed for evaluating and training models in the task of answering science-related questions.
version
1.0.0
publisher
LMSYS Org
model type
Chatbot
release date
2023-03-30

Relations

Relationship Graph

Relationship Graph for LLaVA-1.6-Vicuna-13B

Intended Use

  • Research and academic purposes to advance the field of natural language processing and chatbot design.
  • Developers and enthusiasts to explore chatbot technologies and implement in non-commercial projects.
  • Educational institutions for teaching concepts related to AI and chatbots.
  • Non-profit organizations to enhance their customer support or engagement through chatbots.
  • Open-source community contribution and improvement.

Factors

  • Performance comparison with other models like ChatGPT and Google Bard.
  • Cost-effectiveness of training the model.
  • Accessibility of the model for non-commercial use.
  • Potential for the community to contribute to model improvement.
  • Ease of integration into existing systems for developers.

Evaluation Data

  • 70K user-shared ChatGPT conversations were used for fine-tuning.
  • GPT-4 used for preliminary evaluation to benchmark against ChatGPT and Google Bard.
  • Responses compared for detailed and well-structured answers.
  • Eight question categories devised to assess various aspects of chatbot performance.
  • Comparison based on helpfulness, relevance, accuracy, and detail of responses.

Training Data

  • Approximately 70K user-shared conversations gathered from ShareGPT.com.
  • HTML content converted back to markdown to filter and maintain data quality.
  • Lengthy conversations divided into smaller segments to fit model's maximum context length.
  • Dataset expanded to ensure Vicuna understands long context.
  • Innovations such as gradient checkpointing and flash attention applied to manage GPU memory requirements efficiently.

Additional Information

  • Vicuna is an open-source project with its code and weights available on GitHub.
  • The model is designed for non-commercial use, adhering to LLaMA model license and OpenAI's data usage policies.
  • The cost of training Vicuna-13B was around \$300.
  • The team acknowledges the need for further rigorous evaluation of the model.
  • Community engagement and contributions are encouraged for continued improvement of Vicuna.

Recommendations

  • Use Vicuna-13B for experimental and research purposes to explore limitations and potential improvements.
  • Consider the ethical implications of deploying chatbots and adhere to fair use policies.
  • Because Vicuna is open-source, contributions such as bug fixes, feature enhancements, and training on diverse datasets are recommended.
  • Engage with the development community through discussions, sharing use cases, and providing feedback.
  • Stay informed about future updates and versions by following the project's official communication channels.