- name
- LLaVA-1.6 Vicuna 13B
- description
- Science QA is a dataset designed for evaluating and training models in the task of answering science-related questions.
- version
- 1.0.0
- publisher
- LMSYS Org
- model type
- Chatbot
- release date
- 2023-03-30
LLaVA-1.6 Vicuna 13B
Type: model
Tags: opensource
Publisher: LMSYS Org
Released: 2023-03-30
v.1.0.0
Metadata
General information.
Relations
Relationship Graph
Intended Use
- Research and academic purposes to advance the field of natural language processing and chatbot design.
- Developers and enthusiasts to explore chatbot technologies and implement in non-commercial projects.
- Educational institutions for teaching concepts related to AI and chatbots.
- Non-profit organizations to enhance their customer support or engagement through chatbots.
- Open-source community contribution and improvement.
Factors
- Performance comparison with other models like ChatGPT and Google Bard.
- Cost-effectiveness of training the model.
- Accessibility of the model for non-commercial use.
- Potential for the community to contribute to model improvement.
- Ease of integration into existing systems for developers.
Evaluation Data
- 70K user-shared ChatGPT conversations were used for fine-tuning.
- GPT-4 used for preliminary evaluation to benchmark against ChatGPT and Google Bard.
- Responses compared for detailed and well-structured answers.
- Eight question categories devised to assess various aspects of chatbot performance.
- Comparison based on helpfulness, relevance, accuracy, and detail of responses.
Training Data
- Approximately 70K user-shared conversations gathered from ShareGPT.com.
- HTML content converted back to markdown to filter and maintain data quality.
- Lengthy conversations divided into smaller segments to fit model's maximum context length.
- Dataset expanded to ensure Vicuna understands long context.
- Innovations such as gradient checkpointing and flash attention applied to manage GPU memory requirements efficiently.
Additional Information
- Vicuna is an open-source project with its code and weights available on GitHub.
- The model is designed for non-commercial use, adhering to LLaMA model license and OpenAI's data usage policies.
- The cost of training Vicuna-13B was around \$300.
- The team acknowledges the need for further rigorous evaluation of the model.
- Community engagement and contributions are encouraged for continued improvement of Vicuna.
Recommendations
- Use Vicuna-13B for experimental and research purposes to explore limitations and potential improvements.
- Consider the ethical implications of deploying chatbots and adhere to fair use policies.
- Because Vicuna is open-source, contributions such as bug fixes, feature enhancements, and training on diverse datasets are recommended.
- Engage with the development community through discussions, sharing use cases, and providing feedback.
- Stay informed about future updates and versions by following the project's official communication channels.