[Databricks] The Big Book of Generative AI

5 ways to leverage your data to build production-quality Gen AI applications

Contents

Introduction
The Path to Deploying Production-Quality GenAI Applications
- Stage0 : Foundation Models
  - Use Case : Introducing DBRX : A New State-of-the-Art Open LLM
- Stage1 : Prompt Engineering
  - Use Case : Automated Analysis of Product Reviews Using Large Language Models
- Stage2 : Retrieval Augmented Generation (RAG)
  - Use Case : Improve Your RAG Application Response Quality With Real-Time Structured Data
- Stage3 : Fine-Tuning a Foundation Model
  - Use Case : Creating a Bespoke LLM for AI-Generated Documentation
  - Use Case : Efficient Fine-Tuning With LoRA: A Guide to Optimal Parameter Selection for Large Language Models
- Stage4 : Pretraining
  - Use Case : Training Stable Diffusion From Scratch for <$50K With MosaicML
- Stage5 : LLM Evaluation
  - Use Case : Best Practices for LLM Evaluation of RAG Application
  - Use Case : Offline LLM Evaluation : Step - by - Step Gen AI Appliation Assessment on Databricks
Summary
- GenAI Training
- Additional Resources

Introduction

Achieveing Production Quality GenAI Requires new tools and skills

Generative AI has opened new worlds of possbilities for businesses and is being emphatically embraced across organizations. According to a recent MIT Tech Review report The Great Acceleration: CIO perspectives on generative AI | Databricks , all 600 CIOs surveyed stated they are increasing their investment in AI, and 71% are planning to build their own custom large language models (LLMs) or other GenAI models. However, many organizations have found it challenging to deploy thsee applications at production quality. To meet the standard of quality required for customer-facing applications, AI output must be accurate, governed and safe.

Data Infrastructure Must Evolve to Support GenAI-Powered Applications

Making the leap to generative AI is not just deploying a chatbot; it requires a reshaping of the foundational aspects of data management. Central to this transformation is the emergence of data lakehouses as the new "modern data stack."

Data Lakehouse Architecture | Databricks

These advanced data architectures are essential to harnessing the full potential of GenAI, driving faster, more cost-effective and wider democratization of data and AI technologies. As businesses increasingly rely on GenAI-powered tools and applications for competitive advantage, the underlying data infrastructure must evolve to support these advanced technologies effectively and securely.

No matter where you are on your path to deploying genAI applications, the quality of your data matters

Businesses need to achieve production quality with their GenAI applications. Developers need rich tools for understanding the quality of their data and model outputs, along with an underlying platform that lets them combine and optimize all aspects of the GenAI process. GenAI has many components such as data preparation, retrieval models, language models, (either SaaS or open source), ranking and post-processing pipelines, prompt engineering, and training models on custom enterprise data.

To help you overcome common enterprise challenges with building GenAI, we've compiled a collection of technical content and code samples. We'll start each section with a brief overview and then provide use cases and example code for reference.

In this eBook, you'll learn:

- How to plan a path from basic to advanced GenAI applications, leveraging your organization's data

- How to use retrieval augmented generation (RAG) to make an off-the-shelf AI system smarter

- How to evaluate LLMs and where you want to invest in more powerful AI tools and systems that drive more significant operational gain

- How to build a custom LLM that may be better, faster and cheaper for your organization

- When it might be worth it to pretaini your own model - and more

Use cases for GenAI covered:

- How to use LLMs to gain actionable insights from product reviews

- How to use RAG for a chatbot to improve the quality of output

- How to train your own generative AI model in a cost-effective manner

- How to monitor and evaluate your deployed LLMs and GenAI application

GenAI journey

Plan an iterative path ffrom basic to advanced GenAI, leveraging your data

- Prompt engineering : Crafting specialized prompts and pipelines to guide GenAI behavior

- Retrieval augmented generation (RAG) : Combining an LLM with custom enterprise data

- Fine-tuning : Adapting a pre-trained GenAI model to specific data sets or domains

- Pre-training : Training a GenAI model from scratch

The Path to Deploying Production-Quality GenAI Applications

Stage0: Foundation Models

Before setting off to create production-quality GenAI Applications, we need to cover the base language models that serve as the foundation for layers of increasingly complex techniques. Foundation models commonly refer to large language models that have been trained over extensive datasets to be generally good at some task (chat, instruction following, code generation, etc..)

We won't cover many models, as it is a constantly shifting landscape, but it is important to note that while underlying architectures may differ drastically, foundation models generally fall under two categories:

1) Prorietary (such as GPT-3.5 and Gemini) and

2) open source (such as Llama2-70B and DBRX)

The main difference betwen the two is that while proprietary models historically have an edge outright performance, users have to send their data out to a third party and don't have control over the underlying model as they're often being updated and changed.

Open source models, on the other hand, offer users full control over the model and the ability to run it on their own terms with their own governance and data privacy. Here's a current list of many open source GenAI models across different domains that are all free for commercial use. Open Source Large Language Models | Databricks

Databricks has also created their own state-of-the-art open source foundation model so users can build the highest-quality production GenAI applications.

Foundation model usecase

1) Introducing DBRX: A New state-of-the-art Open LLM

We are excited to introduce DBRX, an open, general-purpose LLM created by Databricks. Across a range of standard benchmarks, DBRX sets a new state-of-the-art for established open LLMs. Moreover, it provides the open community and enterprises buidling their own LLMs with capabilities that were previously limited to closed model APIs; acording to our measruements, it surpasses GPT-3.5, and it is competitive with Gemini 1.0 Pro. It is an especially capable code model, surpassing specialized models like CodeLLaMA-70B on programming, in addition to its strength as a general-purpse LLM.

This state-of-the-art quality comes with marked improvements in training and inference performance. DBRX advances the state-of-the-art in efficiency among open models thanks to its fine-grained mixture-of-experts (MoE) architecture. Inference is up to 2x faster than LLaMA2-70B, and DBRX is about 40% of the size of Grok-1 in terms of both total and active parameter-counts. When hosted on Mosaic AI Model Serving, DBRX can generate text at up to 150 tok/s/user. Our customers will find that training MoEs is also about 2x more FLOP-efficient than training dense models for the same final model quality. End-to-end, our overall recipe for DBRX (including the pretraining data, model architecture, and optimization strategy) can match the quality of our previous-generation MPT models with nearly 4x less compute.

--> 매트릭스를 Language understanding, programming, math 로 측정

The weights fo the base model (DBRX Base) and the fine-tuned model (DBRX Instruct) are available on Hugging Face under an open license. Starting today, DBRX is available for Databricks customers to use via APIs, and Databricks customers can pretrain their own DBRX-class models from scratch or continue training on top of one of our checkpoints using the same tools and science we used to build it. DBRX is already being integrated into our GenAI-powered products, where - in applications like SQL - early rollouts have surpassed GPT-3.5 Turbo and are challenging GPT-4 Turbo. It is also a leading model among open models and GPT-3.5 Turbo on RAG tasks.

Training mixture-of-experts models is hard. We had to overcome a variety of scientific and performance challenges to build a pipeline robust enough to repeatedly train DBRX-class models in an efficient manner. Now that we have done so, we have a one-of-a-kind training stack that allows any enterprise to train world-class MoE (Mixture-of-experts) foundation models from scratch. We look forward to sharing that capability with our customers and sharing our lessons learned with the community.

Download DBRX today from Hugging Face

- DBRX Base : databricks/dbrx-base · Hugging Face

- DBRX Instruct : databricks/dbrx-instruct · Hugging Face

or try out DBRX Instruct in our HF Space,

- HF Space : DBRX Instruct - a Hugging Face Space by databricks

or see our model repository on github

- databricks/dbrx : GitHub - databricks/dbrx: Code examples and resources for DBRX, a large language model developed by Databricks

What is DBRX ? 데이터브릭스, 범용 대형언어모델 ‘DBRX’ 출시 - ZDNet korea

DBRX is a transformer-based decoder - only large language model (LLM) that was trained using next-token prediction. IT uses a fine-grained mixture of experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. It was pre-trained on 12T tokens of tet and code data. Compared to other open MoE models like Mixtral and Grok-1, DBRX is fine-grained, meaning it uses a larger number of smaller experts. DBRX has 16 experts and chooses 4, while Mixtral and Grok-1 have 8 experts and choose 2. This provides 65x more possible combinations of experts and we found that this improves model quality. DBRX uses rotary position encodings (RoPE), gated linear units (GLU), and grouped by query attention (GQA). It uses the GPT-4 tokenizer as provided in the tiktoken repository. We made these choices based on exhaustive evaluation and scaling experiments.

DBRX was pretrained on 12T tokens of carefully curated data and a maximum context length of 32k tokens. We estimate that this data is at least 2x better token-for-token than the data we used to pretrain the MPT family of models. This new dataset was developed using the full suite of Databricks tools, including Apache Spark and Databricks notebooks for data processing, Unity Catalog for data management and governance, and MLflow for experiment tracking. We used curriculum learning for pretraining, changing the data mix during training in ways we found to substantially improve model quality.

Quality

Stage1: Prompt Engineering

Many companies sstill remain in the foundational stages of adopting generative AI techonolgy. They have no overarching AI strategy in place, no clear use cases to pursue and no access to a team of data scientists and other professionals who can help guide the company's AI adoption journey.

If this is like your business, a good starting point is an off-the-shelf- LLM. While these LLMs lack the domain-specific expertise of custom AI models, experimentation can help you plot your next steps. Your employees can craft specialized prompts and workflows to guide their usage. Introducing MLflow 2.7 with new LLMOps capabilities | Databricks Blog

Your leaders can get a better understanding of the strengths and weaknesses of these tools as well as a clearer vision of what early success in AI might look like. Your organization can use things like the Databricks AI Playground to figure out where to invest in more powerful AI tools and systems that drive more significant operational gain and even use LLMs as a judge to help evaluate responses.

Introducing MLflow 2.7 with new LLMOps capabilities | Databricks Blog

Build GenAI Apps Faster with New Foundation Model Capabilities | Databricks Blog

Announcing MLflow 2.8 LLM-as-a-judge metrics and Best Practices for LLM Evaluation of RAG Applications, Part 2 | Databricks Blog

1) Practical applications of genAI technology.

Let's delve into a compelling use case that illustrate the power of prompt engineering with off-the-shelf LLMs. Consider the challenge many businesses face: sifting through vast amounts of product reviews to glean actionable insights. WWitthout a dedicated team of data scientists or a clear AI strategy, this task might seem daunting. However, leveraging the flexibility of LLMs through prompt engineering offers a straightforward solution.

Stage2 : Retrieval Augmented Generation

Retrieval augmented generation (RAG) lets you bring in supplemntal knowledge resources to make an off-the-shelf AI system smarter. RAG won't change the underlying behavior of the model, but it will improve the quality and accuracy of the responses. Building High Quality RAG Applications with Databricks | Databricks Blog

Stage3 : Fine-tuning a Foundation Model

Moving beyond RAG to model fine-tuning lets you start building models that are much more deeply personalized to the business. If you have already been experimenting with commercial models across your operations, you are likely ready to advance to this stage. There's a clear understanding at the executive level of the value of generative AI, as well as an understanding of the limitations of publicly available LLMs. Specific use cases have been established. And now, you and your enterprise are ready to go deeper.

With fine-tuning, you can take a general-purpose model and train it on your own specific data. For exaample, data management provider Stardog relies on the Mosaic AI tools from Databricks to fine-tune the off-the-shelf LLMs they use as a foundation for their Knowledge Graph Platform. This enables Stardogn's custoomers to query their own data across the different silos simply by using natural language. Building High Quality RAG Applications with Databricks | Databricks Blog

Stage4 : Pretraining

Pretrainig a model from scratch refers to the process of training a language model on a large corpus of data (e.g., text, code) without using any prior knowledge or weights from an existing model. This is in constrast to fine-tuning, where an already pretrained model is further adapted to a specific task or dataset. The output of full pretraining is a base model that can be directly used or further fine-tuned for downstream tasks.

Stage5 : LLM Evaluation

Constant evaluation and monitoring of deployed large language models (LLMs) and generative AI applications are crucial due to the dynamic nature of both the data they interact with and the environments in which they operate. These systems learn from vast datasets and can evolve over time, potentially leading to shifts in performance, accuracy or eveen the emergence of biases. Continuous monitoring ensures that any deviation from expected behavior can be detected and corrected promptly, maintaining the integrity and reliability of the AI application. As user needs and societal norms change, ongoing evaluation allows these models to adapt, ensuring their outputs remain relevant, appropriate and effective. This vigilance not only mitigates risks associated with AI deployments, such as ethical concerns and regulatory compliance, but also maximizes the value and utility these technologies bring to organizations and end users.

Evaluating LLMs is a challenging and evolving domain, LLM2 Module 3 - Deployment and Hardware | 3.8 MosaicML Guest Lecture on Training LLMs from Scratch (youtube.com) , primarily because LLMs often demonstrate uneven capabilites across different tasks. An LLM might excel in one benchmark, but slight variations in the prompt or problem can drastically affet its performance. The dynamic nature of LLMs and their vast potential applications only amplify the challenge of establishing comprehensive evaluation standards.

GenAI Training

Generative AI Engineer Learning Pathway : Take self-paced, on-demand and instructor-led courses on generative AI

Databricks Announces the Industry’s First Generative AI Engineer Learning Pathway and Certification | Databricks Blog

Free LLM Course (edX)

GenAI Webinar

LIST

저작자표시 (새창열림)

'Generative AI' 카테고리의 다른 글

[FastCampus] GenAI / LLM Lecture (0)	2024.04.28
[AWS] Generative AI, Foundation Model, Large Language Models, Prompt Engineering, Retrieval Augmented Generation, LangChain, Vector Database, Hugging Face (0)	2024.04.21
[Databricks] Generative AI Engineering Pathway (0)	2024.04.19
Generative AI on Artefact (0)	2024.04.18
[FoundationModel][Report] On the Opportunities and Risks of Foundation Models, Ongoing 24.04.15~ (0)	2024.04.15

Data Scientist Story For Sustainability

[Databricks] The Big Book of Generative AI

'Generative AI' 카테고리의 다른 글

티스토리툴바

[Databricks] The Big Book of Generative AI

'Generative AI' 카테고리의 다른 글

'Generative AI' Related Articles

티스토리툴바