본문 바로가기

Generative AI

[FoundationModel][Report] On the Opportunities and Risks of Foundation Models, Ongoing 24.04.15~

반응형

 

 

 

 

 

Abstract

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) trained on broad data (generally using self-supervision at scale) that can be adopted to a wide range of downstream tasks.

 

We call these models foundation models to underscore their critically central yet incomplete character.

 

This report provides a thorough accountt for the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotic manipulation, reasoning, human interaction) and technical principles (e.g., model architecture, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). 

 

Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilites, and their effectiveness acorss so many tasks incentivizes homogenization. 

 

Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. 

 

Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. 

 

To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally societechnical nature. 

 

 

Contents

1. Introduction

1.1. Emergence and homogenization

1.2. Social impoact and the foundation models ecosuystem 

1.3. The future of foundation models

1.4. Overview of this report

 

2. Capabilities

2.1. Language

2.2. Vision

2.3. Robotics

2.4. Reasoning and search

2.5. Interaction

2.6. Philosophy of understanding

 

3. Applications

3.1. Healthcare and biomedicine

3.2. Law

3.3. Education

 

4. Technology 

4.1. Modeling

4.2. Training

4.3. Adaptation

4.4. Evaluation

4.5. Systems

4.6. Data

4.7. Security and privacy

4.8. Robustness to distribution shifts

4.9. AI Safety and Alignment 

4.10 Theory

4.11 Interpretability

 

5. Society

5.1. Inequity and fairness

5.2. Misuse

5.3. Environment

5.4. Legality

5.5. Economics

5.6. Ethics of scale

 

6. Conclusion

 

Acknowledgements

 

References

 

 

 

 

 

1. Introduction

This report investigates an emerging paradigm for building artificial intelligence (AI) systems basd on a general class of models which we term foundation models. 

 

 

A foundation model is any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks; current examples include

- BERT : https://arxiv.org/abs/1810.04805 BERT: Pre-traning of Deep Bidirectional Transformers for Language Understanding

- GPT-3 : https://arxiv.org/abs/2005.14165 Language Models are Few-Shot Learners

- CLIP : https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf Language Models are Unsupervised Multitask Learners

 

 

From a technological point of view, foundation models are not new - they are based on deep neural networks and self-supervised learning, both of which have existed for decades. 

 

 

However, the sheer scale and scope of foundation models from the last few years have streched our imagination of what is possible; for example, GPT-3 has 175 billion parameters and can be adapted via natural language prompts to do a passable job on a wide range of tasks despite not being trained explicitly to do many of those tasks. 

 

 

At the same time, existing foundation models have the potential to accentuate harms, and their characteristics are in general poorly understood. 

 

 

Given their impending widespread deployment, they have become a topic of intense scrutiny.

 

 

 

 

1.1. Emergence and homogenization

The significance of foundation models can be summarized by two words: emergence and homogenization.

 

- Emergence means that the behavior of a system is implicitly induced rather than explicitly constructed; it is both the source of scientific excitement and anxiety about unanticipated consequences. 

 

- Homogenization indicates the consolidation of methodologies for building machine learning systems across a wide range of applications; it provides strong leverage towards many tasks but also creates single points of failure. 

 

To better appreciate emergence and homogenization, let us reflect on their rise in AI research over the last 30 years. 

 

Fig. 1. The story of AI has been one of increasing emergence and homogenization. With the introduction of machine learning, how a task is performed emerges (is inferred automatically) from examples; with deep learning, the high-level features used for prediction emerge; and with foundation models, even advanced functionalites such as in-context learning emergy. At the same time, machine learning homogenizes learning algorithms (e.g., logistic regression), deep learning homogenizes model architectures (e.g., Convolutional Neural Networks), and foundation models homogenizes the model itself (e.g., GPT-3).

 

 

 

Machine Learning.

Most AI systems today are powered by machine learning, where predictive models are trained on historical data and used to make future predictions. The rise of machine learning within AI started in the 1990s, representing a marked shift from the way AI systems were built previously: rather than specifying how to solve a task, a learning algorithm would induce it based on data - i.e., the how emerges from the dynamics of learning. Machine learning also represented a step towards homogenization: a wide range of appilcations could now be powered by a single generic learning algorithm such as logistic regression.

Despite the ubiquity of machine learning within AI, semantically complex tasks in natural language processing (NLP) and computer vision such as question asnwering or object recognition, where the inputs are sentences or images, still required domain experts to perform "Feature engineering" - that is, writing domain-specific logic to convert raw data into higher-level features (e.g., SIFT in computer vision) that were more suitable for popular machine learning methods. 

 

 

 

Deep Learning.

Around 2010, a revival of deep neural networks under the moniker of deep learning started gaining traction in the field of machine learning.

 

 

 

 

반응형
LIST

'Generative AI' 카테고리의 다른 글

[Databricks] Generative AI Engineering Pathway  (0) 2024.04.19
Generative AI on Artefact  (0) 2024.04.18
Generative AI on GCP  (0) 2024.04.15
Generative AI on Coursera Lectures  (0) 2024.04.15
Generative AI on Databricks  (0) 2024.04.15