Chapter 1 Summary: Introduction to Building AI Applications with Foundation Models

1. The Scaling Up of AI Models

  • Foundation models like ChatGPT and Gemini are highly powerful but require enormous compute resources and vast amounts of data.
  • Only a few organizations have the capability to train these models, leading to the “model as a service” approach, where others can access AI via APIs.

2. Rise of AI Engineering

  • Foundation models have roots in decades of language modeling research, from the 1950s onward.
  • Today’s LLMs evolved through advances such as self-supervised learning and scaling up data and compute.

3. Language Models and Tokenization

  • Language models predict the likelihood of words/tokens in context and can operate over multiple languages.
  • Text is broken into “tokens” (pieces of words, words, or characters), making models efficient and capable of understanding new words by decomposition.
  • The vocabulary size and tokenization method are determined by model developers (e.g., GPT-4’s vocabulary is 100,256 tokens).

Why Tokens?

  • Tokens balance meaningful representation, efficient model size, and the ability to handle unknown/new words.

4. Types of Language Models

  • Masked Language Models (e.g., BERT): Predict missing tokens using context from both sides. Good for non-generative tasks and understanding context.
  • Autoregressive Language Models: Predict the next token using only previous tokens; core to generative AI applications like text generation.

5. Self-Supervision

  • LLMs use self-supervised learning, where models learn to predict parts of their own input data, vastly reducing the need for manual labeling.
  • This allows training at massive scale compared to traditional supervised approaches.
  • The number of model parameters determines what is considered "large"; this threshold has increased over time as models have grown.

6. From LLMs to Foundation and Multimodal Models

  • Foundation models can work with multiple data modalities (text, image, video, etc.).
  • Multimodal models use self-supervision across modalities, such as pairing images and text (e.g., CLIP model).
  • These models mark a shift from task-specific to general-purpose AI, capable of many tasks out of the box and adaptable via prompt engineering, retrieval-augmented generation (RAG), or fine-tuning.

7. From Foundation Models to AI Engineering

  • AI engineering is the discipline of building applications on top of foundation models.
  • “Model as a service” enables rapid development, democratizes access, and accelerates AI adoption.
  • Coding, image/video production, writing, education, conversational bots, information aggregation, data organization, and workflow automation are prominent use cases.
  • Enterprises often adopt lower-risk internal use cases first; exposure of occupations to AI varies widely.

8. Planning AI Applications

  • Evaluating the business need and impact of AI is crucial: existential risk, productivity gains, or keeping pace with innovation.
  • Define success metrics early (automation rates, labor savings, response times).
  • Initial progress is often rapid, but perfecting applications to production quality is much more difficult and time-consuming.

9. Maintenance

  • AI products require ongoing maintenance due to the fast pace of model and infrastructure evolution.
  • Regulatory changes (GDPR, export controls) and market shifts (cost, IP) can introduce new risks and costs.

10. The AI Engineering Stack

  • Three layers:

    • Application Development: Prompts, context, evaluation, and user interface.
    • Model Development: Training, fine-tuning, dataset engineering, and inference optimization.
    • Infrastructure: Serving models, managing compute/data, monitoring.
  • Growth has been fastest in application development, with infrastructure needs remaining stable.

11. AI Engineering vs. ML Engineering

  • Traditional ML engineering focused on training proprietary models; AI engineering emphasizes model adaptation and evaluation using externally-provided models.
  • AI engineering must handle larger models, higher compute, and open-ended outputs, making efficient inference and robust evaluation more critical.

Model Adaptation

  • Prompt-based adaptation: Uses instructions and context without changing model weights.
  • Fine-tuning: Updates model weights for higher performance but requires more expertise and data.

Model Development Responsibilities

  • Modeling/training, dataset engineering, and inference optimization are all important, with increased focus on inference and data quality in foundation model era.

Application Development Responsibilities

  • Differentiation now comes from the application layer: evaluation, prompt engineering, and user interface design are key.
  • Open-ended tasks and various adaptation techniques make robust evaluation both more important and more challenging.

0 Comments

What do you think?