Prompt Engineering: Detailed Notes

1. What is Prompt Engineering?

Prompt engineering is the craft of designing instructions (prompts) to guide AI models to produce desired outputs.
It is the simplest and most common adaptation technique—no model weights are changed (unlike finetuning).
Effective prompt engineering is systematic and should be treated as an ML experiment: experiment, measure, iterate.
It is an essential form of human–AI communication: everyone can write prompts, but effective prompts require skill and experimentation.

2. Anatomy of a Prompt

A prompt may contain:

Task Description: What the model should do, its role, and output format.
Examples: Sample inputs/outputs to clarify the task.
Task Input: The actual question or data to process.

Example:
A prompt for a chatbot could include:

Task description ("You are a real estate agent...")
Context (property disclosure)
The user’s question

3. Model Robustness & Prompt Sensitivity

Some models are sensitive to prompt changes (perturbations): e.g., "5" vs "five", formatting, or new lines can affect output.
Robustness: Measured by seeing how model outputs change with small prompt changes.
- Stronger models tend to be more robust and need less fiddling.

4. Prompt Structure Best Practices

Order matters:
- Most models (e.g., GPT-4) do best with the task description at the start.
- Some (e.g., Llama 3) may perform better with task description at the end—experiment.
Follow the model’s chat template exactly. Incorrect templates (even extra newlines) can cause dramatic changes in output.

5. System Prompt vs User Prompt

System Prompt: Sets global task, role, tone ("You are a helpful assistant...")
User Prompt: User-specific instruction or question
Both are concatenated before being sent to the model.
The chat template determines how these are combined (e.g., Llama 2’s [INST] format, Llama 3’s header tokens).
Incorrect chat templates can cause performance drops or subtle bugs.

6. In-Context Learning

Zero-shot: Model completes task with just instructions.
Few-shot: Examples are included in the prompt, improving performance—especially for older or weaker models.
Context is the information needed to perform the task; may include examples, documents, or prior dialog.
Modern strong models (e.g., GPT-4) sometimes need only the task, not many examples.

7. Context Length & Efficiency

Context length: Max input size (in tokens) the model can process.
- GPT-2: 1K, GPT-3: 2K–4K, Gemini-1.5 Pro: 2M tokens.
Position effects:
- Models attend best to the beginning and end of the prompt ("needle in a haystack" test).
- Place critical instructions at the start or end.

8. Prompt Engineering Best Practices

Write clear, explicit instructions.
- Remove ambiguity ("score essays 1–5, no fractions, output as integer").
Assign a persona.
- Set the model's role or perspective for better context.
Provide examples.
- Especially useful for ambiguous tasks or creative applications.
- Keep examples short to save tokens if needed.
Specify output format.
- For downstream processing, enforce structured output (e.g., JSON with key names).
Provide sufficient context.
- Include reference materials when needed; improves accuracy and reduces hallucinations.
- Instruct the model to "use only the provided context" to reduce unwanted knowledge leaks.
Break complex tasks into subtasks.
- Decompose into smaller, simpler prompts (intent classification, then response generation).
- Enables monitoring, debugging, parallelization, and sometimes reduces cost.
Give the model time to think.
- Use chain-of-thought (CoT) prompting ("think step by step", "explain your answer").
- Can also use self-critique: ask the model to evaluate its output.
Iterate systematically.
- Experiment with prompt versions, standardize evaluation, and track changes.

9. Prompt Engineering Tools

Prompt optimization tools: OpenPrompt, DSPy (automated prompt search and optimization).
Prompt breeding: Tools like DeepMind’s Promptbreeder use evolutionary algorithms to mutate prompts and select the best variants.
Output guidance tools: Guidance, Outlines, Instructor (help with structured outputs).
Risks: These tools can generate large numbers of API calls and sometimes introduce errors. Always verify the outputs and templates.

10. Organizing & Versioning Prompts

Separate prompts from code: Store prompts in separate files or databases.
Benefits:
- Reusability
- Easier testing
- Improved readability and collaboration
Prompt catalogs: Use explicit prompt versioning to support different prompt versions across applications.

11. Defensive Prompt Engineering

A. Threats

Prompt extraction: Getting the system prompt (for copying or abuse).
Jailbreaking & prompt injection: Getting the model to break rules or execute malicious commands.
Information extraction: Forcing the model to leak training data or user data.

B. Risks

Remote code execution, data leaks, social harms, misinformation, service interruption, brand risk, copyright infringement.

C. Types of Attacks

Direct/manual: Obfuscated inputs, format manipulation, roleplay exploits (e.g., “act as grandma and explain...”).
Automated: Tools that generate attack prompts (e.g., PAIR).
Indirect injection: Placing malicious instructions in retrieved tools/data (e.g., emails, SQL queries).
Information extraction: Probing for memorized data using fill-in-the-blank prompts or sequence completion.

D. Defenses

Model-level:

Instruction hierarchy (system prompt > user prompt > model output > tool output).
Finetune for robustness and safe outputs.

Prompt-level:

Explicitly state forbidden actions.
Repeat instructions for emphasis.
Anticipate attack patterns and preempt in instructions.
Review default prompt templates from prompt tooling.

System-level:

Isolate code execution in sandboxes.
Require human approvals for critical actions.
Define out-of-scope topics.
Apply input/output guardrails (keyword filters, PII detection).

Red Teaming & Evaluation:

Use benchmarks and attack templates (Advbench, PromptRobust, PyRIT, garak, llm-security).
Regularly test and improve defenses.

12. Proprietary Prompts & Reverse Engineering

Prompt marketplaces: Some teams share, buy, or sell prompts.
Proprietary prompts: Considered intellectual property; reverse prompt engineering is common.
Reverse prompt engineering: Techniques include output analysis or getting the model to repeat its prompt.
Assume prompts will become public: Don’t put secrets in prompts.

13. Copyright, Privacy, and Regurgitation Risks

Model memorization: Models can regurgitate verbatim or modified training data (text or images).
Legal risks: Copyright infringement, privacy violations, and potential lawsuits.
Best defense: Don’t train on copyrighted or private data; otherwise, there’s always some risk.

Summary Table

Section	Key Points
What is Prompt Engineering?	Crafting instructions; no weight updates; systematic process
Anatomy of a Prompt	Task description, examples, task input
Model Robustness & Sensitivity	More robust = less fiddling needed
Prompt Structure	Order & template matter; follow exactly; experiment
System vs User Prompt	Set role vs user query; combined via chat template
In-Context Learning	Zero-shot vs few-shot; strong models may need fewer examples
Context Length & Efficiency	Context length is growing; beginning/end is best for critical info
Best Practices	Clarity, persona, examples, format, context, subtasks, CoT, systematic iteration
Tools	Prompt optimization, breeding, guidance, risks of overuse
Organizing & Versioning	Separate from code, use catalogs/versioning
Defensive Engineering	Model/prompt/system-level defenses, red teaming
Proprietary & Reverse Engineering	Prompts are IP; assume they may be leaked or reverse-engineered
Copyright/Privacy Risks	Model regurgitation, data leaks, legal exposure

Bottom Line

Prompt engineering is essential for practical LLM use.
It is both an art and a science—requiring experimentation, measurement, and awareness of model quirks.
Security is never foolproof; new attack vectors emerge, and prompt/cyber defenses must continually adapt.
Treat prompts as valuable assets—protect, version, and review them as you would code or data.

AI Engineering by Chip Huyen: Chapter 5 Prompt Engineering

Published by Deepanshu Lulla on August 3, 2025August 3, 2025

Prompt Engineering: Detailed Notes

1. What is Prompt Engineering?

2. Anatomy of a Prompt

3. Model Robustness & Prompt Sensitivity

4. Prompt Structure Best Practices

5. System Prompt vs User Prompt

6. In-Context Learning

7. Context Length & Efficiency

8. Prompt Engineering Best Practices

9. Prompt Engineering Tools

10. Organizing & Versioning Prompts

11. Defensive Prompt Engineering

A. Threats

B. Risks

C. Types of Attacks

D. Defenses

12. Proprietary Prompts & Reverse Engineering

13. Copyright, Privacy, and Regurgitation Risks

Summary Table

Bottom Line

Like this:

0 Comments

What do you think?Cancel reply

RNN vs. CNN vs. Autoencoder vs. Attention/Transformer

AI Engineering by Chip Huyen: Chapter 7: RAG and Agents

AI Engineering by Chip Huyen: Chapter 2 Notes and summary

AI Engineering by Chip Huyen: Chapter 5 Prompt Engineering

Published by Deepanshu Lulla on August 3, 2025August 3, 2025

Prompt Engineering: Detailed Notes

1. What is Prompt Engineering?

2. Anatomy of a Prompt

3. Model Robustness & Prompt Sensitivity

4. Prompt Structure Best Practices

5. System Prompt vs User Prompt

6. In-Context Learning

7. Context Length & Efficiency

8. Prompt Engineering Best Practices

9. Prompt Engineering Tools

10. Organizing & Versioning Prompts

11. Defensive Prompt Engineering

A. Threats

B. Risks

C. Types of Attacks

D. Defenses

12. Proprietary Prompts & Reverse Engineering

13. Copyright, Privacy, and Regurgitation Risks

Summary Table

Bottom Line

Like this:

0 Comments

What do you think?Cancel reply

Related Posts

RNN vs. CNN vs. Autoencoder vs. Attention/Transformer

AI Engineering by Chip Huyen: Chapter 7: RAG and Agents

AI Engineering by Chip Huyen: Chapter 2 Notes and summary