1. When should you escalate from zero-shot to few-shot?
A Always start with few-shotB Only for mathC When zero-shot output is wrong or inconsistent — add examples incrementally, using the minimum neededD Never, few-shot is deprecated
2. What is "Prompt Injection" and how do you defend against it?
A Injecting code into the prompt for faster executionB A user crafts input that overrides your system prompt — defend by separating system instructions from user input and adding explicit refusal rulesC Adding more examples to improve output qualityD A technique for compressing long prompts
3. What is "Meta-Prompting"?
A Prompting about metadata in your databaseB Asking the AI to help you write a better prompt by having it ask you the right questionsC Using meta tags in HTML for SEOD A deprecated prompting technique
4. In the ReAct paradigm, what is the correct loop sequence?
A Action → Observation → ThoughtB Thought → Action → ObservationC Observation → Thought → ActionD Thought → Observation → Action
5. What is "Constraint Stacking"?
A Adding error handling to your codeB Layering multiple constraints (format, length, tone, audience, data requirements) to narrow the output space preciselyC Stacking multiple API calls in sequenceD Using multiple models simultaneously
6. What problem does "Structured Outputs" solve vs. asking for JSON in the prompt?
A Structured Outputs are fasterB Without it, the model may add markdown fences, extra commentary, or malformed JSON — Structured Outputs constrain the model to only output valid JSON matching your schemaC Structured Outputs only work with XMLD There is no difference
7. The course recommends changing only one thing at a time when iterating prompts. Why?
A To reduce API costsB If you change role, format, and examples simultaneously, you won't know which change improved or worsened the outputC Models can only process one instruction at a timeD It is an API requirement
8. What is "Data Leakage" in prompt security?
A Data being lost during API transmissionB Users trying to extract your system prompt or internal data by asking the model to repeat its instructionsC Memory leaks in your application codeD Accidentally logging user data
9. What temperature should you use for code generation and data extraction?
A 1.0+ for creativityB 0.7 as balanced defaultC 0.0 for deterministic, reproducible outputD Temperature is irrelevant for code
10. What is "Self-Consistency" and how does it build on Chain-of-Thought?
A It verifies grammar before respondingB It generates multiple reasoning chains at higher temperature, then takes the majority-vote answerC It forces the same answer regardless of phrasingD It checks answers against a pre-defined database
11. In a RAG pipeline, what is the correct order of steps?
A Generate → Retrieve → Embed → ChunkB Chunk → Embed → Store → Retrieve → Augment → GenerateC Store → Chunk → Generate → RetrieveD Embed → Store → Chunk → Generate
12. What is an "embedding" in the context of RAG?
A A way to embed images in HTMLB A numerical vector that captures the semantic meaning of text, enabling similarity search by meaning rather than keywordsC A method for compressing large documentsD A database index type
13. What does Top-P (Nucleus Sampling) control?
A Response length; always combine with temperatureB Diversity of token selection by limiting to top P% of probable tokens; tune either temperature OR Top-P, not bothC Number of paragraphs; use for creative tasksD Memory of previous conversations; always 1.0
14. What is the "cost-quality ladder" for model selection?
A Always use the most powerful modelB Start with the cheapest model that gives acceptable results; only upgrade when quality is insufficientC Alternate between cheap and expensive modelsD Use open-source exclusively
15. What is "Function Calling" (tool use) in LLMs?
A Calling JavaScript functions inside the prompt textB The model decides to invoke functions you define, returning structured JSON with function name and arguments, which your code executesC A way to call the OpenAI API from your codeD Automatically generating function documentation