Context Engineering vs. Prompt Engineering: Smarter AI with RAG & Agents | YouTube Summarizer

Category: AI Techniques

Tags: AI Context Engineering Prompt Techniques

Entities: Best Western Paris Inn DevOps conference Graeme Paris, France Paris, Kentucky Ritz

Summary

Introduction to Prompt and Context Engineering

Prompt engineering involves crafting input text to guide a large language model's behavior and output.
Context engineering encompasses assembling everything the model sees, including prompts, documents, memory, and tools.

Prompt Engineering Techniques

Role assignment helps the model adopt specific expertise and vocabulary.
Few shot examples demonstrate input/output pairs to guide format and style.
Chain of thought prompting encourages the model to show reasoning steps.
Constraint setting defines explicit boundaries for model responses.

Context Engineering Components

Agentic AI requires memory management, both short-term and long-term.
State management ensures agents maintain context in multi-step processes.
Retrieval augmented generation (RAG) connects agents to dynamic knowledge sources.
Tools enable LLMs to interact with external databases, APIs, and execute code.

Takeaways

Combine prompt and context engineering for optimal system performance.
Use role assignment to tailor model responses.
Incorporate few shot examples for precise output formats.
Apply constraint settings to manage response length and content.
Utilize RAG for contextually relevant information retrieval.

Transcript

00:00

I think by now most of us are familiar with the term prompt engineering. It's the process of crafting the input text used to prompt a large language model, including instructions and examples and formatting cues.

It's what steers the LLM's behavior and output. Now,

00:15

context engineering, on the other hand, is the broader discipline of programmatically assembling everything the LLM sees during inference. Now that includes prompts, but also retrieve documents and memory and tools—everything needed to deliver accurate responses.

So,

00:34

to demonstrate the difference, let me introduce you to an agentic AI model that I like to call Graeme. Secret Agent Graeme.

Agent Graeme specializes in travel booking. So, if I send Graeme this prompt, "Book me a hotel in Paris

00:51

for the DevOps conference next month," well, the agent responds with "Sure thing. The Best Western Paris Inn has great wifi and free parking.

It's booked." Cool. But the only trouble is the Best Western is located in Paris, Kentucky, and

01:06

that DevOps conference is in Paris, France. Now, you could argue that that's a failing of prompt engineering.

I wasn't specific on the location. But it could also be seen as a failing of context engineering because if Agent Graeme here was just a little smarter, well,

01:25

they could have used a tool to check my calendar or look up the conference online to find the right location. So ...

so let's try again with a follow-up prompt. My conference is in Paris, France.

€900 a night. Ritz booked.

01:40

Champagne. Breakfast included.

Well, uh, wish me luck getting that one approved through my company expense reimbursement system. But Graeme here can't really be blamed for that one because I didn't provide sufficient context.

I should have made my company's travel policy available to the agent.

01:58

Perhaps there's a JSON file specifying things like maximum permissible hotel rates for the area. So, prompt engineering—that's the craft of wording the instruction itself.

And context engineering is the system-level discipline of providing the model with what it needs to plausibly accomplish the task.

02:16

So let's take a look at these two terms a bit closer. And we'll start with the key techniques that make prompt engineering effective.

Now this is part art, part science. But there are several prompt engineering techniques that are now widely adopted.

So,

02:32

take for example, role assignment. This tells the LLM who it should be.

So, you are a senior Python developer reviewing code for security vulnerabilities. Well, that produces vastly different outputs

02:48

than a more generic code review request. The model adopts the expertise, the vocabulary and the concerns of that persona that we asked for.

Uh ... another good technique comes down to few shot examples.

So,

03:04

this is show, don't just tell. So, providing 2 or 3 examples of input/output pairs ...

that helps the model understand your exact format and style requirements. So, if you want JSON output with specific field names, well,

03:19

show it. Show it in the examples.

Now, before we had reasoning models trained on reinforcement learning, a pretty popular prompt engineering technique was called COT or chain of thought prompting.

03:36

Now this forces the model to show its work. Adding "let's think step by step" or "explain your reasoning"—that prevents the LLM from jumping to conclusions.

And it's particularly powerful for complex reasoning tasks. And then another technique is called

03:53

constraint setting. Here you define boundaries explicitly.

So, "limit your response to only 100 words" or "only use information from the provided context". And that helps prevent the model from going off on tangents.

04:08

Context engineering—that helps build dynamic, agentic systems to orchestrate the entire agentic environment. And let's take a look at some of the components of that.

Well, agentic AI. First of all, it needs memory.

04:24

And memory management can be thought of in two forms. So there's short-term memory ...

that might involve summarizing long conversations to stay within context windows so that past conversations are not forgotten. And then there's also long-term memory, and

04:42

that uses vector databases to retrieve things like user preferences and past trips and learned patterns. Then there is state management.

Now, this says where are we in a multi-step process? So, if an agent is booking a complete trip—the flight,

04:59

the hotel, the ground transportation, all of it— well, the agent needs to maintain state across these operations. Did the flight booking succeed?

What's the arrival time for scheduling the airport transfer? Stuff like that.

So, state ensures that the agent doesn't lose context mid task. Now,

05:18

another important component is retrieval augmented generation or RAG, that connects an agent to dynamic knowledge sources. So, RAG uses hybrid search which combines semantic and keyword matching based on context.

05:36

So, when retrieving your company's travel policy, RAG isn't returning the entire travel policy document. There's a lot of stuff that's just kind of irrelevant to the context in there.

So instead, it's picking out the relevant sections and the relevant exceptions

05:52

and returning only those contextually relevant parts back to the agent. And agents also need access to tools so they can actually go out and do stuff.

So LLMs by themselves,

06:08

they can't check real databases or call APIs or execute code. It's tools that bridge that gap, and a tool might query a SQL database, or it might fetch live pricing data, it might deploy infrastructure.

And where context engineering comes in

06:24

is in defining the interfaces that guide the LLM toward the correct usage. And tool descriptions— they specify what the tool does, when to use it, and what constraints apply.

And prompt engineering? Well, actually we should include that as well

06:39

because that is also part of context engineering. You can take a base written prompt like "analyze security logs for anomalies".

You can take that as your prompt and then at runtime, inject the prompt with current context,

06:55

like recent alerts and known false positives. And all of those variables in the prompt, they get populated from the states and the memory and the RAG retrievals.

So, that final prompt might be 80% dynamic content from there and 20% static instructions. So,

07:12

prompt engineering ... it gives you better questions.

Context engineering—that gives you better systems when you combine them properly. Hotel booked.

Paris, France. Under budget.

Near the venue. Excellent.

Thank you, Agent Graeme.

07:27

Pending approval from your manager, HR and finance. Estimated approval time: 6 to 8 weeks.

Uh, the conference is in two weeks. Have you tried prompt engineering your manager?