Power up your LLMs: Gemini CLI and Model Context Protocol (MCP) | YouTube Summarizer

Category: AI Tools

Tags: AI Automation Integration MCP Programming

Entities: Gemini CLI Linear LLM MCP OOTH

Summary

Introduction to MCP

MCP stands for Model Context Protocol, a communication layer for large language models (LLMs) to interact with other software.
It extends LLM capabilities, particularly in areas where they are less proficient, such as complex mathematics.
MCP allows LLMs to leverage traditional computer programs for tasks like math, search results, and weather updates.

Implementation of MCP

An MCP server can be created using various programming languages as long as they adhere to the MCP standard.
Tools like addition and subtraction can be registered with the MCP server, allowing LLMs to use them when needed.
MCP servers can also perform actions like updating databases or pushing code to GitHub, using authentication for security.

Using Gemini CLI with MCP

Gemini CLI can be configured to interact with MCP servers via a settings.json file.
With the MCP server enabled, users can execute commands to perform tasks or retrieve information directly from the terminal.
MCP servers can be used to manage tasks in applications like Linear, allowing for task assignment and status updates.

Takeaways

MCP enhances LLM capabilities by integrating with traditional software tools.
It supports a wide range of applications beyond simple calculations, including task management and video creation.
Authentication is crucial for secure MCP operations, ensuring actions are only performed by authorized users.
Configuring MCP servers with tools like Gemini CLI enables efficient workflow management from the terminal.
The flexibility of MCP allows for continuous expansion of AI agent functionalities.

Transcript

00:00

If you're doing anything with generative AI, you've probably heard of MCP. So what is MCP and what is it good for?

MCP is model context protocol. So that protocol is a communication layer where your large language model can

00:15

communicate with other software programs to extend its capabilities, make it good at things that it's not normally very good at. For example, large language models are not known for being the world's best mathematicians.

And while they continue to get better thanks to things like reasoning, it's still not

00:32

what they're best at. Luckily, traditional computer programs are really good at math.

It's kind of their thing. So, we can help our LLM by giving it access to traditional computer programs through the model context protocol.

The protocol is simply the agreed upon way,

00:48

the agreed upon standard that our LLM will communicate with our other programs. So, let me show you what that looks like.

So, I've created an MCP server by writing some code. My server is written in JavaScript.

You can write yours in Python as long as it uses the

01:04

model context protocol, which is the agreed upon way that we can interact with our LLMbased AI agent. So, let's walk through this code.

So, we start by creating the MCP server and then we register a few tools. We have one for addition and one for subtraction.

In

01:20

order for our LLM to know when to use a particular tool, it's important to give it a description. And that way, when the user asks for something, the AI agent can check to see if it has a tool that can help with that.

For the addition tool, it adds two numbers together. And when the agent wants to use this tool,

01:37

the input schema says that what it needs to provide is two numbers. So when it provides two numbers, our code will add those two numbers together.

So now the agent knows when I need to add two numbers together, I have this tool available to me. Now this looks

01:52

different in different languages. Here's an example written in Python.

But you still see the same things. You've got a title, you've got a description, and you have the expected input schema.

They show up in different places, but you still need to communicate the same information because again, even if it's

02:09

a different language, it still needs to adhere to the model context protocol. As long as it does that, we have a way to communicate.

So, now that I've created my MCP server, I want to use it. So, I'm going to tell Gemini CLI how to interact with it.

I'm going to do that by

02:25

updating my settings.json file for Gemini CLI. And now, when I use Gemini CLI, I can run the /mcp command to see the list of tools available.

And there's the tool that I just created. Now, I can ask Gemini CLI to add 1 2 3 4 plus 1 2 3

02:42

4. and I get 2468.

Now, granted, addition is something that LLMs are getting much better at, and you might not need a separate server for this, but it doesn't just apply to math. We can use MCP servers for a whole host of different problems.

You can use an MCP

02:59

server to get search results, to get the latest weather, or to get any other information that wasn't in the LLM's original training data. Beyond retrieving information, MCP can also be used to take actions on your behalf.

So that could be adding something to a

03:14

database, pushing some code to GitHub, or even updating your to-do list. In these cases, you don't want anyone to just be authorized to make changes as you.

So MCP uses OOTH authentication to validate that the request is coming from

03:30

you. For this example, I'm going to use Linear, which is a tool for planning and building products.

They provide a remote MCP server that uses OOTH to verify that I am authenticated. Quick side note, for most of these examples, I'm using CloudShell, which is a cloud-based

03:45

terminal. I couldn't do that for this example because it doesn't have the permission to open up a new tab in my browser, which is necessary in order to complete the OOTH flow.

Just like before, I need to add the MCP server to my Gemini settings.json file. Now I can

04:00

start up Gemini and run /mcp to see that linear is a configured MCP server but it's disconnected which means I can't use it. To connect to the linear MCP server I need to run /mcp

04:16

o linear. A browser tab opens up and I follow the login flow for linear.

It's successful. So I can head back to the Gemini CLI.

Now, this time when I type /mcp, I can see that linear is

04:31

enabled and I see a full list of all of the different tools available. So, what can we do now that the linear MCP server is enabled?

Well, the first thing I want to do is see what are my issues in linear. Now, interestingly, it saw the word my and that actually made a big

04:48

difference here because it has two tools. So, this is something MCP related that I want you to realize.

One of the tools is called list issues and one of the tools is called list my issues. And so here it told me that I don't have any

05:04

issues in linear. And so what I really want to know is what are all of the issues.

So that was my bad. Get it?

My bad. Because my So I rewrote my prompt and the Gemini CLI was able to figure out which tool to use this time.

So,

05:20

since I'm the only one using this linear project, I want all of the tasks to be assigned to me. So, I'm going to ask the Gemini CLI to do that.

Assign all of the tasks to me. And voila, there we go.

And I'm just going to check back on the

05:35

linear dashboard to make sure that that looks correct. And it does.

There's my picture. It looks like every single one of these tasks is now assigned to me.

Now, for the first time, Gemini CLI has taken an action on my behalf. It's not just retrieving information, but it's

05:50

also making changes. It assigned all of these tasks to me.

And now I want to keep going in that flow. So I feel pretty good about how I've integrated linear and using linear in general.

So I'm going to say that yes, I have gotten familiar with linear. So let's go ahead

06:07

and mark that task complete. So Gemini handled marking it as complete and it is no longer in my list of active tasks.

I'll add one more task which is to create a video about MCP and the Gemini CLI, the one I'm making right now. So, the task has been created and it has

06:24

been added to the backlog. But I don't want it in the backlog.

I'm actually doing that right now. It's in progress.

So, let's go ahead and mark it as active and in progress. And there we go.

Gemini CLI and the Linear MCP server are able to work together to do that. I can see

06:41

that it is in progress. The magic here is that I can update issues in linear without ever leaving my terminal.

And for some folks, if you're someone who uses the terminal for everything, it's one less tab you need to have open in your browser. So, you know that MCP stands for model context protocol.

But

06:57

you may have originally thought that MCP stood for monkeys cooking pancakes or mayonnaise can popper or manatees choosing pineapples. And those are just normal things that anybody could have thought of.

Anybody could have thought those things. Anybody.

But now you know

07:14

that MCP stands for model context protocol and and we can use it to extend the capabilities of our LLMbased AI agents, including to create custom generative AI videos with V3 like the ones you just saw. Now, this MCP server

07:31

is a little bit more involved than the other ones that I've demoed. So, I'm going to include a link to how to do this in the description of the video below.

So to get this MCP server running on my computer, I had to clone the repository. I had to navigate to this MCP gen media directory and then I had

07:49

to install it by running this shell script. Once I installed the MCP server locally, I updated the Gemini CLI settings to include this MCP server.

Then I asked the Gemini CLI to create an animation for MCP and the Gemini CLI. So

08:06

using text in video is still one of the tougher tasks for video generation and it didn't get it right 100% of the time, but I was still pretty impressed by what it came back with. So once again, we were able to use MCP to extend the capabilities of the Gemini

08:22

CLI to create some magnificent cartoon parrots. So all of this is to say that MCP isn't limited to solving math problems or keeping track of your to-do list or even creating videos.

power of MCP is that you can flexibly

08:38

extend your AI agents to do a bunch of different things. And this will only become more powerful as more MCP servers become available.

And now I can use MCP one last time to mark this video complete.