π Add to Chrome β Itβs Free - YouTube Summarizer
Category: Technology
Tags: AIautomationdatainfrastructurestorage
Entities: AIAIOpsBlack Fridayblock storagedisaster recoveryhigh availabilitymachine learningreplicationself-driving carsself-driving storagesnapshots
00:00
Many of us know all about self-driving cars. Some of you have self-driving cars that are roaming around your city.
Some of you have vehicles that have this capability built in. I'm here to explain to you a concept called self-driving storage.
00:18
Now I know how is this guy going to explain a connection between a self-driving car and storage infrastructure in my data center? Well, I'm here to connect the dots and explain how we can accomplish this.
So,
00:36
to have a self-driving car, you need obviously a car. And then on the self-driving storage side, you need a car that moves.
Typically, and we're going to use block storage as our example here. Typically block storage is not very mobile.
00:53
You place your storage, you allocate your storage, you place your data on that storage and it doesn't really move anywhere. But in order for us to take full advantage of self-driving storage, we need to provide block storage that is mobile.
That can move around our storage infrastructure.
01:10
So let's first draw out how we typically provision resources for block storage. First thing, is we have volumes.
Volumes store the data that we have from our servers or compute.
01:30
Here our hosts are able to access all of our data using hosts and volumes. Now optionally, you can put containers around these volume groups
01:45
and host clusters. This is the basics of block storage.
All of us that know about block storage use this paradigm to allocate and provision all of our resources. Now where we're going to add something new is we're going to organize
02:04
all of these resources, all of these objects together into one simple container that creates the ability to make it mobile. So we're going to draw a box around this whole thing, and we're going to call this a storage partition.
02:24
We use the word partition in many of our server applications. Um.
We call them LPARs. But in this case we're using a storage partition to describe a part of a storage array.
A subset.
02:39
Just like we do for LPARs in our servers, we do it same here in our storage. So, each of our storage arrays can contain a multitude of these storage partitions.
By providing that gives us the opportunity to be able
02:56
to move the storage around in our storage infrastructure. It's the basis for ah all that we're going to do in self-driving storage.
So now that we have a mobile car and a mobile storage partition,
03:13
the next thing we need to do is we need to feed our AIOps brain with information. And that the way we do that is we describe the storage partition in multiple ways.
That helps the machine learning in our AI platform make good determinations, good decisions,
03:31
and eventually be able to move the actual data from place to place. So we have some things to describe a storage partition.
We have metrics that describe capacity and performance, and we have protection of ways that we can protect the data,
03:47
depending on the importance of the data you're trying to store on the devices. So, in the metrics section we have some things like capacity.
How much data are we storing on this device, and how much data is going to increase or decrease
04:04
as we write data from the hosts? We also have metrics, such as performance metrics, such as IOPS or I/Os per second.
We have bandwidth. And we have latency.
04:20
Bandwidth describes how much how many cars are on the road. How congested is the pipe that's going between the server and the storage?
And latency is how fast is the round trip going from point A to point B? Some applications require very,
04:36
very low latency, and some applications are more lenient in terms of the type of latency that is required. These are all metrics that the AIOps machine learning will go look at as the storage partitions are in place.
04:51
The second section talks about protection, and there's different types of protection depending on the importance of your data. We have things called snapshot, which are local to the storage array,
05:06
which protect the data from cyberattacks, ransomware, logical corruption. They're located on the storage array itself.
We have replication technologies. So.
ways to replicate the data between two different systems in the data center
05:24
and then also outside of the data center ah at a different site. We have ah a capability called disaster recovery, which is the ability to replicate data in a different region.
So, if you have a regional disaster, power outage, or the data center goes down,
05:42
you can still be able to access your data in a different site, which could be hundreds of miles apart. We also have this concept called high availability, that's within the data center.
You have two storage arrays that are synchronously replicating between each other.
05:58
But if any one of those storage arrays goes down, you still are able to maintain access to all of your applications. And then, also, you have the capability of some ah businesses.
They wanna to have a local, replicated
06:15
copy and an offsite DR copy. We call that HA plus DR.
So, all of these things can describe one of these storage partitions. Again, parts of the storage array that are in our storage interconnect.
06:33
These describe attributes of the storage partitions. Now, the other important thing of this whole concept is we have to take this information at a time and apply time to those things.
06:49
A point-in-time copy of all of this information is interesting, but is not that interesting because you don't have a historical reference to that. So, as we're feeding this information up to our AIOps platform, we have to apply time to all of these items.
07:06
And that will help us make decisions. And it'll have the help the AI make dis correct decisions as to which, when and where to move the data to.
So we take all this information. We're attaching them to the storage partitions, and then we're sending them up to our AIOps platform.
07:25
And by doing so, AIOps is able to do machine learning and determine and gather all this information for later use. So when we first think about jumping in a self-driving car, we have to give trust to these
07:42
AI ah engines that are running in the car. Sometimes we don't really trust the AI quite yet to give complete control.
So we want to take baby steps. And a good example of that is navigation software in your car.
07:57
This navigation software is actually AI. It's giving you the best route from point A to point B.
It's actually connecting you to your calendar to tell you where you're going to be going. Same thing goes with self-driving storage.
We want to take baby steps.
08:12
We don't want to give AI immediate control of your whole IT infrastructure. So let's talk about a use case in which we can take this first step into getting to self-driving storage.
So capacity is a major factor in storage. When you run out of capacity,
08:29
your storage goes down. So, as time goes on, we obviously ah are using more storage.
We're writing more data to our storage arrays. And at some point, we hit a point where we're gonna run out of storage really quick.
08:47
The AIOps engine is able to, at that point, tell you you're gonna run out of storage, give you an alert, you know. So when, uh, when you run out of storage, it's going to alert you.
a reactive thing. Now, AIOps, with all the machine learning that we just talked about,
09:05
is also able to determine as a trend how much data you're actually writing and give you up from 30 to 60 days, a predictive analysis
09:20
and a forecast that you're going to run out in 30 to 60 days. So what we want to try to do here is we want to try to use this technology to, and the movement of that data, to better use AIOps
09:36
to get you out of the situation before it becomes too late. So again, we have our storage partitions in our storage arrays.
And this time what we're gonna do is AIOps is gonna say 30 to 60 days, you're
09:51
gonna run out of space. You need to do something about this.
So it sends you an alert. And at that point, it will give you some options.
And this is using generative AI to make a determination and a recommendation of where you want it to go.
10:08
So to the user, the alert will say I can go to system A. I can go to system B.
I can also give a percentage compatibility score. Using all of the metrics that we had talked about before, I can determine which of the storage arrays
10:25
is best suited for this. So I can give this a 90% and I can give this an 80%.
At this point, the user is able to then choose which one of these storage arrays to move the data to. So that's a key difference.
10:42
We're giving the user the ability to do and make a choice, and we're also letting the user actually physically move the data himself. So, we take that and then we can make that decision.
Let's say I chose B for whatever reason.
10:58
So then I can then manually move my data from A to B. The next step is to give AI a little bit more leeway.
For ger our next use case, we're ger we can let the AIOps engine do a little bit more than just give a recommendation or a suggestion.
11:16
We're going to call this one workload placement. And the idea here is I have a new application or of a new set of applications that I want to add to my storage.
But where do I put it? So, the first thing we wanna do is we wanna ask the AI,
11:31
where is the best place to put my storage? So, I have my application.
I know it has certain requirements, very similar metrics to what we had before. So let's say I want to have 30 K IOPS.
Let's say I need four terabytes of storage.
11:50
And let's say that I want snapshot. And let's say that I wanna have DR for this particular application.
This describes the application that I want. I can take this application and I can feed it into our AIOps engine.
12:06
And our AIOps engine can then make determinations on which of the storage arrays to put it on. Now, the difference here is we are still giving you a recommendation.
So again, we're going to draw out for the user
12:22
the different choices that the user can make. Let's say we wanna go to system B again and system C this time.
And we give them the opportunity to make that selection. So let's say this time I'm going to select system C.
Now,
12:37
the difference between this and what we just talked about is we got to this point. What is the next thing the AIOps engine is going to do?
This time, the AIOps engine is actually going to perform the operation of provisioning all of that storage.
12:53
So, what it's gonna do is it's gonna pick storage C, it's gonna to take this storage partition. It's either gonna create it or it's gonna add to the existing storage partition.
And what it's going to do is it's gonna provision
13:09
all of that same stuff we just talked about at the very beginning. It's gonna create a storage partition, provision all of the information.
And at the very end of it, it's going to give you a storage data store or a storage volume
13:25
or a set of volumes with all of the same attributes. And it's going to give it for you to use in your application or operating system.
So now is the pivotal moment. It's time to let go of the wheel and let the self-driving car
13:41
drive itself to your destination. On the self-driving storage front, it's time to let go of the wheel and let the AIOps engine and its agentic AI take you to new levels automatically, autonomously, without user intervention.
13:58
The example we're gonna use for self-driving storage in the end is one we call on-demand performance. And what that means is there are times of the year that the AIOps engine knows that it needs to, uh,
14:14
provide the most amount of system resources, storage resources to achieve that. One of our favorite times of the year where this occurs is Christmastime.
So, during Christmastime,
14:34
we know that around Black Friday, all the way up through Christmas, we know that in the retail industry that they have the highest amount of demand, the highest amount of input and the biggest amount of data
14:51
to be written to our storage systems. The AIOps platform knows this.
And how does it know this? It's been ingesting all of this information, all of the metrics, all of the protection schemes, everything in time
15:06
series throughout the year. So it knows that come end of November, it needs to be able to do something to give us the best opportunity to handle the onslaught of data.
So let's draw out our storage partitions again.
15:21
We're gonna call our three systems. And um, we're gonna use system A as the system that we want to, ah, empty basically, in order for it to be able to handle, uh,
15:37
the Black Friday onslaught. Now, Black Friday is one thing you can use tax time.
You can do any of the other things that would require you to have this extra information. Now previously, when we were using the AIOps platform, we gave the user
15:55
the ability to choose which storage system, which platform to put it on and then the user would then do it. Now, we gave the AIOps platform some levity to create those resources in the previous example.
In this case,
16:12
we are giving the AIOps platform full control of not only deciding where to place the data, but it's actually going to perform actions and move the data itself using agentic AI.
16:28
So, when we're talk thinking about this, the AIOps platform is going to think about when is the right time to do this. So, based on time series of information, it chooses this day,
16:44
some day in October to move that data. And then what it does is it looks at system A and it is determined, using all of those metrics, which system it wants to move it to.
Well, let's say that the AI platform tells this partition
17:01
move to system C. And then over the amount of days from the end of October to the end of November, it's actually gonna take the time to move that data.
And moving that data doesn't cause any access problems with your applications.
17:19
Your applications keep on running. Everything keeps on running.
You don't have to lose any ability to run your business. Also, at the same time, the AI platform says, you know what?
I'm gonna take this one and I'm gonna move it down to system B.
17:36
And at the end of the day, system A now only has one partition and a fully loaded storage array ready to take the onslaught of your Black Friday and Christmas demands. So this is only the tip of the iceberg, but
17:52
you can see where self-driving storage can take us. It is wholly autonomous, using agentic AI, making decisions, moving the data without any user interventions.
It's just the tip of the iceberg in terms of where we're going to go with this.
18:07
There are tons of other examples where self-driving storage can further optimize all of your business needs, your entire storage infrastructure going forward.