π Add to Chrome β Itβs Free - YouTube Summarizer
Category: Data Processing
Tags: componentsdataexecutionparameterssorting
Entities: allow unsortedmajor keymax scoreminor keysort componentsort within group component
00:00
[Music] Welcome to my YouTube channel. In this video, we'll discuss about au sort and sort within group component.
These two component we discuss. Okay.
00:19
So we know that sort component will require when if you are using some rule of join dup sort or scan component if and that parameter if you make as a true that input parameter then we require the
00:37
sorted data sometime if you don't have input data is coming like not sorted out of order then we need this sort component right if sort component we are not using and here uh rule of if you're using sorted input is true then this
00:55
your graph will be fail okay and uh sort within group when we required let's suppose if you're using the sort component is having some primary key and uh we are we need to use uh the sorting
01:14
on the second key okay so let's suppose sort within the group will be work based on your second key. Okay, first key I mean that we are let's suppose we think that we are having two key is primary and second is secondary key.
Okay, so
01:31
primary that is major key is coming from the sorted that will be specifying sort within group and second key secondary key that is minor key. So based on the minor key it will be uh sort your data.
Okay. So these two
01:48
component we discuss details in this video. Okay.
So let's start. So the sort component the name itself we can understand that that this component is is producing the data based
02:06
sorting order based on your key that you have specified. Okay.
This component is provide two parameters. one is key and second one is max score.
Okay. So key we know that based on the key it will be uh order
02:24
your record. There is in that key you can direction you can provide like I mean ascending order or descending order.
These two direction this uh this component will be provide your output
02:39
record also they have option if here having in record having null then which order you have. Okay.
So null you want to be low precedence or null to be high. So this also you can provide you can
02:54
select here. Okay.
And also fourth third option that we have sequence right. Sequence also based on that uh how the order you want sequence machine level this will be default.
Okay. So phone
03:10
book index or custom. Okay your order.
So custom will be work based on that your digit lower case upper case alpha numeric this you can specify or white space null user defined types of you can
03:26
design your order. So this is how the key parameter is for the sort component that providing this much options direction null order and sequence.
Okay. So next parameter
03:41
we'll discuss that max score. So max for default value for short component we have 96 MB.
You can define based on your requirement 0 to you can increase the
03:57
volume more than 96 also but default it is it is 96. You can specify zero as well.
Okay. But uh you can feel that your performance will be I mean uh reduced because if you are having use
04:12
volume of data then you you can understand that the your performance will be reduced. Okay the this component will be right I mean some temporary file in your disk if if it is increased whatever
04:27
the value we have defined in max code. Okay.
So now we'll discuss quickly runtime behavior of sort components. Okay.
So sort components will read all the in records from all your input flow that is connected to your sort component
04:43
and it will be split into temporary files. Okay.
If you have defined a smaller size than whatever it is defined in max score parameter. Okay.
So after reading
05:00
all your inputs record next it will be uh your uh sort uh sort read each input temporary files. Okay.
Whatever temporary file generated in your default directory those will be ordering based on the sort key. Okay.
Then it will be
05:19
merge all your temporary files and maintain your sort orders and finally it will be write the result to your output port. So this is the just two steps of runtime behavior of the sort component.
Okay. So now we'll discuss sort within
05:37
group component. So au have option to produce means sorted based on the secondary key.
We know that uh input is coming sorted based on one
05:53
key only. If you want to order your secondary key, so we can use the sort within group component.
Okay. So description I have just written here that sort within group refineses
06:08
the sorting of record already sorted according to one key specifier means one key that already have sorted. Okay.
And next it is saying that it sort the record within the group formed by the first sort according to a second key
06:23
specifier. So it will be work your second key and produce your order.
Okay. And the sort within group component is provide these are the parameters.
First we discuss the major
06:39
key. Major key we can specify that key which is we know that that key having the data is coming already sorted means input is coming the sorted order.
So that field we can specify in major key and uh minor key that you want to
06:56
sort using this component. So you can specify that key in minor key.
Okay. And also we have option that allow unsorted parameter.
So this parameter is default will be false. Okay.
So if you
07:13
are making as a true then this component will be your unsorted data your prime major key primary key first key. Okay.
Next we have max score. So max score also having some default value for
07:30
this component having 10 MB. So this also behave like same like sort component that will be specify a maximum size this component and if it is increase the size then it will be write the temporary file in your disk.
07:46
Okay. So now we can discuss the runtime behavior of sort within group component.
Okay. So this component will read the record from all input flow that is connected to your input port and it will read until your max score parameter that
08:02
you have specified. Suppose if you have it is increase the length whatever you have defined then it will be write I mean your temporary file in your disk space okay in uh uh your working directory
08:20
from the temporary files it will be create the arranging or uh based on your key. Okay, the second option will be having a second steps that sort within group that it will assume that your sorted according to the major key
08:36
parameter. If it is not sorted major key parameter then allow unsorted parameter approved to true.
Okay, if it will check if you have not specified to set to true then it will be your execution will be failed and it will take that out of order record is
08:52
coming. Okay.
So let's suppose here this steps is fine then it will be uh then again it will be coming to your third steps and it will be read the record uh from the temp free file that have
09:07
created in your working directory it will merge and it will produce based on your minor key parameter okay and it will write your output. So this is how overall we will work sort within group component.
Okay. So if you have any
09:23
question on these two component uh sort and sort within group so you can put your comment on comment section and also you can reach out to my number that I have given in description. Okay.
So yeah thank you so
09:40
much for watching this video.