So it is a non trivial, actually very important example. Let's take a look at the analysis of mergesort. Which is the prototype for the study of divide and conquer algorithms. As a warm up, I want to talk first about binary search, which is everybody's first divide-and-conquer algorithm. And that's where we have a sorted array, and we look to see if a particular value is in the array by looking at the middle. The value we're looking for is less than the item at the middle, go left. If it's greater, go right. and in either case, divide the size of the array about in two. That's the code for binary search, and you can find it in the algorithms book or in the algorithms book site, and its behavior is described by that recurrence. The number of compares in the worst case to find, to discover that an item's missing from an array of size N, is going to be, the size of the sub array that you look in, it's going to be floor of N over 2, so that's the, you take N over 2, and it's the biggest integer less than that. If it's an odd size, then that'll be a exact equation. So, even size you go down one, and so just check that for yourself that that is the number of compares in the worst case. And so now we have this mathematical model and that's what we want to study. Now usually we are trying to get approximate solution or just an idea of about how many compares are taken. And the easy case for binary search is if the file size is exactly of power of two. Then divides in half, then the part of the array you are looking at is also a power of two. So it becomes a recurrence that telescopes. So we will take A sub n = B sub 2 to the nth. If we start with the power of 2, and then we just have a simple recurrence that telescopes, that's just substituting of B of 2 to the nth. and then that reoccurrence that we just looked at telescopes to A of N equals N each time you throw out an N for N times and that means that, that little n is log base two of big N by definition. So the number compare is taken by binary search, and the worst case is log base 2 of N, when N is a power of 2. And again, can check that just by plugging into the recurrence log base 2 of N. So but what about the general case? When N is not necessarily a power of 2? Well, it turns out to be an easy way to study that case. And that's to develop a correspondence with binary numbers. So let's define b sub N to be the number of bits in the binary representation of N. So this example and this 107 and it's got seven bits in it's binary representation. So you can check pretty easily that what you can do is just remove the rightmost bit. If you remove the rightmost bit You get floor of n over 2. And then you removed one bit so what that means is the number of bits in the binary representation of n is the floor of the number of bits of binary representation of floor n over 2 plus the 1 bit. That's the same reoccurrence as binary search. Maybe it's a little bit easier to think about counting the number of bits in the binary representation of N than it is thinking about the binary search algorithm. And so that's an easy model. And then it's pretty straightforward to prove that the number, that number. The number of bits in a binary representation of n is the floor of the log base 2 of n plus 1. And it's worthwhile. Now, this table and these formulas are here for you to check that math for yourself. To be sure that you understand how that might be the case. It's actually a pretty simple calculation. Just the leading digit of log base two of n changes at the powers of two. And if you just ignore the rest of it, then you floor of log base two of n and then we're adding one to that. And this is This math and that table is for you to check and prove to yourself that this is true. We had an algorithm binary search, and we had an idea. The number of bits in the binary representation of the numbers, and these two are matched up through the recurrence. And that's a warm up because now we're going to do the same thing for merged sort. So merge sort, everybody learns and I talked about it in an earlier lecture. Divide the two arrays, the array into two halves. Sort the two halves and then merge to put them back together. For simplicity, we'll assume that the merge implementation always uses N compares, and then if you do that, you get this recurrence for the number of compares for the sort. And this kind of recurrence is more complicated than the ones that we've looked at because the appearance of the floor and ceiling functions over on the right-hand side. They're not immediately easy to deal with and they result in some interesting effects if we're trying to come up with a formula for the answer, what's the answer of C n? We can go ahead and do our computation, but let's first talk about the easy case. Which We already did. It's the same as for binary search. That is if N is a power of 2 then the floors and the ceilings go away. Floor of N over 2 ceiling of N over 2 are both equal to N over 2. Or so, if you do the same kind of thing where a sub n equals c sub n C sub 2 to the n how will get this recurrence here. And that's one of the first ones we solved with the telescoping sum but summation factor is 2 to the n divide by 2 to the n and then we get the result n 2 to the n in terms of the original, it means that c sub cap n equals n log cap n when n is power of 2. But what if N is not a power of 2? So that's the next question to address. So there's a natural question that arises. So for Quicksort, we did the number of compares and got a very precise analysis that we could check against the performance of the algorithm and it was asymptotically equal to 2N natural logN minus this constant times N. So one natural question is we're going to get a more accurate estimate than NlogN for mergesort. Is the number compared for mergesort proportional for N log N + alpha N for some constant alpha. And they would want to find out what alpha is in order to get the job done. And the answer to that is no, actually there's no constant. And that's important to. First of all, it invalidates the hypothesis, like that. And it's going to get in our way, of us trying to accurately approximate the performance of mergesort. It's a prototype of what can happen in many computer algorithms that have this same kind of characteristic so how do I know that well here is a demonstration of what goes on And this is really why I wanted to spend the time at the beginning to motivate you to go ahead and do plots. This is simply using the same structure that we use for Fibonacci and Quick Sort, to go ahead and compute values of the bird sort recurrence. And then the second four loop is just going through and scaling, it's trying to determine if there's this alpha value and so just divide by subtract off the n log n which is the leading term And then it actually adds n to make the scale good. And this is the plot that you get, absolutely not the constant. It's something that grows in a strange factor. And so computing the values and actually plotting the values... if you just plot the values without scaling like that, you might not notice this. It's actually kind of small initially, but it's important because it means that you're not going to be able to prove that there's a constant or find a value of a constant. So plotting their values are very helpful. And if we want to characterize mathematically, the performance of merge sort. Whatever result that we have. Whatever mathematical formula that we have is going to have to be able to do this. This is a fundamental example, it is a relatively simple example, comes over and over and over again. And really this kind of isolation, something that is intrinsic, and the analysis are over this because of the idea that we need to go down to the discreet. And that means that we get stuck in this kind of oscillation all the time. So but I do want to talk about undoing mergesort specifically in the general case because it's so important. So and there is a little trick involved to simplify it a little bit. That again I'm not going to completely motivate. But you'll see what I mean. So if we write down the same formula for n plus 1. Then we are going to have to do some math with floors and ceilings. So floor of n plus 1 over 2 is the ceiling of n over 2. And ceiling of n plus 1 over 2 is the. Floor of n over 2 + 1. That's just, you can do a little math to convince yourself of that. So that's what this table does. And then once we have that, then we can subtract those two formulas. And that gives us something that telescopes, so little magic trickery based on the relationships between floor and ceiling with the plus ones, and that gives us a simpler formula to work with. So we'll just define that difference to be D sub N, and that's going to then be the equation for D sub N But, that equation is a familiar one. That's the binary search equation. It has a different initial value, so the solution is DN = floor of lgN + 2. And then telescoping that one, so that's CN+1- CN = that, And then so Cn plus one equals Cn plus that so then immediately telescopes to give this sum. And so, now what's interesting about that sum is that this thing here is the number of bits in the binary representation of K So this is a proof that C sub N equals N minus 1 plus the number of bits in the binary representation of all the numbers less than N. And that's a interesting fact, that here's a combinatorial way to prove the same thing. So if s sub n is the number of bits and the binary representation of all the numbers less than n, then what we can do is just over in the left is all the numbers less than 15. And what we do carve off the right most bit. If you carve off the right most bit, then taking every other number, you get s of over 2. So it's just counting one, two, three up to over 2. And then The [COUGH] alternate bits are at the ceiling of N/2, and then the right-most bits are N-1. So that's a combinatorial proof that the number of bits in the binary representation of all the numbers less than N satisfies this recurrence, that's the same recurrence that merge sort satisfies. So that's a proof So number compares taken by merge sort is ms1 plus number of bits in the binary representation of all the numbers less than n. And when you think about it, that number of bits in the binary representation of all the numbers less than n, that's going to have some kind of oscillation. Because when you come to powers of two, things are going to change. So this is just another way of looking at the number of bits in all the numbers less than N. So, in one dimension you've got N. And the width you have for [lgN] + 1. So the bits are all in a big N by floor of N + 1 box. So that's a pretty good thing, but we don't have in the actual numbers the leading zeros. But the leaving zeros have a very simple pattern, plus 1 plus 2 plus 4 plus 8 and that's represented by that sum and that sum is just a geometric sum so that gives the solution. Number of bit. Bits in all the numbers less than N without the leading zeros is you just take the square as if there's leading zeros and subtract off the leading zeros and now that's an explicit solution, the number of compares taken by the merge sort. So it's that plus N minus 1, and that's, again, we started with an algorithm. We had an idea which is the binary representation of all the numbers and we show that the number compare taken by the algorithm is equal the number of bits binary representation of numbers less than n and we have an alternate way to count the number of bits number less than n and x is a complete solutions. So that's a fine formula for the number of compares used by mergesort and we can it's a relatively simple formula that we can use to compute the value. So that's just a summary of what we've done so far but what about that oscillating term? Well, there's another way to deal with the floor of lgN. If you write floor of lgN, that's equal to lgN itself, minus the fractional part. That's what the braces do. If you substitute this formula into this solution for the number compares by mergesort, you can split it off into three functions and that's again just the algebra from this for the coefficient of N. I'm sorry, in the two function the coefficient of N is 1 minus a a the a function part of log N and the other thing that you have is the 2 minus log N and those things are parted, They look kind of anti-symmetric, but actually it doesn't cancel out. What's left is that little oscillating function. And if you want to play that by N, you get the exact function that we plotted by before. So you can check this math. And if this is also described in the book, but it's just and indication of the kind of challenges that we're going to face when trying to analyze algorithms. It looks like things are going fine but we might be stuck with oscillation where things are just complicated because we're working. On the one hand with wanting to work with familiar functions like logn and on the other hand needing to work with functions that are forced to be integer value like the of logn or largest integer less than logn. That's the analysis of Mergesort.