So this is the last module from chapter six or last lecture from chapter six, this class discusses a very important idea adding center points to the K design. Now, why would you do this? Well, if you have an unreplicated design, you don't really have an estimate of pure error and experimenters would like to have some replication in the design. And if all your factors are continuous a sort of logical way to do this would be to put an additional run at the center of the design say to the K and then one more run and then replicate that run a few times so that you can get an estimate of pure error or a model independent estimate of erro, but it turns out that adding a run at the center and replicating that a few times also allows you to distinguish between let's say two possible models for the data. The first of these is of of course the standard first order model plus interaction and I've Illustrated a first order model with all of the of the two-factor interactions. But with the center point you now actually have a third level on all the variables so you could consider a second-order model that has these quadratic terms that you see here and that could incorporate curvature in the response and a lot of people like this idea because many people when they run a two level design they instinctively worry about curvature. They distinctively instinctively worry about something like this going on. And of course if you only have runs at two levels, you might miss that curvature entirely. So the idea of adding runs at the center is very attractive to a lot of experimenters. How does this play out? Well take a look at the figure on the left. This is a two square design with runs at the corners of the square, of course and then a run at the center. And the figure on the right shows you again another view of this you have a total of N sub F factorial runs now n sub F would be four here, but it could be it could be eight or it could be 16 and then we have some runs at the center and I'm going to let n sub C be the number of runs at the center. We'll talk about how many runs at the center shortly now go back and look at the figure on the left. These are the runs at the corners of the square and I've Illustrated a fitted plane between those runs. These are the runs that you obtained at the center. That's the average of those runs. Now it turns out the average of these factorial runs will always lie in the plane. So if we want to investigate the possibility of curvature a simple way to do that would be to compare Y bar F to Y Bar C. If Y Bar F + Y Bar C are very similar then there's very little likelihood of this kind of curvature more than likely the response is linear over that range. So what we need to do is if we run these center points is we need to compare Y Bar F to Y Bar C, now, we could actually do that with the two sample T test, that would be very easy to do. It's also very easy to incorporate it into the analysis of variance, the hypotheses that you're actually testing here, the null hypothesis is that the sum of all of those second-order regression coefficients is 0 against the alternative that it's not zero and to incorporate this test into the analysis of variance. You can calculate what we call a pure quadratic sum of squares and the equation for that is shown in the boxed part of the display, notice that it uses the difference between Y Bar F + Y Bar C. As an integral part of that calculation? And of course that would be the numerator that you would use in the two-sample t-test and in fact, if we use the the nth of C observations at the center of the design to get an estimate of pure error, a mean square for pure error within Sub C minus 1 degrees of freedom. You can show that the two-sample t-test that is produced by using that leads to a quantity that if you square it, it will be the F statistic for pure quadratic curvature that you would get using this sum of squares for pure quadratic era in the analysis of variance. This sum of squares has a single degree of freedom. Just like the square of any t statistic would so let's go back and see how this works. Okay, let's take our resin plant experiment again. And when this experiment was run, they did not include center points, but let's let's put in some center points and I'm just going to make up values. So the center points are at x1, x2, x3, x4, all equal to 0 and I'm going to just make up filtration rates of 73, 75, 66 and 69 and assume that those were the values observed at the center. Well, the average of those four center runs is 70.75 and the average of the 16 factorial runs is 70.06. Now these are very similar and we would suspect that there really isn't any curvature but we're going to actually do the statistical test and see that in just a minute, notice that I used n sub c equal to four center runs. I think that usually about the minimum number that you should use is three because that'll give you two degrees of freedom for pure error. I don't think you gain a lot by using more than five or six center points. So I think between three and six is a very reasonable number of center points to use that doesn't require a whole lot of additional resources, and it does give you an estimate of pure error based on at least a moderately reasonable number of degrees of freedom, software will typically do this for you. You will have to tell it how many center runs you want but it will perform the statistical analysis and it all the packages that I'm familiar with use the F test approach instead of a t-test approach. So here's how we would do this in our problem. Let's calculate the mean square for pure error that would simply be the sum of squares of all of the runs at the center divided by the number of degrees of freedom for pure error. Which is the number of center points minus 1 so the mean square for pure error, if we do the arithmetic reduces to 16.25. Now the difference Y bar F minus y bar C is 70.06- 70.75 it's minus 0.69, so we plug that into that equation I showed you earlier it's equation 6.30 from the book. N ub f is 16, nub C is 4 and the sum of squares for pure quadratic curvature turns out to be 1.51. And here is the Anova table here is an Anova table with those quantities displayed, notice that we get a curvature. Pure quadratic curvature sum of squares of 1.41 and we get a pure error mean square of 16.25. The F ratio for curvature is much less than 1 P values 0.78, no, there's no indication of curvature. So what if curvature is significant? What do you do? Well if curvature's significant, there's a pretty good indication that your linear model, your straight line model or your linear model with even with interaction is not going to work very well. So the typical approach is to augment the design with additional runs so that you could actually fit the quadratic model, the typical augmentation strategy that's used is to add what we call axial runs. For the two square the axial runs are these one factor at a time experiments along the x1 and x2 axis? And that produces a design that we call a central composite design, in three dimensions the figure on the right shows you what the central composite design is like. These additional runs give you enough runs to be able to fit the complete quadratic model, you don't have enough with just the center points. You don't have enough runs to fit the complete quadratic for example in two factors the complete quadratic model has two main effects. It has two squared terms and it has an interaction term along with an intercept. So it has more, it has more parameters to estimate than you have design points. You only have five independent design points and you have six parameters that you have to estimate and in the three variable case you have nine observations if you only look at the center runs and the complete second order model in three variables has 10 parameters. So you need more runs and the axial runs enable you to build up enough runs to be able to fit the complete quadratic. Central composite designs are widely used in fitting quadratics and they play a very important role in an area of experimental design that we call response surface methods and we'll talk about these methods later on in the course. Some comments about the practical use of center points. I think in many cases when a 2 to the K design is run around your current operating conditions. It makes a lot of sense to use the current operating conditions as the center point because that would enable you to check for abnormal conditions during the time the experiment was being conducted. All you have to do is compare the runs at the center to your historical performance. And in fact, if you're actually running a control chart on your process, you might be able to take the runs at the center and plot them on the control chart to get an idea about how they compare to historical and control performance. This is a very good idea because sometimes when experiments are run in processes people take extra care, they take extra 10. They're really careful and the results you get may not be consistent with what you get during normal operating conditions, sometimes I think you can use center points to check for time trends. Now the way I would do that is to not randomize the location of the center points, but randomize everything else but then put maybe some of the center points at the very early part of the experiment, some in the middle and some at the end. Now when you finish you can take the responses for the center points and you can look at those and see how they change through time and if you see a trend in time that is there either is consistently the responses are getting larger or getting smaller as you go through time, that could be an indication that there's a time trend in your overall data, and we would need to do something to account for that, to compensate for that in the analysis. I think sometimes center points can be used as the first few runs in your design when you have little or no information about how big the experimental error is and so you can get an idea whether the experimental error is moderately reasonable. There's nothing worse than running an experiment and finding out that the experimental error, the background noise was so big that you didn't learn very much. And so when you have no idea about how much variability might be out there doing a few of the center runs first just to get an idea about how much error there is, how much noise there is in the data can be a good idea. Now we've been talking about center points for experiments that have quantitative factors. Well, what if you have a qualitative factor in the experiment? Well, if the qualitative factor only has two levels you're certainly not going to have a center point there, but you might be able to do something like this. Here's a hypothetical chemical process where the experiment includes three factors, temperature and time are continuous but catalyst type is categorical and we only have two levels. So one way that we can include center runs here would be to put center runs in the subspaces of the design that only include quantitative factors. So here's a group of center points in the time temperature subspace and here's a group of center points in the other catalyst type subspace so are in the time temperature subspace for the other catalyst types. So you could actually check for curvature in these two subspaces. And this can be extremely useful, sometimes when you run an experiment like this you find that there is some curvature in both of these subspaces, but that one of the catalyst types is much better than the other and so you end up being able to only have to do the augmentation in one of those sub spaces because you no longer have interest in the other catalyst type. So sometimes this strategy really works out to your advantage in terms of the amount of testing that you need to do. Okay. So this is the end of module 6, we've covered a lot of material here. This is very important fundamental material and experimental design because two level designs are very widely used in engineering and science and marketing and e-commerce, all sorts of applications of these things. So a good solid understanding of 2 to the K designs is really essential to you having a really good grasp on the basic fundamentals of experimental design. Okay, that's all for this time.