Hi, everyone, my name is. I'm an assistant professor in Memphis and statistics at Erasmus University Rotterdam and I specialize in longitudinal modeling, specifically using Beijing and multilevel analysis. Now let's start with the question why learn about multilevel analysis in the first place? Well, it turns out that in the social sciences, observations are often cluster together in some way. For example, educational sciences researchers often study pupils that are clustered or grouped into several separate schools. Similarly, sociologists might study people that are nestled into different neighborhoods or cities, and these types of nested data observations are not independent of each other. To see this, let's use an example with students in different schools. Pupils going to the same school usually come from more or less the same neighborhoods. This already means that pupils from the same school will be quite alike, for example, when it comes to SCS and IQ. This is a simple selection effect, houses in the same neighborhood are more or less the same price, which means that people that buy a house there make more or less the same amount of money. People that make more money buy more expensive house in a more expensive neighborhood, while people that make less money have to buy a less expensive house in a less expensive neighborhood. In addition, the fact that people from the same neighborhood make more or less the same amount of money implies that they are quite homogeneous compared to the entire population when it comes to the level of education. Since IQ is linked to the level of education and since IQ is partially genetic, this implies that we can assume the pupils from the same school would be more alike when it comes to IQ than two students chosen randomly from all students in the country. This dependence of observations is a clear violation of the independence assumption made by most statistical analysis, and that's where multilevel analysis comes in. During this book, you will learn about the basic theory behind multilevel analysis, and you'll learn how to run a simple, multilevel regression in R. However, before we get into multilevel analysis and before you see how multilevel analysis correct for dependence and observations will shortly revisit multiple regression in this video. After all, the multiple regression model is the basic building block of a multilevel regression model, and one needs to understand it before moving on to the analysis of nested data. In normal regression, we try to predict a continuous dependent variable using one or more independent variables. These independent variables have to be either continuous or dichotomous, so scored a zero and one. If we have a variable with more than two groups, these groups can still be compared immigration analysis, but you need double variables. Importantly, in regression analysis, we assume that the relation between the dependent variable and the independent variable or set of independent variables looks like a straight line. So whenever you're doing regression, you assume that the world looks like a straight line, and this is reflected in your model. Now, such a straight line can be summarized by two values, the intercept b0, which indicates where the line intersects with y axis and the slope b1, which indicates how steep the line is or in other words, how much we expect Y to increase if Y increases by one. Now, obviously, there are unlimited numbers of values for b0 and b1. So how do we determine which ones are the best? Well, the idea is that we want to line which has the lowest total distance to all observe scores. So the one for which the total of all the vertical distances between the data points and the regression line is the smallest. Fortunately, we don't have to find this line accurately through trial and error. Instead, we can use the method of least squares in which there is a closed form solutions from both the intercept and the slope. Once the best fitting regression line is known, we test our assumption about the relation between the dependent variable and independent variable or independent variables, by determining whether the regression coefficient b1 is significantly different from zero. The answer is yes, as it is in this case, since the P value is smaller than 0.5, we subsequently looked at the effect size, which, for regression analysis is indicated by R Squared. The R Squared gives us an idea about the relevance of the relationship between the dependent, variable and independent variables. Significance only tests whether the effect is zero or not, our square tells us how far away from zero the effect actually is, and therefore it also tells us whether it is large enough to be meaningful in real life. That's it f or now, we've discussed that multilevel analysis is important to social scientists because social science data is usually group together somehow. For example, because we are dealing with pupils from the same school or patients treated by the same therapist and in addition, we shortly revisited the multiple regression model. Eager for more? Take a look at the video, titled My First Multilevel Model that shows how the multiple regression model can be extended for multilevel data.