Welcome back, in this video I'm going to discuss log analysis. What is log analysis? Logs are the traces of human behavior seen through the lenses of whatever senses we have. For example, web applications often keep records of visitors' informations on the server end. These web log data often include information about which users visited the website, at what time, using which browsers, etc. Log analysis is the activity seeking to make sense out of the log data. Why log analysis? In previous videos we have talked about many user research methods, interviews, focus group, observations, contextual inquiries. However, user data collectors using these methods are often small in quantity. Now through web based applications we can collect lots of data very quickly and very easily. Analyzing large amount of log data allows us to see phenomena that were previously unobservable. There are many types of information that could potentially be logged. One type of information is the user interaction information, including queries, clicks, URL visits, system interactions. In addition to using interaction information one can also collect context log information, including web pages shown, ads, etc. There are some commercial log data visualization and analytical tools available in the market. Here is a screenshot of a data visualization provided by Google Analytic. One common practice to perform log analysis is to partition data to view interesting slices. For example, we can look for changes in behavior over different times, different user types, different locations, different languages, different entry point, devices, and different systems. Here is one example illustrating how to partition log data to view interesting slices. This figure shows the bounce rate on the GroupLens left front page in two different months, April and August. This figure tells us that the bounce rate is much higher in April compared to August. Here the bounce rate represents the percentage of visitors who enter site and immediately bounce, that is to leave the site rather than continue viewing other pages within the same site. So this figure tells us that the bounce rate is higher in April compared to August. Which means that the visitors in April are much more likely to leave the site, while the visitors in August are much more likely to stay in the site. One possible reason is that many visitors in August are current and incoming students of the GroupLens lab. They often stay in the site to look for information to prepare for the new semester. However, many April visitors might be random visitors, so they tend to leave the site very quickly. The strength of the log analysis is that it can provide a complete and accurate picture of real user behaviors, including the ones people don't want to talk about. Second, log analysis allows for statistical analysis on large size of data. Regarding how to perform statistical analysis, please refer to the video on quantitative analysis in Module 3. Although the log data analysis is often very powerful, we need to keep in mind the weakness of log analysis. What logs cannot tell you is people's intent, people's experience, people's attention, people's beliefs on what's happening. And log analysis also limited to existing interactions. And also, log analysis cannot draw causal relationships. Takeaways, log analysis gives a rich picture of real world behaviors. There are many types of log data. And we can partition the data to view interesting slices. Finally, we need to recognize what the data can and cannot tell us. Thank you for watching this video. Hope to see you in the next one.