In this video, we will introduce association rule mining and discuss its application to urban systems. Let's take the example of a market basket. This is a basket which people typically buy at a shopping store or supermarket. We can answer questions like what items are frequently bought together by the customer. Then it can help tell us interesting consuming patterns such as where should detergent be placed in the store to maximize their sale? And are window cleaning products purchased when detergents and orange juice are bought together? Or is soda typically purchased with bananas? Does the brand of soda make a difference? And how the demographics of the neighborhood affect what customers are buying? So, the frequent patterns are a pattern that occur frequently in the data. It is a set of items, subsequences, substructure, and so on. What is the motivation to find frequent itemsets? First, it finds inherent regularities in that data. What products were often purchased together and what are the subsequent purchases after buying a PC? Or what kind of DNA are sensitive to the new drug? Or can we automatically classify web documents? It has application in several analysis such as association, correlation and causality analysis, sequential patterns or classification and clustering. Association rules discover association relationships among the attributes. It assumes all data to be categorical, such as yes, no, male, female, 0, 1 so the general rule has this form. We have a body which implies a consequent. There are two numbers for this relationship, one is a support and next is a confidence. The body is the examined data and consequent is a discovered property for the examined data. Support is the percentage of the record satisfying both the body and the consequent and the confidence is the percentage of the record satisfying both, to that satisfying the body only. If you look at this small example, let's say we have five market baskets where the products purchased are: A, B, C; A, C, D; B, C, D; A, D, E; and B, C, E, respectively. When I say the association rule is X implies Y, the support is number of times X and Y are together divided by the total number of data points and the confidence is the number of times X and Y are bought together divided by the number of times X was bought. If we look at the rule A implies D, so the support is two by five, that is out of five baskets, there are two baskets in which A and D are both together. Confidence is two by three, that is out of three baskets in which A is there, D is in two of them. Similarly, we can find support and confidence of the remaining rules. In association rule, there are two more parameters used to obtain meaningful results. The first is the minimum support or MinSup, so the itemset in the rule should be frequent in the dataset. That is, it should be present in at least MinSup times N instances. Next is the minimum confidence, that is, the rule must follow on a reasonable number of candidate itemsets. An example is for urban bus network maintenance system, which were discussed in the paper titled, Clustering and Association Rules in analyzing the Efficiency of Maintenance Systems of an Urban Bus Network. As we all know, maintenance is an important part of manufacturing or service system and irregular maintenance could prove costly. The success of an urban bus network depends on the condition of the buses and their maintenance system. In this paper, the author analyzes the efficiency of the bus maintenance system, which comprises of independent components. They used Apriori algorithm, which is a popular algorithm for frequent dataset mining, which we will discuss next to assess the efficiency of the maintenance system, and then the necessary steps were proposed to eliminate bad conditions. The data consists of the following features; one is bus model and number of buses, description of the maintenance activity which could be reactive, corrective, or preventive, or description of the maintenance workshops such as engine, suspension, turnery and so. Other information available about each bus includes: average mileage, number of failures, number of activities carried out in each workshop, total corrective emergency and preventive activities, and mean time between failures. To apply frequent itemset mining, what pre-processing steps need to be done? The first is data transformation. Since frequent itemset mining can only be done on categorical data, if we have any numerical data that need to be converted to categorical using binning. These are the results that we obtained after frequent itemset mining. The discovered rules were then used to modify the current policies in the maintenance system. It consists of data points and then the rules and the support and confidence levels. Another example is association rules for customer shopping behavior. They were described in the paper titled, “Urban Association Rules: Uncovering Linked Trips for Shopping Behavior”. This paper extracts frequently appearing combination of stores that are visited together using Apriori algorithm and it has been characterizing shoppers' behavior. The application could be, it could help understand people's shopping behavior and link trips for shopping all over the city. You can also predict shops most likely to be visited by a customer given an already visited shop. In this video, we introduced frequent itemset mining and saw two applications in urban systems which could be solved by this.