Determination of Temporal Association Rules Pattern Using Apriori Algorithm

A supermarket must have good business plan in order to meet customer desires. One way that can be done to meet customer desires is to find out the pattern of shopping purchases resulting from processing sales transaction data. Data processing produces information related to the function of the association between items of goods temporarily. Association rules functions in data mining. Association rule is one of the data mining techniques used to find patterns in combination of transaction data. Apriori algorithm can be used to find association rules. Apriori algorithm is used to find frequent itemset candidates who meet the support count. Frequent itemset that meets the support count is then processed using the temporal association rules method. The function of temporal association rules is as a time limitation in displaying the results of frequent itemsets and association rules. This study aims to produce rules from transaction data, apriori algorithm is used to form temporal association rules. The final results of this research are strong rules, they are rules that always appear in 3 years at certain time intervals with limitation on support and confidence, so that the rules can be used for business plan layout recommendations in Maharani Supermarket Demak.


INTRODUCTION
Competition in the business world and advances in information technology are interrelated in the intense market competition to meet the increasingly high demands of customers. The company requires strategy and business intelligence to be able to meet customer desires [1]. So that technological advances are needed to develop business, one type of business that utilizes information technology is the retail business.
Competition in the business world, especially in the retail goods industry, requires developers to find an appropriate strategy in order to meet customer desires. One way that can be done is to find out the pattern of consumer shopping purchases so that appropriate measures can be applied [2]. The pattern of purchases of consumer purchases needs to be linked to the time dimension in order to get more complete information for preparing business plans.
Transaction data shows that most consumers at maharani supermarket buy two or more items. Based on interview with manager, consumers often experience difficulty to find other items that they want to buy because the layout of items far apart. It does not make them easier to choose the product they are looking for so that they often ask employees to find the items they are looking for. To overcome this, business planning is needed so that supermarkets can put goods according to the wishes of consumers. The supermarket manager is currently not able to make business plans as intended because there are problems with the supermarket. The problem that occurs is the unknown pattern of goods sold simultaneously and the unknown pattern of goods sold is related to time while the goods have been neatly arranged on a shelf based on the type and brand of the item.
To find the pattern of goods sold together in the sales data, the association method is used. The association method is one of the techniques in data mining that functions to find patterns of relationships between items that are associated each other. Apriori algorithm is used to look up association rules. In this study there are additional aspects of time to result association rules, this method serves as a time limitation in the appearance of frequent itemsets outputs and association rules [3].
Determination of the temporal association rule pattern using apriori algorithm is expected to be able to solve the problem, so it is known what items are often bought together at certain intervals to be made recommendations for business layout of the product layout at the supermarket.

METHODS
In conducting research there are several stages that must be done by the author, the stages are as follows: 1 In this stage, it explains the process of how to analyze product layout with the convenience of consumers finding the product sought at the supermarket. 4. Discussion of analysis based on the method used Based on the analysis done, the writer helps the maharani supermarkets to be able to find out the pattern of purchasing goods simultaneously using the method chosen [3].

Data Mining
Data mining is an automated information search process that is useful in large data storage areas. Another understanding of data mining is an integral part of knowledge discovery in a database [4]. This research is one of the data mining functions of the association function, finding attributes that appear at a time. In the business world it is more commonly called shopping basket analysis (market basket analysis). The task of the association seeks to uncover the rules for measuring the relationship between two or more attributes [5].
The steps to solve research problem are carried out in the following five stages : 1. Preprocessing sales data. There are three steps to take, namely data selection, data cleaning, and data transformation. 2. Carrying out the candidate itemset process to obtain association rules. The process uses apriori algorithm. 3. Get a rule. 4. The rules that have been obtained are identified by adding date information on frequent itemset, the rules are sought again based on the time aspect. 5. Determine minimum support and minimum confidence, and generate rules from sales transaction data. The stages in the temporal association rules process in the form of a scheme are shown in Figure 1.

2 Data Preprocessing
Preprocessing takes almost 60% of the data mining process starting from the aspect of time to the aspect of activity during overall data mining process. In this section there are three steps to be carried out, namely data selection, data cleaning, and data transformation.

Data Selection
Sales transaction data obtained in XLS format, there are several attribute records such as date, item code, transaction name, and selling price. From those several attributes, important attributes is selected. It is used for the process of data mining association rules with apriori algorithm for the formation of temporal association rules. These attributes are:, transaction date, item code, and item name.

Data Cleaning
At this stage, deletion is carried out for disturbing (noise) data. This stage is very important, because the results of the data mining process depend on the quality of the data chosen. In this case, the transaction data found unclear characters in the table of names of goods. Data is cleaned so that the data is ready to be used in the data mining process. Examples of things done at the data cleaning stage are removing the sign (".", ^. [, /) And changing capital letters into lowercase letters. Examples of cleaning up the names of goods in Table 1. Noodles AYM BWNG 2'S /24 noodles aym bwng 2s

Data Transformation
Data transformation is useful for structuring transaction data into a form that is easily processed by data mining. Some sales data still needs to be transformed. Like grouping types of goods into categories.

3 Association Rules
Association rules are rules that learn items or attributes that always appear together. The idea of association rules is to examine all if-then relationships between items and choose only the most likely (most likely) indicators of dependency relationships between items. Usually the term antecedent is used to represent the "if" part and consequent to represent the "then" part. In this analysis, antecedents and consequences are a group of items that do not have a joint relationship [6]. Association rules are if-then statements that help to show the probability of relationships between data items within large data sets in various types of databases. Association rule mining has a number of applications and is widely used to help discover sales correlations in transactional data or in medical data sets [7].

4 Process of Finding Frequent Itemset
There are two stages of seeking association rules, namely frequent itemset analysis and formation of association rules. Frequent itemset analysis look for combinations of items that meet the minimum requirements of a support value in an item's support value database obtained by the following formula. Confidence(A → B) = (3) In order to get better understanding of this technique, the calculation steps are explained using a sample data of 8 sales transactions and a minimum support value is set at 3 (three). Table 1 shows the set of itemset D with the time attribute T which represents the date. There are 8 transactions involved that occur at a time that has a lifetime. Where T = 2 is the start time and T = 8 is the end time. snack,milk 9 Apriori algorithm is used to process C1, 1-itemset candidate by scanning in table 2, then items that meet the minimum support, L1, items in L1 are combined to get 2-itemset C2 candidates. 2-itemset candidates who dont meet the support count are deleted, the minimum support value is set to 3 so items that appear below 3 will be deleted. Then L2 is used to search for 3-itemset candidates and so on until the items cannot be combined. 1. The frequent 1-itemset candidate process: been obtained is then processed further by adding information about the time in the form of a transaction date, so that the resulting association rule already has an interval time. The process of determining the date of the transaction based on the date the first time shopping for [Ts] is considered as the smallest value and the last date of spending [Te] is considered as the largest value, which is in the span of lifetime [2]. Based on Table 4 in L2, the next step is to add transaction date information to each itemset, so that the mapping of each itemset that applies to the transaction date and the number of occurrences appears in Table 5. After the number of times each item appeared and the transaction date are found, the next step, from the combination of rules that contain the itemset in table 4, is to determine the minimum temporal support and confidence. The final temporal rules results are listed in Table  6. Based on the definition of association rule, defining the temporal association rule is done by adding temporal information in the form of transaction date. Adding the transaction date attribute as time will result in the resulting rules being temporal rules.

RESULTS AND DISCUSSION
To implement the data modeling, author used Python tools, the data used are sales transaction data at the Maharani Maharani supermarket. The number of sales data for 3 years ready for the data mining process is 429,832 records. From 3 years data, first year in 2016 the data was 126,590 records, second year in 2017, the data were 149,918 records and in 2018 the data were 153,324 records.This experiment is conducted by the author to find out strong rules, rules that always appear every year at certain time intervals.
The experiment was carried out with 2 schemes. The first scheme with a minimum suppport 0.07 and minimum confidence 0.3, the second scheme with a minimum suppport 0.02 and minimum confidence 0.2 with a time interval of 12 months, 6 months, 3 months, fasting month, christmas and new year. Here are the results of the tests carried out: 1. The results of the rules for 12 month minimum support 0.07 and minimum confidence ≥ 0. 3 Rules that always appear in 12 months in all data with minimum support 0.07 and minimum confidence ≥ 0.3 as follows: 2. The results of the rules for 6 months minimum support 0.07 and minimum confidence ≥ 0.3 Rules that always appear at the beginning of 6 months on all data are not generated because in 2017 data rules are not generated. While the rules that appear in the 2016 and 2018 data with minimum support ≥ 0.07 and minimum confidence ≥ 0.3 are as follows: The resulting rules are the same but there are differences when the rules appear in the 2016 data with the 2018 data.  3. The results of the rules for 3 months minimum support 0.07 and minimum confidence ≥ 0.3 Rules that always appear in 12 months in all data with minimum support support 0.07 and minimum confidence ≥ 0.3 as follows: 4. The results of the rules for fasting months minimum support 0.07 and minimum confidence ≥ 0.3 In the fasting month, the consumption needs of the Muslim community can be said decreased. The need for comestible before the fasting month is arguably not the same as an ordinary day. The situation is actually strange, in every fasting month that spending every household that should be saved because of fasting (only eat twice a day, ie open and dawn) actually actually increases [8].
Rules that always appear in fasting months in all data with minimum support ≥ 0.02 and minimum confidence ≥ 0.2 as follows: Year's Eve celebrations, where demand for consumer goods will increase at this momentum. This can be seen from the 3-month Price Expectation Index in the BI survey which showed an increase to 177.1 from 172.6 in the previous month [9]. Rules that always appear in fasting months in all data with minimum support ≥ 0.07 and minimum confidence ≥ 0.3 as follows: 6. The results of the rules for 12 month minimum support 0.02 and minimum confidence ≥ 0.2 Rules that always appear in 12 months in all data with minimum support ≥ 0.02 and minimum confidence ≥ 0.2 as follows: Rules that always appear in 6 months in all data with minimum support ≥ 0.02 and minimum confidence ≥ 0.2 as follows :  1. Determination of temporal association rule patterns can be generated using apriori algorithm, obtained strong rules, rules that always appear with a period of 12 months and 6 months. 2. Rules that appear in the time interval of 12 months and 6 months in the data 2016,2017, and 2018 show that the rule is strong so that it can be used as a recommendation to make business plan in maharani supermarket. 3. Business plan product placement in a supermarket can use a strong rule with 2 models, namely if you change the layout once in 12 months using the results of the 12-month interval, if you change the layout every 6 months using the results of 6-month interval, and , if you change the layout every 3 months using the results of 3-month interval. the layout for goods items that are not contained in the generated rule can use old pattern used by supermarkets like that.