Business Process Mining in Warehouses: a Case Study

DATABASES AND INFORMATION SYSTEMS H.-M. Haav, A. Kalja and T. Robal (Eds.) Proc. of the 11th International Baltic Conference, Baltic DB&IS 2014 TUT Pr...
1 downloads 0 Views 2MB Size
DATABASES AND INFORMATION SYSTEMS H.-M. Haav, A. Kalja and T. Robal (Eds.) Proc. of the 11th International Baltic Conference, Baltic DB&IS 2014 TUT Press, 2014


Business Process Mining in Warehouses: a Case Study Innar LIIV, Ott LEPIK Department of Informatics, Tallinn University of Technology, Akadeemia tee 15A, 12618 Tallinn, Estonia [email protected] Abstract. The goal of this paper is to present the results from an industry project as a practical example of automatic process mining from the warehousing company business information system logs without any interruption to the actual processes of workers and using only secondary data. An overview of the methodology for extracting, transforming and loading the data from the Enterprise Resource Planning (ERP), Microsoft Dynamics AX is presented. Secondly, the logged business processes are filtered and analyzed using ProM software. A general warehouse-wide process and the most frequent process are automatically found from the company logs, allowing the top management to adjust the strategy and operations according to the actual evolution of the logistics enterprise. The main contribution of this case study paper is to map a specific business problem to the relevant process mining problem definition, perform the analysis and translate the technical results into business insights for the evaluation. Keywords. business intelligence, warehousing, business process mining, data mining

Introduction During the last twenty years people have actively been engaged in optimizing business processes. The main purpose of this has been to increase the efficiency of a company by making more effective use of the resources available within it. The basic requirements and preconditions for the development of business processes are founded in the description and understanding of those processes. It is common practice to describe business processes with the help of empirical methods. Such a description of the process would not necessarily take into account all the events or functions. In addition, such a description may be the vision of its analyst of how the process functions and does not need to be objective. Process mining is a business process model formed of events that actually took place [2],[3]. Processes in organizations typically evolve over time and therefore bottom-up process mining can provide interesting business insights. This case study is based on the data of a logistics company rendering warehousing services. The main process of the company consists of handling processes of different stock keeping units (SKUs). Three different types of warehouses used are: excise, regular and customs warehouses, all of which have both a delivery and a consolidating function. The excise warehouse is used to keep temporarily customs-free products, the excise duty of which has not yet been paid. Optimally performing those handling processes would


I. Liiv and O. Lepik / Business Process Mining in Warehouses

allow for better use of the resources of the company and thus would provide the basis for efficiency growth in the company which in turn would give it a potential competitive advantage. Two types of goals were set for this research: business and technical ones. The business goals were, among others, to reengineer processes in order to make them more efficient; to understand the differences between de facto and de jure processes; and to identify and avoid waste of time (at the process / client / product level). These goals, set by the top management, warehouse managers and sales department, were converted into the following technical challenges and tasks: to establish a linked record of the workflow from an ERP system, which originally did not directly support/record this kind of information; to find and analyze the general warehousing process (bottom-up from the events); to find and analyze the most frequent process. The output of technical results will have a technical evaluation of correctness and domain-specific business evaluation by the management. The logistics company under discussion uses the software Microsoft Dynamics AX (Axapta) and the duration of logging was 1 month. The sample of 13 products was chosen for the log from more than 4000 handled by the company. This was considered sufficient for finding the general model warehousing process. Process mining algorithms Alpha [4] and HeuristicMiner[16], available in process mining tool ProM [13] were used throughout this case study. The novelty of this case study is reaching the business goals by using the information system logs without any interruption to the actual processes of workers and using only secondary data. The main contribution of this case study paper is to map a specific business problem to the relevant process mining problem definition, perform the analysis and translate the technical results into business insights for the evaluation. The authors are not aware of other academic publications of process mining in warehouses. The outline of the paper is as follows: first, the ETL process of the data sources is presented, secondly, filtering and process mining with the ProM software is presented, following with the discussion and conclusions. The main goal of this paper is to give practical insights to all researchers and practitioners looking into process mining topics in logistics. 1. Extracting, Transforming and Loading (ETL) the Event Logs The database system for Microsoft Dynamics AX is Microsoft SQL Server 2005 (MSSQL). All the data tables necessary for generating the events log can be found in the documentation [7]. In spite of that Axapta makes it even simpler to find the necessary data table as it is possible to see on the screen forms from which tables the data shown come. Axapta has been interfaced with the database system through object-relational mapping [11]. All the events connected to the SKU handling are reflected in the central warehouse transactions table (INVENTTRAS). The distinction between the events can be made with the help of transaction type (TRANSTYPE). In Axapta it is possible to make 22 different types of warehouse transactions [8]. Seven of them are considered in this case analysis. In certain cases, in addition to the transaction type, the movement direction (incoming or outgoing) of SKU can also be decisive. By default, the movement direction is determined by the transaction type but in certain cases the movement direction of the SKU may be reversed, for example when the

I. Liiv and O. Lepik / Business Process Mining in Warehouses


SKU are returned to the warehouse during the delivery process. This transaction type is the return of the SKU where the primary receiver returns the SKUs to the warehouse. The transaction is registered in Axapta as outgoing goods but the transaction type is ‘Return’ and the movement direction is incoming. Consequently, another transaction type is added to this case analysis. Thus, there are a total of 8 different transaction types under observation. They all are shown in Table 1. Table 1. Types of Axapta inventory transactions. Transaction type 0 0 3 5 6 14 21 22

Name Sales order Return order Purchase order Inventory valuation In-warehouse adjustment In-warehouse displacement of pallets Movement between the warehouses – outgoing Movement between the warehouses – incoming

Movement of SKUs Out In In In-out In-out In-out Out In


The additional information about the SKU items processed in the warehouse is kept in the product dimensions table (INVENTDIM), which is connected to the warehouse transactions table through an additional information identifier (INVENTDIMID). The products are inserted into the warehouse account in the units they are handled. Every handled unit has its additional characteristics, e.g. the location of the warehouse where the SKUs are kept and the pallet connected to that SKU. Information of this kind is very important when drawing up the events log. In each observed case, certain SKUs are handled. These SKUs are either in the warehouse or in the process of being placed to the warehouse or the quantity of the handling may be up to an entire pallet. Axapta supports monitoring the storing processes by saving the start and the end time of each event, and the person who carried out that event. A separate set of tables has been put together for each monitored process. In Axapta, it is possible to register the time of entries and the time entries are altered. Axapta can also register the name of the user who created the entry or last changed it. This is important for the events, which are observed as belonging to the process but of which Axapta does not have a separate account. The start time and the end time of the events are considered as the time that was actually spent on the events. This means that they do not take into account the time that has been established as working hours in the logistics company. Only the time really spent on the process is taken into account. It is important to remember this fact when analyzing the process results. Data extraction from the database has been implemented with SQL queries. A separate query has been created for each transaction type and movement direction. In order to avoid the excessive complexity of the query and excessive capaciousness of the events log, half-finished processes are filtered out when the query is made. The process mining software ProM uses a data file in the format of MXML (Mining XML) [1] as an input. The MXML file is an ordinary data file in the XML format described by the XSD scheme [10]. The processes are separated by process tags. Several different processes may be described in the file. Each process consists of cases, which are separated by ProcessInstance tags. The case consists of events that have happened in the course of the process or of the events carried out. Each such event has been separated in the file by AuditTrailEntry tags. In the course of this case research, a MXML file is compiled from the result returned by the MSSQL server procedure. The procedure


I. Liiv and O. Lepik / Business Process Mining in Warehouses

returns data in a correctly formatted XML format. This data is written as a file with the help of bcp utility [6]. As the query of each process data is implemented with a separate procedure, 8 files are formed. These files are manually gathered into one XML entire log file of processes. The preprocessing phase took about 200 hours of technical work, which accounted for 90% of the project time (cost). 2. Filtering the Event Logs The events log contains a fair amount of information that might be unnecessary in the course of process mining. For example the events log may contain unfinished cases where it is unknown what events need to be carried out to finish it. The events log may also contain information about several processes or even false information. Therefore, it is necessary to filter the process of interest to us and the events performed within it out of the log. Interaction provided in ProM for filtering the events log may provide the necessary answers even without using process mining algorithms. In this research, the most performed process is found with the help of log filtering. In order to filter the events log, one has to open a data file in ProM software in the format of MXML. From the dashboard (Figure 1) it is possible to find a lot of important information about the data in the log (8 processes, 7828 different process cases and 37917 different events). In addition, there are 34 different event cases in the log and 2 different event types. There are a total of 24 originators in all the processes. Attention must be paid to the fact that a non-filtered events log may contain false information, e.g. unfinished processes, noise or faulty data about the events. Therefore, it is not practical to draw tenable conclusions on the basis of a non-filtered events log.

Figure 1. ProM Dashboard

The filter is applied on the Filter page (Figure 2). There are several different possibilities for filtering the events log. A Filter may be applied to one or several processes. It is not only possible to determine the start and the end event of the process but also the events that are of interest. All those settings can be combined with each

I. Liiv and O. Lepik / Business Process Mining in Warehouses


other. This enables filtering out the necessary process together with the observed events. Filter settings depend on the purpose which is different in every case.

Figure 2. ProM Filter

3. Finding the General Warehousing Model The general process is a collection of all the processes under observation. 8 separate processes under observation form the general process in this case study. Altogether, they form the business process of the logistics enterprise. The sample of 13 products used as the basis for compiling the events log have been chosen with the purpose of covering all the events occurring in the logistics company. Generally, the processes are presented in the process model. The graphic presentation form of the process model is a diagram. The most well-known diagram type is Petri net but several other diagram types are also used. The presentation form of a process model depends on the specific purpose and needs. The process model has to correspond to four basic conditions – suitability, precision, generalization, and structure [14],[15]. In our experiments, we used Alpha algorithm [4] (which output is a process model in notation of Petri net) and HeuristicMiner algorithm [16]. In our case study, the HeuristicMiner process mining algorithm was able to ignore most of the anomalies and noise in the unfiltered event logs and present the most understandable general process for the business users. 4. Finding the Most Frequent Process The most performed process is characterized by the greatest number of process instances. It means that the process and events relating to it are carried out rather more often as compared to other processes. Therefore, comparatively, more resources are also utilized to accomplish the most frequent process. Any kind of resource signifies expenses for the enterprise and thus the most performed process is presumably the greatest expenditure. The most frequent process greatly influences the general efficiency of the enterprise. In order to find the most performed process, one has to use a process filter based on the events log. Taking into account the number of events in the log (7828), the most


I. Liiv and O. Lepik / Business Process Mining in Warehouses

performed process is SHIPMENT, which was performed 6688 times. The performance count of all the processes is given in Table 2. It can be seen from the diagram on the most frequent process (Figure 3) that the majority of events (6687) start from the storage in the regular warehouse. The process ends with the event SHIPMENT. In addition, there are 48 instances where the start event Sending out a pallet returns to the event Storing – the regular storage. Considering that sending products out of the warehouse is the observed process, there should not be any opposite movement. Table 2. Process instances in the logistics company. Process Total number of process instances SHIPMENT 6688 RETURN 70 INCOME 73 REFILL 250 TRANSFERORDER_RECEIVE 524 TRANSFERORDER_SHIPMENT 92 STOCKTAKING 109 ADJUSTMENT 22

Thus, those 48 instances would qualify as noise and should be filtered out during the further SHIPMENT process. These 48 instances still provide a good starting position for seeking possible errors in the initial data or observing them as hidden or as separately taken processes that have currently remained unnoticed with the used methods. Furthermore, from the initial data of the further analysis one may filter out the only output that starts from Warehousing - Excise WH (storage – the excise warehouse) as its percentage amongst all the process instances is very small and would therefore not change the general picture. When estimating the process SHIPMENT, one should take into account the fact that only 13 products are observed out of more than 4000 handled by the logistics enterprise. The number is sufficient to cover all the handling events of stock keeping units but it does not need to give an objective overview of the proportional sequence of these events.

Figure 3. The most frequent process in the warehouse

I. Liiv and O. Lepik / Business Process Mining in Warehouses


It is known from the warehousing theory [5] that the most performed process is sending out products, which is independent from the aim of the storage or from the function of the warehouse. In the course of finding the most performed process it has been established that sending out products is indeed the most performed process. As a side product of process mining on the same data, ProM enables analyzing the duration of frequent processes (Figure 4). The quickest event was assembly (average 6.57 hours). However, 16% of the assembling cases (1055) were handled for longer than the average time spent on this process.

Figure 4. Diagram of events’ duration

5. Conclusions The aim of process mining is to design the process model on the basis of events that actually occurred. Enterprise resource planning (ERP) systems like Microsoft Dynamics AX have their own reporting subsystems but the results obtained by this case study are not available via standard reporting due, because they measure and control individual tasks. Process mining algorithm is required to discover sequential patterns of those events, not to mention providing the user interface for a drill-down to the details: counts and durations. Companies often manage their processes top-down but as the business environment changes, the information system supporting the business begins to evolve, reflecting the changes in individual tasks. Business Process Mining allows detecting such evolution in the information system in order to inform the company about the changes, as opposed to the top-down planned process model. To the best of our knowledge, no other academic publications about business process mining case studies in warehouses exist. In addition, the strength of our approach was to use standard MS Dynamics AX software to capture the data needed for our case study without using additional data logging software. All necessary additions to tag and capture data were presented in the paper, which can also be helpful in other industries.


I. Liiv and O. Lepik / Business Process Mining in Warehouses

The results were evaluated both from the technical and the business perspective. Several “test processes” were performed and recorded in the actual warehouse under ideal circumstances and with experienced labor. Afterwards, it was checked whether the data after the preprocessing and analysis pipeline matches the original test setup. Business evaluation was performed by top management, warehouse managers and sales department to analyze if the actual processes reflected the initial plans and intents. Several insights were found, especially the general process of the warehouse and the most frequent process in the warehouse – a process chain which was carried out 6688 times during the period under observation and which can be summarized by a label “sending out goods” (SHIPMENT). This case study creates a good starting point for further activities and studies. The real-world dataset containing all the warehousing events has been anonymized and is available upon request for research purposes. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

[15] [16]

van der Aalst, W.M.P., van Dongen, B., Herbst, J., Maruster, L., Schimm, G., Weijters, T., Workflow Mining: A Survey of Issues and Approaches. Data and Knowledge Engineering, 2003, 47 (2), 237-267. van der Aalst, W.M.P., Reijers, H., Weijters, T., Vandongen, B., Alvesdemedeiros, A., Song, M., Verbeek, H. Business process mining: An industrial application. Information Systems, 2007, 32 (5), 713732. van der Aalst, W.M.P., Weijters, T. Process mining: a research agenda. Computers in Industry, 2004, 53(3), 231–244. van der Aalst, W.M.P., Weijters, T., Maruster, L. Workflow Mining: Discovering Process Models from Event Logs. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(9), 1128-1142. Bartholdi,J., Hackman,S. Warehouse & Distribution Science. [WWW] (accessed 20.02.2013) bcp Utility. [WWW] (accessed 20.02.2013) Database for Microsoft Dynamics AX. [WWW] (accessed 20.02.2013) InventTransType Enumeration. [WWW] (accessed 20.02.2013) MXMLib. [WWW] (accessed 20.02.2013) MXML Schema. [WWW] (accessed 20.02.2013) Object-relational mapping. [WWW] (accessed 20.02.2013) Petri Nets World. [WWW] (accessed 20.02.2013) ProM (accessed 20.02.2013) Rozinat, A., de Medeiros,A.K.A., Günther,C.W., Weijters, A. J. M. M., van der Aalst,W.M.P. The Need for a Process Mining Evaluation Framework in Research and Practice. In: Proceeding of the Business Process Management Workshops, BPM 2007 International Workshops. Brisbane, Australia, 2007, 8489. Rozinat, A., Veloso, M., van der Aalst, W.M.P. Evaluating the Quality of Discovered Process Models. [WWW] (accessed 20.02.2013) Weijters,A.J.M.M., van der Aalst, W.M.P., de Medeiros, A.K.A., HeuristicMiner algorithm. [WWW] (accessed 20.02.2013)

Suggest Documents