:: Industrial Engineering & Management Systems ::
Journal Search Engine
Search Advanced Search Adode Reader(link)
Download PDF Export Citaion korean bibliography PMC previewer
ISSN : 1598-7248 (Print)
ISSN : 2234-6473 (Online)
Industrial Engineering & Management Systems Vol.17 No.2 pp.193-208
DOI : https://doi.org/10.7232/iems.2018.17.2.193

An Adaptive Time-Based Kiln Schedule using Forward Chaining Approach

Liem Yenny Bendatu, Bernardo Nugroho Yahya*
Department of Industrial Engineering, Petra Christian University, Surabaya, Indonesia
Department of Industrial & Management Engineering, Hankuk University of Foreign Studies, Gyeonggi, Republic of Korea
Corresponding Author, bernardo@hufs.ac.kr
January 31, 2017 July 26, 2017 March 19, 2018

ABSTRACT


Sequential decision problems have been recognized as a strategy which uses search techniques to generate sequence of actions that leads to good states. Current sequential decision problems require a given set of actions while the actions in kiln dry wood process relies on various factors that may diverse from time to time. Predefined kiln schedule has been available to assist kiln operator’s decision making. However, routine decision would be highly affected by either drying problems or customer complaints. Although many works attempt to apply mathematical model and some experiments to predict kiln dry process, relatively little attention has been paid to the problem of discovery sequential pattern from kiln dry wood process data. This study proposes a framework as a decision support for sequential decision problems, specifically, for kiln dry wood process. The framework starts with preprocessing the kiln process data, follows with mapping the preprocessed data into a log, and ends with analyzing as well as verifying the process using quality of predictions performance measures. A case study on real datasets in the field of kiln dry wood process indicate that historical data contain patterns which are able to generate kiln schedule and associated to possible future behavior.



초록


    1. INTRODUCTION

    Decision support systems in manufacturing companies play an important role for managers and experts to analyze and evaluate collected information and data for planning and decision making process. Rather than supporting a solitary decision, complex applications with vast data can deal with a sequence of decisions. Previously concerned with only a single decision, where the utility of each action’s outcome is known, the process has now been improved by sequential decision problems that include utility, uncertainty, and sensing. Sequential decision problem is a strategy that relies on a set of actions sequentially that leads to good states. Most of the tools apply search algorithms and techniques originating from the control theory, operations research, and decision analysis in a suitable space using basic principles from data mining and artificial intelligence.

    A wood manufacturing company is a good example to represent the usage of sequential decision problems. The characteristics and quality of wood highly depend on many aspects, such as the species and market of plantation- grown area. In wood manufacturing, drying process is recognized as a vital activity since it highly affects the performance of finished goods. This process, also referred to as equilibration, can cause a range of issues-most commonly involving the wood shrinking unequally or becoming damaged if the process occurs too quickly (Simpson, 1991; Simpson and Wang, 2001). To ensure timber is dried in the most economical way, i.e., shortest time, without causing drying defects, schedules are created based on the species, thickness, consistency, and intended use of the wood (Ofori and Brentuo, 2010). Since processing timber on an industrial scale takes high cost, a numerous amount of research has developed effective schedules (Gan et al., 2015; Jara et al., 2008; Wang et al., 2002; Smith and Torgeson, 1951; Bramball, 1975). Although Forest Product Laboratory (FPL) has suggested kiln schedules on various wood species as general guidelines (Boone et al., 1988), operator’s judgment is still necessary to determine whether lumber will go through the kiln in a minimum time or emerge uniformly dried to the desired moisture contents.

    The operator’s judgment which refers to sequential decision problem is mainly determined by the attributes based on sensors such as temperature, humidity, and moisture contents (MC). Predefined kiln schedules provide information such as steps on which the temperature and humidity should be changed when MC reach particular states. Although predefined kiln schedules exist, most kiln operators rely on their knowledge and experiences according to the actual situations and conditions of wood. For example, operators need to decide whether the temperature should be increased or decreased according to the output reading of kiln dry machine, e.g., MC. This decision making takes place sequentially in every particular time, e.g. 2 hours, until the product reach particular states. Many times, operators have high belief on their experiences rather than theoretical kiln schedules. Consequently, a log repository to store the output reading of sensors in the kiln is beneficial for future references. However, the intensive uses of the historical data, for example extracting patterns of kiln dry process, have never been explored. As a matter of fact, operators would do tedious job such as evaluating the patterns of historical data rather than applying emerging approach such as process mining.

    Process mining (van der Aalst, 2011), which deals with the analysis of process execution logs, could derive knowledge in several dimensions, including the flow patterns extraction from log. Deriving patterns from historical data according to state of transitions remains as a challenge in the application of process mining. In particular, the context of kiln dry illustrates the use of sensor data as time series. A time series data, or also called as low-level event logs, are not immediately proper in process mining. Some early works on mapping the low-level data into activity clusters have been done (Günther and van der Aalst, 2006). However, the sequential data is somewhat different in twofold; uncertainty and discrete data. In term of uncertainty, the data plot on temperature and humidity are not only influenced by MC but also environmental conditions (e.g. weather) and wood specification (e.g. type of finished goods, wood species, wood plantation area, etc.). In regard to the discrete data, most previous works have dealt with continuous time series data while real world wood manufacturing companies might apply technologies with semi-automatic intermittent time series data, referring to discrete data. The discrete data implies user’s involvement where time interval of sensor data storing process might be different. Thus, there are at least two challenges on this problem; (1) mapping sensor measurements to sequential profiles, and (2) recognizing patterns from sequential profiles. These challenges need to be addressed to close the gap between sensor data and the traditional event logs which are assumed as input for process mining techniques.

    This study aims to develop a framework for a sequential decision problem, in particular, in wood manufacturing. The purpose of this study attempts to address a question: Can previously observed (historical) wood drying process data be reused to estimate the drying process? Some approaches on improving the drying process lie on experiments and observations with the inclusion of existing kiln schedules (Boone et al., 1988). Others attempt to model the relationship among existing factors (Wang et al., 2002; Bramball, 1975). Meanwhile, historical drying process data has been used only for quality assurance and process tracking. This study utilizes the log data to construct a kiln schedule model as a reference for sequential decision problem in kiln dry wood process. Specifically, the contributions of this study are as follows. First, by utilizing the historical data, we propose a decision support framework to gain insights and construct kiln schedule. Second, we reuse the previous technique in process mining, which is transition system, and verify the usability of forward chaining approach to predict the completion time of drying process. It remains challenging to apply the approach for kiln dry wood process since most of process mining approaches use activity as a nominal attribute. Meanwhile, the kiln dry wood process employs uncertain numerical attributes, i.e. the changes of temperature from time to time. Therefore, a new approach to map numerical value into an activity profile is proposed in this study. Subsequently, patterns about the actual situation can then be discovered and compared to the predefined schedule generated by the operators as the part of organizational guidelines. Discovered discrepancies between the kiln schedule model and the real situation can be used to better align process and schedule, to remedy predefined ineffective schedule and to improve lead time.

    This paper is organized as follow. Section 2 describes some related work on kiln schedule mining and process mining. Section 3 addresses the methodology used in this study. Section 4 shows the implementation and the discus sion issues of the proposed framework. Finally, Section 5 concludes the study.

    2. RELATED WORK

    This section addresses some related work on kiln dry process with respect to developing kiln schedules. Subsequently, the work on mapping low level events into event log for process mining is explored.

    2.1. Developing Kiln Schedules

    Kiln dry wood process is one of the recommended processes in wood manufacturing to improve wood product quality by producing dried lumber. In kiln dry, later called as kiln process, higher temperatures and faster air circulation are used to increase drying rate considerably. The combination of temperatures and air circulation enforces the necessity of a kiln schedule. A kiln schedule is a carefully developed compromise between the need to dry lumber as fast as possible for economic efficiency and the need to avoid severe drying conditions that will lead to drying defects.

    There are numerous works on developing kiln schedule. Most existing work mainly focused on conventional kiln schedules (Boone et al., 1988; Simpson, 1991). The kiln schedule includes drying steps to achieve both economic efficiency and high-quality requirements, however, it is used only as a guideline with many modifications when the operators apply it (Jara et al., 2008; Gattani et al., 2005; Ofori and Brentuo, 2010). Due to various factors such as wood species, lumber thickness, initial wood moisture contents, temperature, and environment conditions, it requires a good skill of kiln operators to manage the schedules based on the current conditions in timely manner. As a consequence, the completion time is unpredictable and uncertain that raises a high possibility of tardiness of other processes. Hence, improving the performance of wood drying process by analyzing as well as discovering kiln schedule from the (successful) historical data is necessary.

    Previous works have attempted to improve kiln schedule on various types of wood for estimating the completion time (Simpson, 1991; Campbell, 1975; Nassif, 1983; Vermaas and Neville, 1989; Neumann, 1989). The reason on building individual kiln schedule for each species is to match with the characteristic and the anatomy as well as the nature of the wood. Numerous kiln schedules have been generated, however, inconsiderable aspects such as regions and weather would affect in the implementation. Rather than developing individual kiln schedule, best practices using historical process data might assist operators to decide the steps of processes. In addition, certain processes could be grouped when the features share the same properties. Therefore, it is a challenge to retrieve a kiln schedule with thousand combinations of process sequences.

    Some organizations around the world have dedicated their work on the development of schedules for various woods. For example, Commonwealth Scientific and Industrial Research Organization (CSIRO), Australian Furnishing Research & Development Institute (AFRDI), and Conservation and Land Management (CALM) in Australia (Mills, 1991) are some of the organizations that focus on analyzing wood characteristics as well as generating individual schedule for particular wood. The developed kiln schedules can be categorized into several types; normal MC-based schedules, constant temperature or constant relative humidity schedules, time-based schedules, continuously varying schedules, intermittent schedules, and smoothed schedules.

    Normal MC-based schedules describe a kind of schedules that normally begin at green moisture content, followed by normal kiln-drying after the start (Mills, 1991; Campbell, 1975; Neumann, 1989; Pandey et al., 1984; Stöhr and Mackay, 1984; Stöhr et al., 1985). This kind of schedules often suggest that drying process is followed by an initial air drying (or pre-drying in pre-dryer) period to a particular MC of 25-30%. The constant temperature or constant relative humidity (R.H.) schedules has been applied for pine squares, which are 4/4, 5/4, and 6/4 in cross section and 24 to 36 in long (Simpson, 1991). Moreover, it has been applied in the Netherlands for the complete drying of hardwoods in a climatic chamber.

    In regard to time-based schedules, it has been developed out of MC-based schedules after suitable experience (Campbell, 1975; Simpson and Wang, 2001). Meanwhile, continuously varying schedules (CVS) requires a continuous increment of Dry-Bulb Temperature (DBT) and Wet-Bulb Temperature (WBT) after an initial heating-up stage to 35-45°C (Bramball, 1975; Nassif, 1983; Vermaas and Neville, 1989). There are some variations of CVS schedules. For example, DBT and WBT increase at a fixed rate per percent MC loss. An example of continuously varying schedules can be seen at Table 1.

    The intermittent schedules include a drying phase which is alternated between heating and ventilators. For example, while it runs the heating, the vents are closed and while the heating is switched off, the vents are opened. It is based on a theory that MC and stress gradients will be reduced during the non-drying phase and thus result a better quality of dried wood. An example of good results was done by Vermaas using the schedule to dry E. grandis (Vermaas and Neville, 1989). However, the consequence is a longer drying time. The other kind of schedule is smoothed schedules. This kind of schedule was developed due to a hypothesis that the relatively large abrupt changes in EMC conditions in a conventional kiln schedule subject to the wood surface to abrupt (Simpson, 1980; Little and Toenisson, 1989). In addition, a steep moisture gradient tends to aggravate or cause surface checking. If the changes were smaller and numerous, lumber should be less susceptible to surface checking.

    Specific kiln schedules have been developed to control temperature and relative humidity in accordance with the moisture content and stress situation within the wood, thus minimizing shrinkage-caused defects. Since most of the previous works rely on theoretical setting, most of kiln schedules work on very specific region and are not applicable in other regions due to factors variability, e.g., weather, air condition, etc. It is common that kiln operators apply trial and error at the first process and subsequently depend on their experiences. In other words, historical (log) data could express the knowledge and experiences of operators. To the best of our knowledge, none of the previous work considers the log data for constructing kiln schedule. This study applies forward chaining approach to mine the steps of kiln schedule based on log data.

    2.2. Process Mining and Event Abstractions Approaches

    There are several works on generating a general kiln schedule using approaches such as experimental approach (Smith and Torgeson, 1951; Wang et al., 2002), time series analysis (Gattani et al., 2005), statistical analysis (Wen et al., 2012), and mathematical programming (Marier et al., 2016). Process mining, as the derivation of data mining, has not been considered. Process mining aims to use event logs to discover knowledge and analyze behavior of traces or sequences (van der Aalst, 2011; Weijters et al., 2006; de Medeiros et al., 2007; Günther and van der Aalst, 2007). The aspect that differentiate process mining from data mining is sequential data that represents a process instead of atomic data. The sequential data, called as trace, would be extracted more with some techniques to discover knowledge for business or economical purpose e.g., bottleneck and time analysis.

    Although the field of process mining have been sucsuccessfully applied to various fields such as business process (van der Aalst, 2011; Günther and van der Aalst, 2006), logistics (Jeon et al., 2013; Yahya et al., 2016), municipality (van der Aalst et al., 2011), and manufacturing (Yahya, 2014), there has been little work that focuses on the application of wood manufacturing. Some approaches of process mining would fit to find some potential patterns for monitoring and prediction purpose (van der Aalst et al., 2011). For example, it is used to discover model in operational setting and transition system of low level event (Günther and van der Aalst, 2006; Ferreira et al., 2013; Mannhardt et al., 2016). The approaches apply abstraction mechanism to aggregate events into a meaningful label. While the mentioned low level events are similar to signal processing, the existing work remains challenging when we utilize sequential uncertain attribute values of event data. In addition, since some of the information systems have no explicit information of the notion of activities/tasks, it forces the analyst to induce activity from low-level event log. Through some modifications, the transition system approach could be the basis on discovering model from low-level event log.

    For predicting the processing time, a process mining approach called finite state machine (FSM) miner has been proposed (van der Aalst et al., 2011). The approach has been mostly used to discover model in operational setting that is based on event log. The discovered model could be extended with information to predict the completion time of running instances. Using this approach, we could learn earlier instances and annotate the model with time information. This study reuses the transition system approach with some modifications including verification procedure to show the applicability of aforementioned approach for sequential event data which is available in kiln dry wood process.

    3. METHODOLOGY

    This section describes the methodology used in this study. First, the overall framework to discover kiln schedule is described in Figure 1. It consists of four steps; data preparation, transition system model, kiln schedule generator, and performance analysis. The framework starts with data retrieval mechanism from manufacturing database. The retrieved data is preprocessed in data preparation step for removing noises and excluding incomplete data. The preprocessed data are then stored in an independent repository, called as event logs. In transition system model, it aims to discover best practices of kiln drying operation from the historical data. Kiln schedule generator aims to find a schedule according to the given queries. Finally, the performance analysis aims to measure the effectiveness of the approach. Each of these steps will be explained in the following sub-sections.

    3.1. Data Preparation and Transition System Model

    There are a lot of wood processing data in wood manufacturing companies. This study focuses on kiln dry wood process which the data includes kiln drying machine number, Setting Dry Bulb Temperature (SDB), Setting Wet Bulb Temperature (SWB), Actual Dry Bulb Temperature (ADB), Actual Wet Bulb Temperature (AWB), relative humidity (R.H.), moisture contents (MC) and timestamp. Among the data, the relevant attributes such as a unique identifier (Case ID), time, and the temperature values are selected. It should be noted that a unique identifier, in some cases, does not exist. Hence, aggregating some attributes (room number of kiln dry machine, date, and time) from the database are beneficial for the analysis.

    Each case consists of a sequential data of events with various attributes such as dry-bulb setting temperature (SDB), wet-bulb setting temperature (SWB), moisture contents (MC), and Time. Table 2 shows a fragment of historical data with three cases of the same wood type and several stages of temperatures of SDB and SWB.

    Before continuing to analyze the data, there should be a task to improve the data quality as a requirement for improving the analysis result. Data preparation phase aims to filter, for example, noises and incomplete data. The filtered data is subsequently converted into standard format, e.g., MXML, for further processing.

    To generalize, kiln process log sequential data is defined as follows.

    Definition 1. Kiln process log sequential data

    A kiln process log sequential data, denoted as L, consists of a set of cases. Each case consists of a set of ordered events based on the record during the monitoring process, refer to sequential data. Sequential data (D) denotes as a finite non-empty sequence of attribute values intermittently stored in the repository. Thus, a kiln process log data is a tuple <E, C> which is defined as follows.

    Event.E = A × TT is a set of events, where A is a collections of attributes values used for modeling purpose, and TT is a set of timestamps. The monitoring process records A into a data repository to indicate the progress of kiln process from beginning until the end. It is denoted as A = AD × AW × AD’ × AW’ × AM, where AD, AW, AD’, AW’ AM represent SDB temperature, SWB temperature, ADB temperature, AWB temperature, and MC, respectively.

    For example, according to Table 2, e1 ε E and e1 = {52, 40, 39, 31, 28, 14-11-05 16:00}. It means the first event contains three attributes; 52, 40, 39, 31, 28, and 14-11-05 16:00 that represent the SDB temperature, SWB temperature, ADB temperature, AWB temperature, MC, and timestamp, respectively (Note that the date format follows year, month, date, hour and minute. In addition, the AM is only used for the target state instead of modeling purpose).

    Trace. A trace σ is a finite non-empty sequence of events such that each event appears only once and time is non-decreasing.

    Case.C is a set of ordered event, where a collection of all cases is a kiln process log (L). Cases can be regarded as a trace, denoted as σ є E*. C = E* is the set of possible event sequences (traces describing a case). Each element of L denotes a case. Hence, an event log is a set of traces LC.

    Several approaches for reasoning from the historical data can be categorized in twofold; forward chaining and backward chaining. Forward chaining, sometimes called as data-driven approach, starts with the actual data and uses steps in kiln schedule mining to extract more data until a goal is reached. In this study, the goal refers to the user requirement on the MC, or in the later section known as target MC. For example, suppose that the goal is to achieve a target MC of 10, then the inference engines will iterate the steps until the target MC is reached. Unlike forward chaining, backward chaining, sometimes called as goal-driven approach, usually employs a depth-first search strategy to describe the steps moving backward from the goal. For example, suppose the target MC is given, then the inference engines will iterate the steps moving backward until the actual attributes.

    This study emphasizes the forward chaining to reason the available data and attempts to verify the model with kiln process data. One of the goals to assess sequential attribute values is to build a general model for schedule generator and future prediction. For this purpose, we utilize transition system. Transition system is used to describe the potential of behavior of discrete systems. A transition system is a triplet (S, E, T) where S is the state space (i.e., possible states of the process), E is the set of event labels (i.e., transition labels), and TS × E × S is the transition relation to describe how the system can move from one state to another. For example, (s1, e, s2) ∈ T describes that there is a movement from state s1 to s2 by an event labeled e. To represent the transition system, definitions for state and event representation are necessary. Hence, it is required to map the kiln process log sequential data into a state representation (e.g., eliminate redundant values in a trace and develop a way to distinct the values as states) and event representation.

    Definition 2. State representation

    A state representation is a function to project a trace into some representation. Formally, LstateC → ℛ where C is the set of possible traces and is the set of possible (state) representations (e.g., sequences, sets, multiset).

    Definition 3. Event representation

    An event representation is a function to project an event into a set of possible representations. Formally, LeventE where E is the set of possible events and is the set of possible (event) representations (e.g., corresponding to temperature settings). It should be noted that the same label may appear on more than one transition.

    To optionally remove the order or frequency from the resulting trace, an abstraction mechanism is necessary. The abstraction of sequential data used in this study are defined as follows.

    Definition 4. Sequence, Multiset, or Set

    To obtain the order of kiln schedule, it needs to yield the abstraction in three aspects of sequences that resemble state representation (see Definition 2). The sequence and multiset yield a sequence while the set removes the order or frequency from the trace. Specifically, the definition on discovering the kiln schedule using the three aspects is as follows.

    • - Sequence: the time order of temperatures is recorded in the state,

    • - Multi-set of steps: the number of times each step is executed ignoring their order

    • - Set of steps: the mere presence of steps

    Consider the sequence <52, 52, 55, 55, 57, 57, 60, 60>. The three possibilities result in <52, 52, 55, 55, 57, 57, 60, 60> (sequence), [522, 552, 572, 602] (multiset), or {52, 55, 57, 60} (set). Using two attributes (SDB and SWB), the three abstractions for case #1 is < {52, 40}, {55, 40}, {57, 40}, {57, 42}, {60, 42}> (sequence), [{52, 40}2, {55, 40}2, {57, 40}, {57, 42}, {60, 42}2] (multiset), {{52, 40}, {55, 40}, {57, 40}, {57, 42}, {60, 42}} (Set).

    3.2. KILN SCHEDULE GENERATOR

    There are two things that could be generated from kiln process data; general kiln schedule and specific kiln schedule. The former refers to all wood species and the latter emphasizes specific properties in kiln process (e.g, wood species, time, query) with particular techniques. The occurrence and frequency of current state may be less interesting than the fact that it occurs within the time span to determine the kiln schedule. Hence, the set abstraction result is projected as a labeled transition system to generate general model.

    Definition 5. Labeled Transition system

    Let L be an event log. Given Lstate and Levent as state representation and event representation functions, respectively, the formal definition of a labeled transition system TS = (S, E, T) where

    S = { L s t a t e ( k σ ) | σ L 0 k | σ | } is the state space,

    E = { L e v e n t ( σ ( k ) ) | σ L 1 k | σ | } is the set of event labels

    T = { ( L s t a t e ( k σ ) , L e v e n t ( σ ( k + 1 ) ) , L s t a t e ( k + 1 σ ) | σ L 0 k | σ | } is the transition relation.

    In regard to specific kiln schedule, horizon-based schedule is proposed. Horizon-based schedule aims to generate schedule for specific purpose (e.g. particular wood species, actual queries, or maximal horizon). To calculate the state, we can use either the complete steps, only a subset of the trace is considered in the calculation. Two concepts are introduced; time-based and query-based horizon. The former presents a technique to generate a model according to the time span of the sequences while the latter refers to a technique to find the best references in accordance with the actual conditions. The definitions of the two concepts are explained as follows.

    Definition 6. Time-based horizon

    Time-based horizon is denoted as the step of traces. For example, if the horizon is limited to 2 and uses the trace <52, 52, 55, 55, 57, 57, 60, 60>, then the most 2 recent events are consider a complete trace, then it corresponds to ∞. Figure 2 shows the transition system of the fragment log from Table 2. Using the time horizon h=1 and h=2, the result of tranistion system is illustrated at the left and right of Figure 2, respectively.

    Accordingly, the resulting transition system could be used to obtain state which considers partial steps, i.e., certain events are simply not considered in the calculation. For example, filtering could be used to project the horizon into a set of steps X, i.e., only events that correspond to some steps in X are considered in the state. If X = {57, 60}, then the trace <52, 52, 55, 55, 57, 57, 60, 60> is reduced to <57, 57, 60, 60>.

    Definition 7. Query-based horizon

    Query-based horizon is a way to verify the proposed approach by measuring the difference between the model and the actual. Two types of query-based horizon are proposed. The type I evaluates between the setting temperature and actual temperature. Type II of querybased horizon utilizes two additional attributes (ADB and AWB) from the log to find the closest setting temperature. Using Euclidean distance (see Eq. 1), we select potential instances (cases) with selected variables to be the reference.

    b = a r g m i n i b i a i 2
    (1)

    Where a and b are two states, i.e., actual state and state in a model, respectively. The i-th index refers to the variable in the respective state. Suppose the actual state is 40 and 37 of ADB and AWB, respectively. For type I, case 2 and case 3 would be the best references. For type II, case 2 would the best reference since the ADB and AWB of the second event (40 and 35 for ADB and AWB, respectively) is the closest.

    Generally, decisions would be made after noticing the actual temperature including the value of MC. Three types of possible decisions are changing either SDB or SWB values, keeping SDB and SWB values, and changing both SDB and SWB values. Changing either SDB or SWB value refers to the changes of one attribute. For example, case 1 has the value of SDB increased into 57 at 2014-11-06 16.00 from 55 at 2014-11-06 10.00 while the SWB keep as 40. Keeping the values of both attributes refers to maintaining the conditions until the MC reach particular state. For example, both the value of SDB and SWB would be the same until a goal state, MC, is satisfied. Changing both SDB and SWB values could happen such as case 3 that exist a change from <42, 37> into <45, 38>. Hence, any changes of temperature denotes as event representation that bridges between a state and another state. In addition, the derivation of duration of particular changes from a state to another state represent the occurrence of subsequent process. Time information would be required as a way to annotate transition system with time aspects explained as follow.

    Definition 8. Time information

    The state is annotated with time information such as elapsed time, sojourn times and remaining times. Elapsed time aims to inform the average time to reach a particular state. Meanwhile, sojourn time and remaining time are the average time spent in a particular state and the average time to reach the end from this state, respectively.

    Let σ1 and σ2 be the prefix trace and postfix trace, respectively, and σ1σ2. The full trace σ = (<(52, 40), (55, 40), (57, 40), (57, 42), (60, 42)>) is split into σ1 =<(52, 40), (55, 40)> and σ2 = <(57, 40), (57, 42), (60, 42)>. For this particular situation, 16, 8 and 30 are elapsed time, sojourn, time and remaining time, respectively. The first state, which is (52, 40), has elapsed time 0 hour since it is a starting state. However, the sojourn time, which refers to the time spent in that state, is 16 hours and the remaining time, which is the time to reach the end state, is 46 hours. When it moves to the subsequent state (55, 40), the elapsed time has been added by the sojourn time of the previous state, which is 16 hours. Meanwhile, the sojourn time is 8 hours and the remaining time is 30 hours. The sojourn time is derived from the beginning time of state (57, 40) subtracted by the beginning time of state (55, 40) and the remaining time is the duration of the cases subtracted by the sojourn time of the previous state. For example, the remaining time when it is a state (52, 40) is 46 hours and the remaining time becomes 30 hours (46-16). When a state occurs multiple time in different cases, statistic measures such as mean, median, and minimum value are applied. For example, state <(42, 37)> has mean elapsed time 46 hours (61 hours from case 2 and 31 hours from case 3). When we consider the minimum elapsed time, the value of 31 hours (the minimum value among the cases) is selected. Example of kiln schedule from all cases is illustrated in Table 3. Using this annotation, any input from users can be inferred from the logs to predict the remaining flow time of all or some of the running cases.

    3.3. Performance Analysis

    To measure the quality of the analysis, we simply take the log, derive the annotated transition system and predictions and then in a second run compares the predicted values with the real values. The annotated transition system is a transition system with time information. Mean square error (MSE) (Eq. 2) is used to quantify the difference between the predicted and real values.

    M S E = 1 n i = 1 n ( x i x ¯ ) 2
    (2)

    Suppose that a training set contains a particular state <40, 37> that has two elements; 21 and 31 hours. A test set that contains a state <40, 37> would be considered by calculating the mean of the time information of the state (predicted value). As a result of performance analysis, the predicted value () would be evaluated using MSE measure according to data element of training set.

    Prediction of completion merely depends on abstraction chosen during the schedule generator. In addition, to ensure the accuracy problem, it needs to elaborate on the quality of predictions. For this purpose, there are two performance analysis measures. First, a confidence interval for the true average (mean) and the second is a confidence interval for the real value assuming that we know the true average (aggregated mean). In addition, there are three other measures between the predicted and real values used in this study; root mean square error (RMSE) (See Eq. 3), mean absolute error (MAE) (See Eq. 4) and mean absolute percentage error (MAPE) (See Eq. 5).

    RMSE = M S E
    (3)

    M A E = 1 n i = 1 n | x i x ¯ |
    (4)

    M A P E = 1 n i = 1 n | x i x ¯ | x i
    (5)

    Cross validation is the statistical practice of partitioning a data set into two subsets such that the analysis is initially performed on one subset (i.e., training set) while the other subset is used for validation (i.e., test set). The basic form of cross validation, namely holdout validation, randomly select samples from a dataset as a test set and the remaining data is retained as the training data. More improved form is K-fold cross validation where a dataset is partitioned into K sets of equal size. The process is repeated K times, called as the folds, and the results from the folds is averaged to produce single estimation. Traditional validation approach which utilizes ratios can also be considered. This study applies 80-20 ratio as one of the verification approach. For this purpose, FSM evaluator, as one of the plug-in in the process mining tool, is used to (a) evaluate a transition system with predictions based on one log using cross validation and (b) validate with another event log using 80-20 ratio.

    4. ANALYSIS RESULT – A CASE STUDY

    This section describes the analysis results using real datasets. Firstly, the analysis result aims to show the discovered kiln operation based on historical data. Furthermore, additional analysis is necessary to verify and display the pattern of kiln schedule according to the given queries. In the later part of this section, we put discussions about the limitation of our current work.

    4.1. Analysis Result

    For the analysis, we apply on a real dataset. The dataset (DS1) used on this study consists of 31 cases with 6018 events which were stored between March 2, 2015 and May 15, 2015 (see Table 4). Moreover, the longest and shortest duration lengths including the mean duration and median duration of the traces in the dataset have been measured. There are five wood types, namely W1, W2, W3, W4 and W5.1 For each wood type, we evaluate the number of cases (see Table 5). Note that, a chamber in a kiln process can consist of multiple woods such as W1 and W2. Additionally, we perform additional analysis on a specific wood type (W6) as described at Table 6. Since the data was manually stored by operators, it has noises such as null value and incomplete data on some records. For the analysis purpose, those noises were pruned during the data preparation phase. Afterward, the filtered data was converted into a standard format, i.e. MXML file, for the analysis in process mining tool.

    General kiln schedule model is derived from event log by assuming that operators apply similar knowledge to all woods. The model is obtained by the assumption that the state of a case is determined by the set of setting temperatures that have taken places, i.e., Lstate(kσ) equals to set of steps of event log. This model can be used to reproduce the event log, i.e., all observed traces can be reproduced and the model does not allow for any trace to present in the original event log. The analysis in this study utilizes process mining tool.2 The transition system which had been applied as finite state miner (FSM) was used to discover the kiln schedule (van der Aalst et al., 2011). The implementation of FSM miner using forward chaining results a diagram as presented at Figure 4. As shown thus far, the approach allows for the prediction of various things such as elapsed time, sojourn time, and also the remaining time until completion.

    Specific kiln model is derived from event log by considering particular properties such as wood type and horizons. Because the number of cases of some wood types are not representative, the analysis focuses on the horizons; time-based and query-based horizon. Timebased horizon aims to derive a model according to set of partial steps of temperature setting. Figure 3 and Figure 4 display the result of FSM miner with set abstraction of time-based horizon equal to 1 for DS1 and DS2, respectively. Query-based horizon results the same model, however, the extracted data is different with the general kiln schedule since the used event log is based on the derived reference set.

    For evaluating the effectiveness of the proposed approach, we generate kiln schedule using transition system based on the whole data of the dataset (DS1) and run the tool with various parameters; the range of time horizon (H) from 1 onto 5. In addition, there are two additional statistical properties that we need to cope with. First, the mean, median and minimum are the measures to at least prove the concepts of work in regard to the time information. Second, the mean and aggregated mean of the data sets are also considered to see the deviations. Cross validation, which is a 10-fold validation, was applied to verify the deviation between the model and the log. It is to check the “correctness” of the kiln schedule with the log. The experiment results in various performance analysis measures could be seen at Table 7 and Table 8 for DS1 and DS2, respectively. Accordingly, experiment with time horizon 5 (H = 5) shows the least error rates for all measures among the range of time horizon (MAE with mean (29.87 hours) and aggregated mean (100.29 hours) with RSME as the highest value (36.3 hours) and MAPE as the least value among the others in both mean (1975.19%) and aggregated mean (5787.34%). For the DS2, experiment with time horizon 5 (H = 5) also shows the least error rates for all measures among the range of time horizon (MAE with mean (25.2551 hours) and aggregated mean (67.9259 hours)) with RSME as the highest value (31.5799 hours) and MAPE as the least value among the others in both mean (2933.72%) and aggregated mean (7539.93%). It should be noted that the experiment took place from H = 1 to H = 5 due to the possibility of schedule that could be early determined with the maximum of five successive steps. In addition, the tool FSM Evaluator measures MAPE based on percentage, i.e., 1975.19 percent equals to 19.7519.

    As aforementioned, query-based horizon approach aims to derive a model by considering actual properties, i.e., ADB and AWB temperatures. Using the query-based horizon type I, the experiment to measure the deviation between the best references and the log can be seen at Table 9.

    The best references refer to the set of cases that gave the closest distance measures with specific threshold, e.g. 10. That is, cases less than the threshold would be considered as the cases of the filtered log for the analysis. Note that a value close to zero means the more similar between the query and the historical data. By running the experiment with given queries of ADB, AWB, and MC equal to 30, 29, and 48, respectively, and various values of time horizon ranged from 1 to 5, the result shows that higher time horizon gives a lower error for all measures (refer to Table 9).

    4.2. Discussions

    After presenting the experiments of the proposed approach with some parameter setting on the given dataset, we now discuss potential limitations and data requirements. The previous section showed that this approach could generate a kiln schedule and validate the result using some measures. However, these quality considerations are based on experimentation and do not provide clear rules on determining better accuracy. In addition, the building of a transition system, there could be a balance problem between “overfitting” and “underfitting.”

    The transition system (TS) is considered as “overfitting” if TS does not generalize and is sensitive to some specific traces in L. It means the discovered TS model would be very different when there are removal or addition of a small percentage of cases in L. When the number of data increases, there would be many possible paths with low occurrences. It causes most cases to follow a path not taken by other cases in the same period. Hence, to avoid overfitting, it is necessary to generalize and have a model TS that allows for more behavior than the historical data in L.

    On the other hand, the transition system TS is considered as “underfitting” when it allows for too much behavior that is not supported by L. We can detect a model that allows for the behavior seen in the log but also completely different behavior. For example, a log L consists of 100 cases. There are many cases that the temperature 37 degrees is followed by 39 degrees. And there are no cases where the temperature 37 is followed by 40 degrees. Due to no existence on the TS model, the given query with a temperature 37 that is followed by 40 would be considered as “wrong” based on the model and this TS is underfitting L.

    To test the overfitting and underfitting, we run experiments by distinguishing the data into two sets; training and testing. The training set is assumed to be the historical data and the testing set is designated as the new cases on the kiln drying process. For this experiment, 80-20 ratio was assigned to be the training set and testing set, respectively. Using the FSM evaluator with 80-20 ratio, the result of performance analysis is shown at Table 10 for DS1 and Table 11 for DS2.

    Overall analysis on DS1 shows that a higher time horizon value will decrease the error. In addition, the median data has the lowest error in comparison to the other two. Although the increment of time horizon for various wood type could give a better result, in real world application, high time horizon values means increasing the complexity of the systems. One way to achieve the equilibrium point is to analyze the data from the same wood type. The DS2 exhibits an equilibrium point when the time horizon is either 2 or 3. Categorizing and running the analysis to a particular wood type is proven to produce low error rate. Time horizon (H = 2) results the best on median (MAE) and mean (RMSE). Meanwhile, the MAPE show that time horizon (H = 3) provide the best result on min factor. Hence, it is necessary to find the equilibrium point to set the time horizon in particular demand according to specific wood type.

    The result displayed a high error rate due to various factors such as data limitations and data heterogeneity. Kiln dry wood process execution time mostly requires about 30 days while the minimum and the maximum is about 2 weeks and 2 months, respectively. For better analysis result, acquiring proper amount of data would be necessary. Data heterogeneity is another aspect for getting better analysis result. As a matter of fact, kiln process highly depends on wood types. Data homogeneity, for example proper amount of data of the same wood types, would help to improve the quality of analysis. In addition, environment data is one of the factors that might influence the construction of kiln schedule model. Environment data such as weather, temperature and humidity are having high relationship on the developing kiln schedule. Hence, machine learning approach would be one of the alternatives for the future work.

    5. CONCLUSION

    This study aims to develop a decision support framework to discover kiln schedule based on historical data. The complex relationships among factors on kiln dry wood process create many challenges. The existing approaches resolve the sequential decision problems using techniques dealing with experiments and mathematical model which abandon the potential used of historical data or called as event logs. This study attempts to contribute on the field of sequential decision problems by utilizing the event logs to discover time-based kiln schedule. The logs which contain temperature and moisture contents data were used. By utilizing process mining approach, i.e., using transition system, this study shows how to generate kiln schedules. The advantages of the proposed method are threefold. First, the mined kiln schedule could be a reference for kiln operator to manage the steps of drying process. Second, the time-based kiln schedule is useful for operators and managers to evaluate past processes. Third, it can predict the kiln drying process duration in specific horizon, e.g., time-based and querybased, until completion.

    The mined kiln schedule derived from event logs referred to a general kiln schedule. The experiment on a dataset of a case study showed a representative kiln schedule. Although the error rate is relatively high, this study has proven that the proposed framework could represent kiln schedule used for a reference for kiln operator. Specific kiln schedule according to time-based and querybased horizon were also performed. Both of the approaches imply the mined kiln schedule can be used as the reference for kiln operator. The same as the general kiln schedule which result in the relatively high error rate, the kiln schedule could be a model for kiln process prediction. Additionally, experiment with traditional validation, i.e., 80-20 ratio, was also used to explain the potential of the mined kiln schedule.

    Some issues open for further work. First, additional features are required to do comprehensive analysis. For example, time-based features, instead of state, may result in a reliable kiln schedule. Second, complete historical data will help analyst to elaborate with correlation analysis. Third, the problem of overfitting and underfitting occur when the testing is a new kiln drying process. The analysis result shows that the error rate is high due to low amount of historical data. Finally, uncontrollable factors such as local weather can be one additional feature for determining a better kiln schedule.

    Figure

    IEMS-17-193_F1.gif

    Decision support framework for generating kiln schedule and predicting completion time.

    IEMS-17-193_F2.gif

    Transition system using time horizon h = 1 (left) and h = 2 (right).

    IEMS-17-193_F3.gif

    Kiln schedule discovery of DS1 using FSM Miner (using Set Abstraction, H = 1).

    IEMS-17-193_F4.gif

    Kiln schedule discovery of DS2 using FSM miner (using Set Abstraction, H = 1).

    Table

    Example of continuously varying schedules (t : the drying time in hours)

    Example of fragment of kiln process log sequential data

    An example result of kiln schedule mining mod-el using the data in Table 2 (total duration = 46 hours)

    Statistic descriptive of the dataset DS1

    The number of cases based on wood type

    1For privacy purpose, wood type is written in anonymous to represent the wood species name.

    Statistic descriptive of the dataset DS2 for a specific wood type (W6)

    Analysis result of general kiln schedule using dataset DS1

    Analysis result of a kiln schedule of a particular wood type using dataset DS2

    Analysis result with query data ADB = 30, AWB = 29, Initial MC = 48, Threshold = 10

    Analysis result using conventional validation with 80-20 ratio (DS1)

    Analysis result using conventional validation with 80-20 ratio (DS2)

    REFERENCES

    1. R.S. Boone , C.J. Kozlik , P.J. Bois , E.M. Wengert (1988) Dry kiln schedules for commercial woods: Temperate and tropical, General Technical Report FPL-GTR-57, Forest Products Laboratory, US. ,
    2. G. Bramball (1975) Calculating kiln schedule changes, Forest Products Laboratory, British Columbia.,
    3. G. Bramball (1975) The seasoning of regrowth eucalyptus,, Australian For. Ind. J.,, Vol.41 (11) ; pp.31-33
    4. A.K.A. De Medeiros , A.J.M.M. Weijters , W.M.P. van der Aalst (2007) Genetic process mining: An experimental evaluation., Data Min. Knowl. Discov., Vol.14 (2) ; pp.245-304
    5. D.R. Ferreira , F. Szimanski , C.G. Ralha (2013) Mining the low-level behavior of agents in highlevel business processes., International Journal of Business Process Integration and Management, Vol.6 (2) ; pp.146-166
    6. K.S. Gan , A.R. Zairul , J.L. Tan (2015) Effectiveness of pretreatments on acacia mangium for conventional steam-heated kiln drying., J. Trop. For. Sci., Vol.27 (1) ; pp.127-135
    7. N. Gattani , E. del Castillo , C. D. Ray (2005) Time series analysis and control of a dry kiln., Wood and Fiber Science, Vol.3 ; pp.472-483
    8. C.W. Günther , W.M.P. van der Aalst (2006) Mining activity clusters from low-level event logs, BETA Working Paper Series, Eindhoven: Eindhoven Universityof Technology.,
    9. C.W. Günther , W.M.P. van der Aalst (2007) Fuzzy mining-adaptive process simplification based on multi-perspective metrics, Proceedings of the International Conference on Business Process Management, ; pp.328-343
    10. A.A. Jara , E.D. Bello , S.V.A. Castillo , V.A. Fernandez , P.S. Madamba (2008) Use of relative density- based schedules in kiln-drying big-leafed mahogany(swietenia macrophylla king) lumber., Philipp. J. Sci., Vol.137 (2) ; pp.159-167
    11. D. Jeon , B.N. Yahya , H. Bae , M. Song , S. Sul , R.A. Sutrisnowati (2013) Conceptual framework for container-handling process analytics., ICIC Express Lett., Vol.7 (6) ; pp.1919-1924
    12. R.L. Little , R.L. Toenisson (1989) Drying hardwood lumber using computer controlled mini-stepschedules, Proceedings of the I.U.F.R.O. International Wood Drying Symposium,
    13. F. Mannhardt , M. de Leoni , H.A. Reijers , W.M.P. van der Aalst , P.J. Toussaint (2016) From lowleven events to activities-a pattern-based approach, Proceedings of the Business Process Management, ; pp.125-141
    14. P. Marier , J. Gaudreault , T. Noguer (2016) kiln drying operation scheduling with dynamic composition of loading patterns, Proceedings of the 6th International Conference on Information Systems Logistics and Supply Chain,
    15. R. Mills (1991) Australian Timber Seasoning Manual., Australian Furniture Research and Development Institute Ltd.,
    16. N.M. Nassif (1983) Continuously varying schedule (CVS): A new technique in wood drying., Wood Sci. Technol., Vol.17 (2) ; pp.139-141
    17. R.J. Neumann (1989) Kiln drying young eucalyptus globulus boards from green, Proceedings of the I.U.F.R.O. International Wood Drying Symposium, ; pp.107-115
    18. J. Ofori , B. Brentuo (2010) Drying characteristics and development of kiln schedules., Ghana J. For., Vol.26 ; pp.50-60
    19. C.N. Pandey , B.K. Gaur , H.C. Kanojia , A. Chandra (1984) A new approach to seasoning of eucalyptus hybrid (Eucalyptus teretlcornis)., Indian For., Vol.110 (2) ; pp.117-121
    20. W.T. Simpson , X. Wang (2001) Time-based kiln drying schedule for sugar maple for structural uses, Research Note FPL-RN-0279, Forest Products Laboratory, US.,
    21. W.T. Simpson (1980) Accelerating the kiln drying of oak, Forest Service Research Paper FPL 378, Forest Production Laboratory, Wisconsin, U.S.A.,
    22. W.T. Simpson (1991) Dry Kiln Operators Manual., Forest Service, Forest Product laboratory, Department of Agriculture, U.S.A.,
    23. H.H. Smith , O.W. Torgeson (1951) Kiln schedule for black walnut gunstock blanks, Report R1433, University of Wisconsin, USA.,
    24. H.P. Stöhr , D. Mackay (1984) Drying schedule development for young eucalyptus grandis timber, C.S.I.R Special Report Hout 354, Project no. TP/43495, Pretoria, South Africa.,
    25. H.P. Stöhr , D. Mackay , H.C. Davies (1985) Industrial implementation of high humidity orientated schedules in the drying of young 25 mm E. grandis boards., South African Forestry Journal, Vol.134 (1) ; pp.16-21
    26. W.M.P. van der Aalst (2011) Process mining: Discovery, Conformance and Enhancement of Business Processes., Springer-Verlag Berlin Heidelberg,
    27. W.M.P. van der Aalst , M.H. Schonenberg , M. Song (2011) Time prediction based on process mining., Inf. Syst., Vol.36 (2) ; pp.450-475
    28. H.F. Vermaas , C.J. Neville (1989) Evaluation of low temperature and accelerated low temperature drying schedules for Eucalyptus grandis., Holzforxchung, Vol.43 (3) ; pp.207-212
    29. X. Wang , W.T. Simpson , B.K. Brashaw , R.J. Ross (2002) kiln drying maple for structural uses, Proceedings of the 30th Hardwood Symposium, ; pp.63-68
    30. A.J.M.M. Weijters , W.M.P. van der Aalst , A.K.A. de Medeiros (2006) Process mining with heuristic miner algorithm, BETA Working Paper Series, Eindhoven: Eindhoven University of Technology.,
    31. S. Wen , M. Deng , A. Inoue (2012) Moisture content prediction of wood drying process using svmbased model., Int. J. Innov. Comput., Inf. Control, Vol.8 (6) ; pp.4083-4093
    32. B.N. Yahya (2014) The development of manufacturing process analysis: Lesson learned from process mining., Jurnal Teknik Industri, Vol.16 (2) ; pp.95-106
    33. B.N. Yahya , M. Song , H. Bae , S. Sul , J.Z. Wu (2016) Domain-driven actionable process model discovery., Comput. Ind. Eng., Vol.99 ; pp.382-400
    오늘하루 팝업창 안보기 닫기