Table of Contents

Conformance Checker

Most information systems log events (e.g., in transaction logs, audit trails) to audit and monitor the processes they support. While process mining can be used to discover a process model based on a given event log, explicit process models describing how a business process should (or is expected to) be executed are frequently available. Together with the data recorded in the log, this situation raises the interesting question “Do the model and the log conform to each other?” [1,3,4]. As a result, analyzing the gap between a model and the real world both helps to detect violations (i.e., the real world does not “behave properly”), and to ensure transparency (as the model can be, e.g., outdated). This Conformance Checker has been applied to, for example, administrative processes of a municipality in The Netherlands. In [2], the question of conformance has been investigated in the context of web services.

Prerequesites

Note that there are example files you can use to play with the Conformance Checker. They are contained in your ProM folder at /examples/conformanceChecking/. Alternatively they can be downloaded here.

Analysis Settings

Before the actual analysis is started, one can choose which kind of analysis should be performed. Whole categories or specific metrics can be selected and deselected. A brief description explains each of these options. The conformance checker supports analysis of the (1) Fitness, (2) Precision (or Behavioral Appropriateness), and (3) Structure (or Structural Appropriateness) dimension.

The following analysis methods are used in order to calculate the conformance metrics:

Conformance Checker Analysis Settings

Figure 1. Analysis settings allow for the selection of those metrics to be calculated

Typically, the default settings should be just fine for analysis (you can abort each type of analysis if it takes too long). However, here is some advice if you experience performance problems analyzing your log:

Conformance Analysis Results

Generally, the evaluation of conformance can take place in different, orthogonal dimensions. First of all, the behavior allowed by the process model can be assessed. It may be both “too much” and “too little” compared to the behavior recorded in the event log. Second, we can evaluate the structure of the process model.

Selecting Process Instances

By selecting a subset of the process instances in the log one can restart the analysis for any selected subset of the event log. If the log has been preprocessed, then the cardinality column # indicates the number of process instances being summarized by each trace. By default the diagnostic results are given for all process instances contained in the log and therefore all the log traces are selected. However, it can be interesting to evaluate only the, e.g., 80% most frequent instances to exclude rare behavior and analyze the normal flow of the process. Thus, decreasing the percentage of covered process instances keeps the most frequent traces selected. On the other hand, any subset of log traces can be selected manually and after updating the results one can see how many percent of the whole log are covered by that selection.

Note that, in combination with the button Select Fitting and Invert Selection, one can automatically select the subset of fitting or non-fitting traces in the log in order to, for example, further analyze them separately.

(1) Fitness Analysis

Fitness analysis is concerned with the investigation whether a process model is able to reproduce all execution sequences that are in the log, or, viewed from the other angle, whether the log traces comply with the description in the model. We call this dimension fitness dimension, i.e., the fitness is 100% if every trace in the log “fits” the model description. So, fitness analysis aims at the detection of mismatches between the process specification and the execution of particular process instances.

There are two perspectives of the fitness analysis results:

Model Perspective

The following metric is calculated from a model perspective in order to measure the degree of fitness:

There are a number of options, which can be used to enhance the visualization of the process model by the indication of:

Conformance Checker Fitness Analysis - Model View

Figure 2. Model view shows places in the model where problems occurred during the log replay (screenshot of analysis of the example files Log_L2.xml and M1_nonFitting.tpn)

Log Perspective

The diagnostic perspective can be changed to visualize the log file or a subset of log traces respectively. The following metrics are calculated from a log perspective in order to measure the degree of fitness:

There is one option, which can be used to enhance the visualization of the event log by the indication of:

Conformance Checker Fitness Analysis - Log View

Figure 3. Log view shows where replay problems occurred in the log (screenshot of analysis of the example files Log_L2.xml and M1_nonFitting.tpn)

(2) Behavioral Appropriateness

On the other hand the process model may allow for more behavior than that recorded in the log. We call the analysis and detection of such “extra behavior” behavioral appropriateness, or precision dimension, i.e., the precision is 100% if the model “precisely” allows for the behavior observed in the log. This way one can, for example, discover alternative branches that were never used when executing the process.

The following metrics are available in order to measure the degree of behavioral appropriateness:

Note that pressing the buttons Model Relations and Log Relations will output these relations where the metric is based on to the Message console at the bottom of the ProM framework.

Furthermore, there are a number of options, which can be used to enhance the visualization of the process model by the indication of:

Conformance Checker Precision Analysis

Figure 4. Analysis of the precision of a model allows to detect overgeneral parts (screenshot of analysis of the example files Log_L2.xml and M6_behaviouralInappropriate.tpn)

(3) Structural Appropriateness

In a process model, structure is the syntactic means by which behavior (i.e., the semantics) can be specified, using the vocabulary of the modeling language (for example, routing nodes such as AND or XOR). However, often there are several syntactic ways to express the same behavior, and there may be “preferred” (for example, easier to understand) and “less suitable” representations. Clearly, this evaluation dimension highly depends on the process modeling formalism and is difficult to assess in an objective way (after all, there may be personal, or even corporate preferences). However, it is possible to formulate and evaluate certain “design guidelines”, such as calling for a minimal number of duplicate tasks in the model.

The following metrics are available in order to measure the degree of structural appropriateness:

Furthermore, there are a number of options, which can be used to enhance the visualization of the process model by the indication of:

Conformance Checker Structure Analysis

Figure 5. Structural analysis detects duplicate task that list alternative behavior and redundant tasks (screenshot of analysis of the example files Log_L2.xml and M5_structuralInappropriate.tpn)

Further Steps

The conformance checker provides a number of items to the ProM framework for further processing, that is, as input for other analysis methods or to export them to a file.

Limitations

Note that in the presence of invisible and duplicate tasks the log replay is not always guaranteed to find the optimal solution. For example, we choose the best shortest sequence of invisible tasks to enable the currently replayed task if possible. However, from a global viewpoint it could always be the case that firing some longer sequence would actually produce exactly those tokens that are needed in a later stage of the replay. Dealing with this issue in a global manner (i.e., minimizing the number of missing and remaining tokens during log replay) seems intractable for complexity reasons - see [1] for further details. However, in general our algorithms work very well for most cases, while keeping the metrics accessible for practical situations.

In short: if the fitness metric yields 1.0 then you can be sure that the trace is 100% fitting. However, if the metric is smaller than 1.0 (and your Petri net contains invisible or duplicate tasks), then this does not necessarily mean that the trace is really not fitting the model (because maybe there is some other way of replaying it without errors, but the log replay did not find it).

Using the Conformance Metrics from your Own Code

If you plan to use one of the metrics calculated by the Conformance Checker in you own code, then note that in the context of the Control Flow Benchmark Plugin (see help page) we implemented the conformance measures as so-called 'benchmark metrics'. For example, you can find the fitness metric f in the class org.processmining.analysis.benchmark.metric.TokenFitnessMetric. The advantage here is that only the particular metric that you want to use is calculated (sometimes some other metric may take too much time if you calculate all), and that it is easy to use.

From your code you could then simply call it like this:

TokenFitnessMetric fitness = new TokenFitnessMetric();
double fitnessResult = fitness.measure(inputModel, inputLog, null, new Progress(””));
System.out.println(“Fitness: ” + fitnessResult);

Publications

Conformance Checking of Processes Based on Monitoring Real Behavior

A. Rozinat and W.M.P. van der Aalst
Information Systems, Volume 33, Issue 1, Pages 64-95

→ Read more...

 

Choreography conformance checking: an approach based on BPEL and Petri nets (extended version)

W.M.P. van der Aalst, M. Dumas, C. Ouyang, A. Rozinat, and H.M.W. Verbeek
In F. Leymann, W. Reisig, S.R. Thatte, W.M.P. van der Aalst, Dagstuhl Seminar Proceedings (The Role of Business Processes in Service Oriented Architectures, Vol. 6291). Dagstuhl: Internationales Begegnungs- und Forschungszentrum für Informatik, 2006
Link: Dagstuhl Document

W.M.P. van der Aalst, M. Dumas, C. Ouyang, A. Rozinat, and H.M.W. Verbeek
BPM Center Report BPM-05-25, BPMcenter.org, 2005

→ Read more...

 

Conformance Testing: Measuring the Fit and Appropriateness of Event Logs and Process Models

A. Rozinat and W.M.P. van der Aalst
Workshop on Business Process Intelligence (BPI), Nancy, 2005

A. Rozinat and W.M.P. van der Aalst
C. Bussler et al., editor, Business Process Management 2005 Workshops, volume 3812 of Lecture Notes in Computer Science, pages 163–176. Springer-Verlag, Berlin, 2006

→ Read more...

 

Conformance Testing: Measuring the Alignment Between Event Logs and Process Models

A. Rozinat and W.M.P. van der Aalst
BETA Working Paper Series, WP 144, Eindhoven University of Technology, Eindhoven, 2005

A. Rozinat and W.M.P. van der Aalst
BPM Center Report BPM-05-17, BPMcenter.org, 2005

→ Read more...