New Log Visualizer: Log Skeleton Filter and Browser

Today, I have released the first version of the Log Skeleton Filter and Browser. This plug-in is a new visualizer for event logs. See below for some example visualizations.

The plug-in can be installed by installing the LogSkeleton package using the ProM Package Manager of the ProM Nightly Build. After having been installed, you can visualize an event log using it. At the moment, the visualizer does not support any log classifier yet: It simply uses the concept:name attributes. If you want this plug-in to visualize your log, make sure the events and traces all have a value for this attribute.

Log Skeletons

This shows a typical log skeleton. A log skeleton is a graph, where every node corresponds to an activity and every edge corresponds to a constraint between two (not necessarily different) nodes.

Activities

This shows a typical node, which corresponds to an activity. In this case, the activity is Register+complete, which has occurred 1000 times in the log, and has AnalyzeDefect+complete as equivalence class. This equivalence class will be explained later on.

Apart from the regular activities, a log skeleton contains two artificial activities: |> (start of a trace) and [] (end of a trace).

This shows that the log contains 1000 traces.

Constraints

A log skeleton may contain six different edge types: One edge type for every possible constraint. Possible constraints are Always Together, Always Before, Always After, Never Together, Next (One Way), and Next (Both Ways).

Always Together

The Always Together constraint is visualized by an open box on each end of the edge. In a log skeleton, an open box roughly translates to Always, and the end at which it is placed determines the viewpoint for the constraint. As this constraint is symmetrical, it is placed at both edge ends.

This shows an Always Together constraint between Register+complete and AnalyzeDefect+start. This constraint indicates that both activities occur equally often in every trace. As a result, if Register+complete occurs n times in some trace, then so does AnalyzeDefect+start.

This shows that both activities are Always Together with the artificial start activity |>. From this, we may conclude that both activities occur exactly once in every trace.

This constraint determines the equivalence class of an activity: Two activities related by the Always Together constraint are considered to be equivalent. As a result, they share the same color in the visualization of the log skeleton.

Always Before

The Always Before constraint is visualized by an arrow with the open box (Always) at the head.

This indicates that every occurrence of AnalyzeDefect+start is preceded in the trace by an occurrence of Register+complete: If you stand on an AnalyzeDefect+start, and look towards the start of that trace, you will see a Register+complete somewhere.

Always After

The Always After constraint is visualized by an arrow with the open box (Always) at the tail.

This indicates that every occurrence of Register+complete is followed in the trace by an occurrence of AnalyzeDefect+start: If you stand on a Register+complete, and look towards the end of that trace, you will see an AnalyzeDefect+start somewhere.

Both the Always Before and the Always After constraints provide a sense of direction in the log skeleton. In the example, it is clear that always first Register+complete has to occur, after which AnalyzeDefect+start has to occur.

Never Together

The Never Together constraint is visualized by a closed box (Never) on each end of the edge.

This indicates that Register+complete never occurs together with itself: If you stand on a Register+complete, and look at both the start and end of the trace, you will see no Register+complete (note that we assume you cannot see the node you’re standing on). As a result, we may conclude that Register+complete occurs at most once in every trace.

The two remaining constraints correspond to the directly-follows-graph, which is well-known in the area of process mining. However, as we think the one-way edges in this graph are more informative than the two-way edges, we have split the edges over the two remaining types.

Next (One Way)

The Next (One Way) constraint is visualized by an open dot (Next) on the tail of the edge.

This indicates that in the log Register+complete was 1000 times directly followed by AnalyzeDefect+start, and never the other way around.

Next (Both Ways)

The Next (Both Ways) constraint is visualized by an open dot (Next) on each end of the edge. Furthermore, the arrow at the tail is different to be able to distinguish the source from the target.

This shows that in the log Repair (Complex)+start was 284 times directly followed by Inform User+complete, and 83 times the other way around.

Example Log Skeletons

This section shows the  log skeletons for the noise-free logs from the data sets listed.

DMKD Data Set

From: Maruster, L., Weijters, A. J. M. M., Aalst, W. M. P. v. d., and Bosch, A. v. d. (2006) A rule-based approach for process discovery: Dealing with noise and imbalance in process logs. Data Min. Knowl. Disc., 13, 67-87.

20 synthetic events logs generated from 4 Petri nets, containing 12, 22, 32, and 42 activities, 1000 traces, and different noise levels. This data set uses case labels like a32f0n10, where a32 indicates that this case contains 32 activities, and n10 indicates that in 10% of the traces noise was introduced.

a12
a22
a32
a42

IS 2014

From: Munoz-Gama, J., Carmona, J., and Aalst, W. M. P. v. d. (2014) Single-entry single-exit decomposed conformance checking. Inf. Syst., 46, 102-122.

32 synthetic event logs generated from 4 highly structured Petri nets, containing 59, 48, 32, and 57 activities, 2000 traces, 4 different average trace lengths (approx. 15-55), with and without noise. This data set uses case labels like 59-55, where 59 indicates the reported number of activities, 55 indicates the average trace length.

59-17
59-29
59-41
59-55
48-12
48-23
48-37
48-50
32-18
32-27
32-34
32-41
57-15
57-26
57-39
57-52

Visualizer plug-in

The figure above shows the Log Skeleton Filter and Browser on the a32f0n00 event log.

In the middle, the log skeleton is visualized.

On the right, the user can select which activities to show and which constraints to show. As soon as the user changes the selection, the visualization in the middle will be updated. Furthermore, the user can decide to open a new window visualizing the current log skeleton. This allows for easy comparison between different log skeletons, if needed.

On the left, the user can select activity filters and activity splitters which can be applied to the log before a log skeleton is build for it. The user can select required activities (only traces where all these activities occur will be filtered in) and forbidden activities (only traces where none of these activities occur will be filtered in). Furthermore, the user can enter activity splitters by providing the name of the activity s/he wants to split and the name of the activity it should be split on. The user needs to apply these filters and splitters using the button below, in which case the log will first be filtered, then split, then a log skeleton will be build for it, which will then be visualized to the right. This new visualization will always start with all activities selected, and the Always Before and Always After constraints selected (note that the Always Together constraint is always visualized using the colors). If both the Always Before and Always After constraints are selected and the Next (One Way) constraint is not, then the visualizer adds a Next (One Way) constraint from an activity to another activity if there are no Always constraint from that activity to that other activity and if the Next (One Way) constraint occurs for at least 20% of either the activity or the other activity.