Log Skeleton Graph

Version 6.11.158 of the LogSkeleton package introduces a new model called the Log Skeleton Graph.  A Log Skeleton Graph is constructed from the Log Skeleton, and from the Log Skeleton Graph the Dot visualization is constructed.

A Log Skeleton Graph contains the following:

  • A set of Log Skeleton Nodes
  • A set of Log Skeleton Edges
  • A label (or title)
  • A list of legend lines

Log Skeleton Node

A Log Skeleton Node represents an activity in the Log Skeleton Graph. It contains the following:

  • A set of outgoing Log Skeleton Edges
  • A set of incoming Log Skeleton Edges
  • The label of the activity
  • The label of the representative activity of the corresponding equivalence class
  • The number of times the activity occurs in the event log
  • The lowest number of times the activity occurs in any trace in the event log
  • The highest number of times the activity occurs in any trace in the event log
  • Whether the activity is selected by the user

Log Skeleton Edge

A Log Skeleton Edge represents the most important relations between two Log Skeleton Nodes. From one Log Skeleton Node (the tail node) to another Log Skeleton Node (the head node) there is at most one Log Skeleton Edge. A Log Skeleton Edge contains the following:

  • The tail Log Skeleton Node
  • The head Log Skeleton Node
  • The type of the tail, which can be ALWAYS, NEVER, or EXCLUSIVE:
    • ALWAYS indicates a Response relation from the tail node to he head node: The tail node is always followed by the head node.
    • NEVER indicates a Not Precedence relation from the tail node to the head node: The tail node is never preceded by the head node.
    • EXCLUSIVE indicates a Not Co-Existence relation from the tail node to the head node: The tail node never co-occurs with the head node.
  • The type of the head, which can also be ALWAYS, NEVER, or EXCLUSIVE:
    • ALWAYS indicates a Precedence relation from the head node to he tail node: The head node is always preceded by the tail node.
    • NEVER indicates a Not Response relation from the head node to the tail node: The head node is never followed by the tail node.
    • EXCLUSIVE indicates a Not Co-Existence relation from the head node to the tail node: The head node never co-occurs with the tail node.
  • The percentage of the tail
  • The percentage of the head

Constructing the Log Skeleton Graph from a Log Skeleton

This construction takes the configuration as selected by the user into account.

Log Skeleton Graph

The set of Log Skeleton Nodes contains two types of activities:

  1. The activities as selected by the user (selected activities).
  2. If selected by the user, the activities that have a user-selected non-redundant relation to an user-selected activity (neighbor activities).

The set of Log Skeleton Edges contains an edge from a tail Log Skeleton Node to a head Log Skeleton Node if there exists a user-selected non-redundant relation from the tail activity to the head activity.

Redundant relations

  • A Response relation from a first activity a to a second activity b is called redundant if and only if there is a third activity c such that a has c as Response and c has b as Response: If a is always followed by c and c is always followed by b, then a is always followed by b.
  • A Precedence relation from a first activity a to a second activity b is called redundant if and only if there is a third activity c such that a has c as Precedence and c has b as Precedence: If a is always preceded by c and c is always preceded by b, then a is always preceded by b.
  • A Not Response relation from a first activity a to a second activity b is called redundant if and only if there is a third activity c such that a has c as Not Response and c has b as Not Response: If a, b, and c all occur in the same trace, then c has to precede a and b has to precede c, hence b has to precede a. However, if the three activities do not occur in the same trace, the Not Response relation from a to b may not be redundant, consider for example a log with three traces: <a,b>, <b,c>, and <c,a>. As a result, we may call some relation redundant while in fact it is not redundant. The edges in the Log Skeleton Graph may therefore by incomplete, but they all are still correct.
  • A Not Precedence relation from a first activity a to a second activity b is called redundant if and only if there is a third activity c such that a has c as Not Precedence and c has b as Not Precedence: If a, b, and c all occur in the same trace, then c has to follow a and b has to follow c, hence b has to follow a. However, if the three activities do not occur in the same trace, the Not Precedence relation from a to b may not be redundant, consider for example a log with three traces: <b,a>, <c,b>, and <a,c>. As a result, we may call some relation redundant while in fact it is not redundant. The edges in the Log Skeleton Graph may therefore by incomplete, but they all are still correct.
  • Whether a Not Co-Existence relation is called redundant depends on the selection as made by the user. The user can choose whether or not to use the Equivalence relation and the Precedence relation for this redundancy:
    • If netiher Equivalence or Precedence is selected, then no relation is called redundant.
    • If only Equivalence is selected, then a Not Co-Existence relation from activity a to activity b is called redundant if a or b is not the representative for its Equivalence class.
    • If only Precedence is selected, then a Not Co-Existence relation from activity a to activity b is called redundant if there is an activity c such that either a has c as Precedence and c has b as Not Co-Existence, or b has c as Precedence and c has a as Not Co-Existence. As a motivation for the first case: If a is always preceded by c and c never occurs together with b, then a and b can also not occur together. Note that we could also have used Response here instead of Precedence. The reason for using Precedence is that this tends to keep only the Not Co-Existence relation that is ‘first in the flow’, that is, where the decision between a and b has to be made for the first time.
    • If both are selected, then a Not Co-Existence relation from activity a to activity b is called redundant if it would be redundant if only Equivalence would be selected and if only Precedence would be selected.

The label of the event log is used as the label of the Log Skeleton Graph. Useful (non-default) information on the user selection (like which activities and relations are selected) is stored in the legend lines.

Log Skeleton Node

The activity label is used as the label for the Log Skeleton Node. The label of the representative of the equivalence class is set by taking the lexicographical smallest activity label from the activities in the equivalence class. The numbers of times are copied from the Log Skeleton. If the user has selected the activity to be shown (selected activity), this is set accordingly. Otherwise, the activity is a neighbor activity.

Log Skeleton Edge

The tail and head type are set based on the relations. Here, Response and Precedence take precedence over Not Response and Not Precedence, as we consider the former two to be stronger relations than the latter two. By definition, Not Co-Existence cannot be in conflict with the other four relations.

The tail and head percentages are set to the maximal percentage of activity occurrences for which the relation hold. As an example, if activity a has activity b as Response with a percentage of 98%, then at least 98% of all occurrences of a are indeed followed by some b, but this does not hold for 99% of those occurrences.

Visualizing the Log Skeleton Graph

The Log Skeleton Graph is visualized by constructing a Dot graph from it.

Log Skeleton Graph

A separate, disconnected, node is created to hold all legend lines. This node has a yellow background color and it is positioned at the bottom of the Dot graph.

Log Skeleton Node

A Dot node is constructed for every Log Skeleton Node. On the first row, the activity label is shown in a large font. On the second row, the name of the representative, the number of times the activity occurs in the log, and the low and high numbers of times the activity occurs in any trace are shown (if the low number equals the high number, only the low number is shown, otherwise an interval low..high is shown).

The background color depends on the representative activity: All activities in the same Equivalence class will get the same background color. This allows the user to quickly see which activities occur equally often in the event log.

The Dot node has a border if the activity is a selected activity. As a result, neighbor activities are shown without a border.

Log Skeleton Edge

A Dot edge is constructed for every Log Skeleton Edge. The Dot edge has a tail color, a tail symbol, a tail fraction, a head color, a head symbol, and a head fraction.

  • If ALWAYS as tail type, then the tail color is blue and the tail symbol is a filled arrow pointing to the head node. This indicates that the tail activity is always followed by the head activity.
  • If ALWAYS as head type, then the head color is blue and the head symbol is a filled arrow pointing to the head node. This indicates that the head activity is always preceded by the tail activity.
  • If NEVER as tail type, then the tail color is red and the tail symbol is an open arrow pointing to the head node. This indicates that the tail activity is never preceded by the head activity.
  • If NEVER as head type, then the head color is red and the head symbol is an open arrow pointing to the head node. This indicates that the head activity is never followed by the tail activity.
  • If EXCLUSIVE as tail type, then the tail color is yellow (ochre) and the tail symbol is an orthogonal bar. This indicates that the tail activity does not co-occur with the head activity.
  • If EXCLUSIVE as head type, then the head color is yellow (ochre) and the head symbol is an orthogonal bar. This indicates that the head activity does not co-occur with the tail activity.
  • Otherwise the color is gray.

If the head (tail) percentage is lower than 100%, then this percentage is added as a head (tail) label as a fraction, like 98% becomes “.98”. Furthermore, the color of the head (tail) becomes lighter to emphasize that the relation does not hold always.

Example

The figure below shows an example Log Skeleton visualization for the “a12f0n05.mxml” event log.

Among other things, this shows that:

  • 99% of all occurrences of “|>” (the artificial start activity) are followed by “S”.
  • “S” is always preceded buy “|>”.
  • 99% of all occurrences of “S” are not preceded by “f”.
  • 98% of all occurrences of “f” are preceded by “S”.
  • “b” and “f” exclude each other.