Using the packages processcheckr
prodecural rules can be checked in an event log. Checking rules will add a boolean case attribute, which can be used for filtering or in analysis.
Rules can be checked using the check_rule
function (see example below). It will create a new logical variable to indicate for which cases the rule holds. The name of the variable can be configured using the label
argument in check_rule
.
In the following example, the first rule checks the starting activity, while the second rule checks whether CRP and LacticAcid occur together.
library(bupaR)
library(processcheckR)
sepsis %>%
# check if cases starts with "ER Registration"
check_rule(starts("ER Registration"), label = "r1") %>%
# check if activities "CRP" and "LacticAcid" occur together
check_rule(and("CRP","LacticAcid"), label = "r2") %>%
group_by(r1, r2) %>%
n_cases()
## # A tibble: 4 x 3
## # Groups: r1 [2]
## r1 r2 n_cases
## <lgl> <lgl> <int>
## 1 FALSE FALSE 10
## 2 FALSE TRUE 45
## 3 TRUE FALSE 137
## 4 TRUE TRUE 858
Using the function check_rules
, multiple rules can be checked with one function call, by providing them as named arguments. The following code is equivalent to that above.
sepsis %>%
check_rules(
r1 = starts("ER Registration"),
r2 = and("CRP","LacticAcid")) %>%
group_by(r1, r2) %>%
n_cases()
## # A tibble: 4 x 3
## # Groups: r1 [2]
## r1 r2 n_cases
## <lgl> <lgl> <int>
## 1 FALSE FALSE 10
## 2 FALSE TRUE 45
## 3 TRUE FALSE 137
## 4 TRUE TRUE 858
Instead of adding logical values for each rule, you can also immediately filter the cases which adhere to one or more rules, using the filter_rules
sepsis %>%
filter_rules(
r1 = starts("ER Registration"),
r2 = and("CRP","LacticAcid")) %>%
n_cases()
## [1] 858
Currently the following declarative rules can be checked:
Cardinality rules:
contains
: activity occurs n times or morecontains_exactly
: activity occurs exactly n timescontains_between
: activity occures between min and max number of timesabsent
: activity does not occur more than n - 1 timesOrdering rules:
starts
: case starts with activityends
: case ends with activitysuccession
: if activity A happens, B should happen after. If B happens, A should have happened before.response
: if activity A happens, B should happen afterprecedence
: if activity B happens, A should have happend beforeresponded_existence
: if activity A happens, B should also (have) happen(ed) (i.e. before or after A)Exclusiveness:
and
: two activities always exist togetherxor
: two activities are not allowed to exist togetherThe available rules are explained in more detail below.
Arguments:
activity
: a single activity name.n
(default = 1): the minimum number of the times the activity should be presentReturns: cases where activity
occurs n
times or more.
[Example] How many cases have three or more occurences of Leucocytes?
sepsis %>%
check_rule(contains("Leucocytes", n = 3)) %>%
group_by(contains_Leucocytes_3) %>%
n_cases()
## # A tibble: 2 x 2
## contains_Leucocytes_3 n_cases
## <lgl> <int>
## 1 FALSE 590
## 2 TRUE 460
Arguments:
activity
: a single activity name.n
(default = 1): the exact number of the times the activity should be present[Example] How many cases have exactly four more occurences of Leucocytes?
sepsis %>%
check_rule(contains_exactly("Leucocytes", n = 4), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 960
## 2 TRUE 90
Returns: cases where activity
occurs n
.
Arguments:
activity
: a single activity name.min
(default = 1): the minimum number of the times the activity should be presentmax
(default = 1): the minimum number of the times the activity should be presentReturns: cases where activity
occurs between min
and max
times.
[Example] How many cases have between 0 and 10 occurences of Leucocytes?
sepsis %>%
check_rule(contains_between("Leucocytes", min = 0, max = 10), label = "r1") %>%
group_by(r1) %>%
n_cases()
## Joining, by = "case_id"
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 38
## 2 TRUE 1012
Arguments:
activity
: a single activity name.n
(default = 0): the maximum number of times the activity is allowed to happenReturns: cases where activity
occurs maximum n
times.
Note that absent(n = x)
is equivalent to contains_between(min = 0, max = x)
[Example] How many cases have between 0 and 10 occurences of Leucocytes?
sepsis %>%
check_rule(absent("Leucocytes", n = 10), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 38
## 2 TRUE 1012
Arguments: * activity
: a single activity name
Returns: cases that start with activity
.
[Example] How many cases start with “ER Registration”
sepsis %>%
check_rule(starts("ER Registration"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 55
## 2 TRUE 995
Arguments: * activity
: a single activity name
Returns: cases that end with activity
.
[Example] How many cases end with “Release A”
sepsis %>%
check_rule(ends("Release A"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 657
## 2 TRUE 393
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where (an instance of) activity_a
is eventually followed by (an instance of) activity_b
, if either activity_a
or activity_b
occurs.
[Example] How many cases is “ER Sepsis Triage” succeeded by “CRP”
sepsis %>%
check_rule(succession("ER Sepsis Triage","CRP"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 1 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 1050
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where (an instance of) activity_a
is eventually followed by (an instance of) activity_b
, if activity_a
occurs. [Example] How many cases is “ER Sepsis Triage” followed by “CRP”, if “ER Sespis Triage” occurs.
sepsis %>%
check_rule(response("ER Sepsis Triage","CRP"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 1049
## 2 TRUE 1
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where (an instance of) activity_b
is preceded by (an instance of) activity_a
, if activity_b
occurs.
[Example] How many cases is “CRP” preceded “ER Sepsis Triage”, if “CPR” occurs.
sepsis %>%
check_rule(precedence("ER Sepsis Triage","CRP"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 1007
## 2 TRUE 43
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where if activity_a
occurs, also activity_b
occurs (but not vice versa)
[Example] How many cases contain both “CRP” and “ER Sepsis Triage”, if “CPR” occurs.
sepsis %>%
check_rule(responded_existence("CRP", "ER Sepsis Triage"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 1
## 2 TRUE 1049
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where both activity_a
and activity_b
occur or both are absent
[Example] How many cases contain both “CRP” and “ER Sepsis Triage”.
sepsis %>%
check_rule(and("CRP", "ER Sepsis Triage"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 44
## 2 TRUE 1006
Arguments: * activity_a
: a single activity name * activity_b
: a single activity name
Returns: cases where either activity_a
or activity_b
occur, but not both.
[Example] How many cases contain “CRP” OR “ER Sepsis Triage”.
sepsis %>%
check_rule(xor("CRP", "ER Sepsis Triage"), label = "r1") %>%
group_by(r1) %>%
n_cases()
## # A tibble: 2 x 2
## r1 n_cases
## <lgl> <int>
## 1 FALSE 1006
## 2 TRUE 44