After preparing the data and creating an eventlog object, we can use bupaR functions to get basic information from the log, as well as metadata. In this example, we will use the patients provided by eventdataR.

library(bupaR)
eventdataR::patients
## Event log consisting of:
## 5442 events
## 7 traces
## 500 cases
## 7 activities
## 2721 activity instances
## 
## # A tibble: 5,442 x 7
##    handling     patient employee handling_id registration_type
##    <fct>        <chr>   <fct>    <chr>       <fct>            
##  1 Registration 1       r1       1           start            
##  2 Registration 2       r1       2           start            
##  3 Registration 3       r1       3           start            
##  4 Registration 4       r1       4           start            
##  5 Registration 5       r1       5           start            
##  6 Registration 6       r1       6           start            
##  7 Registration 7       r1       7           start            
##  8 Registration 8       r1       8           start            
##  9 Registration 9       r1       9           start            
## 10 Registration 10      r1       10          start            
## # ... with 5,432 more rows, and 2 more variables: time <dttm>,
## #   .order <int>

Getting metadata

The mapping function can be used to retrieve all the meta data from an event log object, i.e. the relation between event log identifiers and data fields.

patients %>% mapping
## Case identifier:     patient 
## Activity identifier:     handling 
## Resource identifier:     employee 
## Activity instance identifier:    handling_id 
## Timestamp:           time 
## Lifecycle transition:        registration_type

In this case, we see that the handling field is the activity identifier in the event log, while the patient field is used as case identifier. We can also obtain each of these identifiers individually.

patients %>% activity_id
patients %>% case_id
patients %>% resource_id
## [1] "handling"
## [1] "patient"
## [1] "employee"

Getting basic information

We can look at a general summary of the event log by calling the summary function.

patients %>% summary
## Number of events:  5442
## Number of cases:  500
## Number of traces:  7
## Number of distinct activities:  7
## Average trace length:  10.884
## 
## Start eventlog:  2017-01-02 11:41:53
## End eventlog:  2018-05-05 07:16:02
##                   handling      patient          employee 
##  Blood test           : 474   Length:5442        r1:1000  
##  Check-out            : 984   Class :character   r2:1000  
##  Discuss Results      : 990   Mode  :character   r3: 474  
##  MRI SCAN             : 472                      r4: 472  
##  Registration         :1000                      r5: 522  
##  Triage and Assessment:1000                      r6: 990  
##  X-Ray                : 522                      r7: 984  
##  handling_id        registration_type      time                    
##  Length:5442        complete:2721     Min.   :2017-01-02 11:41:53  
##  Class :character   start   :2721     1st Qu.:2017-05-06 17:15:18  
##  Mode  :character                     Median :2017-09-08 04:16:50  
##                                       Mean   :2017-09-02 20:52:34  
##                                       3rd Qu.:2017-12-22 15:44:11  
##                                       Max.   :2018-05-05 07:16:02  
##                                                                    
##      .order    
##  Min.   :   1  
##  1st Qu.:1361  
##  Median :2722  
##  Mean   :2722  
##  3rd Qu.:4082  
##  Max.   :5442  
## 

The basic counts which show up in the summary can also be retrieved indivdual as a numeric vector of length one.

patients %>% n_activities
patients %>% n_activity_instances
patients %>% n_cases
patients %>% n_events
patients %>% n_traces
patients %>% n_resources
## [1] 7
## [1] 2721
## [1] 500
## [1] 5442
## [1] 7
## [1] 7

More detailed information about activities , cases, resources and traces can be obtained using the functions named accordingly. For example, consider the overview of the cases of the patients event log below.

patients %>% cases
## # A tibble: 500 x 10
##    patient trace_length number_of_activities start_timestamp    
##    <chr>          <int>                <int> <dttm>             
##  1 1                  6                    6 2017-01-02 11:41:53
##  2 10                 5                    5 2017-01-06 05:58:54
##  3 100                5                    5 2017-04-11 16:34:31
##  4 101                5                    5 2017-04-16 06:38:58
##  5 102                5                    5 2017-04-16 06:38:58
##  6 103                6                    6 2017-04-19 20:22:01
##  7 104                6                    6 2017-04-19 20:22:01
##  8 105                6                    6 2017-04-21 02:19:09
##  9 106                6                    6 2017-04-21 02:19:09
## 10 107                5                    5 2017-04-22 18:32:16
## # ... with 490 more rows, and 6 more variables: complete_timestamp <dttm>,
## #   trace <chr>, trace_id <dbl>, duration_in_days <dbl>,
## #   first_activity <fct>, last_activity <fct>