Skip to content

Event Mapping

datacorner edited this page Jun 27, 2023 · 1 revision

What and why ?

The aim of this activity is to clean up the list of events (EVENT_ID). Most of the time those informations come from different data sources and can have different values for the same meaning, or there are just too much events and it may be necessary to filter out some of them (not necessary for the analysis). For these reasons and others, you may have to manage an Event Dictionary workshop (See all details in the ExYPro Methodology).

At the end of this workshop you should have a simple list of events with:

  • The old event name
  • The new event name (if nothing is specified that just mean the event log must be removed) Note: The resulting file must be a CSV file (with comma as separator).

Event management (filtering or renaming)

pyBPPIBridge enables to automatically make some changes in the event names and also filter out some of them by using their name according to the rule previously detailed. To make that possible you just have to provide the event map file (CSV) and also specify some options in the configuration file.

This is how the file must be constructed

  • It's a CSV file
  • Use the comma as separator
  • The First columns contains the name of event ID (in the source data)
  • The Second columns contains:
    • The new name (that will appear in BPPI) if there's something filled out. In this case the old (source) evend id is just replaced by this one.
    • if nothing, this event log (the entire row) is simply removed

Note: the Columns header names do not matter, however the solution assumes there are headers.

Matching rules:

  • If the Source Column find a match in the Event Map file (Column 1)
    • If there is a data (not Null) in Event Map file (Column 2)
      • Then The Source Column is replaced by the Event Map file (Column 2)
    • Else
      • The Source Row is removed
  • Else
    • The Source Row is removed

Configuration

In the [events] section, you must specify then:

  • map (yes|no) if you want this mapping/filtering to happen or not
  • maptable filename with the above table
  • eventcolumn column name to map from the data source
[events]
# Map the event names in the source dataset with a event map file (CSV file with two columns)
map=no
maptable=./test/evtmap.csv
eventcolumn={column to map in the data source}

Template creation

If in the configuration file map=yes but the file (parameter maptable) does not exist, then the solution creates

  • A template event file (CSV)
  • Use the file name in the parameter maptable
  • With all the distinct values of the event source file (same values in both columns).

It's then up to the user to reuse that file and update it accordingly to his needs.

Clone this wiki locally