Dataset for Bootstrap Generalization Estimations and Ground Truth Comparison for Discovered Process Models in Process Mining
This dataset contains estimated generalization values evaluated using the bootstrap generalization framework and corresponding ground truth values for models, with characteristics of the event logs used for the estimations. The estimated generalization values are compared with the ground truth values to assess the effectiveness of using system estimation techniques such as bootstrapping. The analysis also explores the coverage of various experimental parameters. Detailed information about the experiment is available in ieeexplore.ieee.org/document/10680679.
- Version 1 includes the initial analysis explained in ieeexplore.ieee.org/document/10680679.
- Version 2 extends this analysis with additional data.
The dataset used in the experiment, as well as the experimental code, can be accessed at our GitHub repository.
We kindly request that you cite our work if you use this dataset in your research:
A. Karunaratne, A. Polyvyanyy, and A. Moffat, “The role of log representativeness in estimating generalization in process mining,” in Int. Conf. Process Mining. IEEE, 2024, pp. 33-40.