This dataset contains the correlation data for species-coverage-based log representativeness measure and Trace-based Log Representativeness Approximation (TLRA) across event logs of 60 generative systems and varying log sizes and noise levels.