Learning the latent structure of collider events

  • PDF / 3,263,215 Bytes
  • 48 Pages / 595.276 x 841.89 pts (A4) Page_size
  • 69 Downloads / 216 Views

DOWNLOAD

REPORT


Springer

Received: June Revised: September Accepted: September Published: October

17, 14, 21, 30,

2020 2020 2020 2020

Learning the latent structure of collider events

a

Joˇzef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia b Physik-Institut, Universit¨ at Z¨ urich, CH-8057, Switzerland c Faculty of Mathematics and Physics, University of Ljubljana, Jadranska 19, 1000 Ljubljana, Slovenia d International Center for Advanced Studies (ICAS) and CONICET, UNSAM, Campus Miguelete, 25 de Mayo y Francia, CP1650, San Mart´ın, Buenos Aires, Argentina

E-mail: [email protected], [email protected], [email protected], [email protected] Abstract: We describe a technique to learn the underlying structure of collider events directly from the data, without having a particular theoretical model in mind. It allows to infer aspects of the theoretical model that may have given rise to this structure, and can be used to cluster or classify the events for analysis purposes. The unsupervised machine-learning technique is based on the probabilistic (Bayesian) generative model of Latent Dirichlet Allocation. We pair the model with an approximate inference algorithm called Variational Inference, which we then use to extract the latent probability distributions describing the learned underlying structure of collider events. We provide a detailed systematic study of the technique using two example scenarios to learn the latent structure of di-jet event samples made up of QCD background events and either tt¯ or hypothetical W 0 → (φ → W W )W signal events. Keywords: Jet substructure, Beyond Standard Model, Hadron-Hadron scattering (experiments), Jets, Particle and resonance production ArXiv ePrint: 2005.12319

c The Authors. Open Access, Article funded by SCOAP3 .

https://doi.org/10.1007/JHEP10(2020)206

JHEP10(2020)206

B.M. Dillon,a D.A. Faroughy,b J.F. Kamenika,c and M. Szewcd

Contents 1 Introduction

2 3 4 5 7 8 11 15 15 16 17

3 Learning latent jet substructure 3.1 Jet de-clustering and substructure observables 3.2 Probabilistic models of jet substructure 3.3 Choosing a data representation for the jet substructure

17 17 18 20

4 Set-up and benchmarks 4.1 Algorithm set-up 4.2 Benchmark di-jet events 4.2.1 Boosted top quark pair-production 4.2.2 A 3 TeV W 0 model with a 400 GeV scalar 4.3 Comparing classification power of different observables 4.4 Measurement co-occurrences

21 21 21 22 23 25 27

5 Unsupervised learning with LDA 5.1 Systematics 5.1.1 Offset 5.1.2 Chunk size

30 33 33 36

6 Conclusions

37

A Supplementary scans

41

–1–

JHEP10(2020)206

2 Probabilistic generative modelling for collider experiments 2.1 Probabilistic generative models 2.1.1 Mixture models 2.1.2 Mixed-membership models 2.2 Latent Dirichlet Allocation 2.3 Variational inference 2.4 The LDA landscape 2.4.1 A landscape of classifiers 2.4.2 Evaluating the classifier performance 2.4.3 Model selection with perplexity

1

Introduction

–2–

JHEP10(2020)206

With the discovery of the Higgs boson [1, 2], all the degrees of freedom that form