**STAT1-4**:

**Introduction to Statistical Inference and Machine Learning **** –****Gibran Fuentes Pineda- **

This is an introductory course to machine learning, in which the basic concepts and general methodology for model training, selection, and evaluation will be addressed. The model learning and training problem is formulated and based on a statistical perspective, for which basic concepts of statistical inference will also be presented.

References:

- Probabilistic Machine Learning: An Introduction by Kevin Murphy: https://probml.github.io/pml-book/book1.html
- Pattern Recognition and Machine Learning by Christopher M. Bishop: https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf

Requirements: Python libraries Numpy, Pandas, Matplotlib

**Projects A **

** P1. Lyman-alpha tomography and One-dimensional power spectrum – Corentin Ravoux –**

The Lyman-alpha forest is a region of a high-redshift quasar spectrum that probes the quantity of neutral hydrogen in the intergalactic medium. It gives a field of Lyman-alpha absorption along the line of sight. The tomography project aims to create a three-dimensional map of Lyman-alpha absorption from a dense sample of quasars. We will start with a tutorial to create and understand a small 3D map on a specific field of DESI. This project can be extended to work on a large map obtained with eBOSS with several applications: the search of cosmic voids and proto-cluster on the map or the stacking of galaxies to obtain a cross-correlation signal.

One-dimensional power spectrum (P1D): This observable measures the correlations of the Lyman-alpha forest along the line-of-sight. P1D can be used to constrain the structure of the cosmic web at small scales, but also the sum of neutrino masses and the properties of dark matter. The first objective of this project is to manipulate and understand the newly published spectroscopic data of DESI with a small tutorial. The following will aim at reproducing the one-dimensional power spectrum measurement performed on DESI data by fitting the quasar continuum and measuring the Fourier transform of the Lyman-alpha absorption.

References:

P1D: https://arxiv.org/abs/2306.06311, https://arxiv.org/abs/1812.03554

Tomography: https://arxiv.org/abs/2004.01448, https://arxiv.org/abs/2203.11045, https://arxiv.org/abs/1504.03290, https://arxiv.org/abs/1412.1507

Check this link for more details on the P1D project, and this one for the tomography project.

Requirements:

Two tutorials are provided with google colab (Fully online jupyter notebook which uses google CPU and GPU). You only need to have a google account (gmail address). You can also run the notebook directly on the laptop, but you will need to have a proper python environment.

** P2. Bayesian Statistics – Gabriela García Arroyo –**

Recording (second part of the video)

The main goal of this project is for students to understand the crucial role Bayesian statistics have played in advancing precision cosmology. During the project, students will acquire knowledge about the basic theory and algorithms that support the results presented in scientific articles that use statistical inference.

To achieve this, they will develop a MCMC algorithm to constrain parameters of the wCDM cosmological model using data from type Ia supernovae and cosmic chronometers. Subsequently, they will use the emcee algorithm to compare the results obtained. In this way, students will be able to understand more complex methods on their own and be more critical when interpreting the tables of results and contour plots that are usually presented in the articles.

References: Moresco et al. 1601.01701

Requirements: python: numpy, scipy, matplotlib, seaborn, pyfits, emcee, corner.

Datos: Supernovae data: JLA y JLA_simple. Data can be downloaded from https://supernovae.in2p3.fr/sdss_snls_jla/ReadMe.html#sec-1-6 We will use data v6 : jla_likelihood_v6.tgz

Cosmic clocks:Can be downloaded from montepython repository https://github.com/brinckmann/montepython_public/blob/3.6/data/cosmic_clocks/Hz_2016.dat

or Table 4 from Moresco et al. 1601.01701

Drive directory**Projects B**

**P3. Weak Lensing: Theory and Estimators – Alejandro Avilés Cervantes **

In this course, we will approach the cosmological theory of weak gravitational lensing from its foundations, which studies the distribution of light from distant galaxies and how it is distorted by the gravitational influence of matter located between such galaxies and us. We will focus on building the most used statistics in cosmology for the study of this effect, which are the two-point correlation functions of the galactic shear. To achieve this, we will develop simple Python codes that allow us to estimate these statistics from synthetic catalogs. Later, we’ll review the most widely used code today for the same purpose, called TreeCorr. The main objective of this course/project is to provide theoretical and computational tools for the analysis of observational data obtained from galaxy catalogs that are currently being carried out and that will continue in the next 10 years, such as Dark Energy photometric experiments. Survey (DES) and Rubin Observatory – Legacy Survey of Space and Time (LSST); the latter has a large Mexican participation.

References:

Recommended introductory reviews: arxiv.org/abs/1411.0115, arxiv.org/abs/1710.03235. Chapter 13 of Modern Cosmology Dodelson & Schmidt 2nd Edition.

Notes for the course (still in construction) are available at https://drive.google.com/file/d/14skR3BjoK0JqdU4snd_o16O9BHXoS0Tm/view?usp=sharing

Requirements:

We will use Google Colab, so you only need a google account. If you prefer, you can run the projects on your own computer, for this you will need Python 3 and install the modules numpy, astropy and treecorr.

** P4. The galaxy-halo connection – Aldo Rodríguez Puebla**

In the current paradigm of structure formation, galaxies formed and evolved within massive dark matter halos, where multiple physical mechanisms are responsible for self-regulating star formation and thus setting up their observed properties. As a result, a relationship between galaxy and dark matter halo properties emerge which

is often referred to as the galaxy-halo connection. In this project, we will study the galaxy-halo connection by using high resolution cosmological N-body simulations. The student will learn how to use simulations and the assembly history to infer how galaxies evolved across the time. In addition, the student will learn to mock up flux limited surveys (such as the SDSS, GAMA, DES, etc.) for statistical studies of the galaxy population.

References:

Requirements: