Títulos y Résumenes

  • Dr. Joseph Salmon

Title: «Safe screening rules to speed-up sparse regression solvers»

Abstract: In high dimensional regression scenarios, sparsity enforcing penalties have proved useful to regularize the data-fitting term. A recently introduced technique called screening rules, leverage the expected sparsity of the solutions by ignoring some variables in the optimization, hence leading to solver speed-ups. When the procedure is guaranteed not to discard features wrongly the rules are said to be «safe». The proposed framework can cope with generalized linear models and various sparsity enforcing penalty functions, though the talk will focus mainly on least-squares with l1 regularization for simplicity.
Our Gap Safe rules (so called because they rely on duality gap computation) allow to safely discard more variables than previously considered safe rules, particularly for low regularization parameters. We can handle any iterative solver but our contribution is particularly well suited for (block) coordinate descent techniques. We report significant speed-ups compared to previously proposed safe rules both on simulated and real data.

  • Dr. Gibran Etcheverry Doger

Title: «Newborn cry features extraction for affections detection and classification.»

Newborn cry features extraction for affections detection and classification has been intensively developed during the last ten to fifteen years. In this talk, methods from the system identification area have been implemented in order to obtain Linear Predictive Coefficients (LPCs) plus nonlinear ones. In order to show the contribution of the nonlinear features, a Expectation Maximization (EM) algorithm over a Mixture of Experts (ME) operation has been performed to classify this subgroup speech type.

  • Dra. Zobeida Jezabel Guzmán

Title: «Data Science in Digital Video Applications.»

Abstract: We live in an era of data explosion. Consequently, we need processes, systems and programs to analyse that data to help us to obtain information automatically. Is known that the more volume of data, the more challenges to process and analyse it. Such is the case of applications with digital videos, due to video is a kind of data with a massive load of visual and also acoustic information. Therefore, this talk is centred in the importance of data science for the study of the most common type of data transmitted globally, which is digital video.

  • Dr. Roberto Rosas Romero

Title: «Pattern Recognition & Applications»

This talk describes Pattern Recognition as the application of Statistics in the analysis of data to uncover relevant and useful information, which makes objects unique, for recognition, prediction, identification and classification in different applications such as self-driving for autonomous vehicles, tele-monitoring of epileptic patients, diagnosis of diseases such as Parkinson disease, analysis of Stock Markets for investment decisions. Pattern recognition systems have to learn, before they are applied by using large amounts of data (Big Data) as training sets. Thus, there is a need for using computer systems to analyze this information.

  • Dr. Erwan Scornet

Title: «A walk on random forests»

The recent and ongoing digital world expansion now allows anyone to have access to a tremendous amount of information. However collecting data is not an end in itself and thus techniques must be designed to gain in-depth knowledge from these large data bases. This has led to a growing interest for statistics, as a tool to find patterns in complex data structures, and particularly for turnkey algorithms which do not require specific skills from the user. Such algorithms are quite often designed based on a hunch without any theoretical guarantee. Indeed, the overlay of several simple steps (as in random forests or neural networks) makes the analysis more arduous. Nonetheless, the theory is vital to give assurance on how algorithms operate thus preventing their outputs to be misunderstood.

In this mini-course, I will present recent theoretical results on random forests to give insights about how the algorithm works and how to tune the main parameters. Special attention will be given to consistency and rate of consistency of several random forests model. In particular, I will present a first result on Breiman’s forests consistency and show how it sheds some lights on its good performance in a sparse regression setting.

  • Dr. Mario Alberto Díaz Torres

Título: «Garantías teóricas de algunas metodologías de aprendizaje máquina»

Resumen: A través de los años se ha vuelto evidente la relevancia de algunas metodologías de aprendizaje máquina en la resolución de una amplia variedad de tareas. Sin embargo, a veces son menos evidentes las razones detrás de su buen desempeño. Dado el despliegue masivo de dichas metodologías, establecer resultados teóricos que garanticen su funcionamiento apropiado es un problema de interés creciente. Este mini curso tiene como objetivo presentar algunos resultados clásicos acerca de las garantías estadísticas, de optimización y de expresividad de algunas metodologías de aprendizaje máquina, con especial énfasis en redes neuronales. A través del mini curso se dará un vistazo a algunas líneas de investigación contemporáneas en el tema.

  • Dr. Oscar Dalmau

Title: «Parametric machine learning in Keras»

Abstract:  Machine learning can be seen as the learning process of a function that assigns input variables to output variables.

The learning process is typically done through an optimization algorithm. In this way, the algorithm learns the function from training dataset. In order to simplify the learning process, we can restrict the problem to parametric functions (parametric automatic learning). In this mini-course we are going to give an introduction to Keras for parametric automatic learning. This mini-course will be divided into three sessions. In session 1, we will work with regression problems, in the second session with multilayer perceptron (neural network) and in session 3 with convolution neural networks for image classification.

  • Dr. Arno Siri-Jégousse

Title: «Modelos aleatorios en genética de poblaciones y distribuciones tipo fase»


Los datos genéticos proveen problemas complejos debidos principalmente a su tamaño y a su correlación. Las muestras pueden ser excesivas, tanto por el número de sitios que explorar como por el tamaño de la población estudiada. Por otro lado los datos están muy impactados por la historia evolutiva de la población y por lo tanto no pueden ser considerados como independientes. En esta charla introduciré modelos clásicos que ayudan en conseguir estimadores de los parámetros de la evolución. También enseñaré nuevos modelos mas finos que permiten tomar en cuenta hipótesis evolutivas como la selección o los bottlenecks (bajada drástica en el tamaño de la población). Para terminar propondré un método sencillo, basado en las distribuciones tipo fase, para poder distinguir entre los distintos modelos cual corresponde mas a la realidad.