Time series data mining and its applications in real-world problems

4 octubre, 2021
12:00 pm

Título: Time series data mining and its applications in real-world problems

Ponente:  Antonio Manuel Durán (Universidad de Córdoba)

Organizador: Jesús Javier Rodríguez Sala

Fecha: Lunes 4 de octubre de 2021 a las 12:00 horas.

Lugar: Online.

Para asistir a la charla regístrese aquí

El registro se cerrará 20 minutos antes del inicio del seminario

Abstract: Currently, information systems such as sensors produce a  large amount of data, which is expected to experience exponential  growth in the coming years. These data are often treated as time  series, which are chronologically collected data representing a  time-varying function. Time series appear in a wide range of scientific fields, such as hydrology, paleoclimatology or air traffic,  among others. This seminar exposes the preprocessing, segmentation and  prediction of time series, which are considered one of the main tasks  of time series data mining. Machine learning (ML) techniques and  bio-inspired algorithms are detailed throughout a set of real-world  applications.

Bayesian Decision Tree Ensembling Strategies for Nonparametric Problems

17 septiembre, 2021
5:00 pm

Título: Bayesian Decision Tree Ensembling Strategies for Nonparametric Problems

Ponente:  Antonio Linero (Universidad de Texas en Austin, EEUU)

Organizador: Xavier Barber

Fecha: Viernes 17 de septiembre de 2021 a las 17:00 horas.

Lugar: Online.

Para asistir a la charla regístrese aquí

El registro se cerrará 20 minutos antes del inicio del seminario

Abstract:  In this talk we will make the case for using Bayesian decision tree ensembles, such as Bayesian additive regression trees (BART), for addressing some fully-nonparametric problems. We present models for density regression and survival analysis, and argue that our approaches are both easier to use and more effective than more standard Bayesian nonparametric solutions (such as those based on mixture models). On the applied side, we show how to use our models to extract interesting features across several datasets. On the theoretical side, we also show that our models attain minimax-optimal rates of convergence of the posterior in high-dimensional settings. Throughout the talk we will emphasize the flexibility and ease-of-use of our approach: we obtain excellent results on all simulated and real data analyses using heuristically chosen «default» priors, and our software tools make it quite straight-forward for researchers (both in-principle and in-practice) to embed our ensembles in larger models.

The concept of hyperbolicity of equilibria for a nonlocal and quasilinear problem

26 julio, 2021
12:00 pm

Título: The concept of hyperbolicity of equilibria for a nonlocal and quasilinear problem

Ponente:  Estefani Moraes Moreira (Universidade de São Paulo)

Organizador: José Valero

Fecha: Lunes 26 de julio de 2021 a las 12:00 horas.

Lugar: Online.

 PINCHA AQUÍ PARA VER EL SEMINARIO

Abstract: In this talk, we study a dissipative one-dimensional parabolic problem with a nonlocal diffusivity.
We address the hyperbolicity of equilibria in this context, for which linearization procedures are not directly applicable

Bayes y la búsqueda del modelo perfecto

19 julio, 2021
12:00 pm

Título: Bayes y la búsqueda del modelo perfecto

Ponente: Anabel Forte Deltell (Universidad de Valencia)

Organizador: Xavier Barber

Fecha: Lunes 19 de julio de 2021 a las 12:00 horas.

Lugar: Online.

 PINCHA AQUÍ PARA VER EL SEMINARIO 

Abstract: Entender cómo funciona el mundo que nos rodea y cuáles son las relaciones entre los procesos que tienen lugar en él, ha sido siempre un objetivo primordial para el ser humano. Los modelos matemáticos y estadísticos son una gran herramienta para cumplir con este propósito. Sin embargo, en un mundo en el que el acceso a datos e información es cada vez más fácil, y la capacidad computacional es cada vez mayor, elegir un modelo adecuado se convierte en un gran reto. Por una parte, cabe determinar qué variables están involucradas en un proceso de interés. ¿Está relacionada la temperatura ambiente con el funcionamiento de una placa solar? ¿Puede verse influido el crecimiento de un país por su consumo energético? ¿Qué componentes del ADN están relacionadas con la resistencia a una enfermedad? Por otra parte, será necesario elegir la forma del modelo, las distribuciones probabilísticas o las ecuaciones adecuadas al proceso. Dos tareas nada sencillas para las que existen multitud de aproximaciones. En concreto, en esta charla, daremos un paseo por el mundo de la selección Bayesiana de modelos y los grandes retos que ha supuesto para el desarrollo de la Estadística.

Complexity-based permutation entropy

12 julio, 2021
12:00 pm

Título: Complexity-based permutation entropy

Ponente: José María Amigó (CIO)

Organizador: CIO

Fecha: Lunes 12 de julio de 2021 a las 12:00 horas.

Lugar:  Online.

 PINCHA AQUÍ PARA VER EL SEMINARIO

Abstract: Entropy is a useful concept in many areas of physics and applied mathematics, primarily thermodynamics (where it originated), statistical mechanics, information theory, dynamical systems and data analysis. In the first part of our talk, we will review (i) the Shannon entropy, (ii) the permutation entropy, which is the Shannon entropy in the symbolic representation of real-valued time series via permutations, and (iii) the axiomatic definition of the entropy of probability distributions, which leads to the weaker concepts of generalized entropy, compossable entropy and group entropy.

In the second part, we will focus on the permutation entropy. Although this entropy has proven to be useful in data analysis, its theoretical aspects have remained limited to noiseless deterministic series (i.e., generated by dynamical systems), the main obstacle being the super-exponential growth of visible permutations with length when randomness (also in form of observational noise) is present in the data. To overcome this shortcoming, we take a new approach through complexity classes, which are defined by the asymptotic growth of visible permutations with length. Thus, deterministic processes belong to the exponential class, while usual noisy processes belong to the factorial class. For the processes of each possible class, we will construct a group entropy that is finite and coincides with the conventional permutation entropy on the exponential class. This construction is completely general and can be applied to other situations.

A real sampling strategies for estimating totals and averages on a grid of quadrats: applications to forest surveys

5 julio, 2021
12:00 pm

Título: A real sampling strategies for estimating totals and averages on a grid of quadrats: applications to forest surveys

Ponente: Maria Chiara Pagliarella (INAPP of Rome and University of Cassino and Southern Lazio)

Organizador: Domingo Morales

Fecha: Lunes 5 de julio de 2021 a las 12:00 horas.

Lugar:  Online.

 PINCHA AQUÍ PARA VER EL SEMINARIO

Abstract: The Reduction of Emissions from Deforestation and forest Degradation (REDD) project was proposed and initiated in 2005. Monitoring of forest cover by statistical methodologies is a key pre-requisite. Forest cover is usually estimated at large scale by spatial sampling strategies, in which the study region is partitioned into polygons of equal size (e.g. quadrats). Then, a sample of N units is selected, aerial photos are provided and visually interpreted to determine the forest cover. In order to incorporate spatial aspect into the design and to account for the presence of spatial autocorrelation among the units, in this paper are compared familiar sampling schemes such as simple random sampling without replacement (SRSWR), one-per-stratum stratified sampling (STR) and systematic sampling (SIS) against ad hoc spatial schemes such as Local Pivotal Method of the first type (LPM1) by Grafström et al. (2012), Generalized Random-Tessellation Stratified Sampling (GRTSS) by Steven and Olsen (2004), Spatially Correlated Poisson Sampling (SCPS) by Grafström (2012), the drawn-by-drawn scheme avoiding the selection of contiguous units by Fattorini (2006) and the Doubly Balanced Spatial Sampling by Grafström and Tillé (2013). A simulation study is performed in order to compare and check the validity of each strategy.