Small area estimation under a measurement error bivariate Fay–Herriot model (2020), Statistical Methods & Applications

Jan Pablo Burgard  (Trier University), María Dolores Esteban (University Miguel Hernández of Elche), Domingo Morales (University Miguel Hernández of Elche) and Agustín Pérez (University Miguel Hernández of Elche).

Abstract: The bivariate Fay–Herriot model is an area-level linear mixed model that can be used for estimating the domain means of two correlated target variables. Under this model, the dependent variables are direct estimators calculated from survey data and the auxiliary variables are true domain means obtained from external data sources. Administrative registers do not always give good auxiliary variables, so that statisticians sometimes take them from alternative surveys and therefore they are measured with error. We introduce a variant of the bivariate Fay–Herriot model that takes into account the measurement error of the auxiliary variables and we give fitting algorithms to estimate the model parameters. Based on the new model, we introduce empirical best predictors of domain means and we propose a parametric bootstrap procedure for estimating the mean squared error. We finally give an application to estimate poverty proportions and gaps in the Spanish Living Condition Survey, with auxiliary information from the Spanish Labour Force Survey.

Keywords: Multivariate models; Fay–Herriot model; small area estimation; measurement error; Monte Carlo simulation; poverty proportion; poverty gap

Seminario Online Jan-J. Rückmann

8 junio, 2020
12:00 pma12:45 pm

Título: Stabilityof C-stationary Points for Mathematical Programs with Complementarity Constraints.

Ponente:   Jan-J. Rückmann (University of Bergen)

Organizador: Juan Parra

Fecha: Lunes 8 de junio a las 12:00 horas.




We consider the class of mathematical programs with complementarity constraints (MPCC). Under an appropriate constraint qualification of Mangasarian-Fromovitz type we present a topological and an equivalent algebraic characterization of a stronglystable C-stationary point of MPCC. Strong stability refers to the local uniqueness, existence and continuous dependence of a solution for each sufficiently small perturbed problem where perturbations up to second order are allowed. This concept of strong stability was originally introduced by Kojima for standard nonlinear optimization; here, its generalization to MPCC demands a sophisticated technique which takes the combinatorial properties of the solution set of MPCC into account.

Some matheuristic algorithms for multistage stochastic optimization models with endogenous uncertainty and risk management (2020), European Journal of Operational Research 285. 988–1001

Laureano F. Escudero (Rey Juan Carlos University of Madrid), M. Araceli Garín (University of País Vasco), Juan F. Monge (Miguel Hernández University of Elche) and Aitziber Unzueta (University of País Vasco).

Abstract: Two matheuristic decomposition algorithms are introduced. The first one is a Progressive Hedging type so-named Regularized scenario Cluster Progressive Algorithm. The second one is a Frank-Wolfe PH type so-named Regularized scenario Cluster Simplicial Decomposition Progressive Algorithm. An extension of endogenous Type III uncertainty is considered for representing the decision dependent scenario probability and outlook. Its performance is tested in the time-consistent Expected Conditional Stochastic Dominance risk averse environment. As a result of the modeling, the typical risk neutral multistage mixed 0–1 linear stochastic problem under uncertainty is replaced with an enlarged model that is equivalent to the required mixed 0–1 bilinear model. Based on the special features of the problem, it is unrealistic to seek the optimal solution for large-scale instances. Feasible solutions and lower bounds on the solution value of the original model are provided. In total, 48 strategies are considered, each one consists of a combination of a regularization norm, a calibration type for the PH pseudo-gradient computation, and a set of value intervals of the influential variables on a representative endogenous uncertainty-based piecewise function in the scenarios. Computational results are reported for a large-scale extension of a well-known real-life pilot case for preparedness resource allocation planning aiming to natural disaster relief. The matheuristics outperform the plain use of a state-of-the-art solver.

Keywords: Stochastic programming; Exogenous and endogenous uncertainties; Time-consistent stochastic dominance; Mixed 0–1 bilinear optimization; Scenario cluster-based decomposition algorithms

Seminario Online Joscha Krause

1 junio, 2020

Título: Regularized Small Area Estimation: A Framework for Robust Estimates in the Presence of Unknown Measurement Errors

Ponente:  Joscha Krause (Universität Trier)

Organizador: Domingo Morales

Fecha: Lunes 1 de junio a las 12:00 horas.




SAE provides stable estimates of area statistics in the presence of small samples. This is achieved by combining observations from multiple areas in suitable regression models.These models exploit the functional relation between the area statistic and contextually related covariate data to make predictions for the quantities of interest.  An important assumption of this methodologyis that the covariate data is measured correctly. If this does not hold, areastatistic estimates can be severely biased or highly inefficient. In that case, methodological adjustments are required to allow for reliable results.

There are several approachesin the literature that allow for robust estimates from contaminated data bases. Unfortunately, many of them share a common limitation. Robust SAE techniques typically require distribution assumptions on the measurement error. Theseassumptions can be either explicit by requiring a specific distribution, or implicit by demanding that the distribution is known. However, both settings are rarely verifiable in practice.

We propose a new approach torobust SAE that does not require distribution assumptions on the measurement error. Using insights into robust optimization theory, we proof that regularized model parameter estimation is equivalent to the robust minimization of loss functions under arbitrary model matrix perturbations. This equivalence holds for many well-established regularized regression methods, such as theLASSO, ridge regression, and the elastic net. It allows us to produce reliablearea statistic estimates in the presence of unknown covariate measurement errors.

We built upon this result toderive a modified Jackknife algorithm that allows for conservative MSEestimation for predictions obtained on contaminated data bases. In addition to that, we discuss consistency in model parameter estimation of regularized regression in this setting. The effectiveness of the methodology is demonstrated in a Monte Carlo simulation study.

Evaluation of ontology structural metrics based on public repository data, Briefings in Bioinformatics, 21(2), 2020, 473–485

Manuel Franco (University of Murcia), Juana María Vivo (University of Murcia), Manuel Quesada-Martínez (Miguel Hernández University), Astrid Duque-Ramos (University of Antioquia) and Jesualdo Tomás Fernández-Breis (University of Murcia).

Abstract: The development and application of biological ontologies have increased significantly in recent years. These ontologies can be retrieved from different repositories, which do not provide much information about quality aspects of the ontologies. In the past years, some ontology structural metrics have been proposed, but their validity as measurement instrument has not been sufficiently studied to date. In this work, we evaluate a set of reproducible and objective ontology structural metrics. Given the lack of standard methods for this purpose, we have applied an evaluation method based on the stability and goodness of the classifications of ontologies produced by each metric on an ontology corpus. The evaluation has been done using ontology repositories as corpora. More concretely, we have used 119 ontologies from the OBO Foundry repository and 78 ontologies from AgroPortal. First, we study the correlations between the metrics. Second, we study whether the clusters for a given metric are stable and have a good structure. The results show that the existing correlations are not biasing the evaluation, there are no metrics generating unstable clusterings and all the metrics evaluated provide at least reasonable clustering structure. Furthermore, our work permits to review and suggest the most reliable ontology structural metrics in terms of stability and goodness of their classifications.

Keywords: biological ontologies; quantitative metrics; metrics comparison; data analysis.