Alejandro Rabasa (University Miguel Hernandez of Elche), Nuria Mollá Campello (University Miguel Hernandez of Elche) and Agustín Pérez Torregrosa (University Miguel Hernández of Elche).

Abstract. This paper presents the design of an in-depth descriptive analysis of data collected from public surveys at tourist information points. It uses a dataset that compiles different information related to trips to the Valencian Community (Spain). The aim of this study is to describe the patterns (association rules) that a certain type of expense has, and how this could be used to improve the services offered to tourists. There are different kinds of expenses to analyze: transport, accommodation, leisure as well as total daily expenses and total daily expenses per person. Those cases where expenses are especially high or low are considered as particularly important because of their strategic interest for the public administration of tourism. The study starts with data preprocessing, followed by pattern extraction for the sub-samples with very high and very low expenses, and in some cases, zero expenses are not considered as outliers but as a particular group of individuals. After this, the study aims to extract the most important attributes (feature selection) to create a classification model and compare its efficiency with the models that compute the complete set of attributes. To conclude, this paper presents the possible future predictive models that could lead to an improvement in planning for public tourist services in the Valencian Community (Spain).

Keywords. Feature Selection; Pattern Discovery; Predictive Tourism Analysis.