No Tag was found...

A classification strategy for the Yellow-Taxi demand Prediction

Ayoub Berdeddouch 1 year, 10 months 2 min read -    -   0 Stars   

1 View


An abstract presentation in the DSSV-ECDA Colloquium during 7-9 July 2021, entitled A classification strategy for the Yellow-Taxi demand prediction and visualization tools

Data Science, Statistics & Visualisation (DSSV) and the European Conference on Data Analysis (ECDA) are joint virtual conferences aimed at bringing together researchers and practitioners interested in the interplay of statistics, computer science, and visualization, and to build bridges between these fields for interdisciplinary research. We are a satellite conference of the World Statistics Congress and are promoted by the International Association for Statistical Computing (IASC), the Gesellschaft für Klassifikation (GfKl) – Data Science Society, the European Association for Data Science (EuADS), and the Dutch/Flemish Classification Society (VOC). DSSV-ECDA 2021 will be hosted by the Erasmus University Rotterdam, the Netherlands.


On the first of this colloquium, I was able to share along with other artefacts presenters, present my short talk entitled: A classification strategy for the Yellow-Taxi demand prediction and visualization tools.  

Under the supervision of resp. Profs: Ali Yahyaouy & Rosanna Verde, below you can read the Abstract, 


Nowadays, there is a variety of alternatives to public transportation, such as trains, buses, subways, or even taxis that tries to lure new riders every day. This competition to attract passengers depends on factors like time travel, reliability, convenience. This leads to a huge number of data sets that will provide planners and stakeholders with the necessary means to analyze urban travel patterns and through them gain meaningful insights to leverage a dynamic ecosystem. Through this paper, we will explore spatiotemporal variations of the Yellow taxi trips, by the mean of Taxi Limousin Commission's open data set of New York City by analyzing millions of trip records(about 10 Million rows per month). This paper presents a strategy of analysis based on representation by the graph of the direction of trips during the different hours of the days and the week. A comparison is realized according to hourly trip frequencies and density between weekdays and weekend days in NY center and in each borough of Manhattan. A clustering based on the graph allows analyzing the evolution of the trips. A prediction of the taxi trip duration is performed by using Random Forest Algorithm considering as features: fare amount, density, distance, the wind rose,...etc. The study explores the taxi trips which are not originated, nor concluded from airports in NY city, as well as the variation of travel speed during the day. Visualization tools are also proposed for detecting traffic density, and the most frequent direction of the trips. This proposal could be used as a decision support tool for investments in rapid transit infrastructures and services, that would be particularly effective to increase transit mode share.

Keywords: Spatiotemporal, Yellow-Taxi, Visualization tool, classification