" profile image

Benoît FAGOT

Big Data Developper

Contact Me

About Me

Hello, I'm Benoît. I recently completed a Master degree in Intelligent Distributed Systems and posesss a strong interest in Big Data. I thrive on the new challenges of technology and am always eager to learn more about the algorithms and practices that those imply.
I'm an easy-going person, who gets along with everyone and enjoys working in a team. I have many hobbies such as fitness, chess, binge-watching shows, playing online video games with friends and I'm always looking for novelty, discovering new things.
I am currently looking for a Big Data job opportunity to build more experience along experts so feel free to contact me if you are interested in my profile.

Latest Projects

project name

Web Sensors - Detection of events

We used streams of data from social networks such as Twitter and RSS feeds from news websites with the aim of discovering events in real time. We used different NLP techniques to pinpoint the place, date, description and topic of each event (Geonames API, TF-IDF n-grams, LDA topic extraction...). We backed up the output of our algorithms by matching them with the most fitting article harvested from our RSS feeds.

PythonGeonamesTF-IDFDateFinderMongoDB AtlasPHP

Find out more

project name

Management of subcontractors cost

I established a data model in which I integrated the daily and monthly results of subcontracted couriers of Coursier.fr (amount of kilometers, deliveries, extra-services, vehicule used...), a calendar which contains the modular prices depending on the day and vehicule and that can be modified by couriers managers from a web interface, and then automated the computations of salaries with a mariaDB procedure called from a shell script with Crontab.

PHPAJAXMariaDBShellGoogle APIs

project name

Wine dataset analysis - pattern mining

We analysed the wine dataset and applied classification algorithms to find out which features were deterministic on the class result.

RRpartDecision TreesK-NNData Mining

Find out more

project name

Multiple data sources mediator - data warehousing

We filled a local database from a CSV file which we completed with web-scraping, looked for additionial data from DBpedia, and finally used OpenMovieDatabase REST service as our last source of data.


Find out more

Other Interesting Projects

Smart automated stay offers from Lucene indexed SQL database

Intelligent system capable of automatically construct stay offers based on client characteristics and requirements on a static destination.


GitHub repository

Extraction of spatio-sequential phenomena from chess databases

I studied the evolution of the possibilites of valid and safe moves, and the power of pieces of players all game long, to see if the predicate that it is correlated to the result of the game is correct.

JavaJFreeChartChesslibData Mining

GitHub repository

Online Quizz game

We created a website where users can take quizz of various categories and difficulties, track their score and compare to others. The questions could be yes/no questions, opened questions or multiple choice questions. We created the entire data model and used a PostgreSQL database to store users data, quizz questions/answers, online chat history, and more.


GitHub repository

More on GitHub

Work Experience

SQL, AI, and Web Developper apprenticeship - Coursier.fr (2018 - 2020)

I managed a mariaDB database with millions of entries, integrated data from various APIs (Google Directions, OpenWeatherData...) and webscraping, computed salaries and other features of couriers such as premiums, service rate, etc. I made dashboards for HR, accounting department, couriers managers. I also improved the performance of procedures by using Memory Tables, smart index choices, etc. I automated a lot of processes such as the exportation of couriers theoretical and actual working hours to a third party platform, the calculation of each courier route's distance, and more. I participated to improve the automatic dispatching of deliveries system by working on a linear regression of distance and speed, looking for a correlation between traffic/weather and time of delivery, mapping Paris in a list of rectangles coordinates to keep couriers from going in and out of the city...

Java Developper internship - Univeristy of Cergy-Pontoise (2018)

I looked for chess databases and analysed chess games to find interesting patterns. Based on the findings, I was asked to analyse if some features of the player (number of pieces, power of the pieces, domination of the board, possibilites of safe moves, etc) were correlated with the final result. I used Java to parse text file of games using the FEN notation and recreate each move on a 2D array, calculating each feature across time and transposing the results on linecharts.

My GitHub

GitHub contribution graph calendar (credits : IonicaBizau's GitHub Calendar widget).

GitHub activities (credits : Casey Scarborough's GitHub Activity Stream widget).