Christmas is almost here, and we've been hearing a ton of christmas carols and chrismas songs. Now my question is, how jolly are these songs actually? This is a nice opportunity to test my Web Scraping & Text Mining skills.
This blog is the place where I share my work, projects, frustrations and discoveries. I designed it to be entertaining to some, and educative to others. Sometimes hopefully both.
What I’ve been up to This blog has been quiet for almost a year now. Mostly because I joined a Marketing Analytics company called Gradient Metrics as an all-round quantitative analyst. Gradient gave me the opportunity to combine my previous marketing experience with my newfound love for data science, and I never looked back.
Combining Marketing & Data Science A successful marketing strategy will always require two dimensions. On one end, you’ll find creativity.
Together with a couple of friends, we’ve created our own personal Awesome Mix Vol.1 Instead of being a tape with 13 songs however, we’ve added roughly 1.500 songs. Now I’m curious as to how our musical taste differs from one another, but also what kind of musical clusters we have created in our playlist.
Let’s get started. ## Rspotify spotifyr tidyverse knitr kableExtra ggthemes ## TRUE TRUE TRUE TRUE TRUE TRUE ## highcharter htmltools widgetframe cluster factoextra here ## TRUE TRUE TRUE TRUE TRUE TRUE First, I’ll have to extract the audio features of each song in the playlist.
Introduction I love traveling, and I love world maps, even more so when I can hang them on my walls. Now that got me thinking. What if I could create maps, with a similar look and feel, of all my holiday destinations? And what if I plot my location progression on top of the destination? That would be absolutely awesome!
In a recent post, I described how Google often tracks your location (if you agreed to it), and plotted each of my tracked location on a map.
1 Introduction 2 Packages and initialisations 3 Exploratory Data Analysis (EDA) 3.1 Missing Values 3.2 Discrete Variables 3.3 Continuous Variables 3.4 Correlations 3.5 Boxplots 4 Data Description 5 Data Preparation 6 Model Choices 7 Model Preparations 8 Applying the Models 8.1 Linear Regression 8.2 Partial Least Squares 8.3 Principal Component Regression 8.4 Ridge Regression 8.5 Lasso Regression 8.6 Elestic Net 8.7 Neural Networks 8.8 MARS 8.9 SVM 8.10 KNN 9 Model Performances 10 Model Comparison 11 Final Notes 1 Introduction This blog post is a comprehensive summary of predictive modeling with regression techniques.
Web Scraping Required libraries List of Songs Lyrics of the Songs Text Analysis Top 15 words Christmas Carol Sentiments Positive vs Negative Most Positive & Most Negative Christmas Carol Christmas is almost here, and we’ve been hearing a ton of christmas carols and chrismas songs. Now my question is, how jolly are these songs actually? This is a nice opportunity to train my Web Scraping & Text Mining skills.
A Researcher’s Love for Excel and SPSS Working for the research department at the municipality of Rotterdam (OBI), I discovered their love for excel spread sheets and SPSS documents. Hundreds, no, thousands of them. Each file additionally has another 100 sheets. Sometimes, I needed to join sheets, or bind columns and rows, which is tedious in excel itself. I started scouring packages and functions to find what I needed. In this post, you’ll find some tips and tricks that helped me speed things up.
Loading Libraries Data Wrangling House Prices over time - Netherlands Netherlands Rotterdam Rotterdam Vs. Netherlands As a Data Science Trainee for the municipality of Rotterdam, I was tasked to find out-of-the-box data sources to measure economic growth within the city. Naturally, I turned towards online estate agents. Funda and Jaap.nl are the biggest players in the Netherlands. Fortunately for me, Jaap.nl makes aggregated data for each municipality publicly available.
Visualising Google Tracking Data Loading & Cleaning data Static Visualisation Interactive Visualisation Visualising Google Tracking Data In 2016 I let google track my location. Everytime my phone sent an update to google, a new record was created. By adding up records for each longitude and latitude coordinates combination, I was able to recreate the spots where I spent most of my time.
Loading & Cleaning data ## Load packages & Install if necessary ipak <- function(pkg) { new.
When you’ve written the same code 3 times, write a function
When you’ve given the same in-person advice 3 times, write a blog post
— David Robinson (@drob) November 9, 2017 My Story I started learning R for real in the summer of 2017, after graduation for my MSc in Marketing Management. My graduation thesis was all about Recommendation Systems and their impact. I wrote my thesis for a festival application called Appic.