(image credits: Akshay Gupta, instagram: odds_are_awed) |
With the boom in technology, music streaming platforms are very commonly used. We all listen to our favourite songs on these platforms. They use various classification methods in order to give personalized recommendations. It could be done by analyzing the audio or various other methods.
In my Data Camp
project, I was working on data compiled by a research group known as The Echo Nest. It has many
musical features: acousticness,
danceability, energy, instrumentalness, liveliness, speechiness, tempo, track_id,
valence. These features are recorded on a scale of -1 to 1.
These are the features
that will be used for classification of songs as “hip-hop” or “rock”. There is another fi
After importing data
from both files and merging to form a single data frame, I checked for
correlation between Features and plotted the following heat-map:
No strong
correlations between the features is found. So, in order for the model to perform
well, I performed the most common dimensionality reduction technique – Principal Component
Analysis(PCA).
PCA helps to find relative
weight of each feature towards the variance between classes.
In the data-set, there
are more entries for Rock classification than for Hip-Hop. In order to
prevent disproportional results from the model, I reduced the entries for Rock
to the size of Hip-Hop and performed the final PCA.
Finally, models were
trained to predict on the test set. I used Decision Trees and Logistic Regression.
This is the classification report:
I hope you enjoyed
reading about this project. Thanks to Data Camp for an opportunity to work on
this interesting project. It was a great learning experience for me.
Find Code and datasets in my Github repository.
Find my LinkedIn profile here.
Nice research work and execution, really liked it. This is informative.
ReplyDeleteGreat work , too much depth in research work . Enjoyed reading it๐
ReplyDelete