(image credits: Akshay Gupta, instagram: odds_are_awed ) With the boom in technology, music streaming platforms are very commonly used. We all listen to our favourite songs on these platforms. They use various classification methods in order to give personalized recommendations. It could be done by analyzing the audio or various other methods. In my Data Camp project, I was working on data compiled by a research group known as The Echo Nest. It has many musical features: acousticness, danceability, energy, instrumentalness, liveliness, speechiness, tempo, track_id, valence. These features are recorded on a scale of -1 to 1. These are the features that will be used for classification of songs as “hip-hop” or “rock”. There is another fi le with general information about the songs. Only the Genre column is needed from this file. It will be the target variable. After importing data from both files and merging to form a single data frame, I checked for correlation between
Hello, everyone! My name is Manan. I am a student at NMIMS university, pursuing a B.Tech in Data Science. I am starting this blog in order to talk about the Data Science projects I do, the new things that I learn and much more about this field. I have done very few and basic projects yet (like- Titanic, Housing - prices, MNIST, etc). Today, I will be talking about the one I finished recently. It was the first time that I did a project completely on my own. The dataset I was working on is the PIMA Indians Diabetes Database. This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Let's get started! Preparing the Data: If you check the dataset, there are no missing values. But, note that the values which were missing