Christmas is almost here, and with it comes Santa, Eggnog & Spotify Wrapped.
Even if you’ve been living under a rock without a friend’s Spotify account to
steal borrow, you likely haven’t been able to get away from the shock and awe bombardment that is Spotify’s genius ad campaign.
Confused? Let me remind you.
The sheer power of Spotify’s 172 million paid premium subscribers [Q3 in 2021] listening data might not have been realized without the right person seeing this as an opportunity for marketing.
Spotify Wrapped has managed to turn this into a social media advertising phenomenon. Their platform & individual user’s listening data has been distributed in a stylized, visualized format that is ripe for consumption on social media.
Looking at Google Trends, we can see that Spotify has managed to eclipse last year’s interest over time and hit the mythical 100 rating. This is no small feat for a large company that has had major peaks in the news cycle.
Something major must have happened – and that something is Spotify Unwrapped.
Why is Spotify Wrapped so popular with their userbase?
Take a look at these usage statistics [source]
Spotify usage time by region
|Region||Average daily usage (minutes)|
|Middle East & Africa||124|
With COVID-19 accelerating people’s dependency on the internet for work, entertainment and communication, I would argue that these figures are quite strong and shows a level of dependency amongst people to get their music fix from paid streaming sites like Apple Music & YouTube Music.
As Spotify’s importance and market share have increased, there is a huge wealth of data that can be gleaned from your average user.
Now as a marketer myself, I can safely say that most successful businesses find a way to tell a story. Whether it’s a true story or not, if its engaging, it will get attention.
With the passion that people get from being a superfan of certain bands, singers and genres, Spotify Wrapped has transformed the simple act of listening to music, into a performative expression of one’s music tastes.
The neat, crisp visuals of Spotify, shape this data into an annual narrative, showing people the stats on the music & podcasts that they love. This form of interpretative data, in a sharable visual medium is gold on social media.
There has been post after post on Reddit by users sharing which artists, they were a top listener of. Peep the upvotes next to the main thread & comments. This specific thread hit the front page which is no easy feat.
You can see that not only is there a high positive sentiment and engagement with Spotify Wrapped, it’s also abundantly clear that artists take these listeners seriously; with tickets & promotional gifts, given away as a thank you for being a top listener.
Limitations of the Spotify’s data set
Interestingly, this data has a major consideration to ponder. How much does Spotify’s recommendation system influence the annual Wrapped data?
Is it logical to assume that the algorithm suggests Aesop Rock to this user, influencing them to listen to 78,010 minutes of Aesop Rock?
Or that a user decided to that they all they wanted to do was listen to Blink 182 ad nauseum to enter the top 0.05% of listeners?
I ask this question as it is important to understand the power of AI. In some ways, the mathematical models of recommendation systems program our tastes, or at the very least influence them.
With Netflix spending millions upon millions in fine tuning and configuring their recommendations system, and going as far as to personalize the movie artwork according to the end user’s. This gives us a clue to how important search, discovery and recommendation algorithms are to software companies, as well as the underlying data that fuels this.
Deeper mining of Spotify Wrapped.
The prospect of mining the data set of Spotify’s user base, or at the very least their artists music data and statistics is mouthwatering. Alas, there is no publicly accessible API for the entire platform’s data (there’s an idea for you Spotify!) – however through their API, they do offer the ability to pull requests for accounts you can access – https://developer.spotify.com/documentation/web-api/
Spotify REST API is primarily used to pull JSON metadata regarding songs, artists & albums on the platform.
Recognizing that user data is incredibly useful and engaging, they have expanded this to pull requests regarding user related data, like playlists and music that the user saves in the Your Music library.
To access this data for a specific user, you must authorize it via Spotify Accounts Service. Without this, even if you specify a Spotify user ID, it will not return the data you are asking for.
For a great overview on the deeper processes involved, it’s highly recommended you view Pavan Sanagapati’s tutorial on Spotify Music API – Data Extraction
Technical Commentary by Michelangiolo Mazzeschi – Data Evangelist at Relevance AI
Spotify has not released information regarding the details of its recommendation system. There are several ways in which Machine Learning can be used to enhance the user experience based on its historical preferences, from Collaborative Filtering to Neural Network based recommendations.
In this post, we are going to explore one of the most advanced recommendation approaches that uses vector-based search as its main core search engine. The advantage of using vectors to encode songs into a multi-dimensional space is being able to find the relationships between different songs, meaning that we can immediately group songs that are similar. Once a model has understood what are the main preferences of a user, it can suggest the songs that most corresponds to its musical taste.
Encoding Spotify’s songs
Before being able to recommend any song to a user, the main issue is that all the data is categorical. Even by using tags or song content, the data is too complex to represent: how would you convert a song lyric into a number? For this particular purpose we use something called encoder, a complex algorithm that is able to convert categorical data into a series of numbers that we call vector.
Once we have converted all the songs in Spotify into vectors, we can visualize them into a cartesian plane (note that in this example I am only using 2 AC/DC songs and one song from Lesly Gore:
As we can immediately see, songs that are similar to each other in terms of tags, content, description and lyrics occupy the same region in space. For example, I can assume that all the cluster of all the blue dots represent Heavy Metal songs, while the Turquoise dots are likely going to be vintage songs.
The difficulty of encoding songs and podcast into a vector-space, is that there are several features that provide valuable information on the relationships between elements. The content of a song would be the main feature, but each song is given one or more tag, belongs to one album or a collection, and there could be one or more artists that have been contributing to its creation.
To take advantage of multiple features, we use a technology called multi-vector search, which is able to perform a search taking in consideration the various feature of each item, each one corresponding to a separate vector space.
The way a vector-based recommendation system works is by creating a dedicated vector space for each user in the platform. This dedicated space only hosts vectors that are part of the user history. Depending on the complexity we wish to adopt and our storage limitation, we can weight these vectors differently according to several factors, sometimes even ignoring some of them.
Given all user preferences, identified in its dedicated embedding, we can identify one location that best represents the average of all its previous preference.
We can then use this location in the general item embedding to find the best matches. Note that we are not limited to a single representative user vector, but we can use multiple, if we wish.
The representative dot of user1 is closer to It’s my party, while the representative dot of user2 is closer to Highway to Hell: we have just simulated a vector-based recommendation system.
Key takeaways and considerations
It is clear despite valid privacy considerations and concerns, that the benefits of big data are too large to ignore. The world is richer for the takeaways that our collective data is used, whether it is from a user experience, healthcare, business analytics or legislation point of view.
With great power comes great responsibility, yet with the right ethical considerations, businesses can leverage the anonymized data they collect to effective use.
Data such as:
- Initial trial account usage
- User firmographics
- User demographics
- UI engagements
- API & SDK Requests
- Usage predictions based on usage data
- Customer churn data
- Product Purchasing data
Can be used for:
- Data as content
- Marketing Strategy
- Business Strategy
- Understanding weaknesses in the UI / UX to flag for improvement
- Buyer Persona Insights
- Conversion rate optimization
- Content customization
- Sales and Marketing automation
- Development & Product Roadmaps
With this in mind, and the clear power of Spotify’s data as content, we ask you:
1) What useful and important data does your business capture currently?
2) And how will you leverage this data in an ethical manner within your product, development, sales, marketing, or general business activities?