Carousel Personalization in Music Streaming Apps with Contextual Bandits

Recommending relevant and personalized content to users is crucial for media services providers, such as news,
video or music streaming platforms. Indeed, effective recommender systems improve the users’ experience and
engagement on the platform, by helping them navigate through massive amounts of content, enjoy their favorite videos
or songs, and discover new ones that they might like. As a consequence, significant efforts were initiated to transpose
promising research on these aspects to industrial-level applications.

In particular, many global mobile apps and websites, notably from the music streaming industry, currently
leverage swipeable carousels to display recommended content on their homepages. These carousels,
also referred to as sliders or shelves, consist in ranked lists of items or cards (albums,
artists, playlists…). A few cards are initially displayed to the users, who can click on them or swipe on the
screen to see some of the additional cards from the carousel.

Carousels on Deezer

Selecting and ranking the most relevant cards to
display is a challenging task, as the catalog size is usually significantly larger than the number of available
slots in a carousel, and as users have different preferences. While being close to slate recommendation
and to learning to rank settings, carousel personalization also requires dealing with user feedback to adaptively
improve the recommended content via online learning strategies, and integrating that some cards from the carousel
might not be seen by users due to the swipeable structure.

In this paper, we model carousel personalization as a multi-armed bandit with multiple plays learning problem.
Within our proposed framework, we account for important characteristics of real-world swipeable carousels,
notably by considering that media services providers have access to contextual information on user preferences,
that they might not know which cards from a carousel are actually seen by users, and that feedback data from
carousels might not be available in real time.

Focusing on music streaming applications, we show the effectiveness of our approach by addressing a
large-scale carousel-based playlist recommendation task on the global mobile app Deezer.

Cumulative Regrets of Bandits Policies for Playlist Recommendation

Along with this paper, we publicly release large-scale datasets of user preferences for curated playlists on Deezer, and an open-source environment to recreate comparable learning problems.
The code is available on GitHub and the datasets are available
on Zenodo.

Datasets

This paper has been published in the proceedings of the 14th ACM Conference on Recommender Systems (RecSys 2020), and has been shortlisted among the “Best Short Paper Candidates”.