February 15, 2021

Voice Assistant Coexistence on the Sonos Platform

John Tolomei

Conversation between Alexa and Google Assistant

Sonos pioneered multiroom wireless audio, made it sound amazing, and changed the way people listen at home. We continue to build on this legacy with a mission to inspire the world to listen better. Over the years, we have added support for more than 100 streaming audio services worldwide. Our commitment to freedom of choice naturally extends to the way you control your music and content. Sonos is the only company that enables both Alexa and the Google Assistant to operate on one system - and offers both assistants built in to every voice-enabled speaker. A user can install Amazon Alexa on some players in a household and Google Assistant on others in the same household. In such a household, a user can ask an Alexa-enabled speaker in the bedroom to play a song on a Google-enabled speaker in the Kitchen (and vice versa).

A lesser known feature of Sonos voice-enabled speakers is that they can be grouped to form coexistent, or concurrent, voice assistants that operate side-by-side within a speaker group. The Sonos platform recognizes such a speaker group as a single user-interface entity. When a user invokes either assistant in the group, the Sonos system infers that the voice commands are targeting the group rather than any individual speaker in the group.

This article provides a general overview of voice-assistant coexistence within a speaker group on the Sonos platform and how Sonos intermediates between multiple content and voice assistant services. This overview is presented in connection with an example user experience in which an Alexa-enabled Sonos speaker (Figure 1, left speaker) has been grouped into a speaker group with a Google-Assistant-enabled Sonos speaker (Figure 1, right speaker).

Fig. 1 - Sonos speakers grouped to form coexisting Alexa and Google voice assistants

In the above speaker group, two Sonos One speakers, named Bookshelf and Cabinet, have been grouped to form a playback group. The woman on the left has assigned the Bookshelf speaker to the Alexa voice assistant. The man on the right has assigned the Cabinet speaker to the Google voice assistant. Each speaker can be assigned to one of the voice assistants in the Sonos App, and both speakers can be grouped in the App, as shown in the inset.

Using the “Hey Google” phrase, the man initiates music playback on the Cabinet + Bookshelf speaker group with a request to play “classic rock.” The Cabinet speaker, upon detecting its wakeword, captures and sends the uttered voice request to Google’s voice service on the back end. The Google voice service converts the uttered request to text and determines that the request involves music playback and control. Google and Sonos servers then coordinate to search for content and retrieve URIs for a particular content service. In this case, the Cabinet + Bookshelf group begins playing music via YouTube Music, which can be assigned by the user as a default content service among others for Google Assistant. A user may likewise assign a default music service for Alexa, such as Spotify or Amazon music.

Fig. 2 - Initiating voice playback on the Cabinet + Bookshelf speaker group via Google Assistant

To support many content providers and voice assistants, Sonos provides an API that allows voice assistants to access certain metadata and state information for each Sonos speaker in a household. This information includes a speaker’s playback state; the tracks in queue; the names of songs, artists, albums, etc. For instance, after the man initiates playback of the selected content, the metadata and state information indicates that the Cabinet + Bookshelf speaker group is currently playing the song “Simple Man” by “Lynyrd Skynyrd,” with the song “Sweet Child O’ Mine” next in queue, as shown in Figure 3.

Fig. 3 - Metadata and state information of Cabinet and Bookshelf speakers after initiating music playback via Google Assistant in Fig. 2

When the woman asks Alexa what is playing (Figure 4), the Bookshelf speaker picks up where Google Assistant left off. In particular, the woman’s request is captured and sent to the Alexa voice service. Sonos facilitates this hand off by providing to the Alexa service the name of the song and artist of the current track (Figure 3) currently playing on the speaker group.

Fig. 4 - Navigating voice playback on Cabinet + Bookshelf speaker group with Alexa

Next, the woman asks Alexa to skip the current song. Notably, Sonos advances the queue without re-invoking the Google Assistant or switching away from the YouTube Music service. Instead, the Sonos cloud directs the speaker group to advance to the next track queue for the current music service providing content to the speaker group. After advancing the queue, Sonos updates the state information of the speaker group, as shown in Figure 5.

Fig. 5 - metadata and state information of the Cabinet and Bookshelf speakers after advancing music playback via Alexa in Figure 4

Next, while the YouTube Music stream is still playing on the speaker group, the woman asks Alexa to play some heavy metal (Figure 6). Alexa processes this voice request and initiates playback of a new queue associated with a different music service on the Cabinet + Bookshelf speaker group. In response to the request for new music, the Alexa Service changes the content provider to Spotify, which may be the default music service for the Bookshelf speaker. Sonos, in turn, updates the track information and the queue of music on Spotify (Figure 7A). The woman also invokes Alexa to increase the Bookshelf speaker volume, which is reflected in the volume state information for this speaker (Figure 7B).

Fig. 6 - Selecting new content on Cabinet + Bookshelf speaker group with Alexa

Fig. 7A - Metadata and state information of Cabinet and Bookshelf speakers after advancing music playback via Alexa in Figure 6

Fig. 7B - Volume and EQ state information of the Bookshelf speaker after asking Alexa to increase the volume in Figure 6

When the man returns to the room (Figure 8), he asks Google to turn down the volume of the Bookshelf speaker. The Google Assistant takes control of the volume on the Alexa-enabled speaker by instructing Sonos on the back-end to lower the volume of this speaker. For example, the Google Assistant service may send an instruction for Sonos to lower the volume to level 40 on the Bookshelf speaker, as reflected in the change in state information shown in Figure 9.

Fig. 8 - Controlling volume on the Alexa-enabled Cabinet speaker with Google Assistant

Fig. 9 - Volume and EQ state information of the Alexa-enabled Bookshelf speaker after instructing Google Assistant to lower the volume in Figure 8

The examples above can be expanded to many other scenarios involving additional speakers and/or different types of players, such as a Sonos Beam, Move, and Arc. For any speaker group formed with Sonos-voice-enabled speakers, users can initiate playback using their preferred voice assistant and without mention to the name of the group or any particular speaker in the group. Users can also skip tracks or otherwise navigate the speaker group’s queue without disrupting the current content source, playlist, station, etc. When a user wishes to change content, she may invoke her preferred assistant to do so, and Sonos will coordinate with the target content service on the back end. At any time, a user can ask a voice assistant to identify what is currently playing in the group, or to perform any of a number of other requests, such as control the volume on any individual speaker in the group. In all of these cases, Sonos will seamlessly hands off control from one voice assistant to another without disrupting user experience.