How Microsoft is cutting through the noise to create a more useful, beautiful ‘sound world’
You might never have thought about the sounds your computer emits when an email arrives, your battery runs low or a meeting reminder pops up on your screen. Matthew Bennett has. A lot.
Bennett personally composed, performed and digitally manipulated more than 400 versions of the Windows 10 calendar alert sound before choosing the perfect one.
“That’s just how long it took to get it right,” Bennett said with a shrug during a recent visit to his Redmond, Washington sound studio. The ambiently lit, sound-damped room features a mixer, multiple high-end studio monitors and large LCD screens, and, front and center, a multi-octave synthesizer keyboard.
As audio creative director for a large portfolio of Microsoft software and devices, Bennett has played a key role in the company’s sound design for 15 years. He has strong opinions and well-developed philosophies about sound, as well as a highly specialized vocabulary to discuss it.
Video: How I composed the Windows 10 calendar alert
Summarizing his role, he reflected, “Our responsibility to customers is, first, do no harm – no annoying audio! Second, make it functional, and third, make it beautiful. Beauty and function go hand in hand. The more beautiful the design, the better it will support the experiences we’re creating.”
The Windows 10 family of sounds took many months to perfect, as he collaborated closely with key members of his team, including visual designers, researchers, project managers and engineers. “We iterate a lot to be sure every sound is just right,” he said.
A composer of classical and improvised music who has done extensive research on non-Western music cultures, Bennett carried out Ph.D. work in ethnomusicology (the anthropology of music) at the University of Washington, before leaving the program to accept his first full-time position at Microsoft. After a five-year stint, he struck out to form his own agency, and for the next decade devoted himself to creating scores for film and television, as well as brand sound design for Fortune 500 companies. But he eventually became dissatisfied with the music he was creating.
Seeking new inspiration, he quit composing to study medieval chant and the musical cultures of West Africa, India, the Middle East and Indonesia. When he gradually resumed composing, his goal was to create a personal musical language – “a sound world that I could live with,” as he describes it. These examples show the results.
Once back at Microsoft, Bennett dug in hard. Now his work can be heard not only throughout the Windows platform, but also in the Xbox operating system and products including Office, Surface, Cortana and Skype. Having a strong sound design philosophy and creative point of view at the center is intended to help unify the soundscape of Microsoft products, just as the company’s user-interface design principles attempt to create a company-wide visual and functional continuity among its products.
We want to orchestrate harmony across devices and senses.
Beyond that, Microsoft’s Fluent Sound and Sensory Design development environment seeks to influence sound design in the technology industry more broadly.
“We use sound to shape the rhythm and emotional texture of the user experience,” Bennett said. “Sound is an element that’s integrated with other sensory experiences like touch, texture and movement. We’re shifting the way we think about sound design at Microsoft, and hopefully the industry at large. Our goal is to help orchestrate harmony across devices and senses.”
Rick Senechal, a Microsoft media solutions architect, has worked with Bennett for 20 years. Senechal directs a worldwide music service for company teams and agencies. Each year the service provides 4,000 songs for events, videos, podcasts and products.
Bennett takes his time and is extremely deliberate, Senechal said.
“Matthew is the most focused person I’ve ever worked with,” he said. “He takes a long, in-depth view of his craft and really thinks things through. He’s not just making sounds and saying, ‘Oh, that sounds good.’ There’s a logic and intelligence behind the sounds and textures he creates.”
Bennett is quick to declare what Microsoft sounds are not.
“We’re not sound effects, game sounds, generic sounds (beeps and bloops), novelty sounds (dogs and fog horns), futuristic sounds, wall-to-wall music or alarms,” he said. “Our product sounds are not live musicians or sampled bits of real instruments, like a piano or guitar or analog synthesizer, because those evoke specific musical styles and emotional memory, which is very subjective between individuals and across cultures. Those design approaches don’t make sense for the kinds of modern digital experiences our teams are creating. Our goal is to develop a sound design language that feels unique and authentic and deeply integrated with our products and devices.”
Sounds in older versions of Windows were quite different from those in Windows 10, Bennett noted. For one thing, there were a lot more of them. Triumphant sounds denoting a successful boot-up “aren’t necessary anymore,” he said. “We no longer need to celebrate the fact that our devices are turned on. That’s something we can take for granted at this point.”
Many modern product sounds tend to be shorter. Earlier sounds, such as the shutdown signal in Windows NT Workstation 4.0 (1996), lasted 8 seconds – interminable by today’s standards, which call for less intrusive sounds measured in milliseconds (1/1,000 of a second). And, like start-up sounds, shut-down sounds are a thing of the past, deemed just another needless contributor to tech-induced noise pollution.
The start-up sound in Windows NT Workstation 5 (2000), nearly 12 seconds long, sounded like a squadron of fighter jets taking off, followed by twinkling marimbas. Today’s sounds are “more deeply integrated with the product and as calm, quiet and non-intrusive as possible,” Bennett said.
Gone are sounds that specialists call skeuomorphic – those that replicate their real-world counterparts, like a piece of paper being crumpled up when a document is deleted or the clacking of 19th century, mechanical typewriter keys denoting on-screen keystrokes.
“In earlier stages, those sounds helped people get familiar with technology, but we don’t need them anymore. They no longer add to the experience, and they tend to feel more like clutter now,” Bennett said. “For many years now, the visual design world has been reducing clutter and using more space,” he observed. “Now sound is starting doing the same.”
Windows 7 had about 40 sounds. Windows 10 has about eight, though legacy sounds are included with the OS to ensure backwards compatibility, he said. “When I started, there were seven different system error sounds. They had accrued over years and no one knew what they meant. There were no clear guidelines for partners or for ourselves. We got rid of the whole set and replaced them with two much more focused sounds – one gentle background notification and another more urgent sound.”
One design technique Bennett has developed involves the extensive recording and comparison of vocal contours – the melodic and rhythmic aspects of speech – from many different languages, to identify universal patterns that can help create a sound design language. For example, a statement that means “Ready to go?” can have a very similar pitch pattern when spoken in English, French, Japanese, Mandarin, Spanish or Russian. It’s basically “up, down, and a small leap,” he says.
Bennett took that particular vocal contour and replicated it musically, so that it can be heard underlying the 2.5-second Windows 10 calendar alert prompt. This technique has shaped the entire set of Windows 10 sounds. “The language contours are deeply integrated, not intended to be heard literally, or consciously,” he said. “They should just be felt intuitively to create an emotional connection that feels natural, instinctive.”
Bennett believes the best operating system sounds should be deeply integrated with the events they support. For example, texting is more time-sensitive than emailing, so the Windows 10 text messaging sound “pulls you forward a bit and is a little more alertful,” he said. For a new email, “you still want to know something’s come in, but the sound pulls back a bit. It’s a little more relaxed.”
Does he call his creations “music”?
We design sound with silence in mind.
“In the broadest sense, yes. I would describe them as paramusical,” he said. “They utilize musical elements – rhythm, melody and harmony – to make sounds that feel beautiful, but they should never call attention to themselves as a piece of music,” he said.
Musical concepts certainly play a major role in Bennett’s design thinking.
“The error-message tone uses a minor 9th interval, which is definitely a little dissonant and says, ‘You really need to pay attention to this,'” he said.
While more tech companies are now employing audio directors like Bennett, “as a discipline, sound design still lags a little behind hardware and visual design,” he said. “We traditionally haven’t been deeply integrated into product design teams, aside from games. Microsoft was one of the first companies to realize the value of embedding sound designers with product teams.”
In addition to influencing Microsoft and technology design more broadly, Bennett thinks the discipline of sound design has an obligation to the world at large. The New York Times, in a Feb. 9, 2018 story, noted the cacophony produced by today’s ubiquitous electronic devices, asserting that “bombastic, attention-grabbing inorganic noises are become the norm [and] disruptive sonic alerts trigger Pavlovian feedback.”
Bennett hears that.
“There are so many device sounds in our environment now. Windows sounds alone are heard hundreds of millions of times a day around the world,” he said. “That’s a lot of sound affecting a lot of lives. Even if they are relatively short, every sound has an emotional impact, whether we’re aware of it or not. We have a responsibility to approach this as a system and to help create an audio ecology that supports healthy relationships between people and technology.”
The World Health Organization has recognized that unexpected loud sounds can cause stress and anxiety which are detrimental to public health, and that unnecessary sounds and excessive volume are just another form of pollution.
“In a rainforest, there’s an incredible amount of information being communicated through sound, with many layers in motion simultaneously – birds, insects, trees, plants, water and wind. And it’s all very intelligible because the acoustic design of a rainforest has evolved to be naturally orchestrated, with a deep harmony that let’s all the layers breathe and function together. That’s a powerful metaphor for how we should be designing sound.”
Toward the end of our conversation, I made a confession to Matthew: I haven’t operated my Windows computer with the sounds turned on since, oh, about 1990. I found them unnecessary and even irritating.
I asked him what I’d been missing – whether there is some subtle aspect of the OS that is being lost on me.
He answered, “The right sounds at the right time, can support a more efficient and more pleasant user experience. They can convey important information and improve the rhythm and flow of attention, which is really our most important resource. They can convey crucial information when we’re away from a screen. They can improve the way our technology feels. We want people to know it’s OK to turn your sounds back on. Our modern approach to sound design is deeply respectful. We’re not going to boot up loudly in a meeting or in the library, we’re not going to disturb the people around you. It’s not going to be random noise. It’s going to be a small set of beautiful sounds that are carefully curated to communicate important information very efficiently and to sit well in your environment.”
A Gentle Reminder
Matthew Bennett on creating the Windows 10 calendar alert sound
A lot of people feel anxiety over their calendar sounds, because it means there’s something they have to do. Some of them say it’s like responding to fire alarms all day. We needed something that was alertful but not anxiety-producing. And we wanted to get the right amount of optimism and energy, pulling the user forward to their next activity, but with the feeling of a calm, supportive friend.
This sound is meant to be heard at lower volumes and to be more felt than heard. It has a beginning, middle and end. If you listen closely, you’ll hear that it’s a rhythm of seven equal pulses. It starts low and slow, with three pulses that are designed to be felt more than heard. And it lasts a long time for a user-interface sound – 2.5 seconds – but at normal volume you only really hear part of the sound because those first three tones are so soft. They’re like a breath, a musical pick-up, to let you know something is about to happen. Then the volume swells a bit, it blooms, to make the middle section more audible. And at the end there’s a long reverb tail, falling off, that feels very transparent and light but can also improve audibility in certain loud contexts or when users are away from their device.
So it’s long sound, but very open. It’s definitely not alarming. It feels lightweight and pleasant and has a nice emotional texture.
There’s also a subtle left-to-right movement in the sound field that you can hear through headphones or decent speakers, like those on a good laptop to tablet.
There are foreground and background layers baked into the finished sound. The foreground is digitally sculpted plucks and tuned percussion. The textures sound familiar but they aren’t real-world instruments.
There’s a triplet feel to this sound and to a lot of the others in Windows 10. Over the years, the sounds that usually feel the most fluid, and that can balance the right qualities of energy and calmness, have tended to be resolved to an underlying triplet rhythm. So that pulse, that rhythmic substructure, has become part of our DNA.
We want to sound organic, and integral. That means we definitely don’t want the sounds to feel like they’ve been programmed on a computer. But we also don’t want to sound like a human being performing a little piece of music inside your device. So we resolve to a subtle temporal grid, to feel a little machine-like, while still keeping a little soulfulness.