Decoding the Human Voice

What is said, how it's said, how it sounds, and why it matters

By Brendon O'Connor

Thu Feb 10 2022

Words are just a piece of the puzzle

Imagine decoding the human voice to gain a deeper understanding of what is said, how it’s said, how it sounds, and why it matters. As experts in sociolinguistics – the study of language as a window into human behavior – this is exactly what inVibe’s analysts do every day.

But why should we prioritize listening to the human voice in market research? Aren’t the words themselves all that we really need? At first, it might seem that way. The truth is, however, trying to determine what someone means based on words alone cannot be done without potential misunderstandings (and if there’s one thing companies want to avoid when talking with valued stakeholders, it’s misunderstandings).

To illustrate this using a classic example from the study of linguistics, consider how a guest who tells their host: “It’s cold in here” may be disappointed when the host simply agrees instead of offering to close the nearby window. In this situation, the host succeeds in grasping the superficial meaning of the guest’s words (i.e., a statement of fact), but fails in grasping their situational intention (i.e., a request to alleviate discomfort).

This scenario demonstrates that when we communicate, our words are just a piece of the puzzle. The reality is that when we speak, there are many factors accompanying our words that contribute to what we want to convey. To put it simply, it’s not just what we say that matters – it’s also how we say it and how it sounds. Using another example, the simple phrase — “You look great today” can be a genuine compliment if spoken in earnest, but also a sarcastic or derisive remark if spoken with a hint of a snicker. Alternatively, placing emphasis on ‘You’ (i.e., “You look great today“) implies that while perhaps the recipient looks great, someone else present looks to be in comparatively rough shape.

While these examples are extremely simplified, they provide evidence that listening for and interpreting cues beyond words is a natural process that we all engage in, as well as a critical process for truly understanding where people are coming from and what they really need.

Voice is the big picture

None of this is to suggest that words themselves – or “linguistic” cues – can’t still convey a great deal about how someone is feeling. It’s simply that if we ignore the voice, we lose other communicative cues that can further inform meaning. These types of cues are sometimes referred to as “paralinguistic” and include things like tone, pitch, and even pauses in speech. These features allow us to better understand the emotion and intent behind words. As a final example, consider how even the words in a declarative statement like “It’s Monday” can convey palpable uncertainty and confusion when the speaker adds a pause and an upward inflection (i.e., “It’s… Monday?”).

In our research, inVibe’s focus on the voice means we simultaneously analyze both the linguistic and paralinguistic aspects of a respondent’s speech – with the end result being a complete, high-res picture of what they’re trying to tell us. In a way, providing equal weight to what is said, how it’s said, and how it sounds is like seeing in color, while examining words alone is like seeing in black and white.

Healthcare market research often touches on areas that are complicated, high-stakes, and stressful for stakeholders, and the constant question being asked is: “Why should we focus so much on listening to voice?” However, realizing the extra dimension that voice adds to our data, the real question should be: “Why aren’t we listening more?”

How We inVibe the Process

Collect

Of course, the first step in analyzing voice data is to collect voice responses. Even at this early stage in the research process, inVibe’s practices are informed by fundamental understandings of how people communicate. In the field of sociolinguistics, it’s well established that participants in a conversation are influenced by their “interlocuters” – meaning that they change the way they speak based on who they’re talking to.

Let’s consider the traditional qualitative research approach of using an in-person moderator and the effects it can have on data. From the perspective of the research participant, the moderator is usually a complete stranger, making it likely for many participants to become nervous, uncomfortable, or unable to speak as they would under normal circumstances. They might even feel pressure to change their answers based on what they think the moderator wants to hear. On the other end, the moderator can also be biased or inconsistent with their inquiries, in some cases even “leading” participants to certain preferred answers.

By inVibing the process, participants call in at their convenience and engage in our research from wherever they feel the most comfortable. Since our prompts are pre-recorded, each participant is asked the same question, the same way, every time, allowing us to control the experimental variable and eliminate the potential for interviewer bias. This unique methodology elicits responses that are open and uninhibited, free from the influences of pressure or judgment that may be more typically present in a face-to-face scenario. This is a huge benefit for any type of research but is particularly helpful in healthcare where patients, caregivers, and HCPs alike are more likely to hold back when discussing delicate, personal aspects of their lives and professions.

Analyze

While it’s important to have good data, that’s truly only the starting point. The next step is to make sure the analysts working with the data have the expertise necessary to get the most out of it. Voice data is a rich resource, brimming with potential for valuable insight, but without specialized knowledge in the way language works, it’s likely that a good deal of that resource would go untapped. That’s why inVibe’s research team is built out with sociolinguists, who by trade are highly skilled in both research and the critical analysis of language data. Armed with inVibe’s powerful proprietary platform, our analysts are empowered to identify and track key themes and attitudes as they listen to all types of stakeholders shed light on the areas where they need the most help.

Deliver

At the final stage of research, we translate what we hear into actionable insights. But we don’t simply report findings and expect our clients to take our word for it. We bring the voice of our participants into the spotlight, providing concrete examples of where it drives our intuitions and the scientific framework for why those intuitions are valid. In other words, we highlight not only what we find in the data, but how we arrived there and why it matters.

"Talking is the most natural thing we do. It’s one of the most engrained skills we have as human beings. The most effective form of communication. We should be using that enabled skill to let people express themselves naturally,” Fabio Gratton, Co-Founder and CEO said.

At the end of the day, inVibe makes listening a standard by providing confidence and clarity to the decision-making process across the industry. See our use cases for examples of how we’ve helped companies gain deeper insight through the power of voice, and reach out to learn more about how our capabilities can best serve your needs. And of course, if you already have some ideas about how your company might benefit from harnessing the power of voice, get in touch – we’ll be more than happy to listen.

imagine

Decoding the Human Voice

Thanks for reading!

Introducing inVibe’s Topic Analysis Tool: Transforming Complex Voice Data into Actionable Insight Grounded in Real Human Emotion

See More, Understand More: New Quant Views on the inVibe Dashboard

Unlocking Cross-Study Insights: Introducing inVibe’s Multi-Project AI Chat for Deeper Market Research Analysis

/voice

/product

/resources

/use-cases

/company

@social