That unseasonably end-of-March-like gust of wind you all felt during the first few days of the week of February 21st was actually a climatic vacuum effect caused by the collective gasp of shocked and horrified lovers of language, innocent people who had the misfortune to find in their SpeechTek West freebie conference bag, and then to open and actually read, the Official Glossary of Service Automation Terms (OGSAT).
The pamphlet seemed innocuous enough, its cover fairly representing the reliable creative spirit of Speech Technology Marketing collateral – insets of three Caucasian professionals (as opposed to professional Caucasians) engaged in some form of mediated communication. It makes you wonder, though, why the football-player type is in his SUV, on his cellphone, and working on his laptop…such frenzied times in which we live, in the Age of Service Automation Terms. I sure hope his vehicle isn’t moving… The three thumbnails are set against the galactic backdrop of Space, The Final Frontier. (No depictions of speech or sound waves – a rarity.) The association of Outer Space with the Speech Biz struck me as interesting. The OGSAT cover looks serious and unaware of its Kitsch Quotient, but Space and Speech are, indeed, both tokens of Futurism. …And then there’s the über Voice User Interface (okay fictitious ü-VUI) of all time, Hal, from the wildly popular but obtuse sci-fi classic 2001: A Space Odyssey.
But you can’t judge an Official Glossary of Service Automation Terms by its cover. I milled about the exhibition hall, OGSAT in hand, casually leafing through an alphabetical arrangement of what I guess some conference organizers believed would assist me in my Speech Technology endeavors. The San Francisco Marriott conference area then took on a surreal, sinister cast as I inexorably began to sink, slow-motion, into the sickening sensation of what date rape must feel like. (The personal connection I feel with the topic I’m about to discuss, gentle reader, will be understood as a matter of course, if you will kindly indulge me…)
It’s was the glossary entry for “Discourse Markers.” At the top of page 18:
Discourse Markers (DM)
In the context of speech and dialog design, discourse markers help to bind adjacent phrases or provide emphasis. For example, words or phrases such as “like,” “you know,” “well,” and “or….” Discourse markers are not essential parts of speech, but provide for a more natural conversation. A multi-phrased utterance can be grammatically sound without a DM. The discourse marker can also be removed and adjacent phrases would still be strongly associated from a semantic point of view. Consider the following phrases: “It’s time to make, you know, the choices between 1 or 2. Please, like, make your choices.” the [sic] discourse markers “you know” and “like” could be expurgated from the text and the phrases would nonetheless have meaning and grammatical continuity. Discourse markers, then, are a way to inject casual or colloquial speech patterns into a dialog.
[end disturbing excerpt]
Whoa! What’s wrong with this picture? There are some mixed signals in this message, but if I were a betting linguist, I’d say that based on the riduculousness of the example prompts (“It’s time to make, you know, the choices between 1 or 2. Please, like, make your choices”), the author has a thing against discourse markers. But where to begin?
(1) “Like” and “you know” are indeed discourse markers, but does that make them appropriate for any prompts you've written lately? Um, no. “Like,” in particular, is stigmatized in careful speech settings. The general effect of these example discourse markers is unfavorable, suggesting that the utterances are unplanned and careless -- now how is that for help with your branding effort! Presumably unaware that his or her example prompts are laughably ill-conceived, the author’s message is loud and clear: “If you dare to use discourse markers, your VUI will sound as stupid and unattractive as my examples.”
For these example messages to be appropriate, the application’s tasks, users, business goals and branding objectives would have to warrant (seriously?) a VUI persona that is, I’d say, an intermediate-level speaker of English as a Foreign Language. "After all," ( = discourse marker indicating a request for minimal concession), it’s hard to imagine a native-speaker coming up with a construction like “make choices ['choices' in the plural] between 1 or 2 [not 'between 1 and 2'].”
Even if the persona were intended to be a native-speaker, who in their right mind would to try to pass off as legitimate a prompt that so boldly illustrates the notion of “sociolinguistic clash”? I’m talking about “…, like, …” and “…, you know, …” in the communicative setting of a commercial VUI production, and alongside the formal (and, in my opinion, stilted and stale) DTMF-speak construction “Please VERB.” ...Again, the read I’m getting on this persona is that he or she is not a native speaker of English, and that they have yet to master the appropriate use of disjunctive adverbs, also known as “discourse markers.” Meaning is meaning, but knowing how to use language in context is quite another. “Extinguish” and “put out [e.g. a fire, a cigarette]” mean the same thing, but you wouldn’t use them interchangeably. ...When I put myself to the challenge of retro-fitting the OGSAT example prompt to a likely persona from the world I know, several teenage Vietnamese students readily come to mind – I taught high school English for the Los Angeles Unified School District for about three years. Downtown LA. Deep downtown.
For the record, Deborah Shiffrin proposes an interesting analysis of “You know” and the related marker “I mean,” in her book called Discourse Markers (Cambridge University Press, 1996), appropriately enough. As I understand her story, “Y’know” is used to create a situation in which the speaker is acknowledging knowledge ( = meta-knowledge!) that is assumed to be shared with the hearer. "Y’know" sets up a framework for participation by gaining the attention of the hearer in order "to open an interactive focus on speaker-provided information” (p. 267). Shiffrin provides lots of real-life dialog-contextualized examples (p. 275ff.): “You know, they say an apple a day keeps the doctor away”, “A bastard’s a bastard regardless, y’know”, “We’re not all perfect, y’know”, “Y’know: what you lean towards and what you do, are two different things.”
The OGSAT also adduces “well” as an example discourse marker (but neatly overlooks that the “For example” at the beginning of that same sentence is also a discourse marker). Schiffrin offers that the prototypical use of “well” is to introduce a contribution as somehow inferior in quality to what the listener deserves or expects. (She discusses other uses, so there’s more to the story than I'm giving here.) Accepting her scholarly analysis of “well,” I can’t think of any of the VUIs that I’ve ever designed or consulted on where “well” would have been appropriate…anywhere.
Yes, people say “well,” but that doesn’t mean a VUI has to. People fart, too, and they use the F word. Y’know, not everything anthropomorphic is good for your VUI. Just as there are nouns and verbs that offend the ear, so too can discourse markers be inappropriate in certain contexts. Either the OGSAT author doesn’t realize this, or this fact is being withheld for the purpose of manipulating a naïve audience into espousing negative views about something the author is personally not comfortable with.
The sheer absurdity of these “example” prompts is an insult to our intelligence and should not be tolerated. Did someone actually find these prompts in an actual application? Or even in a design specification document? Who would create such implausible discourse-marker examples for the purpose of instruction? (I wonder what Shiffrin's opinion would be as to their plausability?) And who the hell put this in my conference freebie bag, anyway? Why does someone in charge at SpeechTek think that I need to be misled by only shockingly bad prompting examples that make a mockery of conversational prompting and its value to user-centered design?
(2) So the subtext of The “Discourse Markers (DM)” entry, through its use of sociolinguistically inappropriate (read “insufferably idiotic”) examples, is presumably to persuade us to avoid discourse markers. Notice, however: the development of the OGSAT author’s insidious and spurious argument against the use of discourse markers paradoxically relies on – can you guess? – discourse markers! For example, the second sentence of the “discourse markers” entry starts with “For example,” which is, of course, a discourse marker.
(Can you imagine designing a VUI that avoids “For example” and all synonyms of this disjunctive adverbial phrase?)
Now look at the entry’s last sentence: “Discourse markers, then, [italics mine] are a way to inject casual or colloquial speech patterns into a dialog.” “Then” is a discourse marker whose function is to couch the upcoming language in a logically resultative relationship with what came before, and is synonymous with “therefore,” or “thus,” which are also discourse markers. Not only does the author appear unaware that “for example” and “then” are discourse markers (this is safe to assume, right?), but he or she also seems to be ignorant of the fact that they are not confined to casual and colloquial “injections,” since he or she is also using them in this intellectual slop posing as a technical reference. It's self-referentially twisted madness. ...If “for example” and “then” are good enough for this “official” glossary, why aren’t they good enough for VUI prompts? How are VUIs more successful without them? If all human languages have discourse markers, why is better to avoid them in automation? Did Mother Nature screw up or something?
(3) Discourse markers constitute a grammatical class, and as such there is nothing inherently “casual or colloquial” about them. You want to see some discourse markers, open the Holy Bible. In fact, [by the way, “in fact” is a discourse marker, and so is “by the way”], this same resultative use of “then” (as we saw in the last sentence of the OGSAT entry, above) makes its first appearance in the King James Bible in Genesis 29:25, when Jacob figures out that his new father-in-law Laban had deceitfully sent his other daughter in (Leah, not Rachel, who Jacob was expecting and who was apparently the mo' fly honey of the two) for the big wedding-night shtoopfest:
"And it came to pass, that in the morning, behold, it was Leah: and he [Jacob] said to Laban, What is this thou hast done unto me? did not I serve with thee for Rachel? wherefore then hast thou beguiled me?”
The conversational analog of this sort of “resultative” discourse marker is “so” (e.g. “…So what card did you want to put that on? We take Visa, Mastercard, and the Discover Card.”) Besides the fact that it gives the listener a preview of the resultative meaning that is to follow, “so” simultaneously signals that the speaker is about to turn the floor over to the listener ( -- another Schiffrin analysis here), which can be a very useful device in engineering a conversation with everyday users.
There is no linguistic validity to the claim that discourse markers, as a distinct grammatical category, are “to inject casualness or colloquial speech patterns into a dialog,” since they are equally in evidence in formal genres. Just as there are formal nouns and verbs (like the noun “medial epicondylitis” for “tennis [or golfer’s] elbow,” or the verb “extinguish” for “put out”), so are there formal vs. informal discourse markers. ...When you go to apply for a loan at the bank or arrange a funeral for a loved one (both being formal settings), the bank officer and funeral director don’t suspend or discontinue use of spoken discourse markers, although their choice of discourse markers likely reflects a level of formality that befits the topic and setting (i.e. context).
Ironically, the OGSAT author tells us that sentences still make sense and have grammatical continuity without discourse markers (which I don’t dispute), but in doing so, he uses the adverb “nonetheless,” which is – you guessed it! – another discourse marker. “Nonetheless” is a disjunctive adverb frequently used in formal and academic styles; its discourse function is often to introduce some unexpected outcome. Other discourse markers that betoken more careful and formal styles: Regretfully, Theoretically, Nominally, Fundamentally, Curiously, Wisely, Rightly, Indubitably, Unquestionably, Reportedly, In conclusion, Hence, Moreover, Consequently, In contrast, Parenthetically.
(4) …Which brings me to my next point: the OGSAT’s definition of a discourse marker. With all that has been written about discourse markers, the description “to provide emphasis” must’ve been pulled fresh out of the author’s ass…with all due respect, of course. How many of the discourse markers you’ve seen in this article can be satisfactorily summed up as “providers of emphasis”? Where’s the beef? For Shiffrin, they re-inforce a kaleidoscope of richly different meaning relationships that arise when "brackets of talk" come together.
In Quirk & Greenbaum’s discussion of disjunctive adverbs in their Concise Grammar of Contemporary English there are over a dozen meaning-categories, only one of which seems analogous to the “provide emphasis” OGSAT definition.
As for the first part of the OGSAT definition, the bit about “helping to bind” looks like it came off one of my presentation slides, or maybe from the Voice User Interface Design book (Cohen, Giangola, Balogh, 2004).
(5) Yes, it is true that discourse markers can be “expurgated…and the phrases would nonetheless have meaning and grammatical continuity.” But here again, OGSAT author seems to be flaunting ignorance. According to Schiffrin, the nature of discourse markers is essentially redundant. (Can you say "raison d'être"?) It is precisely this redundancy that facilitates comprehension and enhances the functional unity (a/k/a “coherence”) of a stretch of language, and which is probably why discourse markers are a design feature shared by all human languages. Because discourse markers explicitly reinforce meaning relationships between sentences, listeners don’t need to work as hard to figure out how the speaker intends units of syntax to cohere -- this opposes a user-centered design approach but is what the OGSAT advocates by implication. “Expurgating” discourse markers from a dialog whose structure is unfamiliar to users, as is often the case with infrequent callers, only serves to increase the cognitive burden inherent in the listening comprehension task.
By the way, the idea that discourse markers can help or hinder an interaction, depending on the communicative context, is a not-so-new idea that should have been considered by the author of the Angel newsletter article entitled “The VUI View: Spice up your VUI with dialog [sic] markers.” Definitely in the running for the A-Little-Knowledge-Is-A-Dangerous-Thing Award, this perky piece begins: "One way to ensure that your Voice User Interface (VUI) sounds cold and robotic is by stripping it of dialog markers such as ‘ok,’ ‘and,’ ‘next’ and ‘finally’ that give form and texture to a dialog flow." (I wonder if the tricky, here’s-one-mistake-you-don’t-want-to-make angle might actually backfire on this Messenger from Heaven, as if the cold-and-robotic-sounding VUI is an ideal that designers should seek to “ensure” – the thought crosses my mind, what with speech technology people being so literal, and all.) ...Anyway, if I use the app every day, maybe the discourse markers should be stripped out – after all, the comprehension of familiar, ritual-like dialogs doesn’t need facilitating, so the redundancy benefits of discourse markers are irrelevant here. Squandered. They may come off as inappropriate. They may weigh down the pace of the interaction and annoy the user. You can read the article yourselves, and draw your own conclusions, at http://www.angel.com/newsletter/3-05/vuiView.jsp .
Inarguably, the success of a VUI interaction depends in great part on its success as a listening comprehension activity. By blindly “expurgating” discourse markers from a dialog, because the most god-awful prompting examples imaginable have scared you into thinking that discourse markers will make your VUI sound like a 16-year old ESL student from Lincoln Park, Los Angeles, the VUI designer consequently obliges users to expend undue mental energy in order to infer meaning relationships that the message no longer makes explicit. And what advantage would this serve? Whose advantage would this serve? How does expurgating the discourse marker “For example” from a prompt help the listener know that what follows is an example?
The logician Quine explains the value of redundancy in the construction of systems, (in Pinker 1994):
"It is judicious excess over minimum requisite support. It is why a good bridge does not crumble when subjected to stress beyond what reasonably could have been foreseen. It is fallback and failsafe. It is why we address our mail to city and state in so many words, despite the zip code… A kingdom, legend tells us, was lost for want of a horseshoe nail. Redundancy is our safeguard against such instability."
The redundancy conferred by discourse markers is a phenomenally effective design feature of human language, one that VUI designers should exploit, when appropriate.
.....
Now back to the eponymous topic of why VUIs suck, Mommy. We learn on the back cover of the OGSAT that the organization also provides: consulting services, workshops and training, industry benchmarking, usability experts, call flow logic designs, speaking engagements (discourse markers expurgated?), research reports, and how-to books. How can an organization so blind to the workings of natural language and so cocksure in its dismissal of linguistics as a science and its relevance to the design of a spoken language experience make the world a safer place for user-centered design? This glossary reeks of professional negligence and potential mal-practice. Y'know, with friends like these...
Wow! (Not a discourse marker.) I feel a lot better now that I got that off my chest (and gosh, I feel like eating raw meat!) I honestly did not have a restful sleep that first night of SpeechTek West – the OGSAT had provoked so many disturbing questions in my mind – not just about VUI design and the precarious future of this industry (which isn’t what I would call real “progress” at large), but about people’s motivations and intentions, and the professional “community” that we VUI people have made for ourselves.
The next morning, an old friend saw that I wasn’t handling this OGSAT thing very well. It wasn’t just the Rape of the Sabine Women’s Discourse Markers. That was just a sample. Anyway, this friend says, “Look, no one’s even going to read this piece of ****. And even if they read it, or if they don’t read it, it’s not going to make a bit of difference one way or the other. Now fugheddabout it!”
He’s right. What kept me tossing and turning through the night had very little to do with discourse markers, or the OGSAT truly incomprehensible treatment of the “passive” voice (now a “Service Automation Term”?!?), or the fact that “concatenation” isn’t even listed. It has more to do with pathetic, fearful, pitiful attempts at (in the words of friend and colleague Diego M., personal communication) "fencing in one’s own little cabbage patch."
It really, like, makes you wonder…you know, about people.
Recent Comments