The language of data: fear + words

by Randall Snare

I just read Super Sad True Love Story, the latest novel by Gary Shteyngart. It took a bleak but hilarious look at the very near future, which involved the devaluation of the dollar, sexual liberation gone wrong, and constant, terrible streams of data. To me, the terrifying data was the most interesting.



Everyone in the novel had with them at all times, their ‘apparat’ - a thinly veiled iPhone - with which they rated everyone in the room by attractiveness (you’re the 8th ugliest person in the bar right now!), and constantly streamed their opinions - kind of like sitting with someone who never stopped tweeting.

“But I don’t even know her personality,” I said . . .
“The personality score depends on how ‘extro’ she is,” Vishnu explained. “Check it out. This girl done got three-thousand-plus images, eight hundred streams, and a long multimedia thing on how her father abused her. Your apparat runs that against the stuff you’ve downloaded about yourself and then it comes up with a score. Like, you’ve dated a lot of abused girls, so it knows you’re into that shit.”

Shteyngart is no traditionalist- he admits his love for his iPhone in interviews - but this is a dystopic view of the future. The constant projections of personal data remove people from actual reality. And it shortens their attention span to almost nil: no one reads books anymore (which are called ‘printed, bound media artifacts’). Their brains can only follow bits of information, nothing longer than a phrase.

But what’s most interesting is the leap in logic that their apparats make. As if the fact that data exists (“you’ve dated a lot of abused girls”) removes the need for interpretation (therefore you’d like this person who blogs about being abused). That’s the best satire in the book.

How data works (or doesn’t)

How many of us have seen a completely useless google ad next to an email? Maybe the question should really be, how many of you have seen a useful one?

For example, my friend wrote the following email to me:

Email says: I went to a horse and acrobatics show last night. Jealous? It was so French Canadian. Also it started with a video of a horse giving birth in a meadow.

Links say:

  • Dublin Daily Deals
  • Thai Chi Ireland
  • Optical Express - Ireland
  • Thai Massage - Temple Bar

Because my friend is mocking a horse show doesn’t mean I need laser eye surgery. The only thing they got right is that I’m in Dublin.

The only way to interpret language is by understanding its context.

Google and context

But Google adwords don’t take context into account. Their explanation for how their Gmail ads work is the same as how they place AdWords next to things. From Google’s help centre:

Google’s Gmail service, which is part of the Display Network, also displays AdWords ads. Gmail ads are placed by Google computers using the same automated process that matches relevant AdWords ads to webpages and newsletters.

The examples Google uses in their explanations about how their ads work (which is really a ‘don’t worry, we don’t use your personal data’ disclaimer) uses the examples ‘cameras’ and ‘soccer.’ (“If you’ve recently received a lot of messages about photography or cameras, a deal from a local camera store might be interesting,” Google says).

They’ve machinated human language and removed its nuance in the process.

Nuance: the unalgorithmable

Emails are made of lots of words. They aren’t as simple as “Hey Jesop - I just got a new camera!” for which Google’s targeted ads would work. Emails - and other written things - aren’t just filled with semantic meaning, but with subtext. Algorithms treat words like the basic components of language, while the actual basic components are often hidden - elements like association, nuance, emotion and humour.

All web designers should know about ELIZA

In 1966, a scientist named Joseph Weizenbaum built a computer called ELIZA. The computer was a language programme, with which people could have full conversations. People could talk to ELIZA and it would respond based on certain keywords that it heard.

Weizenbaum had programmed several phrases into ELIZA that it would use to make conversation. For example, if ELIZA heard the word “mother” it would respond with a question about “family.” That may work. But then again, it may not.

Consider the 80s

Remember the 80s and the prevalence of ‘yo mama’ jokes? Remember the use of “mother” as a slang adjective meaning “huge”? That’s just two instances of straightforward definitions slanted by humans.


Language is dependent on context. No set of rules can incorporate this. You can’t codify context and emotion. That’s why the semantic web is such a challenge - and why the ads next to my email are such a joke.

The rules we’ve built so far work in the currency of the word, and therein lies the problem. Modularity can’t be at word level. If we’re to build a useful and flexible semantic web, shouldn’t we do with building blocks that we can work with?

But what is there to work with? Taking a closer look at language programmes is a good first step:

1) Clever bot: Rollo Carpenter wrote this language programme. It has no responses or phrases built in, rather it simply collects information as you talk to it. That is, the only language the robot can say is stuff that it has heard. Right now it talks to about 3 million people per month, and storing all of that information. Try talking to it; it’s very interesting, but more poetic than sensible.

2) Bina 48: Martine Rothblatt built this bot to look exactly like his wife, Bina Rothblatt, and the program itself is actually built with her words, thoughts and memories. What ensues, however, is mostly confusion. This video shows that confusion (the best part? When Bina says friendship is all about “conspiring to take over the planet”).

It seems there’s no real programme we can build that will fully replicate human language. And that’s probably a good thing, or else our earth will start to look like Blade Runner. One thing we can do though, is learn more about language.

The more we understand language and comprehension, the better modular web universe we’ll build. A future free from irrelevant Google ads. Or at least apparats.


Written by
Randall Snare
Image by

Add a comment

6 responses to The language of data: fear + words

  1. Cool post Randall :)

    Of course, I suspect that we’ll soon have computer programs that will be able to determine context - nothing is impossible. I suspect more that computers will never be able to grasp the human psyche - something that makes every relationship we have unique.

    Rock on!

  2. Great post!

    My sister and I were just talking about this on the phone last night. She recently bought a camera online, and keyword ads start showing up.

    You know what would be an interesting outcome? If advances in the semantic web could take into account the obvious chronology of a situation. When she purchased the camera, she received an order confirmation. When it shipped, she received a shipping confirmation. YOUR MOVE, INTERNET.

    What could be served up? Ads for accessories. Books about photography. Lens cleaner. Etc., etc. Amazon does this now, of course (with varied success). And Amazon even knows what you’ve purchased, and likely, the steps you took to get there.

    Outside of places like Amazon, it gets complicated. The sales funnel has un-funneled itself. People draw research for purchases from so many different sites and social platforms (not to mention real-world interactions) that offering up relevant ads becomes a nearly intractable situation.

    My solution: get some people to read every email, and hand-serve ads based on context. (Or not.)

    I will be very interested to see where this leads. And if companies can get there without totally freaking people out regarding privacy (e.g., “We’ve been scanning your tweets and texts and Facebook updates and Foursquare check-ins and search history and your cookies. BUY THIS PRODUCT.) Then again, the comfort level with that sort of thing varies by demographic.

    Again, thanks for another 100% awesome post!

  3. Gr8 post.

    We, digital strategists , need to understand the context of web services first and find out how ads works for the services.

  4. Thanks for your comments!

    Clinton - that’s a great point: we should start with things we can measure, about which we can make reasonably accurate conclusions. Stages of purchase is a pretty clear one.

    And Leo - yes, I certainly hope computers can never understand human psyche. Although they seem to be making strides in collective human psyche. Did you try talking to Clever bot?

  5. Pingback: Interview With a Robot - « dogsmeat

  6. Nice article. Elegant and inviting blog.

    I am reminded of Isaac Asimov’s Robot Series, a tattered copy lies open on my bedside table. I pick it up, re-read a story, connect it to my life today, smile, wonder, stare up at the ceiling contemplating the impromptu map made from the pealing paint in the beveled curve, and feel, time and time again, that he deeply understood the most fundamental difference between human and machine.

    Context, nuance, the plane of tools used by humans with language that is the mark of our humanity, of our live thinking feeling sensing spirit.

    Yes, I did say spirit. The closest word in english that gets at the flash of synapse, energy, connection, fire, that is life, that is human thinking that is the living flame of context.

    A computer program will not do for content strategy. It is a human activity, by design. This is the hinge. A human must sort things out for other humans.

    A machine may sort things out for other machines.

    Thanks for the stimulating thoughts. And connections.



Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


About Mapped.

We're two writers who make web things. We're interested in what makes stories go: in our brains, online, in design, fiction, culture and everywhere.

Further reading.

  • Contents
  • Sentence First
  • Stet