I just read Super Sad True Love Story, the latest novel by Gary Shteyngart. It took a bleak but hilarious look at the very near future, which involved the devaluation of the dollar, sexual liberation gone wrong, and constant, terrible streams of data. To me, the terrifying data was the most interesting.
Everyone in the novel had with them at all times, their ‘apparat’ - a thinly veiled iPhone - with which they rated everyone in the room by attractiveness (you’re the 8th ugliest person in the bar right now!), and constantly streamed their opinions - kind of like sitting with someone who never stopped tweeting.
“But I don’t even know her personality,” I said . . .
“The personality score depends on how ‘extro’ she is,” Vishnu explained. “Check it out. This girl done got three-thousand-plus images, eight hundred streams, and a long multimedia thing on how her father abused her. Your apparat runs that against the stuff you’ve downloaded about yourself and then it comes up with a score. Like, you’ve dated a lot of abused girls, so it knows you’re into that shit.”
Shteyngart is no traditionalist- he admits his love for his iPhone in interviews - but this is a dystopic view of the future. The constant projections of personal data remove people from actual reality. And it shortens their attention span to almost nil: no one reads books anymore (which are called ‘printed, bound media artifacts’). Their brains can only follow bits of information, nothing longer than a phrase.
But what’s most interesting is the leap in logic that their apparats make. As if the fact that data exists (“you’ve dated a lot of abused girls”) removes the need for interpretation (therefore you’d like this person who blogs about being abused). That’s the best satire in the book.
How data works (or doesn’t)
How many of us have seen a completely useless google ad next to an email? Maybe the question should really be, how many of you have seen a useful one?
For example, my friend wrote the following email to me:
Email says: I went to a horse and acrobatics show last night. Jealous? It was so French Canadian. Also it started with a video of a horse giving birth in a meadow.
- Dublin Daily Deals
- Thai Chi Ireland
- Optical Express - Ireland
- Thai Massage - Temple Bar
Because my friend is mocking a horse show doesn’t mean I need laser eye surgery. The only thing they got right is that I’m in Dublin.
The only way to interpret language is by understanding its context.
Google and context
But Google adwords don’t take context into account. Their explanation for how their Gmail ads work is the same as how they place AdWords next to things. From Google’s help centre:
Google’s Gmail service, which is part of the Display Network, also displays AdWords ads. Gmail ads are placed by Google computers using the same automated process that matches relevant AdWords ads to webpages and newsletters.
The examples Google uses in their explanations about how their ads work (which is really a ‘don’t worry, we don’t use your personal data’ disclaimer) uses the examples ‘cameras’ and ‘soccer.’ (“If you’ve recently received a lot of messages about photography or cameras, a deal from a local camera store might be interesting,” Google says).
They’ve machinated human language and removed its nuance in the process.
Nuance: the unalgorithmable
Emails are made of lots of words. They aren’t as simple as “Hey Jesop - I just got a new camera!” for which Google’s targeted ads would work. Emails - and other written things - aren’t just filled with semantic meaning, but with subtext. Algorithms treat words like the basic components of language, while the actual basic components are often hidden - elements like association, nuance, emotion and humour.
All web designers should know about ELIZA
In 1966, a scientist named Joseph Weizenbaum built a computer called ELIZA. The computer was a language programme, with which people could have full conversations. People could talk to ELIZA and it would respond based on certain keywords that it heard.
Weizenbaum had programmed several phrases into ELIZA that it would use to make conversation. For example, if ELIZA heard the word “mother” it would respond with a question about “family.” That may work. But then again, it may not.
Consider the 80s
Remember the 80s and the prevalence of ‘yo mama’ jokes? Remember the use of “mother” as a slang adjective meaning “huge”? That’s just two instances of straightforward definitions slanted by humans.
Language is dependent on context. No set of rules can incorporate this. You can’t codify context and emotion. That’s why the semantic web is such a challenge - and why the ads next to my email are such a joke.
The rules we’ve built so far work in the currency of the word, and therein lies the problem. Modularity can’t be at word level. If we’re to build a useful and flexible semantic web, shouldn’t we do with building blocks that we can work with?
But what is there to work with? Taking a closer look at language programmes is a good first step:
1) Clever bot: Rollo Carpenter wrote this language programme. It has no responses or phrases built in, rather it simply collects information as you talk to it. That is, the only language the robot can say is stuff that it has heard. Right now it talks to about 3 million people per month, and storing all of that information. Try talking to it; it’s very interesting, but more poetic than sensible.
2) Bina 48: Martine Rothblatt built this bot to look exactly like his wife, Bina Rothblatt, and the program itself is actually built with her words, thoughts and memories. What ensues, however, is mostly confusion. This video shows that confusion (the best part? When Bina says friendship is all about “conspiring to take over the planet”).
It seems there’s no real programme we can build that will fully replicate human language. And that’s probably a good thing, or else our earth will start to look like Blade Runner. One thing we can do though, is learn more about language.
The more we understand language and comprehension, the better modular web universe we’ll build. A future free from irrelevant Google ads. Or at least apparats.