Archive for April, 2010

As the world Turings…

For two years I’ve flirtatiously circled the Pygmalion myth, toying with human-machine interactions in which it’s not necessarily clear to the human that s/he’s interacting with a machine or human-human interactions in which both participants are convinced that the other is a machine. I can’t seem to get away from this idea of tricking people into adopting mistaken mental models of interactions. I thought it would be fun to create two bots that would follow each other on Twitter. Caleb Larsen, whom I’ve written about before and with whom I’m beginning to believe I share an eerie and otherworldly mental connection (I found this today, compare to my Obama piece) created a script that updated his Facebook and tweeted randomly generated status messages as part of Whose Life is it Anyway, though in the end he abandoned algorithmically generated messages for appropriation of other people’s statuses—which I find conceptually stronger but no longer relevant to the topic at hand.

In any case, in 1950, Alan Turing wrote a paper about thinking machines. In it he proposed a thought experiment in which a person is asked to converse via teletype with a person and computer pretending to be a person. If s/he is unable to definitively distinguish between the two, goes his argument, the computer is effectively intelligent. People have taken issue with Turing’s conception of intelligence, but nonetheless, over the years, this “Turing test” has spawned doctoral dissertations, colloquia, academic prizes, late-night geek-outs, and many software implementations of computerized interlocutors or “chatterbots.” The first of note was Joseph Weizenbaum’s ELIZA, a Rogerian psychotherapist (you can still talk to her here). She was followed by PARRY, a paranoid schizophrenic. A match made in heaven, I know, but their conversations weren’t nearly as interesting as this exchange between more sophisticated later chatterbots (in this case ALICE and Jabberwacky). Awesome.

There have been a couple of really good ITP projects that riff on what I might call the nebulous interlocutor. Generative Social Networking is my absolute favorite ITP project—in conception, in execution, even in documentation. After using a Bluetooth exploit to download all the contacts on your cellphone, a program calls each number in succession, playing a recording of the last person it called as the other half of the conversation. The most amazing thing when you listen to the demo is the realization that most people have no idea they’re talking to a recording! And some of the “conversations” that develop would easily fool a casual observer too. The ritualized form of phone conversation combined with the latencies and poor connections to which frequent cell phone use have accustomed us make it really hard to tell the difference.

That was part of what made the Popularity Dialer so much fun (and ultimately led to its demise—though creators JennyLC and Cory referred to it last week as “dormant” rather than dead). The premise popular people get lots of phone calls, so what better way to enhance your popularity than by increasing the number of calls you receive? Enter your phone number on a website and schedule a call from one of five characters who think you’re awesome (girl dying to date you being my favorite). At the appointed time, your phone rings and the voice you’ve selected speaks its half of a recorded phone conversation, pausing several beats for you to respond. It seems totally real to onlookers. The problem? It seems totally real to many of the people receiving the calls after their friends entered their number as a prank. Worked great until a humorless FCC lawyer got a call late one night from dude wanting to get some beers.

I love both of these projects. They raise questions about the subjective nature of interaction that don’t get discussed all that much in the literature. So much of an interaction is in our heads. That’s the great lesson of Apple’s marketing—you can take a shitty phone that’s uncomfortable to hold and inconvenient to talk on, but if people are emotionally attached to it, they’ll find using it a pleasure anyway (I think Donald Norman might have said something similar a little more eloquently). In our case, if I think I’m talking to a real person, my experience of that conversation will be radically different from my experience of the exact same conversation if I know I’m talking to a recording—just think of that weird, disjointed feeling you get when a friend’s answering machine picks up and you think it’s him and start talking only to realize a second later that it’s a recording. Richard Powers’s Galatea 2.2 deals with this notion of artificial intelligence as deception, and I want to as well.

I propose a film of a woman with her back to the viewer. She is obviously concentrating hard, occasionally tapping a pencil or reaching for her coffee mug but otherwise moving very little. A phone number is displayed beneath the frame. The viewer calls the number and suddenly the phone on the desk next to the woman rings. She picks it up, and the viewer is amazed to hear her voice both on the screen and through his phone. He speaks to her. She responds that the connection is not clear, she can’t hear him well. He tries to gauge whether she is a real video or a clever program. She hangs up in anger and frustration. She looks at her phone and decides to call back. The viewer’s phone rings and when he picks up, she apologizes for the poor connection and asks him a question. When he answers, she asks another. Suddenly, she has to go. She apologizes, turns toward the screen, waves, and hangs up. The viewer scratches his head and calls back. Her phone rings, she looks at the number and sends the caller directly to voicemail with an over-the-shoulder wag of the finger. And scene!

I’ve seen a couple implementations of phone-enabled interactive movies, but they’re infantile choose-your-own-adventure narratives constructed like corporate phone trees (“if you’d like to see the hero die, please press pound now, otherwise, stay on the line for more options”). I want the interaction to be the purpose of the piece, not a means of advancing a canned story, though I do love the bizarro preview man voiceover in this German interactive “horrah” film:

My system works in a similar way, though without all the voice recognition. I’m interested in exploring how much of such an interaction is actually reactive. In Japan, for instance, it’s definitely over 50%, but I’m working on the assumption that it will be similar for the viewer speaking on the phone, that the character in my movie won’t need to respond directly to the viewer’s words because the social inertia that carries people through uncomfortable party conversations with socially maladroit companions will cause him to behave a certain way in this particular interaction—enough that I’ll be able to maintain some doubt as to whether they’re actually participating in a real conversation. Based on several recent interactions with customer service representatives over the phone, I can’t swear that health insurance companies haven’t already commercialized and adopted this system.


Plentiful bandwidth, virtually free storage, and internet connected cameras has translated into a glut of online video. When anyone can upload to the online panopticon, it’s only a matter of time before people start exploiting the web’s massive audience to crowdsource moonwalks, personal interpretations of the Mos Eisley Cantina scene, ads, or homemade porn—for fun and for profit.

Well, guess what? I don’t want to see your videos. Not the ones you’ve uploaded at least.

The proliferation of cameras everywhere makes it less and less likely that you are ever not being recorded and uploaded the minute you do something remotely interesting. See, for instance, Hong Kong Bus Uncle, the infamous “don’t tase me, bro” (which I find so distasteful that I refuse to link to it), Chinese Airport Woman, and el niñato de Valencia. But again, these are actions performed in public—the operating assumption has to be that someone is recording. And with sites that make live broadcasting as easy as hitting a button on your phone (UStream for instance) popping up like nefarious little mushrooms, it’s entirely possible that your public meltdown will be captured and transmitted live and from several different angles. Totally unscripted reality TV, it’s like your real life, only more interesting.

But not to me. I’m more interested, at least for the purposes of this argument, in recording deviously, either in secret or with unacknowledged intentions. At some point in the future, it’s conceivable to imagine that there will be no place where one is legally protected from being filmed and/or photographed. Or when there are just so many people and devices filming and uploading so many things that prosecuting them all will be impossible, which is functionally equivalent. It is from said future that the ideas that follow come.

What if I created an iPhone app that requires you to hold the device up to your ear as if you were talking on the phone (or when you’re actually talking on a phone with an open source platform) entirely as a pretense to upload video the camera on the back of the phone is recording without your knowledge. There would probably be a lot of hands in the way, but that would make it easier to filter through the results in software. You’d never be in the video so it would be hard to definitively identify it as yours.

A slightly more elaborate variation on that theme would be to build cameras into other devices. One of the big payoffs for me of the Eternal Moonwalk mentioned above is that the majority of people tend to moonwalk across their living rooms, so you get to see the insides of people’s homes all over the world. What if everyone who bought a Roomba were unwittingly inviting an autonomous, wireless streaming surveillance camera into their home? The easiest way I can think of doing this is embedding cameras into particularly nice pieces of furniture left out on New York City sidewalks.

Page scraping and iframes offer another interesting alternative video source which might actually be much less illegal since technically you’re not moving the video from its original location. Instead, you’re finding video content, preferably unembeddable proprietary stuff, and using a web script to strip away any surrounding material and reproduce it in a different place—and it never moves from its original location.

My favorite approach, though, is simply to lie about your intentions. It might be as simple as creating a video high score board for an online game, where instead of their initials, people leave a ten-second taunt for the players they’ve just displaced. A database filled with video taunts has many potential uses. It might be more complicated, for instance creating an online application that uses face detection to perform some non-camera-related function—shaking your head to pan an image back and forth for instance—so that when the application requests access to the user’s web camera, he thinks nothing of pressing “OK,” never suspecting that his face is being displayed on a billboard somewhere across the globe with the supertitle “Did you know that 1 in 3 people has genital herpes?”

Or, as I discovered in the process of writing this post, offer some sort of online video conversion. Video formats are confusing as hell. Put up an all-in-one converter, make it look slick, and simply “keep a backup copy” of people’s video when people upload it!

The Sound of White Space or Pregnant Pause Parturition

As anyone who’s tried to write fiction knows, the real hurdle is not deciding what to write, it’s deciding what not to write. The empty page, like the empty score or the empty canvas, is white not because there is nothing on it but because everything’s on it—possibility, like light, is additive. The act of putting a word on a page, a note in the air, a drop of paint on the canvas removes a bit of that possibility, revealing a glimpse of what it may actually become. I know I’ve read somewhere that a block of uncarved stone contains every sculpture and that it’s only by chipping and chiseling that an artist collapses the artistic wave function into a singular reality. The result is the interface between the remaining possibility (positive space) and the absence of what has been removed (negative space).

Sometimes, though, that interface is deceptive. Foams, for instance, have voluminous contours, but pressure or heat or vigorous movement reveal their deception, reducing them to a mere puddle. Impermeability between negative and positive spaces in a work of art may well correlate to its quality and perdurance, I put that out there to the turtlenecked Barthes and Foucault-reading crowd to discuss. In any case, designers who find commercial success of the Architectural Digest sort are great fans of drawing rigid lines of Germanic severity to divide what’s there from what’s not.

While I find this hard-edged contrast alienating in architectural spaces (I much prefer the worn edges and threadbare plush of vernacular utilitarianism), I all but require it linguistically. There is no place for diaphanous prose in my bookshelf, nor will I fight you for tickets to any recent mainstream movie. Blurring the boundary between meaning and nonsense purposefully is either comedy or chicanery; accidentally, it’s the sign of mental rot.

I find political discourse in general and American political discourse in particular a perfect example of this foamy, insubstantial nonsense, a populist pastiche of pre-chewed jingoistic pablum that fills heads with bubbles that quickly deliquesce to nothing. I was curious, in exploring notions of positive and negative space, to discover how this discourse is actually constructed.

To this end, I took this year’s State of the Union address, all 69 minutes of it, and reduced it to its negative space, cutting out all of President Obama’s utterances. What remains is a strangely compelling silent dance between the beats. How we don’t speak is as idiosyncratic as how we do. In Obama’s case, the space around his words is punctuated with generous pauses and a constantly turning head (though one suspects this may have more to do with the dual teleprompters than with his oratorical style). His hands are animated while he speaks, but drop with a thud against the lectern as he pauses, fingers interlocked.

The rhythm of his silences follows the rhythm of his speech. Even without hearing a word, we can tell by watching the crowd how his rhetoric moves between introductory remarks, political self-congratulation, and exhortation, before ending on a note of overwrought patriotism. The accompanying silence in the room is mesmerizing, both for its depth and its duration. Here is a man who can hold an audience for over an hour, and, I’d argue, comes pretty close to holding an audience for half an hour without saying a damn thing.

How much of the negative space that we experience do we throw away as soon as we can see where the positive space begins and what do we lose in the process? I think it depends on what it is we’re experiencing, but my guess is regardless, it’s more than we think.