-
Rise of the socialbots
This week’s CBC tech column is election-related, but to my delight, it’s also about robots. There’s a copy up at cbc.ca/tech, and one below, for posterity.
This past weekend, Canadians tweeted almost 18,000 election-related messages in 48 hours, according to analysis by Ottawa-based digital public affairs strategist Mark Blevis. But — and this isn’t as big a “but” as you might think — what if some of those messages weren’t written by human beings?
A new breed of computer programs called socialbots are now online, and they could be used to influence online political discourse.
What’s a socialbot? Basically, it’s a piece of automated software that controls a social media account.
Now, automated social networking accounts are nothing new. For instance, CBC has several Twitter accounts that do nothing but automatically post headlines with links to news stories. Or I can sign up for weather updates on Twitter that are published by a robot. And once, I interviewed the creator of a robotic toilet that automatically tweeted with every flush.
But socialbots are different. Socialbots hide the fact that they’re robots. Many are specifically programmed to to infiltrate online communites and pass themselves off as human beings. And they’re out there in the wild, right now.
If you’re like me (a human being with a Twitter account), you can post messages. You can reply to messages. You can re-post, or retweet others’ messages. You can follow other users.
Socialbots mimic all of these actions in an effort to blend in; in other words, to appear human. They reply to tweets. They retweet popular messages. Some of them even appropriate others’ tweets. There’s an old New Yorker cartoon: “On the Internet, nobody knows you’re a dog.” Socialbots are a bit like that, but in this case, on Twitter, nobody knows you’re a computer program.
For example, back in 2008, computer engineering students Zack Coburn and Greg Marra created a socialbot called @trackgirl, designed to infiltrate a group of running enthusiasts. When @trackgirl started following people on Twitter, they followed her back. As Marra wrote on his blog, @trackgirl started tweeting about her marathon training, and “she wove her way into the community. One day trackgirl tweeted that she had fallen and hurt her knee. Her followers immediately replied with concern, asking if she was ok.” People had developed some level of emotional connection @trackgirl without knowing she was a robot.
That’s what I find so fascinating and disturbing about social robots online. Imagine hundreds or thousands of autonomous software personas, each programmed to infiltrate and influence popular political opinion.
For his perspective on this, I called Tim Hwang. He’s the co-director of the Web Ecology Project, a research community that recently held a Socialbots coding competition.
Tim told me that socialbots can be programmed for a variety of desired outcomes. For instance, he’s seen bots designed to create new connections between users. “One of these bots is very prone to introducing people to one another. And we actually find that these bots have a really powerful influence on getting those people to talk to one another.”
As Tim explained, the word “coalition” kept coming to mind.
Socialbots could also be programmed for the opposite effect: to disconnect and disrupt existing groups. This software, and the popularity of online social networks, “opens up the possibility for these bots to have an aggregate impact on the way that real humans connect online,” he said.
This also has huge potential implications for election news coverage. When sites like Twitter and Facebook are so often used as a barometer for public opinion, what does it mean when some of the participants are silicon?
As far as I know, no Canadian politicians or parties are using socialbots. From a technical perspective, the software required to launch a socialbot campaign is available, open source, and free.
From a legal perspective, Canada’s Elections Act (not surprisingly) makes no explicit reference to the use of automated online personas. However, the act does have rules about reporting election spending, and rules about collusion that could prevent politicians from employing such a strategy.
If a politican were to try to influence public opinion through software and it came to light, I can’t imagine it’d be anything but scandalous. For example, during Toronto’s last mayoral election, a member of Rob Ford’s team allegedly used a fictional Twitter account to mislead a voter into handing over incriminating material. Many were critical of that strategy.
Public affairs strategist Mark Blevis calls the prospect of socialbot armies “extraordinarily creepy. I would say it’s unethical.”
Though I suppose that creepiness would only be apparent if a socialbot failed and was exposed as such.
Public socialbot projects have been small so far, but they’re scaling up quickly. Researcher Tim Hwang says he’s working on a “large-scale social architecture project” that will involve 10,000 Twitter users over the next three to six months. “The idea is to create a bot-constructed social bridge. So basically, these two groups of 5,000 users will become more and more connected without being aware that this aggregated effect is happening.”
So then, how do you tell the difference between a human Twitter user and a socialbot?
This can be tricky, as socialbots are designed to mimic human behaviour. And if you ask a bot whether it is indeed a bot, you’re not likely to get a truthful response.
Hwang’s advice? Try to hold an extended conversation with the suspected bot. “One of the things that the bots are able to leverage on Twitter is that the interactions are often very short. So it’s easy for them to get by that way. But if you have an extended conversation with a bot, you may be able to tell that it’s not responding in a human-like way or an intelligent way.”
[REDACTED: Pithy closing joke about holding an extended, intelligent, human conversation with *anyone* on Twitter.]
-
Do Not Track me online, please
This week’s CBC tech column is all about the Do Not Track feature baked into today’s release of Firefox 4, and last week’s Internet Explorer 9. Column is online at cbc.ca/tech, and posted below for posterity.
If you’re looking for the links I mentioned on the radio, here they are:
Browsers:
Do Not Track browser add-ons:
Other anti-tracking add-ons:
===
Almost everywhere go online, you’re being watched. From news sites (like this one), to Facebook, to YouTube, your online behaviour is being tracked, often without your knowledge or consent.
In response, two of the biggest web browsers are throwing their weight behind a new anti-tracking mechanism.
Both Microsoft’s Internet Explorer 9 (launched March 14), and Mozilla’s Firefox 4 (slated for release on March 22) support Do Not Track, a proposed internet standard that puts the privacy onus on website trackers and advertisers.
Why use Do Not Track? Let’s first put on our tinfoil hats and pretend that I have a heart condition.
Let’s say I spend the whole afternoon researching heart medication online. Then, later in the day, I try to get a quote for medical insurance. Could my potential insurer know how I spent my afternoon online?
According to Jonathan Mayer, a security researcher at the Stanford Law School Centre for Internet and Society, “There is absolutely no technical reason they wouldn’t have access to that. And that’s a real concern.”
Mayer is also one of the authors of the Do Not Track standards proposal.
The situation I just described isn’t some made-up example. Right now, in Britain, the website of the National Health Service contains trackers – small bits of code, often invisible – from Facebook, Google and others.
Here in Canada, the Health Canada website contains tracking code from Google that logs the length and frequency of my visits, my geographic location, and other details. The question is, Do you really want a third party knowing that you visited an infopage on syphilis?
Profiling
The concern here isn’t so much about collecting one or two pieces of information, but rather, about persistent tracking of your online behaviour across time and across different sites. Essentially, profiling.
The other concern is transparency. Or, rather, lack of transparency.
Most people don’t know that this kind of tracking is even happening. Many top websites (CBC.ca included) contain tracking code that quietly watches you in the background while you’re visiting their site. Sites also commonly leave behind cookies, which can also be used to track you across multiple websites.
Here in Canada, we have data privacy laws that require both knowledge and consent in order to collect personal information. Privacy advocates are concerned that current tracking methods collect personal information without either.
For his part, Stanford’s Mayer characterizes your online behaviour as intimate. “If someone asks, ‘Could I take a look at your browsing history?’ I would imagine that your answer would be, ‘No, of course not.’
“And yet that’s essentially what we have as an everyday business practice on the web.”
Pretty please
The Do Not Track technology attempts to address the privacy issues surrounding third-party web tracking. This is how it works.
Right now, every time you go to a website, your browser says something like, “Hello, YouTube, please show me a video.”
If you turn on these new Do Not Track features, your web browser still makes that request. But it also adds, “and by the way, you know how you ordinarily track me? Don’t do that, please.”
In that sense, it’s a bit like the National Do Not Call registry for telemarketers, but with two important differences: there’s no centralized list and, for now, there’s nothing that compels websites to honour your opt-out request.
By turning on Do Not Track, you’re simply communicating your preference not to be tracked. The system puts the onus on trackers, and right now, it’s a bit of an honour system.
Unlike the National Do Not Call Registry, there are no fines or disincentives for companies or advertisers that track you against your wishes.
Government action?
So the question is, should we have regulations that would force websites and advertisers to heed your request not to be tracked?
In the U.S., both the Federal Trade Commission and the Obama administration are pushing for Do Not Track legislation that would compel organizations and advertisers to respect the opt-out mechanisms.
Here in Canada, Privacy Commissioner Jennifer Stoddart has acknowledged the issues surrounding online tracking in a 2010 report.
When asked specifically about Do Not Track, her office responded: “We are following with interest the U.S. Federal Trade Commission’s proposal for a Do Not Track mechanism. Our Office has concerns about the lack of visibility with respect to online tracking, profiling and targeting. If people don’t know about such practices, they can’t take steps to limit tracking.”
This response highlights one of the main barriers to adoption for Do Not Track: lack of awareness.
What’s more, even though Firefox and Internet Explorer both support this Do Not Track mechanism, it’s not turned on by default. Users have to know it exists, and know how to turn it on.
The other challenge is that what we’re talking about today is part of a much larger online privacy movement that includes a wide array of technologies. In addition to the Do Not Track technology, Microsoft and Google have their own anti-tracking approaches and technologies, including block lists and cookie-based opt-out mechanisms.
Though technically simple concepts, these can be confusing for users.
If you’re still wearing your tinfoil hat, you may be wondering what can I do right now.
Do Not Track is currently supported by Internet Explorer 9 and Firefox 4. You can also add Do Not Track functionality to other browsers with plug-ins and add-ons.
And, as many websites do not currently support Do Not Track, it may also be worth investigating other anti-tracking plug-ins such as AdBlock Plus or Ghostery.
-
$2.99
Once upon a time, I was in a rock and roll band called The Canaries. This past week, Tristan found a copy of our first and last recording at the Value Village near Queen and Logan. I’m not quite sure how this correlates to fame and fortune.

-
Do I want an app that tells me what I like?
This week’s CBC tech column is all about the double-edged sword of online personalization. There’s a copy up at cbc.ca, and one below for posterity.
===
Last week, a Vancouver-based app-maker called Zite launched a new iPad application of the same name that it bills as “a personalized iPad magazine that gets smarter as you use it.”
As a member of the so-called Generation Y, I am, of course, a narcissistic egomaniac with an affinity for anything that promises to shape itself in my image. So, of course, I downloaded it.
Here’s the idea: Zite brings together stories from across the web — blog posts, magazine articles, stories from news websites — and filters them by your particular interests, creating an up-to-the minute personalized reading list just for you.
The underlying technology was developed at the University of British Columbia’s Laboratory for Computational Intelligence.
Zite’s fundamental innovation is that it tracks how it’s being read.
After narrowing down the subject areas, it shows you a number of stories it thinks you might like, then it tracks how you interact with them.
“We have an underlying philosophy that ‘you are what you read,’ explains CEO Ali Davar.
“When you’re on Zite, and we see you bypassing articles that we’ve recommended to you, that tells us as much as when you select an article. So, when you do that continuously, through time, we learn something about you.”
Staying on track
Zite also pays attention to the form and content of what you’re reading. Is it a long article? A short article? Who wrote it? Does it come from a particular political viewpoint?
By tracking your reading habits, Zite tries to give you more of what you’re interested in, and less of what you’re not.
This tactic will sound familiar if you’ve ever bought something from Amazon, used a TiVo, or watched Netflix.
All of these services track people’s behaviour, then use that information to give them more of what they like.
When I tried the Zite app, I really did have the sense that it was learning about me.
And while I recognize that these recommendation services can be useful, part of me can’t help but worry. Specifically, I’m concerned that online personalization will perpetuate my bad or lazy habits.
For instance, I spend a lot of time reading gadget blogs. Arguably, too much time. Over the past two weeks, there’s no question that I’ve read more reviews of the iPad 2 than necessary.
I know others with similar vices: Hollywood gossip blogs or obsessive sports coverage. My question is: Do I really need a tool that will help me find yet another iPad 2 review? Or would I be better off reading something new and unfamiliar?
Comfort versus challenge
For another perspective on this, I called Ethan Zuckerman, a researcher at the Berkman Centre for Internet and Society at Harvard University.
As he put it, “personalization is absolutely a double-edged sword. You can imagine it being a force to challenge you, and push you towards things you might not otherwise have read.
“You can also imagine personalization cocooning you in a world of familiar, unthreatening, unchallenging, but copacetic news.”
Zuckerman frames this tension as “comfort versus challenge.”
And it is the first part that worries me. That by giving me more of what they think I want, these personalization tools might actually narrow my worldview. They might cocoon me in the comfortable to keep me coming back.
When I asked Zite’s Davar about this, he told me his company’s technology is focused on what he calls discovery.
“It’s not simply about getting more of the same. It’s actually quite the opposite. The challenge is to give you the things that you wouldn’t typically find if you went out and looked for yourself.”
From a technical point of view, this is apparently a difficult feat. There are two very different answers to the question why didn’t you read that.
One is, I didn’t read it because I know I won’t like it. Another is, I didn’t read it because I didn’t know it existed.
Training computers to tell the difference is hard, but Zite believes they’ve found a way, using a secret algorithm sauce.
Zuckerman, on the other hand, told me that “anytime someone is providing algorithmically-organized information, there are some politics behind it. And you really owe it to yourself to think about what those politics are.”
The personalized recommendations that come from Netflix and Amazon are generated by proprietary algorithms. We don’t know exactly how they work, but we do know their objectives: to sell more stuff, and keep subscribers watching.
Zite’s algorithm is also proprietary and Davar says it took “a lot of money and a lot of time to develop.”
Zite’s service is currently ad-free, but Davar told me the company plans to add advertising and a subscription model to generate revenue.
It seems we’re headed towards a world of increasing personalization. As such, it’s important to think critically about personalization technologies.
When faced with an algorithmically-generated recommendation, we need to ask questions like why is this being recommended to me. Is it making me comfortable, or is it challenging me? And who’s getting paid?
For me, it’s about recognizing your own personal blind spots, and not necessarily trusting a computer algorithm to help fill them.
Now, if you’ll excuse me, I have to go read another iPad 2 review.
-
Warning: this book will self-destruct
My CBC tech column this week is all about the self-destructing e-books that HarperCollins now licenses to public libraries. There’s a copy up at cbc.ca/tech and one below, for posterity.
—
As of March 7, HarperCollins e-book titles licensed to Canadian schools and public libraries come with a new restriction: after 26 checkouts, they self-destruct.
The e-books simply won’t work anymore. If a library wants to keep lending that book, it’ll have to buy a new license, potentially buying the same book over and over again.
Right now, I can borrow electronic books from my local public library and download them to my computer, e-reader or other portable device. But here’s the thing: when I download an e-book, I’m not actually downloading it from my library. In many cases, I’m downloading it from a service called OverDrive, an e-book distributor that many Canadian public libraries use.
When OverDrive distributes an electronic book, it wraps it in DRM (digital rights management) software — basically, a digital lock that determines what I can and can’t do with that e-book. Essentially, HaperCollins has told OverDrive to add a new restriction, rendering its digital books useless after 26 checkouts.
Why 26? When I called HarperCollins Canada for an explanation, they didn’t have one for me. But according to trade publication Library Journal, the number 26 has to do with “the average lifespan of a print book and wear and tear on circulating copies.”
Obviously, paper books are susceptible to many kinds of wear and tear. People spill coffee on paper books. They ruin books by dropping them into bathtubs. They dog-ear pages and scribble in margins. Eventually, libraries get rid of old, worn books. From the publisher’s point of view, that’s a good thing, because it means an opportunity to sell replacement copies. Not so with the unlimited-use e-books publishers have been licensing to libraries for the past several years.
It’s almost impossible for your dog to eat a digital copy of a book. HarperCollins’s 26-use limit seems to be an attempt to create artificial scarcity.
26-checkout limit too low
OverDrive says this is the first time one of its publishers has placed a per-use limit on e-books. The digital lending caps apply worldwide but only to titles that libraries license after March 7. Existing licenses will remain unlimited.
Digital books can be read on any number of devices, including e-readers like the one from Sony shown above. (Associated Press/Sony Digital Reading)
The cost of e-book licences varies by title and can be as low as $3 or many times that. OverDrive said there’s no universal standard on what books cost more and that it’s up to publishers to set their own prices.
When I first heard about the HarperCollins announcement, I called up Vickery Bowles. She’s the director of collections management at the Toronto Public Library. She understands why publishers want to place this kind of restriction on their e-books and agrees that there needs to be another business model that works for both libraries and publishers. But, Bowles says, the 26-use cap “isn’t the right one.”
I also talked to Keith Walker, president of the Canadian Library Association, who said libraries need to be able to control the circulation of digital books.
“Libraries need to be able to own that content, as we do with print, [and to] be able to set our own circulation use policies,” he said. “There’s conflict right now with publishers who may not understand that the libraries want to be able to continue the same way that we have with print.”
Walker and other librarians take issue with the assertion that paper books wear out after 26 checkouts. One group of librarians in Oklahoma posted a video to YouTube showcasing paper versions of HarperCollins books that have been checked out 26 times or more: most were still in very good shape.
2-tier system
An OverDrive representative said HarperCollins titles are now segregated from the rest of the distributor’s offerings to keep librarians from unintentionally licensing e-books with use-based limits. As librarians explained it to me, this has created a two-tiered system for e-books: unlimited-use titles and titles that expire after 26 checkouts.
Of course, HarperCollins is perfectly free to change the terms of new licenses whenever it wants, and libraries are free to take up the new offerings (or not). The Pioneer Library System in Oklahoma (which made the YouTube video) wrote in an open letter to HarperCollins that “until a change is made in the licensing, the Virtual Library cannot, in good conscience. spend our limited budget to repeatedly purchase e-book titles from HarperCollins or any other publisher who enforces checkout limits.”
It seems this is just another example of an old, scarcity-based business model butting heads with a new digital model. It’s a story that’s playing out across the digital media landscape and has repercussions far beyond libraries. For me, the danger with the HarperCollins decision has to do with the precedent it sets and the slippery slope it might send us down.
What happens if publishers decide they’d prefer a 13-use cap? Or an entirely pay-per-use model?
Time to figure out new rules
Beyond books and libraries, we’re seeing a larger trend toward DRM-restricted media: digital movie rentals that expire 24 hours after you press play or music files that are rendered worthless if you stop paying a monthly access fee.
When I buy a paper book, I can lend it to a friend, store it on my shelf indefinitely or donate it to a rummage sale. Not so with DRM-restricted material.
Here in Canada, library e-book circulation is still low. Bowles said e-books currently represent less than one per cent of the Toronto Public Library’s overall circulation. But, she says, they are “growing exponentially.”
As e-book use grows, so does the importance of collaboration between publishers and libraries in the digital space, says Walker.
If you ask me, the time to figure out these business models is now — while the stakes are (relatively) low.
Do we want our digital media with or without DRM? Do we want electronic books, movies and music that self-destruct or belong to us permanently? Should we be able to lend our digital media to one another? And how does anybody get paid?
These are the things we should be thinking about, because the decisions and precedents that we set now will shape the digital media landscape for years to come.