Dealing with harassment (and spam) on the Web

Dealing with harassment (and spam) on the Web
nick@npdoty.name
10/26/2014 11:11:02 AM Ben, Ryan, Tantek, Wendy Andrew, DKM, Seb https://bcc.npdoty.name/

I see that work is ongoing for anti-spam proposals for the Web — if you post a response to my blog post on your own blog and send me a notification about it, how should my blog software know that you're not a spammer?

But I'm more concerned about harassment than spam. By now, it should be impossible to think about online communities without confronting directly the issue of abuse and harassment. That problem does not affect all demographic groups directly in the same way, but it effects a loss of the sense of safety that is currently the greatest threat to all of our online communities. #GamerGate should be a lesson for us. Eg. Tim Bray:

Part of me suspects there’s an upside to GamerGate: It dragged a part of the Internet that we always knew was there out into the open where it’s really hard to ignore. It’s damaged some people’s lives, but you know what? That was happening all the time, anyhow. The difference is, now we can’t not see it.

There has been useful debate about the policies that large online social networking sites are using for detecting, reporting and removing abusive content. It's not an easy algorithmic problem, it takes a psychological toll on human moderators, it puts online services into the uncomfortable position of arbiter of appropriateness of speech. Once you start down that path, it becomes increasingly difficult to distinguish between requests of various types, be it DMCA takedowns (thanks, Wendy, for chillingeffects.org); government censorship; right to be forgotten requests.

But the problem is different on the Web: not easier, not harder, just different. If I write something nasty about you on my blog, you have no control over my web server and can't take it down. As Jeff Atwood, talking about a difference between large, worldwide communities (like Facebook) and smaller, self-hosted communities (like Discourse) puts it, it's not your house:

How do we show people like this the door? You can block, you can hide, you can mute. But what you can't do is show them the door, because it's not your house. It's Facebook's house. It's their door, and the rules say the whole world has to be accommodated within the Facebook community. So mute and block and so forth are the only options available. But they are anemic, barely workable options.

I'm not sure I'm willing to accept that these options are anemic, but I want to consider the options and limitations and propose code we can write right now. It's possible that spam could be addressed in much the same way.

Self-hosted (or remote) comments are those comments and responses that are posts hosted by the commenter, on his own domain name, perhaps as part of his own blog. The IndieWeb folks have put forward a proposed standard for WebMentions so that if someone replies to my blog on their own site, I can receive a notification of that reply and, if I care to, show that response at the bottom of my post so that readers can follow the conversation. (This is like Pingback, but without the XML-RPC.) But what if those self-hosted comments are spam? What if they're full of vicious insults?

We need to update our blog software with a feature to block future mentions from these abusive domains (and handling of a block file format, more later).

The model of self-hosted comments, hosted on the commenter's domain, has some real advantages. If joeschmoe.org is writing insults about me on his blog and sending notifications via WebMention, I read the first such abusive message and then instruct my software to ignore all future notifications from joeschmoe.org. Joe might create a new domain tomorrow, start blogging from joeschmoe.net and send me another obnoxious message, but then I can block joeschmoe.net too. It costs him $10 in domain registration fees to send me a message, which is generally quite a bit more burdensome than creating an email address or a new Twitter account or spoofing a different IP address.

This isn't the same as takedown, though. Even if I "block" joeschmoe.org in my blog software so that my visitors and I don't see notifications of his insulting writing, it's still out there and people who subscribe to his blog will read it. Recent experiences with trolling and other methods of harassment have demonstrated that real harm can come not just from forcing the target to read insults or threats, but also from having them published for others to read. But this level of block functionality would be a start, and an improvement upon what we're seeing in large online social networking sites.

Here's another problem, and another couple proposals. Many people blog not from their own domain names, but as a part of a larger service, e.g. Wordpress.com or Tumblr.com. If someone posts an abusive message on harasser.wordpress.com, I can block (automatically ignore and not re-publish) all future messages from harasser.wordpress.com, but it's easy for the harasser to register a new account on a new subdomain and continue (harasser2.wordpress.com, sockpuppet1.wordpress.com, etc.). While it would be easy to block all messages from every subdomain of wordpress.com, that's probably not what I want either. It would be better if, 1) I could inform the host that this harassment is going on from some of their users and, 2) I could share lists with my friends of which domains, subdomains or accounts are abusive.

To that end, I propose the following:

That, if you maintain a Web server that hosts user-provided content from many different users, you don't mean to intentionally host abusive content and you don't want links to your server to be ignored because some of your users are posting abuse, you advertise an endpoint for reporting abuse. For example, on grouphosting.com, I would find in the <head> something like:

<link rel="abuse" href="https://grouphosting.com/abuse">

I imagine that would direct to a human-readable page describing their policies for handling abusive content and a form for reporting URLs. Large hosts would probably have a CAPTCHA on that submission form. Today, for email spam/abuse, the Network Abuse Clearinghouse maintains email contact information for administrators of domains that send email, so that you can forward abusive messages to the correct source. I'm not sure a centralized directory is necessary for the Web, where it's easy to mark up metadata in our pages.
That we explore ways to publish blocklists and subscribe to our friend's blocklists.

I'm excited to see blocktogether.org, which is a Twitter tool for blocking certain types of accounts and managing lists of blocked accounts, which can be shared. Currently under discussion is a design for subscribing to lists of blocked accounts. I spent some time working on Flaminga, a project from Cori Johnson to create a Twitter client with blocking features, at the One Web For All Hackathon. But I think blocktogether.org has a more promising design and has taken the work farther.

Publishing a list of domain names isn't technically difficult. Automated subscription would be useful, but just a standard file-format and a way to share them would go a long way. I'd like that tool in my browser too: if I click a link to a domain that my friends say hosts abusive content, then warn me before navigating to it. Shared blocklists also have the advantage of hiding abuse without requiring every individual to moderate it away. I won't even see mentions from joeschmoe.org if my friend has already dealt with his abusive behavior.

Spam blocklists are widely used today as one method of fighting email spam: maintained lists primarily of source IP addresses, that are typically distributed through an overloading of DNS. Domain names are not so disposable, so list maintainance may be more effective. We can come up with a file format for specifying inclusion/exclusion of domains, subdomains or even paths, rather than re-working the Domain Name System.

Handling, inhibiting and preventing online harassment is so important for open Web writing and reading. It's potentially a major distinguishing factor from alternative online social networking sites and could encourage adoption of personal websites and owning one's own domain. But it's also an ethical issue for the whole Web right now.

As for email spam, let's build tools for blocking domains for spam and abuse on the social Web, systems for notifying hosts about abusive content and standards for sharing blocklists. I think we can go implement and test these right now; I'd certainly appreciate hearing your thoughts, via email, your blog or at TPAC.

Nick

P.S. I'm not crazy about the proposed vouching system, because it seems fiddly to implement and because I value most highly the responses from people outside my social circles, but I'm glad we're iterating.

Also, has anyone studied the use of rhymes/alternate spellings of GamerGate on Twitter? I find an increasing usage of them among people in my Twitter feed, doing that apparently to talk about the topic without inviting the stream of antagonistic mentions they've received when they use the #GamerGate hashtag directly. Cf. the use of "grass mud horse" as an attempt to evade censorship in China, or rhyming slang in general.

Labels: Web, gamergate, spam, abuse, indieweb, harassment

Bcc