I'll try to make it tonight for Homebrew meeting. Maybe I can get "fragmentions" (ugh, terminology) or hypothes.is annotations on academic papers working beforehand.


P.S. While last time I RSVP'ed I worried that these irrelevant posts in my feed were needless, I ended up getting multiple emails with really valuable responses (about hiding certain types of posts and about the academic writing on the web project in general). So I'm persuaded not to worry urgently about hiding them from my index page or feed.

Subject: imperialist/sexist maps
From: nick@npdoty.name
Date: 7/27/2014 03:05:22 PM To: all my friends with whom I've been discussing maps and imperialism Bcc: https://bcc.npdoty.name/

I’ve been thinking about the qualities of maps — imperialist, humanitarian, democratizing — and the demographics of cartographers, neogeographers and Web mapping folks.1 Did you all see this article on the possibilities of sexism in street maps? I’m encouraged to see writing on the topic, but I see two important lessons to remember.

First, let’s use data, quantitative and qualitative, to investigate sexism and fairness in maps.

The FastCo article makes a specific claim that OpenStreetMap “may contain more strip clubs than day care centers”. Getting an exhaustive answer to that question would take some time, but it doesn’t take long to gather some initial data.2 Because OpenStreetMap is actually more an accessible database than it is a map, we can issue queries and get statistics. In fact, we have an awesome opportunity to ask and answer some of these questions of fairness in the map, such that it’s worth spending some time investigating.

Strip clubs are typically tagged as “amenity=stripclub”, a well-defined tag, of which there are 455 in OSM. Day care centers are a little more difficult to count (more on this later) because people use different tags for them: one might be “amenity=kindergarten” which covers pre-school centers (what we often call “kindergarten” in the US) and services that look after young children but which aren’t educational. I count 124,197 of these in OSM, but I can’t quickly tell how many are pre-school (what I often call kindergarten) and which are child care centers. A few years ago there was a proposal to formally start using amenity=childcare to refer to child care facilities, a proposal that was rejected by some voters who thought it overlapped too much with amenity=kindergarten. Nevertheless, OSM users can use whatever tags they want,3 and many of them are using amenity=childcare, there are 1,329 instances (triple the amenity=stripclub count, although that tag is more formally approved). There are 504 instances of the more obscure syntax of social_facility:for=child and :for=juvenile, although I suspect many of those are covering group homes, orphanages, community centers and various social work facilities for children.

(We could also search OSM by name to try to count child care centers and strip clubs. There seem to be roughly 1100 with “day care” in the name and 640 with “child care”, but those names are likely very English specific and don’t provide for good comparisons with stripclubs, which I expect rarely include “strip club” in the venue name. I can find about 20 that use some variation of “gentlemen’s club”.)

But these numbers don’t discount the concern that the distribution of mapped venues may be skewed in a way that might be sexist in intent or impact. Rather, I suspect that statistics would actually bring the problem into greater relief. For example, if business license records show 100 or 1000 times as many child care centers as strip clubs in many jurisdictions and the OSM database only shows 5 times as many child care centers, that would be an important result. Rather than comparing only two numbers, we would do better to compare the proportions of different venues to some independent measure to see which are disproportionately present or missing. Side note: if you gather that data, give it to OSM volunteers so they can identify where or why that skew is happening.4 We would learn more by comparing other categories as well: while the stripclub/childcare example may be relevant (in particular because of the taxonomic question, see below), not all and not only women care about childcare services.

Beyond statistical counts of features, we can and should use qualitative methods to evaluate sexism in the map and in the community. For example, that rejected 2011 proposal for a recommended amenity=childcare tag revealed that the (mostly male) OSM editors may discount the need for a separate tag, to the detriment of users who would benefit from it. Because OSM conducts votes on these taxonomic questions, complete with explanations memorialized in wiki form, researchers and the public can review the debate, comment by comment. That proposal is also an interesting case because it reveals a linguistic difference. As I understand it, “kindergarten” is used differently in Germany (which many OSM editors call home), where a day care (Kindertagesstätte) might be more closely related. None of this is my discovery, so don’t take my word for it:

  • Dr. Monica Stephens gave a nice (short, I just watched it, it’s awesome, go watch it!) talk on exactly this topic in 2012 — you can watch the video online — comparing the sparse selection of childcare tags to the large diversity of bar/nightclub/swingerclub tags. See also her paper from 2013.5
  • The discussion page on the OSM wiki has a good “post-mortem” discussion of the childcare proposal that’s worth reading.
  • This blog post from April does some numerical analyses after the childcare tag controversy, and also tries to analyze the presence of commercial venues that might tend to bias towards one gender or another, with mixed results.

The second lesson to remember is that maps always reflect perspectives’ of their creators; there is no present or future “objective” map.

But unlike Google Maps, which rigorously chronicles every address, gas station, and shop on the ground, OpenStreetMap’s perspective on the world is skewed by its contributors.

I don’t dispute the latter clause: OSM is absolutely skewed by its contributors. However, I don’t see that maps that don’t rely on crowdsourced data (whether it’s Google Maps or the USGS or any other) are, in contrast, objective in a way that OpenStreetMap could never be.

All maps are skewed by the selections of the humans who make them or, increasingly, the technology that humans build in order to make them. In fact, one might define a map as exactly the process of selecting some geographic data and leaving out all others. This American Life illustrates this point beautifully.6

Google Maps may have a relatively exhaustive accounting of commercial venues. Even in that incredibly narrow category, though, consider last week’s article on mistakes in the Google business directory from malicious or mistaken reports. I say “narrow” because what about the parts of our physical world that aren’t commercial venues or roads for automobiles? Here’s a quick list of some of the interesting feature types in OpenStreetMap that aren’t as easy to find in Google Maps.7

  • Car-sharing locations: as a user of the CityCarShare nonprofit, I’m pleased that many CityCarShare locations are available in the OSM database and are typically rendered on the basemap. Searching in my neighborhood, I find that Google Maps actually does let you find car-sharing locations, but maybe only for Zipcar?

  • Benches: OSM has the locations of 400,000 benches! (Mostly in Europe and some in the US and I bet this isn’t nearly exhaustive enough, but I love that it’s there.)

  • Mailboxes: While the USPS can let you search online for locations of those blue mailboxes, Google Maps only directs you to FedEx or Mailboxes Etc venues. In Oakland there aren’t many of these marked in OSM yet, but when I was traveling in Brussels, I thought it was pretty awesome to be able to pull up the exact location of the nearest red postbox. (161,982 in OSM.)

  • Fire Hydrants: Almost a quarter million of these in OSM. Maybe they’d be useful for Adopt-a-Hydrant websites, without a city having to import all the data themselves.

  • Wheelchair Accessibility: This one is a real challenge. It would be awesome if streets, sidewalks, businesses, toilets could all have metadata about their wheelchair accessibility, so that, for example, your navigation software could tell you how to get from point A to point B without ever directing you to take stairs, or cross a road where there isn’t a curb ramp. OSM has 600,000 tags with wheelchair accessibility metadata, but even that surely isn’t nearly enough. (OSM Wiki has a page on wheelchair routing and there are also some Google Maps projects for crowdsourcing that kind of data.)

  • Trees: OSM has 3.4 million individual trees mapped. Ha, awesome. I really want to map all the trees in our neighborhood in order to make a more beautiful and detailed print map of the area. (See also, the Urban Forest Map of all 88,000 trees on San Francisco's streets.)

Personally, I like to use OpenStreetMap for its detailed data on hiking trails, including gates and fences along the route. Others use OSM for bicycle routing and Google Maps also has a different mode for viewing bike lanes. But it should be clear that no single map contains everything, and certainly not everything in an objective way that doesn’t involve the perspectives of both the designer of the map and the contributor of the data. Even the distribution of categories themselves is a pragmatic, rather than essentialist, exercise.

It can be tempting, and perhaps more so now with maps that are more databases than static cartographic projections, to believe that a map can contain everything, such that claims of sexism could simply be refuted. Maybe if we just had more and more data, all the data, then the perspective of the cartographer would disappear as it became more and more precise, until the map itself contained everything in the territory. Indeed, digital maps have made it possible to represent different maps in different situations; to show different ownership of the same territory based on who’s looking at the map, for example. But mapping, like any data collection and analysis project, will always have perspectives. We can do better or worse at being aware of these perspectives and adjusting our practices to address disparities in the design of maps, but we shouldn’t imagine that one day there will be such an authoritative source that we can stop asking whether the map is sexist or how to make it less so.

Please forgive my verbose enthusiasm. Yes, of course, I’m super into maps, but I can’t help but think that these same lessons will arise in every data project we pursue.

Thanks for reading my ramblings,

P.S. And thanks to Brendan, Geoff, Julie and Zeina for helping to clarify some of this before I posted it publicly.

  1. In short, is my vague impression correct that mapping technology meetings are more disproportionately white even than other technology-focused communities? Is there greater representation of Brits and Americans? And if so, why, and what are the implications?

  2. A commenter on the FastCo site has already pointed this out in brief, but I’ll share my data with some links anyway.

  3. This point is extremely important — a case where the users/implementers can behave in ways that contradict the attempt to standardize. It’s a good check on what I think was a real mistake in not approving the tag. I’m not sure, however, if this voting affects the common renderings of the map, like at openstreetmap.org.

  4. As one friend put it, and maybe this is the issue with many disparities in tech, the community is good-hearted but just naive.

  5. Monica Stephens, “Gender and the Geoweb: Divisions in the Production of User-Generated Cartographic Information,” GeoJournal 78, no. 6 (2013): 981–96, http://link.springer.com/article/10.1007/s10708-013-9492-z.

  6. The whole episode is great, but the first few minutes of the prologue are enough, and lovely.

  7. This might seem like I’m poking fun or diminishing Google’s awesome map, but I really don’t mean it that way at all. Different maps work for different uses, and while I think there’s often a healthy competition among Web mappers, these are just examples.

Sure, I'm in for tonight's Homebrew meeting. I don't have a ton of progress to report, but I've been working on academic writing that can be simultaneously posted to the Web (where it can be easily shared and annotated) and also formatted to PDF via LaTeX. Oh, and I'm excited to chat with people about OpenPGP for indieweb purposes.

P.S. While I like the idea of posting RSVPs via my website, it seems a little silly to include them in RSS feeds or the blog index page like any other blog entry. What are people doing to filter/distinguish different kinds of posts?

Thanks for writing. I’m inspired to write a couple of comments in response.

First, are academic, professional ethicists as irrelevant as you suggest? (Okay, that’s a bit of a strawman framing, but I hope the response is still useful.)

Floridi is an interesting example. I’m also a fan of his work (although I know him more for his philosophy of information work — I like to cite him on semantics/ontologies, for example (Floridi 2013) — rather than his ethics work), but he’s also in the news this week because he’s on Google’s panel of experts (their “Advisory Council”) for determining the right balance in processing right-to-be-forgotten requests.

Also, I think we see the influence of these ethical and other academic theories play out in practical terms, even if they’re not cited in a direct company response to a particular scandal. For example, you can see Nissenbaum’s contextual integrity theory of privacy (Nissenbaum 2004) throughout the Federal Trade Commission’s 2012 report on privacy (FTC 2012), even though she’s never explicitly cited. And, forgive me for rooting for the home team here, but I think Ken and Deirdre’s research of “on the ground” privacy (Bamberger and Mulligan 2011) played a pretty prominent role in the White House framework for consumer privacy (“Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy” 2012).

But second, I’m even more excited about your conclusion. Yes, decentralize!, despite the skepticism about it (Narayanan et al. 2012). But more than just repeating that rallying cry (which I still think needs repeating – I’m trying to support #indieweb as my part of that), is the form of the problem.

I think a really cool project that everybody who cares about this should be working on is designing and executing on building that alternative to Facebook. That’s a huge project. But just think about how great it would be if we could figure out how to fund, design, build, and market that. These are the big questions for political praxis in the 21st century.

Politics in our century might be defined by engineering challenges, and if that’s true, then it emphasizes even more how coding is not just entangled with, but is itself a question of, policy and values. I think our institution could dedicate a group blog just to different takes on that.


Some references:

Bamberger, KA, and DK Mulligan. 2011. “Privacy on the Books and on the Ground.” Stanford Law Review. http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1568385.

“Consumer Data Privacy in a Networked World: A Framework for Protecting Privacy and Promoting Innovation in the Global Digital Economy.” 2012. White House, Washington, DC. http://www.whitehouse.gov/the-press-office/2012/02/23/fact-sheet-plan-protect-privacy-internet-age-adopting-consumer-privacy-b.

Floridi, Luciano. 2013. “Semantic Conceptions of Information.” In Stanford Encyclopedia of Philosophy, edited by Edward N. Zalta, Spring 201. http://plato.stanford.edu/archives/spr2013/entries/information-semantic/.

FTC. 2012. “Protecting Consumer Privacy in an Era of Rapid Change Recommendations for Businesses and Policymakers.” Technical report March. Federal Trade Commission. http://ftc.gov/os/2012/03/120326privacyreport.pdf.

Narayanan, Arvind, Vincent Toubiana, Helen Nissenbaum, and Dan Boneh. 2012. “A Critical Look at Decentralized Personal Data Architectures.” http://arxiv.org/abs/1202.4503.

Nissenbaum, Helen. 2004. “Privacy as Contextual Integrity.” Washington Law Review 79 (1): 101–139. http://heinonlinebackup.com/hol-cgi-bin/get_pdf.cgi?handle=hein.journals/washlr79&section=16.

Subject: notary digital?
From: nick@npdoty.name
Date: 7/05/2014 09:42:00 PM To: Amanda, Andrew, Brendan, DKM, JCP, Rachel, Sam, Seb, Z Bcc: https://bcc.npdoty.name/

Recently I had the honor of swearing, and having notarized, an affidavit of bona fide marriage for a good friend as part of an immigration application. Speaking with another friend who had done the same for a friend of hers, she remarked that it was such a basic and important thing to do, that even if she did nothing else this year it would have been an accomplishment. And the formal, official process of notarization was interesting enough itself that I spent some time looking into how to become one.

Notary Public

Becoming a notary is a strange process. By its nature, it's an extremely regulated field: state law specifies exactly what a notary must do, what training they must have, what level of verification is needed for different notarized documents, exactly how much a notary may charge for each service, how the notary may advertise itself, etc. That is, you become a notary public, not just a notary. Presumably this is in part because other legal and commercial processes depend on notarization of certain kinds.

Given all those regulations, if the notary errs or forgets when conducting her duties, the law provides penalties. Forgot to thumbprint someone when you notarized their affidavit? That's $2500. Forgot to inform the Secretary of State when you moved to a new apartment? $500. Screw up the process for identifying an individual in a way that screws up someone else's business? They can sue you for damages. In short, if you're a notary, you need to buy notary errors and omissions insurance, at least $50 for four years. Also, the State wants to be sure that you can pay if you become a rogue notary who violates all these rules. As a result, as soon as you become a notary you're required to execute a bond of $15,000 with your county. In short, you pay a certified bond organization maybe $50 for the bond; if the State thinks you screwed up, they get the money directly from the bondsman and then the bondsman comes and gets the money from you.

Notary Digital?

But mostly I'm curious about this just because I've been thinking about the idea of a digital notary. (This is not to be confused with completing notary public activities with webcam verification instead of in-person, which appears to be illegal in most states, and not what I'm offering.)

That is, it seems like there are some operations we do in our digital, electronic lives these days that could benefit from some in-person verification. Those operations might otherwise just be cumbersome or awkward, but if we have an existing structure — of people who advertise themselves as carefully completing these verification operations in person — maybe that would actually work well, even with our online personas. These thoughts are, charmingly I hope, inchoate and I would appreciate your thoughts about them.

Backup / Escrow

Some really important digital files you want to backup in a secure, offline way, where you're guaranteed to be able to get them back. (Say: Bitcoin wallets; financial records; passwords, certificate revocations, private keys.) You meet with the digital notary; she confirms who you are, who can have access to the files, whether you want them encrypted in a way that she can't access them, how and when to get them back to you (offline-only, online with certain verifications, etc.). You pay her a fee then and a fee at the time if you ever need to retrieve them.

Alternatives: online "cold storage" services; university IT escrow services (not sure if this is common, but Chicago provides it for faculty and staff); bank safety deposit boxes with USB keys in them; online backup you really hope is secure.

Verification and Certification

You can go to a digital notary to get some digital confirmation that you are who you say you are online. The digital notary can give you a certificate to use that has your legal name and her signature (complete with precise verification steps) that you can use to sign electronic documents or sign/encrypt email. Sure, anyone can sign your OpenPGP key and confirm your identity, but the notary can help you set it up and give you a trusted verification (based on her well-known reputation and connection to the Web of Trust and other notaries).

And, traditional to the notary, she can sign a jurat. That is, you can swear an affidavit of some statement and she can verify that it was really you saying exactly what you said, but do so in a way that can be automatically and remotely verified.

Alternatives: key-signing parties; certificate authorities (some do this for free, others require a fee, or require a fee if it's not just personal use); creating your own key and participating in the Web of Trust in order to establish some reputation.


While we hope to see an increase in the thanatosensitivity (oh man, I've been waiting for an excuse to use that term again; here are all my bookmarks related to the topic) of online services — like Google's Inactive Account Manager — after we die, it's likely that our online accounts will become defunct and difficult for our next-of-kin to access. It would be useful to give someone instructions for what we want done with our accounts and data after death; that person will likely have to securely maintain passwords and keys and be able to verify, offline, our identities. Pay your digital notary a fee and she can execute certain actions (deleting some data, revealing some passwords to whichever family members you chose, disabling social media accounts) after your death, after verifying it using not just inactivity, but also confirmation with government or family.

Alternatives: a lawyer who understands technology well enough to execute these digital terms of your will just as they do your regular will and testament. (Does anyone know the current state of the art for lawyers who know how to handle these things?)


And actually what might be most valuable about digital notary services is that she can explain to you these digital verifications work. That is, not only can a digital notary provide digital execution with in-person verification, she can provide the basic capability, explain how it works and then conduct it. Another advantage of in-person meetings, you can seek individualized counsel, not just formalistic execution of tasks.

It would be nice if information technology had a profession with a fiduciary responsibility to its clients; the implications of digital work are increasingly important to us but remain hard for non-experts to understand, much less control. Just as we expect with our doctors and our lawyers, we should be able to ask technological experts for advice and services that are in our own best, and varied, interests. Related, it would be useful if the law reflected that relationship and provided liability but also confidentiality, for such transactions. That latter part will take a little while (the law is slow to change, as we know), but a description of the profession and some common ethical guidelines of its own could help.

A Shingle?

As an experiment, I offer you all and our friends the services described above — escrow of files/keys; authentication, encryption and certification of messages; execution of a digital will and testament — at a nominal $2 fee per service.

Sincerely yours,


P.S. Did you know that payment of fees is one factor used to determine that a privileged client-attorney relationship has been established?