Recently, we shared a privacy threat model which was centered on the people of Seattle, rather than on the technologies they use.
Because of that, we had different scoping decisions than I’ve made previously. I’m working through what those scoping decisions mean.
First, we cataloged how data is being gathered. We didn’t get to “what can go wrong?” We didn’t ask about secondary uses or transfers — yet. I think that was a right call for the first project, because the secondary data flows are a can of worms, and drawing them would, frankly, look like a can of worms. We know that most of the data gathered by most of these systems is weakly protected from government agencies. Understanding what secondary data flows can happen will be quite challenging. Many organizations don’t disclose them beyond saying “we share your data to deliver and improve the service,” those that do go farther disclose little about the specifics of what data is transferred to who. So I’d like advice: how would you tackle secondary data flows?
Second, we didn’t systematically look at the question of what could go wrong. Each of those examinations could be roughly the size and effort of a product threat model. Each requires an understanding of a person’s risk profile: victims of intimate partner violence are at risk differently than immigrants. We suspect there’s models there, and working on them is a collaborative task. I’d like advice here. Are there good models of different groups and their concerns on which we could draw?
(Cross-posted to my personal blog.)
[Update Feb 23: Updated spreadsheet based on initial feedback]
I’m pleased to say that we have some first results from our threat modeling for Seattle resident privacy project. In this post, I’m going to share those results, and look forward to what we might do next. (See A Privacy Threat Model for the People of Seattle and Introducing Threat Modeling For Seattlites for more background.)
This blog post provides an overview, and there’s a longer discussion in Seattle Resident Threat Model white paper (draft).
Overall, I’m happy to say that the effort has been a success, and opens up a set of possibilities.
- Every participant learned about threats they hadn’t previously considered. This is surprising in and of itself: there are few better-educated sets of people than those willing to commit hours of their weekends to threat modeling privacy.
- We have a new way to contextualize the decisions we might make, evidence that we can generate these in a reasonable amount of time, and an example of that form.
- We learned about how long it would take (a few hours to generate a good list of threats, a few hours per category to understand defenses and tradeoffs), and how to accelerate that. (We spent a while getting really deep into threat scenarios in a way that didn’t help with the all-up models.)
- We saw how deeply and complexly mobile phones and apps play into privacy.
- We got to some surprising results about privacy in your commute.
What we can learn from this:
- Walking and biking are the most privacy preserving commutes. Everything else generates long-term records of your movement. However, some electric bikes have anti-theft GPS built in, as do the new dockless rental bikes.
- It’s easier to prevent camera tracking on a bike because a helmet is not as attention-grabbing as a mask. Bikes also limit “gait biometrics.”
- Motorcycles have far less electronics and fewer radios than a car, but still carry license plates and may be tracked via road toll systems. There’s obviously complex tradeoffs involved in motorcycle commuting, but it wasn’t obvious to us going in that privacy could play in those tradeoffs.
- Between Lyft/Uber and your own car, your own car is trackable in more ways, and more ways that tie to you. Unless you’re worried about those companies, you’re better off with a taxi or carshare. If you’re worried about Feds or local government, there are a lot of parties a government will subpoena, and so that’s neutral. (Taxis vs app-driven: if you call a taxi, your pickup location/phone combo may be recorded. If you hail it, then pay with a card, your dropoff location may be recorded. If you hail and pay cash, then you’re more private than with an app. Thanks to @internmike for teasing that out.)
We also looked at phones. There’s a set of radios, some of which (bluetooh, wifi) can be turned off with less impact on usability. The cellular network radios can only be turned off with a substantial loss of function. We also discussed differences in usability of turning off app access to location between various brands.
One of the things we did not do was a risk assessment for any particular vulnerable group, but we believe that the information we’ve gathered can support and accelerate such analysis. For example, we know that cell site location information can only be disabled by discarding a mobile phone or leaving it in airplane mode. We also know that DHS collected mobile phone information from DACA applicants . We have not attempted to analyze this or its implications, but we’d be happy to do so in partnership with organizations that have specific concerns.
Since we were exploring how we might do this, we have not yet produced a guide to doing it yourselves.
The Raw Data
The raw data is available under a creative-commons attribution license. Here it is as an Excel spreadsheet. (We use xlsx rather than CSV because we needed Excel’s “sheets” feature.) Here’s a version in Excel and a web view, exported HTM here.