More about the privacy pledge

Plenty of you have seen—and indeed signed— the App Makers’ Privacy Pledge on GitHub. If you haven’t, but after reading it are interested, see the instructions in the project README.

It’s great to see so many app makers taking an interest in this issue, and the main goal of the pledge is to raise awareness of app privacy concerns: awareness among developers that this is something to take seriously, and awareness among our customers that there are developers committed to respecting their identities and their data.

But awareness is useless if not followed through, so we need to do more. We need materials that developers can refer to: the GSM Association have good guidelines on app practices. We need actionable tasks that developers can implement right away, like Matt Gemmell’s hashing guide for social apps. We need sample code and libraries that developers can rely on. We need data lawyers to explain what the current regulations are, and what’s coming down the pipe. We need to convince the industry and the governments that we can regulate our own actions. We need the ability to audit our apps and determine whether they’re privacy-preserving. We need to be able to demonstrate to customers what we’ve done, and explain why that’s a good thing. We need to earn customer trust.

So there’s a lot to do, and the pledge is only the start. It’s off to a good start, but there’s still a long way to go.

On privacy, hashing, and your customers

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

  • Have I made it clear to my customers that we need this data?
  • Have I already given my customers the choice to decline access to the data?
  • Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:

X-Customer-Identifier: iamleeg%40gmail.com

But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I’m cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user’s current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

  • Legal obligations to your customers in whatever country those countries (and your data) reside
  • Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
  • Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

Don’t be a dick

In a recent post on device identifiers, I wrote a guideline that I’ve previously invoked when it comes to sharing user data. Here is, in both more succinct and complete form than in the above-linked post, the Don’t Be A Dick Guide to Data Privacy:

  • The only things you are entitled to know are those things that the user told you.
  • The only things you are entitled to share are those things that the user permitted you to share.
  • The only entities with which you may share are those entities with which the user permitted you to share.
  • The only reason for sharing a user’s things is that the user wants to do something that requires sharing those things.

It’s simple, which makes for a good user experience. It’s explicit, which means culturally-situated ideas of acceptable implicit sharing do not muddy the issue.

It’s also general. One problem I’ve seen with privacy discussions is that different people have specific ideas of what the absolutely biggest privacy issue that must be solved now is. For many people, it’s location: they don’t like the idea that an organisation (public or private) can see where they are at any time. For others, it’s unique identifiers that would allow an entity to form an aggregate view of their data across multiple functions. For others, it’s conversations they have with their boss, mistress, whistle-blower or others.

Because the DBADG mentions none of these, it covers all of these. And more. Who knows what sensors and capabilities will exist in future smartphone kit? They might use mesh networks that can accurately position users in a crowd with respect to other members. They could include automatic person recognition to alert when your friends are nearby. A handset might include a blood sugar monitor. The fact is that by not stopping to cover any particular form of data, the above guideline covers all of these and any others that I didn’t think of.

There’s one thing it doesn’t address: just because a user wants to share something, should the app allow it? This is particularly a question that makers of apps for children should ask themselves. Children (and everybody else) deserve the default-private treatment of their data that the DBADG promotes. However, children also deserve impartial guidance on what it is a good or a bad idea to share with the interwebs at large, and that should be baked into the app experience. “Please check with a responsible adult before pressing this button” does not cut it: just don’t give them the button.

On the top 5 iOS appsec issues

Nearly 13 months ago, the Intrepidus Group published their top 5 iPhone application development security issues. Two of them are valid issues, the other three they should perhaps have thought longer over.

The good

Sensitive data unprotected at rest

Secure communications to servers

Yes, indeed, if you’re storing data on a losable device then you need to protect the data from being lost, and if you’re retrieving that data from elsewhere then you need to ensure you don’t give it away while you’re transporting it.

Something I see a bit too often is people turning off SSL certificate validation while they’re dealing with their test servers, and forgetting to turn it on in production.

The bad

Buffer overflows and other C programming issues

While you can indeed crash an app this way, I’ve yet to see evidence you can exploit an iOS app through any old buffer overflow due to the stack guards, restrictive sandboxes, address-space layout randomisation and other mitigations. While there are occasional targeted attacks, I would have preferred if they’d been specific about which problems they think exist and what devs can do to address them.

Patching your application

Erm, no. Just get it right. If there are fast-moving parts that need to change frequently, extract them from the app and put them in a hosted component.

The platform itself

To quote Scott Pack in “The DMZ”, If you can’t trust your users to implement your security plan, then your security plan must work without their involvement. In other words, if you have a problem and the answer is to train 110 million people, then you have two problems.

Protecting source code

As I mentioned on the missing iDeveloper.tv Live episode, one of the consequences of the Gawker hack was that their source code for their internal software was leaked into the Internet. I doubt any of my readers would want that to happen to their code, so I’m going to share the details of how I protect my clients’ code when I’m working. Maybe some of this will work for you.

In the office, I work at a desktop iMac. This has an external time machine backup disk and a DropBox for off-site storage. However, client code does _not_ go onto the DropBox. Instead I keep a separate, encrypted sparse disk image for each project I’m working on. The password for each is different. As well as protecting against snooping, this helps stop cross-contamination. I rarely have two such images mounted at once. Note that it’s not just source that goes into these images: build products, notes, Instruments traces, and images all go into the encrypted containers.

Obviously that means a lot of passwords, and no I can’t remember them all. I use a keychain. It locks automatically when not in use, and has a passphrase that’s different from my login passphrase.

The devices I test on are all encrypted where available (if a client needs me to test on an iPhone 3G, then I can, but it isn’t encrypted). They are passphrase locked, set to require passphrase immediately. And I NEVER take them away from the desk before deleting any developer builds, unless I need to do something special like a real-world location services test.

I rarely do coding work on the laptop, but when I do I copy the appropriate encrypted image onto it. The laptop additionally has FileVault configured, though I’m evaluating full-disk encryption options. Keychain configuration as above, additionally with a password required on wake from sleep or screensaver, and a firmware password.

For pushing work back to the clients, most clients use github or bitbucket which offer SSL-encrypted connections to the repositories. Personally, I have a self-run repo host available over HTTPS or SSH, but will probably move that to a github-like service because life’s too short. Their security policy seems acceptable to me.

On Trashing

Back in the 1980s and 1990s, people who wanted to clandestinely gain information about a company or organisation would go trashing.[*] That just meant diving in the bins to find information about the company structure – who worked there, who reported to whom, what orders or projects were currently in progress etc.

You’d think that these days trashing had been thwarted by the invention of the shredder, but no. While many companies do indeed destroy or shred confidential information, this is not universal. Those venues where shredding is common leave it up to their staff to decide what goes in the bin and what goes in the shredder; these staff do not always get it correct (they’re likely to think about whether a document is secret to them rather than the impact on the company). Nor do they always want to think about which bin to put a worthless sheet of paper in.

Even better: in those places that do shred secret papers, they helpfully collect all of the secrets in big bins marked “To Shred” to help the trashers :). They then collect all of these bins into a big hopper, and leave that around (sometimes outside the building, in a covered or open yard) for the destruction company to come and pick up.

So if an attacker can get entry to the building, he just roots around in the “To Shred” bins. Someone asks, he tells them he put a printout there in the morning but now think he needs it again. Even if he can’t get in, he just dives in the hopper outside and get access to all those juicy secrets (with none of the banana peelings and teabags associated with the non-secret bin).

But for those attackers who don’t like getting their hands dirty, they can gain some of the same information using technological means. LinkedIn will helpfully provide a list of employees – including their positions, so the public can find out something of the reporting structure. Some will be looking for recruitment opportunities – these are great people to phone for more information! So are ex-employees, something LinkedIn will also help you out with.

But the fun doesn’t stop there. Once our attacker has the names, he now goes over to Twitter and Facebook. There he can find people griping about work…or describing what the organisation is up to, to put it another way.

All of the above information about 21st-century trashing comes from real experience with an office I was invited into in the last 12 months. Of course, I will not name the organisation in charge of that office (or their data destruction company). The conclusion is that trashing is alive and well, and that those who participate need no longer root around in, well, in the trash. How does your organisation deal with the problem?

[*] for me, it was mainly the 1990s. I was the perfect size in the 1980s for trashing, but still finding my way around a Dragon 32.

A solution in need of a problem

I don’t usually do product reviews, in fact I have been asked a few times to accept a free product in return for a review and have turned them all down. This is just such an outré product that I have decided to write a review, despite not being approached by the company and having no connection. The product is the 3M Privacy Filter Gold. Here is one I was given at InfoSec, on my MacBook:

3M Privacy Filter Gold - front view

You’ll probably notice that the screen has some somewhat unsightly plastic tabs around the edge. They are holding in the main feature of this product, which is the thing giving the screen that slightly coppery colour. It’s actually a sheet of plastic which I assume is etched with a blazed diffraction grating, because at more acute viewing angles it makes the screen look like this:

3M Privacy Filter Gold - side view

OK, so you can still see the plastic tabs, but now it’s hard to make out the text. And that’s the goal of the privacy filter gold: it’s to stop shoulder surfers. By reducing the usable viewing angle of your screen. Hold on a moment, while I count the number of times I’ve been told about a business or government agency that leaked sensitive data through shoulder surfing.

0.

And the number of times I’ve heard (or discovered, through risk analysis) that it’s an important risk for an organisation to address? About the same. OK, so this product is distinctive, and gimmicky, and evidently does what it’s designed to do. But I don’t see the problem that led to this product being developed. It might be useful if you want to browse porn in Starbucks or goof off on Facebook while you’re at work, except that someone stood behind you can still see the screen. OK, you could browse porn while sat on the London Underground – except you won’t be able to get a network signal.

If anti-shoulder-surfing is important to you, you may want to bear these issues into account. When used in strong sunlight, the privacy filter gold makes the screen look like this (note: availability of handsome security expert holding iPhone is strictly limited):

3M Privacy Filter Gold - in the sunshine

The MacBook isn’t great in strong sunlight anyway, but with the filter over the top it becomes positively unusable. All of that dust is actually on the filter (although it was fresh out of its packaging when the picture was taken), and causing additional scatterings through its grating leading to a “gold dust” effect on the screen. And yes, the characters “3M GPF13.3W” are indeed etched into the filter at the top-right, near that thing that yes, is indeed a notch you can put your finger in for extracting the filter from the plastic tabs.

That’s one issue, the other is price. Retail price for these filters is around £50, varying slightly depending on screen size. I’m really not sure that’s a price worth paying considering that I have no idea what the benefit of the filter will be.

So it’s not just the Department of Homeland Security, then

What is it about government security agencies and, well, security? The UK Intelligence and Security Committee has just published its Annual Report 2008-2009 (pdf, because if there’s one application we all trust, it’s Adobe Reader), detailing financial and policy issues relating to the British security services during that year.

Sounds “riveting”, yes? Well the content is under Crown copyright[*], so I can excerpt some useful tidbits. According to the director of GCHQ:

The greatest threat [to government IT networks] is from state actors and there is an increasing vulnerability, as the critical national infrastructure and other networks become more interdependent.

The report goes on to note:

State-sponsored electronic attack is increasingly being used by nations to gather intelligence, particularly when more traditional espionage methods cannot be used. It is assessed that the greatest threat of such attacks against the UK comes from China and Russia.

and yet:

The National Audit Office management letter, reporting on GCHQ’s 2007/08 accounts, criticised the results of GCHQ’s 2008 laptop computer audit. This showed that 35 laptops were unaccounted for, including three that were certified to hold Top Secret information; the rest were unclassified. We pressed GCHQ about its procedures for controlling and tracking such equipment. It appears that the process for logging the allocation and subsequent location of laptops has been haphazard. We were told:
Historically, we just checked them in and checked them out and updated the records when they went through our… laptop control process.

So our government’s IT infrastructure is under attack from two of the most resourceful countries in the world, and our security service is giving out Top Secret information for free? It sounds like all the foreign intelligence services need to do is employ their own staff to empty the bins in Cheltenham. In fairness, GCHQ have been mandated to implement better asset-tracking mechanisms; if they do so then the count of missing laptops will be reduced to only reflect thefts/misplaced systems. At the moment it includes laptops that were correctly disposed of, in a way that did not get recorded at GCHQ.

[*] Though significantly redacted. We can’t actually tell what the budget of the intelligence services is, nor what they’re up to. How the budget is considered sensitive information, I’m not sure.

Look what the feds left behind…

So what conference was on in this auditorium before NSConference? Well, why don’t we just read the documents they left behind?

folder.jpg

Ooops. While there’s nothing at higher clearance than Unrestricted inside, all of the content is marked internal eyes only (don’t worry, feds, I didn’t actually pay too much attention to the content. You don’t need to put me on the no-fly list). There’s an obvious problem though: if your government agency has the word “security” in its name, you should take care of security. Leaving private documentation in a public conference venue does not give anyone confidence in your ability to manage security issues.