Structure and Interpretation of Computer Programmers

I make it easier and faster for you to write high-quality software.

Thursday, March 28, 2019

There’s more to it

We saw in Apple’s latest media event a lot of focus on privacy. They run machine learning inferences locally so they can avoid uploading photos to the cloud (though Photo Stream means they’ll get there sooner or later anyway). My Twitter stream frequently features adverts from Apple, saying “we don’t sell your data”.

Of course, none of the companies that Apple are having a dig at “sell your data”, either. That’s an old-world way of understanding advertising, when unscrupulous magazine publishers may have sold their mailing lists to bulk mail senders.

These days, it’s more like the postal service says “we know which people we deliver National Geographic to, so give us your bulk mail and we’ll make sure it gets to the best people”. Only in addition to National Geographic, they’re looking at kids’ comics, past due demands, royalty cheques, postcards from holiday destinations, and of course photos back from the developers.

To truly break the surveillance capitalism economy and give me control of my data, Apple can’t merely give me a private phone. But that is all they can do, hence the focus.

Going back to the analogy of postal advertising, Apple offer a secure PO Box service where nobody knows what mail I’ve got. But the surveillance-industrial complex still knows what mail they deliver to that box, and what mail gets picked up from there. To go full thermonuclear war, as promised, we would need to get applications (including web apps) onto privacy-supporting backend platforms.

But Apple stopped selling Xserve, Mac Mini Server, and Mac Pro Server years ago. Mojave Server no longer contains: well, frankly, it no longer contains the server bits. And because they don’t have a server solution, they can’t tell you how to do your server solution. They can’t say “don’t use Google cloud, it means you’re giving your customers’ data to the surveillance-industrial complex”, because that’s anticompetitive.

At the Labrary, I run my own Nextcloud for file sharing, contacts, calendars, tasks etc. I host code on my own gitlab. I run my own mail service. That’s all work that other companies wouldn’t take on, expertise that’s not core to my business. But it does mean I know where all company-related data is, and that it’s not being shared with the surveillance-industrial complex. Not by me, anyway.

There’s more to Apple’s thermonuclear war on the surveillance-industrial complex than selling privacy-supporting edge devices. That small part of the overall problem supports a trillion-dollar company.

It seems like there’s a lot that could be interesting in the gap.

posted by Graham at 10:11  

Sunday, May 27, 2012

Is privacy a security feature?

I’ve spoken a lot about privacy recently: mainly because it’s an important problem. Important enough to hit the headlines; important enough for trade associations and independent developers alike to make a priority. Whether it’s talks at conferences, or guiding people on designing or implementing their apps, there’s been a lot of privacy involved. But is it really on-topic for a security boffin?

The “yes” camp: Microsoft

In Michael Howard and David LeBlanc’s book, “Writing Secure Code, Second Edition”, there’s a whole chapter on privacy:

Most privacy threats are information disclosure threats. When performing threat analysis, you should look at all such threats as potential privacy violations.

In this view, a privacy problem is a consequence of a failure of confidentiality being disrupted. You model your application, taking into account what data it protects, what value the customers put on that data, and how important it is to protect the confidentiality. Personally-identifying information is modelled in exactly this way.

Privacy automatically falls out of this modelling technique: if people can get access to confidential data, then you have a privacy violation (that also looks like a security vulnerability because it appears in your threat model).

The “no” camp: Oh, it’s Microsoft again

A different viewpoint is expressed in another book by Michael Howard (with Steve Lipner this time): “The Security Development Lifecycle”.

Many people see privacy and security as different views of the same issue. However, privacy can be seen as a way of complying with policy and security as a way of enforcing policy. […] Privacy’s focus is compliance with regulatory requirements[…], corporate policy, and customer expectations.

So in this model, privacy is a statement of intent, and security is a tool to ensure your software follows through on your intent. It’s the difference between design and implementation: privacy is about ensuring you build the right thing, and security helps you build the thing right. The two have nothing to say about each other, except that if you didn’t get the security right you can’t make any claim about whether the policy expressed in the privacy requirements will successfully be met in deployment.

The “who cares?” camp: me

The argument above seems to be a question of semantics, and trying to apportion responsibility for different aspects of development to different roles. In fact, everyone involved in making a product has the same goal – to make a great product – and such niggling is distracting from that goal.

Most of my professional work fits into one of a few categories:

  • Learning stuff
  • Making stuff
  • Helping other people make better stuff
  • Making other people better at making stuff than I am

So if, in the process of helping someone with their security, I should be able to help with their app’s privacy too, should I really keep quiet until we’ve solved some quibbling point of semantics?

posted by Graham at 15:35  

Sunday, March 25, 2012

More about the privacy pledge

Plenty of you have seen—and indeed signed— the App Makers’ Privacy Pledge on GitHub. If you haven’t, but after reading it are interested, see the instructions in the project README.

It’s great to see so many app makers taking an interest in this issue, and the main goal of the pledge is to raise awareness of app privacy concerns: awareness among developers that this is something to take seriously, and awareness among our customers that there are developers committed to respecting their identities and their data.

But awareness is useless if not followed through, so we need to do more. We need materials that developers can refer to: the GSM Association have good guidelines on app practices. We need actionable tasks that developers can implement right away, like Matt Gemmell’s hashing guide for social apps. We need sample code and libraries that developers can rely on. We need data lawyers to explain what the current regulations are, and what’s coming down the pipe. We need to convince the industry and the governments that we can regulate our own actions. We need the ability to audit our apps and determine whether they’re privacy-preserving. We need to be able to demonstrate to customers what we’ve done, and explain why that’s a good thing. We need to earn customer trust.

So there’s a lot to do, and the pledge is only the start. It’s off to a good start, but there’s still a long way to go.

posted by Graham at 21:06  

Wednesday, February 8, 2012

On privacy, hashing, and your customers

I’ve talked before about not being a dick when it comes to dealing with private data and personally-identifying information. It seems events have conspired to make it worth diving into some more detail.

Only collect data you need to collect (and have asked for)

There’s plenty of information on the iPhone ripe for the taking, as fellow iOS security boffin Nicolas Seriot discussed in his Black Hat paper. You can access a lot of this data without prompting the user: should you?

Probably not: that would mean being a dick. Think about the following questions.

  • Have I made it clear to my customers that we need this data?
  • Have I already given my customers the choice to decline access to the data?
  • Is it obvious to my customer, from the way our product works, that the product will need this data to function?

If the answer is “no” to any of these, then you should consider gathering the data to be a risky business, and the act of a dick. By the way, you’ll notice that I call your subscribers/licensees “your customers” not “the users”; try doing the same in your own discussions of how your product behaves. Particularly when talking to your investors.

Should you require a long-form version of that discussion, there’s plenty more detail on appropriate handling of customer privacy in the GSMA’s privacy guidelines for mobile app developers.

Only keep data you need to keep

Paraphrasing Taligent: There is no data more secure than no data. If you need to perform an operation on some data but don’t need to store the inputs, just throw the data away. As an example: if you need to deliver a message, you don’t need to keep the content after it’s delivered.

Hash things where that’s an option

If you need to understand associations between facts, but don’t need to be able to read the facts themselves, you can store a one-way hash of the fact so that you can trace the associations anonymously.

As an example, imagine that you direct customers to an affiliate website to buy some product. The affiliates then send the customers back to you to handle the purchase. This means you probably want to track the customer’s visit to your affiliate and back into your purchase system, so that you know who to charge for what and to get feedback on how your campaigns are going. You could just send the affiliate your customer’s email address:


But now everybody who can see the traffic – including the affiliate and their partners – can see your customer’s email address. That’s oversharing, or “being a dick” in the local parlance.

So you might think to hash the email address using a function like SHA1; you can track the same hash in and out of the affiliate’s site, but the outsiders can’t see the real data.

X-Customer-Identifier: 028271ebf0e9915b1b0af08b297d3cdbcf290e3c

We still have a couple of problems though. Anyone who can see this hash can take some guesses at what the content might be: they don’t need to reverse the hash, just figure out what it might contain and have a go at that. For example if someone knows you have a user called ‘iamleeg’ they might try generating hashes of emails at various providers with that same username until they hit on the gmail address as a match.

Another issue is that if multiple affiliates all partner with the same third business, that business can match the same hash across those affiliate sites and build up an aggregated view of that customer’s behaviour. For example, imagine that a few of your affiliates all use an analytics company called “Slurry” to track use of their websites. Slurry can see the same customer being passed by you to all of those sites.

So an additional step is to append a different random value called a salt to the data before you hash it in each context. Then the same data seen in different contexts cannot be associated, and it becomes harder to precompute a table of guesses at the meaning of each hash. So, let’s say that for one site you send the hash of “sdfugyfwevojnicsjno” + email. Then the header looks like:

X-Customer-Identifier: 22269bdc5bbe4473454ea9ac9b14554ae841fcf3

[OK, I admit I’m cheating in this case just to demonstrate the progressive improvement: in fact in the example above you could hash the user’s current login session identifier and send that, so that you can see purchases coming from a particular session and no-one else can track the same customer on the same site over time.]

N.B. I previously discussed why Apple are making a similar change with device identifiers.

But we’re a startup, we can’t afford this stuff

Startups are all about iterating quickly, finding problems and fixing them or changing strategy, right? The old pivot/persevere choice? Validated learning? OK, tell me this: why doesn’t that apply to security or privacy?

I would say that it’s fine for a startup to release a first version that covers the following minimum requirements (something I call “Just Barely Good Enough” security):

  • Legal obligations to your customers in whatever country those countries (and your data) reside
  • Standard security practices such as mitigating the OWASP top ten or OWASP mobile top ten
  • Not being a dick

In the O2 Labs I’ve been working with experts from various groups – legal, OFCOM compliance, IT security – to draw up checklists covering all of the above. Covering the baseline security won’t mean building the thing then throwing it at a pen tester to laugh at all the problems: it will mean going through the checklist. That can even be done while we’re planning the product.

Now, as with everything else in both product engineering and in running a startup, it’s time to measure, optimise and iterate. Do changes to your product change its conformance with the checklist issues? Are your customers telling you that something else you didn’t think of is important? Are you detecting intrusions that existing countermeasures don’t defend against? Did the law change? Measure those things, change your security posture, iterate: use the metrics to ensure that you’re pulling in the correct direction.

I suppose if I were willing to spend the time, I could package the above up as “Lean Security” and sell a 300-page book. But for now, this blog post will do. Try not to be a dick; check that you’re not being a dick; be less of a dick.

posted by Graham at 13:44  

Tuesday, September 6, 2011

Don’t be a dick

In a recent post on device identifiers, I wrote a guideline that I’ve previously invoked when it comes to sharing user data. Here is, in both more succinct and complete form than in the above-linked post, the Don’t Be A Dick Guide to Data Privacy:

  • The only things you are entitled to know are those things that the user told you.
  • The only things you are entitled to share are those things that the user permitted you to share.
  • The only entities with which you may share are those entities with which the user permitted you to share.
  • The only reason for sharing a user’s things is that the user wants to do something that requires sharing those things.

It’s simple, which makes for a good user experience. It’s explicit, which means culturally-situated ideas of acceptable implicit sharing do not muddy the issue.

It’s also general. One problem I’ve seen with privacy discussions is that different people have specific ideas of what the absolutely biggest privacy issue that must be solved now is. For many people, it’s location: they don’t like the idea that an organisation (public or private) can see where they are at any time. For others, it’s unique identifiers that would allow an entity to form an aggregate view of their data across multiple functions. For others, it’s conversations they have with their boss, mistress, whistle-blower or others.

Because the DBADG mentions none of these, it covers all of these. And more. Who knows what sensors and capabilities will exist in future smartphone kit? They might use mesh networks that can accurately position users in a crowd with respect to other members. They could include automatic person recognition to alert when your friends are nearby. A handset might include a blood sugar monitor. The fact is that by not stopping to cover any particular form of data, the above guideline covers all of these and any others that I didn’t think of.

There’s one thing it doesn’t address: just because a user wants to share something, should the app allow it? This is particularly a question that makers of apps for children should ask themselves. Children (and everybody else) deserve the default-private treatment of their data that the DBADG promotes. However, children also deserve impartial guidance on what it is a good or a bad idea to share with the interwebs at large, and that should be baked into the app experience. “Please check with a responsible adult before pressing this button” does not cut it: just don’t give them the button.

posted by Graham at 11:00  

Saturday, December 4, 2010

A site for discussing app security

There’s a new IT security site over at Stack Exchange. Questions and answers on designing and implementing IT security policy, and on app security are all welcome.

I’m currently a moderator at the site, but that’s just an interim thing while the site is being bootstrapped. Obviously, if people subsequently vote for me as a permanent moderator I’ll stay in, but the converse is also true. Anyway, check out the site, ask and answer questions, let’s make it as good a venue for app security discussion as is for general programming.

posted by Graham at 16:32  

Friday, December 3, 2010

On Fuzzy Aliens

I have just launched a new company, Fuzzy Aliens[*], offering application security consultancy services for smartphone app developers. This is not the FAQ list, this is the “questions I want to answer so that they don’t become frequently asked” list.

What do you offer?

The company’s services are all focussed on helping smartphone and tablet app developers discover and implement their applications’ security and privacy requirements. When planning an app, I can help with threat modelling, with training developers, securing the development lifecycle, requirements elicitation, secure user experience design, and with developing a testing strategy.

When it comes to implementation, you can hire me to do the security work on your iOS or Android app. That may be some background “plumbing” like storing a password or encrypting sensitive content, or it might be an end-to-end security feature. I can also do security code reviews and vulnerability analysis on existing applications.

Why would I want that?

If you’re developing an application destined for the enterprise market, you probably need it. Company I.T. departments will demand applications that conform to local policy regarding data protection, perhaps based on published standards such as the ISO 27000 family or PCI-DSS.

In the consumer market, users are getting wise to the privacy problems associated with mobile apps. Whether it’s accidentally posting the wrong thing to facebook, or being spied on by their apps, the public don’t want to—and shouldn’t need to—deal with security issues when they’re trying to get their work done and play their games.

Can I afford that?

Having been a Micro-ISV and contracted for others, I know that many apps are delivered under tight budgets by one-person companies. If all you need is a half day together to work on a niggling problem, that’s all you need to pay for. On the other hand I’m perfectly happy to work on longer projects, too :).

Why’s it called Fuzzy Aliens?

Well, the word “fuzz” obviously has a specific meaning in the world of secure software development, but basically the answer is that I knew I could turn that into a cute logo (still pending), and that it hadn’t been registered by a UK Ltd yet.

So how do I contact you about this?

You already have – you’re here. But you could see the company’s contact page for more specific information.

[*] More accurately, I have indicated the intent to do so. The articles of association have not yet been returned by Companies House, so for the next couple of days the blue touch paper is quietly smouldering.

posted by Graham at 17:08  

Monday, February 22, 2010

Look what the feds left behind…

So what conference was on in this auditorium before NSConference? Well, why don’t we just read the documents they left behind?


Ooops. While there’s nothing at higher clearance than Unrestricted inside, all of the content is marked internal eyes only (don’t worry, feds, I didn’t actually pay too much attention to the content. You don’t need to put me on the no-fly list). There’s an obvious problem though: if your government agency has the word “security” in its name, you should take care of security. Leaving private documentation in a public conference venue does not give anyone confidence in your ability to manage security issues.

posted by Graham at 13:48  

Friday, January 29, 2010

It’s just a big iPod

I think you would assume I had my privacy settings ramped up a little too high if I hadn’t heard about the iPad, Apple’s new touchscreen mobile device. Having had a few days to consider it and allow the hype to die down, my considered opinion on the iPad’s security profile is this: it’s just a big iPod.

Now that’s no bad thing. We’ve seen from the iPhone that the moderated gateway for distributing software—the App Store—keeps malware away from the platform. Both the Rickrolling iKee worm and its malicious sibling, Duh, rely on users enabling software not sanctioned through the app store. Now whether or not Apple’s review process is a 100% foolproof way of keeping malware off iPhones, iPods and iPads is not proven either way, but it certainly seems to be doing its job so far.

Of course, reviewing every one of those 140,000+ apps is not a free process. Last year, Apple were saying 98% of apps are reviewed in 7 days, this month only 90% are approved in 14 days. So there’s clearly a scalability problem with the review process, and if the iPad does genuinely lead to a “second app store gold rush” then we’ll probably not see an improvement there, either. Now, if an app developer discovers a vulnerability in their app (or worse, if a zero-day is uncovered), it could take a couple of weeks to get a security fix out to customers. How should the developer deal with that situation? Should Apple get involved (and if they do, couldn’t they have used that time to approve the update)? Update: I’m told (thanks @Reversity) that it’s possible to expedite reviews by emailing Apple. We just have to hope that not all developers find out about that, or they’ll all try it.

The part of the “big iPod” picture that I find most interesting from a security perspective, however, is the user account model. In a nutshell, there isn’t one. Just like an iPhone or iPod, it is assumed that the person touching the screen is the person who owns the data on the iPad. There are numerous situations in which that is a reasonable assumption. My iPhone, for instance, spends most of its time in my pocket or in my hand, so it’s rare that someone else gets to use it. If someone casually tries to borrow or steal the phone, the PIN lock should be sufficient to keep them from gaining access. However, as it’s the 3G model rather than the newer 3GS, it lacks filesystem encryption, so a knowledgeable thief could still get the data from it. (As an aside, Apple have not mentioned whether the iPad features the same encryption as the iPhone 3GS, so it would be safest to assume that it does not).

The iPad makes sense as a single-user or shared device if it is used as a living room media unit. My girlfriend and I are happy to share music, photos, and videos, so if that’s all the iPad had it wouldn’t matter if we both used the same one. But for some other use cases even we need to keep secrets from each other—we both work with confidential data so can’t share all of our files. With a laptop, we can each use separate accounts, so when one of us logs in we have access to our own files but not to the other’s.

That multi-user capability—even more important in corporate environments—doesn’t exist in the iPhone OS, and therefore doesn’t exist on the iPad. If two different people want to use an iPad to work with confidential information, they don’t need different accounts; they need different iPads. [Another aside: even if all the data is “in the cloud” the fact that two users on one iPad would share a keychain could mean that they have access to each others’ accounts anyway.] Each would need to protect his iPad from access by anyone else. Now even though in practice many companies do have a “one user, one laptop” correlation, they still rely on a centralised directory service to configure the user accounts, and therefore the security settings including access to private data.

Now the iPhone Configuration Utility (assuming its use is extended to iPads) allows configuration of the security settings on the device such as they are, but you can’t just give Jenkins an iPad, have him tell it that he’s Jenkins, then have it notice that it’s Jenkins’s iPad and should grab Jenkins’s account settings. You can do that with Macs and PCs on a network with a directory service; the individual computers can be treated to varying extents as pieces of furniture which only become “Jenkins’s computer” when Jenkins is using one.

If the iPad works in the same way as an iPhone, it will grab that personal and account info from whatever Mac or PC it’s synced to. Plug it in to a different computer, and that one can sync it, merging or replacing the information on the device. This makes registration fairly easy (“here’s your iPad, Jenkins, plug it in to your computer when you’re logged in”) and deregistration more involved (“Jenkins has quit, we need to recover or remove his PIN, take the data from the iPad, then wipe it before we can give it to Hopkins, his replacement”). I happen to believe that many IT departments could, with a “one iPad<->one computer<->one user” system, manage iPads in that way. But it would require a bit of a change from the way they currently run their networks and IT departments don’t change things without good reason. They would probably want full-device encryption (status: unknown) and to lock syncing to a single system (status: the iPhone Enterprise Deployment Guide doesn’t make it clear, but I think it isn’t possible).

What is clear based on the blogosphere/twitterverse reaction to the device is that many companies will be forced, sooner or later, to support iPads, just as when people started turning up to the helpdesks with BlackBerries and iPhones expecting them to be supported. Being part of that updated IT universe will make for an exciting couple of years.

posted by Graham at 06:03  

Powered by WordPress