It’s understandable when things change as fast as they do these days that it takes a bit for our ideas of how things work to catch up to how they actually work. One misunderstanding worth clearing up, since it’s so sensitive, is the suggestion that Apple (or Google, or whoever) is somewhere maintaining a special folder in which all your naughty pics are kept. You’re right to be suspicious, but fortunately, that’s not how it works.
What these companies are doing, one way or another, is analyzing your photos for content. They use sophisticated image recognition algorithms that can easily recognize anything from dogs and boats to faces and actions.
When a dog is detected, a “dog” tag is added to the metadata that the service tracks in relation to that photo — alongside things like when you took the picture, its exposure settings, location, and so on. It’s a very low-level process — the system doesn’t actually know what a dog is, just that photos with certain numbers associated with them (corresponding to various visual features) get that tag. But now you can search for those things and it can find them easily.
This analysis generally happens inside a sandbox, and very little of what the systems determine makes it outside of that sandbox. There are special exceptions, of course, for things like child pornography, for which very special classifiers have been created and which are specifically permitted to reach outside that sandbox.
The sandbox once needed to be big enough to encompass a web service — you would only get your photos tagged with their contents if you uploaded them to Google Photos, or iCloud, or whatever. That’s no longer the case.
Because of improvements in the worlds of machine learning and processing power, the same algorithms that once had to live on giant server farms are now efficient enough to run right on your phone. So now your photos get the “dog” tag without having to send them off to Apple or Google for analysis.
This is arguably a much better system in terms of security and privacy — you are no longer using someone else’s hardware to examine your private data and trusting them to keep it private. You still have to trust them, but there are fewer parts and steps to trust — a simplification and shortening of the “trust chain.”
But expressing this to users can be difficult. What they see is that their private — perhaps very private — photos have been assigned categories and sorted without their consent. It’s kind of hard to believe that this is possible without a company sticking its nose in there.
Part of that is the UI’s fault. When you search in the Photos app on iPhone, it shows what you searched for (if it exists) as a “category.” That suggests that the photos are “in” a “folder” somewhere on the phone, presumably labeled “car” or “swimsuit” or whatever. What we have here is a failure to communicate how the search actually works.
The limitation of these photo classifier algorithms is that they’re not particularly flexible. You can train one to recognize the 500 most common objects seen in photos, but if your photo doesn’t have one of those in it, it doesn’t get tagged at all. The “categories” you’re seeing listed when you search are those common objects that the systems are trained to look for. As noted above, it’s a pretty approximate process — really just a threshold confidence level that some object is in the picture. (In the image above, for instance, the picture of me in an anechoic chamber was labeled “carton,” I suppose because the walls look like milk cartons?)
The whole “folder” thing and most ideas of how files are stored in computer systems today are anachronistic. But those of us who grew up with the desktop-style nested folder system often still think that way, and it’s hard to think of a container of photos as being anything other than a folder — but folders have certain connotations of creation, access, and management that don’t apply here.
Your photos aren’t being put in a container with the label “swimsuit” on it — it’s just comparing the text you wrote in the box to the text in the metadata of the photo, and if swimsuits were detected, it lists those photos.
This doesn’t mean the companies in question are entirely exonerated from all questioning. For instance, what objects and categories do these services look for, what is excluded and why? How were their classifiers trained, and are they equally effective on, for example, people with different skin colors or genders? How do you control or turn off this feature, or if you can’t, why not?
Fortunately, I’ve contacted several of the leading tech companies to ask some of these very questions, and will detail their responses in an upcoming post.