Thoughts on Identity, Part 1
Jesper M. Johansson
A concept many of us thought we had a good handle on keeps coming back up these days: identity. With unrelenting speed, we are seeing various new identity projects, most in the form of some kind of distributed identity systems and/or something that will finally replace passwords. These systems all have one basic goal—to replace the plethora of username-and-password combinations we all have to remember with a single identity. Instead, these new systems would offer one identity that gets us access to everything.
As I look at the 127,128 passwords I have stored in my Password Safe
, I think that actually sounds quite appealing. And each time I go to any Microsoft.com property and am asked, again, for my Windows Live ID and password, I realize we also have a very long way to go before reaching a world of single sign on (SSO).
Identity, however, is a nebulous concept and an oft-misunderstood one. In this article, I lay out some thoughts on identity and principles that an identity system has to meet. I do not have enough gray hair yet to pretend that these are fundamental laws that these systems must obey. However, I do believe that any identity system that fails to abide by these principles will fail in the court of consumer opinion, as well as in the one of enterprise support and deployment. An identity system, as theoretically pleasing as it may be, must ultimately provide value to a business and to a user if it is going to succeed.
What Is an Identity?
An identity, simply put, is an abstract representation of an entity in a computer system. Philosophy defines it as the "sameness" of two things. Two (or more) things are identical if they are the same. Yes, that would qualify as a highly circular definition. So, instead, let's say that identity states that an entity is definable and recognizable. Maybe this is simpler:
That is the basic definition of identity in symbolic logic. P is equal to Q if P is the same thing as Q. Restated slightly differently, if Peter is Peter then clearly Peter is Peter. Now all we need is a way for Peter to prove that he (or she?) is in fact Peter.
Methods of proving identity are highly interesting in digital systems. In the digital world, we have a slightly different definition of identity from that in symbolic logic. In his seminal paper "The Laws of Identity
," Kim Cameron defines digital identity as "a set of claims made by one digital subject about itself or another digital subject."
That is a different definition than the pure logical one. The pure logical definition is simply who you are. In the digital world, identity is rooted in the claims you choose to present. In other words, a digital identity is not who you are, but rather, who you choose to be. In a digital identity system, as long as Peter, or whoever claims to be Peter, can provide an acceptable proof of that claim (meaning that the can authenticate the claim), we must accept Peter's identity. The claims Cameron talks about are essentially a manifestation of identity. Those are the authenticators Peter, or whoever needs to be Peter, is presenting.
Therein lies the first identity problem. Digital identity is not the same as an ontological (in the philosophical sense) identity—a representation of a real world entity that actually exists. Digital identity is ephemeral. Digital identity is incomplete. Digital identity is optional. And, most interesting of all, a single physical entity can present many different sets of claims and, therefore, have many different digital identities. Digital identities are quite possibly most useful when they do not have a 1:1 correspondence to an ontological entity.
The Identity Problem
A claims-based digital identity concept causes many interesting problems. First is the fact that an identity has a price. You can buy a whole new identity for only a few thousand dollars. For that money you get a functioning social security number along with a valid address that you can change. Driver's licenses, credit cards, passports, and the rest can be had with that. If you want an identity with a verifiable personal history, that will probably cost you quite a bit more.
Some people, including myself, claim that this means that ontological identity is meaningless as it relates to digital identity systems. Personally, I find true identity (or at least, correctness of identity) to be irrelevant and uninteresting when dealing with it in the context of a digital system. For the vast majority of applications, identity merely means that you can lie consistently. If you are able to present the same lies to me today that you did two weeks ago, I am going to assume you are the same person/entity/computer and give you whatever access I granted to you last time.
Given that true, ontological identity is not only ephemeral, but irrelevant to most applications, I don't care about binding you to a real-world entity. I just need correlation, not true physical identity, for most purposes. Any application that relies on binding identity to a physical and immutable entity is doomed from the start, and relying on such a strong connection means you will always make mistakes and they will always be critical. Any real-world person can undermine such a system merely by lying or withholding the claims of identity or, of course, by sharing them (inadvertently or deliberately) with another entity.
Rather, by building a system that (1) is based on the assumption that any claimed identity is fake, and (2) that merely attempts to correlate identity between sessions, we can build a far more resilient, useful, and valuable system. And we waste far fewer resources attempting to solve a problem that, throughout history, nobody has been able to solve without outrageous violations of personal privacy.
Another problem with identity is that identity by itself is not something you can use. Yet, throughout history, identities have often been taken for granted, without authentication. In Shakespeare's Henry IV, there is a passage where a character says, "I am Robert Shallow, sir, a poor esquire of this county, and one of the King's justices of the peace." Curiously, the character to whom Justice Shallow was speaking took this claim of identity as truth, with no further verification. Go back into earlier forms of literature, such as the Norse Sagas, and you will find similar claims of identities made by Thor and Odin, all of which were immediately accepted without question.
Clearly, if you have only one eye, are riding on an eight-legged steed, and claim to be Odin, one might argue that you have provided some amount of verification of identity. However, on the Internet, we cannot use the mounts we ride as verification of identity. As the old saying goes, "On the Internet nobody knows you are a dog." Therefore, unauthenticated claims of identity are hardly ever accepted. Therefore, we need to somehow authenticate identity.
"Ay, there's the rub." (Hamlet.) How do you prove your identity? After all, the proof of your identity is as important, perhaps more so, as the actual identity. So, how do we prove identities? The method we used to use was a form of shared secret authentication, known as "passwords." On February 14, 2006, Microsoft Chairman Bill Gates declared that passwords would be gone where the dinosaurs rest in three to four years.
But as I write this in March 2009, it is pretty clear that Bill was wrong. I have more passwords now than I had in February 2006. Just at work, I have my network password, my Unix password, the password for the expense reporting system, the password for the HR system, passwords for the company stock broker and benefits providers, the root passwords for my developer machine and a few servers, the admin passwords for my laptops, the PIN for my phone, and a couple of extraneous passwords to access the company products as various users. Still, despite having so many passwords, I can honestly say that I really do not wish passwords were dead. They are not actually a bad authentication claim.
What was Bill Gates thinking? He thought, three years ago, that by now InfoCard would have replaced all these passwords. InfoCard, which was eventually renamed Windows CardSpace, is a kind of authentication technology that shipped in Windows Vista and later in Windows XP SP3.
Points go to anyone who has ever created a card. I will give you extra credit if you find a single Web site that accepts a Windows CardSpace credential. If you just want to try it, you can take part in a beta test of InfoCard for Windows Live
Stakeholders in Identity
There are, of course, other ways to authenticate an identity. Some are more useful in certain situations. To understand where, you need to understand the difference between the stakeholders in an identity system.
A provider of identity services is some entity that provides a service to the other parties. Typically, the provider holds the identity database and authenticates the end users of identities. Typically, the provider is not a person. Thus, the provider can deal with much larger volumes of structured data than a person can. For example, a provider can use a shared secret of arbitrary length to prove its own identity. Therefore, the credential used to authenticate the identity of an identity provider can often be many orders of magnitude more complex than what you and I can remember. Consequently, it is potentially many orders of magnitude more secure. I say "potentially" because whether it really is depends greatly on how the secrets are managed. It does no good to have a 4096-bit secret to prove your identity if you do not actually keep it secret.
Identity providers can very easily use constructs such as Digital Certificates to authenticate themselves. A Digital Certificate is simply a relatively sizeable chunk (a couple of thousand bytes) of structured information. A computer can easily send that as part of every transaction. You and I, however, would not be too thrilled if we had to enter that to authenticate for a transaction.
The end user is the entity that needs to have its identity asserted. Very often, the user is a person, but that is not always the case—the user can also be a computer system. Therefore, the quality of the claims the user can present varies greatly.
Finally, there's the relying party. The relying party is the entity that trusts the identity provider to verify the user before providing the user some service. The relying party is almost always a computer system. To the extent that the relying party proves its own identity to a user and the identity provider, it can do so with equally strong claims as the identity provider used.
It is important to note that a user in one scenario is often a relying party in another. If the user is a piece of software, or has the help of a piece of software, this user can easily deal with the same types of structured data as the provider. That is why distributed identity protocols, such as OAuth
and the various security specifications part of WS-* focus heavily on significant quantities of structured data for authentication—they are primarily focused on services authenticating to each other. Even when those specifications deal directly with people, they tend to be brokered through services that can handle the structured data on behalf of the people. Often, the broker is just an application on the user's computer.
The concept that a user (a person) uses a computer system as an intermediary when accessing services from a relying party is quite important. It goes to the heart of how to make identities usable by people, as opposed to systems. Unfortunately, other than InfoCard, which so far has found little success, few systems really address that topic. To a large extent, the process flow when a physical person uses an identity service to get access to something looks something like Figure 1.
Figure 1 Process Flow of Using an Identity Service
Notice the "Then A Miracle Occurs" (TAMO) box. The sociotechnical aspects of the system—how you enable users to take full advantage of them—is where a system is truly made or broken.
There have been many attempts at resolving the TAMO issue. Most came about because of the incongruity that on the Internet we want to use strong identities, typically based on certificates, but users persistently refuse to type them in every time they need to access a service. Microsoft Passport, currently called Windows Live ID, was an early attempt. Google has entered the fray with its Google Account, and there are others, as well. Even Windows CardSpace can be said to have been an attempt at solving this problem. All are designed to provide some level of SSO. All have succeeded, largely, in doing so, within the domains owned by their various purveyors of the particular solutions. All have failed almost entirely outside of their purveyor's domains.
In the remainder of this article and Part 2, I will address this problem in more detail. Note that I do not purport to solve the problem, but merely to point out some reasons why nobody else has yet solved the problem either. Eventually, I will arrive at a set of principles that I believe any solution to this problem must follow. That, in turn, will lead to a conclusion that says, to sum it up, ontological identity may just be a pipe dream.
How Many Identities Do You Have?
The big challenge with respect to identity is not in designing an identity system that can provide SSO, even though that is where most of the technical effort is going. It's not even in making the solution smoothly functioning and usable, where, unfortunately, less effort is going. The challenge is that users today have many identities. As I mentioned above, I have well over 100. On a daily basis, I use at least 20 or 25 of those. Perhaps users have too many identities, but I would not consider that a foregone conclusion.
The purist would now say that "SSO can fix that problem." However, I don't think it is a problem. At least it is not the big problem. I like having many identities. Having many identities means I can rest assured that the various services I use cannot correlate my information. I do not have to give my e-mail provider my stock broker identity, nor do I have to give my credit card company the identity I use at my favorite online shopping site. And only I know the identity I use for the photo sharing site. Having multiple identities allows me to keep my life, and my privacy, compartmentalized.
In addition, if the credit card company manages to get itself hacked, all the other identities are unaffected. If I had a single identity, that may or may not be the case, depending on how that single identity was implemented. A properly implemented SSO system would never expose one site using the system to a failure in another, unrelated site. However, enforcing that separation is not trivial, and if the identity provider is compromised, every site that depends on it is automatically compromised.
This latter point is one reason why the digital identity systems Cameron described have, so far, failed to become the single universal identity provider—any single identity provider is an extremely sensitive entity.
Principles of Identity
There are several principles of identity that must be met in at least some way to provide a successful digital identity system. These are very different from the laws Cameron outlined in the Laws of Identity paper. Cameron's laws were design principles—use cases, more or less—that essentially define technical requirements for a trustworthy digital identity system. As such, they are a set of necessary but insufficient criteria for success. While many systems can be designed to meet or exceed Cameron's laws, I do not think any of them will be broadly successful without also taking into account the principles I define here. My principles are higher-level principles, dealing with business requirements, not with direct design points of the system.
1. The identity provider is at least as sensitive as the most sensitive relying party.
First, as I mentioned earlier, the identity provider is trusted by all parties. This is why we refer to them as "relying parties." That means that the identity provider must be at least as well protected as the most sensitive relying party requires. This single point is why many of the candidate universal systems have failed to create broad-reaching SSO systems—the trust simply is not there. Even if the trust is warranted, it is extraordinarily difficult to prove why one party should trust another.
Would you trust your e-mail provider with your bank account information and all your checking account data? If the answer is "no," you should not trust it to provide the identity you use to sign into your bank. Consumers today are deluged by unreliable software, incomprehensible pop-ups, and anti-malware vendors (legitimate and not-so-legitimate) who claim you can't possibly surf the Internet safely unless you pay them. Consequently, and quite wisely, consumers trust almost nobody. Very few organizations today are trusted enough to be candidate identity providers for other organizations. Building that kind of trust is not cheap. Keeping it is a fragile proposition. And earning it back once lost due to a breach is nearly impossible.
If we will ever see a company providing a successful universal digital identity system, it will be from a company that has earned a very high level of trust. Interestingly, many of the players in the identity provider space are not very high up in the surveys, or they do not show up at all (See the Ponemon Institute’s fifth annual survey of Most Trusted Companies for Privacy
2. Permit the enterprise to protect its customer relationships and to own and control information that it deems as business confidential.
Anyone who has an MBA knows that there are three ways to succeed with a business: you have the most innovative product, or you have the lowest price, or you excel at customer intimacy (or some combination of all three). Interestingly, virtually every business is interested in the latter scenario: customer intimacy. Customer relationships, especially in an online business, are sacred! In a world where switching costs are next to nothing, where everyone guarantees (for some loose definition of "guarantees") the lowest price, and the same product is available everywhere, customer relationships become key. Even where innovation does happen, such as in social networking and e-mail providers, customer connection is still critical.
One primary reason why no large Web site accepts credentials from another company (or another company that did not use to own them, as in the case of Expedia) is that it dilutes its relationship with its customers. Imagine, for example, if Yahoo simply received a claim from NetIdentitiesRUs that, yes, it really is customer 923071235309342-2 that just signed in. Yahoo would no longer own the database of its customers. It would not know who the customer actually is. Yahoo could no longer manage the customer and the system the way it wanted. It could no longer do customer research by cross-checking the identity with the databases in its subsidiaries and acquisitions. It could not even sell the information to third parties for extra revenue (should it wish to and had the customers opted in, of course). Clearly, Yahoo would not be particularly interested in giving all that up and, consequently, would need some exceptional incentives to do so.
Now imagine instead that Yahoo is the identity provider. Yahoo can get information on exactly where customer 923071235309342-2 goes on the Web because it knows exactly which sites request claims for that customer. Yahoo could compile invaluable data on that customer's browsing habits, allowing extraordinarily well-targeted advertisements with astonishingly high click-through rates. For an ad-funded organization, and especially one that is actually in the ad business itself, that is a competitive advantage you just cannot give up.
Trusting identities provided by other companies would be equivalent to giving up huge opportunity. Providing your own to others would be jumping on it. In other words, there's a conflict and the result is exactly what we are seeing on the Web today. Within conglomerates, we see one universal identity, and while all of the major conglomerates are also identity providers, virtually nobody relies on them.
This analysis points to a fundamental asymmetry in identity provisioning. There is huge value in being the identity provider and no value at all in being the relying party. Unless the market can figure out ways to equalize the two, we will continue to see as many identity providers as relying parties, with virtually a 1:1 relationship between them.
Check back next month when I will look at the remaining principles in this two-part series.
Jesper M. Johansson is Principal Security Architect for a well-known Fortune 200 company, working on risk-based security vision and security strategy. He is also a contributing editor to TechNet Magazine. His work consists of ensuring security in some of the largest, most distributed systems in the world. He holds a Ph.D. in Management Information Systems, has more than 20 years experience in security, and is an MVP in Enterprise Security. His latest book is the Windows Server 2008 Security Resource Kit.