Thoughts on Identity, Part 2
Jesper M. Johansson
Last month, I embarked on the somewhat daunting task of describing identity systems and, in particular, why we still don’t have an industry standard one. First, I defi ned what identity really is, and why we care about identities in digital systems. Then I covered the problems with identity—notably, the fact that we all have not just one identity, but many. Then I pointed out that if you don't believe you've got enough, you can always create, or buy, a few more.
As discussed, this creates a problem in authentication because we typically think of identity as an ontological identity—a representation of a real, physical person. In most systems, that relationship turns out to be an unnecessary complication. Most identities don't need to be representative of the real-world entity they identify or even identify any real-world entity at all—in fact, it's often undesirable for them to do so.
Finally, I covered how authentication works—specifically, that the really interesting part happens in the socio-technical system where the consumer of some service provides proof of identity. I noted that a successful digital-identity system must fulfill certain criteria and conform to a set of basic principles. In Part I, I covered the first two such principles: "The identity provider is at least as sensitive as the most sensitive relying party" and "Permit the enterprise to protect its customer relationships and to own and control information that it deems as business confidential." Here, in Part II, I conclude the series by covering additional principles that successful digital identity systems must meet.
Be Platform Agnostic
Because they're in business to make money, companies will cater to their customers. They'll endeavor to reduce the amount of "friction" required for the customer to hand over their hard-earned dollars. Any smart business will avoid making requirements of its customers that reduce the customer's likelihood of staying a customer, nor any requirement that reduces the customer base. That means that no mainstream business will use an identity solution that excludes some number of its customers from being able to spend money at the business. Likewise, no rational business is going to implement an identity solution that requires its customers to go to extra lengths to install new software or components to in order to use it.
A similar implementation decision will be made for an identity solution that works for only a subset of customers. Let's say that such a solution costs $5 million to implement. Let's also say that the free cash flow for the business—the proportion of its revenue that is not earmarked for any current development project—is 7 percent of its gross revenues. In this case, the company would need to generate an additional $71.5 million in revenues from the solution just to cover the cost of implementation. That is a significant number, especially if it provides only an incrementally improved solution for a subset of customers. There would have to be a measureable impact to the security of those customers to justify it. Few, if any, identity solutions will meet those requirements.
Work Well with the User's Cognitive Framework
Ultimately, an identity system that people use must work within the human cognitive framework. Surprising as it may seem, human beings didn't evolve to deal with digital identity systems. Nor, it would seem, were they particularly intelligently designed to do so. Many scholars in cognitive science claim that the fundamental wiring of human beings was designed to deal with life in a cave, foraging for food (and mates), and trying to survive attacks by saber-toothed cats. Undoubtedly, mankind has spent a lot longer living in caves than in cubicles and a lot longer communicating with smoke signals than electrons. In cave life, we had little use for digital identities, and, consequently, we didn't evolve with wiring specifically geared to understanding them.
Consequently, our mental model of the world doesn't include managing hundreds of digital identities to safeguard electronic transactions. However, it is perfectly reasonable to assume that people had to use code words while living in caves. Certain words likely had special meaning and got special results. "Please" probably had some kind of forerunner in caveman-speak, just to mention one.
I'm not saying that using code words, or the current equivalent—passwords—is the only reasonable way to authenticate a digital identity. I am saying that if a majority of users is to accept your digital-identity system, it must not require them to change their cognitive model of the world. Cognitive models are largely hard-wired. We can certainly deal with systems that don't fit with cognitive models—but doing so comes at a cost: stress, frustration and anger. Using such a system would have to include a benefit that outweighs the cost of those factors. By contrast, an intuitively obvious system will receive greater levels of adoption even if it provides fewer other benefits.
Guarantee Two-Way Identification
Among the most important aspects of a digital-identity system—and one that, unfortunately, isn't well understood by the users of such systems—is the identification of the relying party and/or identity provider to end users. The relying party—or more commonly, the service provider—is often implicitly trusted by end users. That's why phishing attacks—that is, stealing an identity by posing as a trusted party—are so incredibly successful. It doesn't take much to fool a sufficient number of users to reveal their secrets.
A successful digital-identity system must permit all parties to authenticate themselves in ways that fit the cognitive model of the system user. Unfortunately, this crucial aspect of the system is often ignored by the service providers. In my "Security is About Passwords and Credit Cards" series, I illustrated how a popular credit-card company actively refuses to identify itself to you before you authenticate to it. At this writing, it gives me no pleasure to report that Discover still refuses to show a user a digital identity before requesting the user to authenticate. If you go to the Discover Card Web site, you'll be redirected and asked for your username and password without any opportunity to verify that you aren't sending your credentials to a phishing site. One can only wonder how many Discover accounts have been compromised because cardholders have been conditioned to provider their usernames and passwords to anyone who asks.
On the other hand, it's impossible to argue that SSL has been a successful component of digital-identity systems. The fact that it permits user authentication is virtually unknown, at least to users. The fact that it is primarily designed to prove server identity is ignored by many service providers. Instead, SSL is used as an expensive key-exchange mechanism and as a primary source of revenue to companies that issue the certificates that nobody bothers to inspect. As currently implemented, SSL is failing as a component in digital-identity systems. It doesn't work with the user's cognitive models and while it permits service providers to identify themselves, this identification is presented to end users so poorly that most don't understand how to use it.
A successful end-to-end digital-identity system must make identification of both parties to the transaction an integral part of the authentication workflow. However, a digital-identity system must first and foremost solve the digital-identity problem. Managing digital identities, refreshing digital identities, storing digital identities, transmitting claims proving digital identities, verifying digital identities, and granting appropriate access to digital identities are all problems that are difficult enough on their own. If a digital-identity system can be used to solve other problems as well, that's a bonus—but it shouldn't be among the system's design goals.
A perfect example is phishing. Phishing is a human problem, not a digital-identity one. Phishing attacks human beings, not technologies. Ultimately, the only solution to phishing will be to help people make more intelligent security decisions. An identity system can certainly do that, but not at the expense of the core purpose of such a system which is to help users and service providers identify themselves to each other.
Permit the User to Provide Claims with an Assurance Level Appropriate to the Service Provided
Most users use many different information services. Some services provide highly sensitive information, such as bank-accounts or retirement-savings accounts. Some provide information that may be sensitive in some cases, such as e-mail. Others provide information of value to users' reputation, such as social-networking sites. Still others provide information that isn't at all sensitive, such as a software supplier's technical-support site.
Yet, despite the different levels of sensitivity involved with these services, many try to use the same identity. For example, the same identity I use to access my e-mail is requested every time I try to get product information from my software supplier; if I chose to do so, I could manage my banking information with the same identity. I believe that my e-mail has significant value; my banking information definitely does. Software product information? Not so much. Because I could easily get a salesperson to print it out and even drive 30 miles to deliver it all, I consider that information to have virtually no value at all.
This type of overloaded use of credentials is endemic to identity systems. It is called "single sign-on." In some ways, single sign-on is an appealing concept. In an enterprise environment, it has great business value and, if it isn't already available, it should be added to the agenda right now. Outside the enterprise, where we deal with information of very disparate value, single sign-on is dangerous. If you walked up to your newsstand and the salesperson asked you for two forms of identity before you were allowed to buy your morning paper, you'd surely object, but the same request made before you can withdraw $2,000 from your bank account or drive away in a new car wouldn't raise your eyebrows.
The same separation of claims must be supported by a successful digital-identity system. Users should be able to present a set of credentials appropriate to the level of risk represented by the data or services they arerequesting.
Single sign-on is inappropriate in broadly used digital-identity systems. It's certainly acceptable to use single sign-on to access the system itself, but the actual credentials presented to the relying party—the service provider—must be commensurate with the service the user is receiving. For that reason, it's highly likely that, whatever form a successful digital-identity system takes, it will support a two-layer identity system: one layer to sign into the digital-identity system itself and another to actually identify the user to the relying party. A system that provides this facility nicely is Microsoft's InfoCard system.
Better still, a good digital-identity system should make it easy for the user to submit a set of claims defined for a service to that service, but warn the user when submitting them to another service. In other words, the digital-identity system should help users identify the service providers they are attempting to access.
Focus on Consistency of the Claims as Opposed to a Canonical Identity
A successful digital-identity system must respect the individual's right to privacy. While the exact expectation of privacy differs greatly between cultures, the overall human desire for privacy, however it is defined, is indisputable. You might even call privacy an innate human need. It can easily be considered a safety need, under Abraham Maslow's hierarchy of needs shown in Figure 1. Maslow, writing in pre-Internet times, may simply have excluded it because privacy was not as much of an issue in 1943.
Figure 1 Maslow’s Hierarchy of Needs
If we accept privacy as a human need, or at least a desire, we can discuss that need in the context of a digital-identity system. Consider, for example, a system such as a discussion board about your favorite hobby. Most such discussion boards require authentication; in other words, they implement a digital-identity system. What identity do you use for that board? Do you use your real name or do you go by a moniker such as "Deep Diver 13"? Most of us would probably use some kind of nickname. Requiring us to register a valid phone number and home address for such a purpose would probably be considered a violation of privacy. Requiring us to use a digital identity that maps to our physical person, and which also maps to our health records, is almost certainly going to be considered a violation of privacy.
I've often said that, for most purposes, digital-identity systems need only be concerned with whether a user can lie consistently. A user doesn't have to bind his or her digital identity a given system to a physical identity or even to a digital identity used in another system. Users should be able to conceal their true identities and hide links to other systems, whether at the same sensitivity level or other ones, by using different digital identities.
The system, in essence, must focus on the consistency of the claims presented by the user, rather than on binding those claims to a canonical, typically ontological, identity. As long as the same set of claims is presented, the user should be accepted, and for most purposes, that is all the system requires. Take a retail Web site, for instance. The retailer really doesn't need to care about who a user actually is, unless the laws governing the transaction require it. The retailer needs to care only about whether users can present the same lies today as when they set up their accounts, and whether their methods of payment still work. In most cases, that is sufficient to make a successful transaction. Many identity systems overdo the "identity" part and attempt to tie identity to a person, as opposed to a user.
Perhaps even more important than providing the user the ability to protect his or her ontological identity is the ability to make it easy to manufacture identities. A digital-identity system is, after all, just software. It could very easily help users manage identities to maximize user privacy. A digital-identity system that implements this ability stands a much better chance of success than one that ignores it.
Don't Mix Trust Levels
One interesting feature of the digital- identity systems is that they often require use of an identity provider that differs from the service provider. If the trust the user places in the service provider doesn't match the level of trust placed in the identity provider, there's obvious discord. If the user trusts the service provider, but not the identity provider that the service provider uses, the user may be reluctant to use the service provider for that very reason. An interesting example where this might occur is Expedia.com. Formerly a Microsoft subsidiary, Expedia uses Microsoft Passport, now Windows Live ID, for authentication, as shown in Figure 2.
Figure 2 Expedia supports sign-in with Windows Live ID.
Now let's say that a user trusts Expedia, but not Microsoft. If Expedia required use of Windows Live ID (which it doesn't) would that user be willing to use Expedia at all? Maybe. Maybe not. Whichever the case may be, the user is able to use an Expedia identity on Expedia as opposed to a Windows Live ID one, and some users may prefer that.
Of course, the same trust issue goes the other way. If a user trusts Microsoft but not Expedia, the user may be unwilling to use Expedia if it requires use of Windows Live ID. Perhaps the user has used Windows Live ID for highly sensitive purposes, such as protecting bank-account information through MSN Money. In that case, some users may be reluctant to use the same Windows Live ID that protects their bank-account information to book travel reservations.
This property of a system is very much in line with the previous item about requiring claims in accordance with the sensitivity of the information that they protect. In a nutshell, the user must trust certain parties in any transaction, but the extent to which those parties are trusted may differ. Mixing trust levels is likely to provide users with reasons to distrust the system.
Don't Violate Laws or Regulations—or Expectations
Few issues in information-security management today are as vexing as legal and regulatory compliance. While many security professionals probably don't want to think of lawyers as their best friends, attorneys are becoming absolutely necessary to the cause. Privacy is a matter of law and regulation, and those laws and regulation differ in different jurisdictions.
This raises an interesting issue. Let's say that a particular digital-identity system is actively recruiting relying parties. The identity claims get picked up by a relying party that provides information that isn't very sensitive, such as access to a discussion forum. The relying party provides a level of security for the identity claims commensurate with the sensitivity of the information they serve—in other words, very little. Now let's say that the identity provider signs a contract with a credit-reporting agency, and consequently modifies the claims packet to include a national ID number or some representation of it. This new claim is suddenly being transmitted to the discussion board. The discussion board's security protocols don't provide the security level required for such information, so it may violate various laws and regulations.
These are typically simple issues to work around, and if the remaining principles are followed, it's unlikely that this one will be violated, but it's still an important consideration. Likewise, digital-identity systems that are used across national borders must be sensitive to different rules. Asking for a particular piece of identifying information may be perfectly legitimate in one jurisdiction; asking for the same information somewhere else may be illegal or subject to regulation.
For example, asking for age as part of a digital-identity system makes the identity provider subject to the Children's Online Privacy Protection Act in the United States and similar laws in other jurisdictions. If an identity provider makes that information available to any relying party that doesn't request it, the compliance requirement is transferred to that relying party even if it didn't want the information. Obviously, it's critical that a digital-identity system is capable of respecting those types of laws and regulations and not put anyone in violation.
Permit All User-facing Attributes to Be Modified or Deleted
Finally, a digital identity system must put users in control of their own data. This is perhaps the most controversial of all the principles. The identity providers often view the information they collect as business data. Users, by contrast, consider their names, addresses and other information their personal property to use as they see fit, which might include revoking someone's right to have it.
Clearly, current practices already reject users' rights to completely control access to their information; anyone who has attempted to get one of the three U.S. credit-reporting agencies to delete personal information can attest to this. Even if that information is documented as incorrect, the credit-reporting agencies are usually happy to keep selling it. Accuracy and approval by the subject of the data is irrelevant. Discussing the ethics of a system where someone else is permitted to profit from your identifying information without your consent is beyond the scope of this article, but for a system to be widely accepted by users, a digital-identity system must not put users' information beyond their control.
This means that all information must be modifiable, including aspects such as usernames. A very common type of username is an e-mail address. In spite of the fact that using an e-mail address as an identifier may violate some of the separation principles discussed earlier, it also makes sense to use an e-mail address as an identifier because it's guaranteed to be unique. However, it is also a mutable entity and another person is often granted a previously used e-mail address. That means that the identity system must not only permit a user to modify his or her user ID if it's an e-mail address, but also that the system must deal with problems that occur when new users pick up old addresses. What would happen if a new user tried the forgotten-password feature and received access to another user's account information? That's probably not optimal. Decommissioning an e-mail address is a perfectly valid use-case. The digital identity system needs to deal with this contingency as well as with other changes in information, such as names.
Clearly, digital-identity systems are quite complicated. Not only are they complicated to build, as we already knew, but the principles they need to comply with to be broadly successful are complex as well. We can draw two conclusions from this discussion. First, digital identity systems must be simplified. We've spent several years trying to design systems that work around users, not through users. Digital-identity systems must provide users with control of their information without making undue requests of them. At the extreme end, they must support users who wish to be anonymous. Likewise, they must not require businesses to expend more on implementing the system than the system is worth.
Second, digital-identity systems must provide identification services that are appropriate to the way they are used. They must provide for multiple levels of claims to support the assurance level required for the systems that those claims are being used to authenticate. In other words, on the Internet as a whole, single sign-on is probably going to be appropriate only to the system used to manage identities—it won't be the identity itself.
Whether we will ever see a single, very successful system that meets all these principles remains to be seen. Most people agree that we are not there yet.
Jesper M. Johansson is Principal Security Architect for a well-known Fortune 200 company, working on risk-based security vision and security strategy. He is also a contributing editor to TechNet Magazine. His work consists of ensuring security in some of the largest, most distributed systems in the world. He holds a Ph.D. in Management Information Systems, has more than 20 years experience in security and is a Microsoft Most Valuable Professionals in Enterprise Security. His latest book is the Windows Server 2008 Security Resource Kit (Microsoft Press, 2008).