by Efraim Berkovich

Graphs for security

news
Sep 13, 200415 mins

Presenting the ACG datastructure

Authorizing access is a vital part of most any system’s security. But, for strong access control, simply asking “Where are you going?” is not enough. Asking “Where did you come from?” is also important. This question is standard practice at airports (one hopes), but usually not in software security.

The authorization or access-control component of the system security infrastructure protects system resources against inappropriate or undesired user access. A security architecture defines what it means by a “resource.” A security architecture also decides what kinds of user entities are granted access. For example, in “role-based authorization,” users are assigned roles or groups, and then the role or group is given access to a resource.

For instance, in the case of directory files, resources are files along with the actions that can be done to the files such as read, write, and delete. Access is controlled by individual user or, more properly, by role. So, File1 can be modified only by administrators, File2 can be read by everyone but modified by administrators and power users, and so on. Most people are familiar with this type of access control.

When controlling access to a directory of files, it does not usually matter which file a user accesses first, second, or last. Rather, the operating system authorizes access based only on the user’s credentials and the file requested. That kind of “flat” structure is typically enforced by an access-control list (ACL). An ACL is a list of resources in the system and information about the types of users that can access each resource.

Unfortunately, most systems apply the same flat datastructure to more complex authorization requirements. Most computer security access-control mechanisms use an access-control list for authorization. Each secured resource has an entry in the list, and usually a combination of datastructures and business logic determines who has access to what resource. However, a list does not really capture the way most applications want to restrict access.

By asking “Where did you come from?” a system gives context to a user’s request. In a computer system’s access control, having a context for a user’s request adds extra protection to the system.

I propose using an access-control graph (ACG) instead of an ACL for access control. A graph does everything an ACL can do, offers additional security, and provides other useful features not available in an ACL design.

Directed graphs

An ACL can be thought of as a matrix, where each resource corresponds to a row and each user (or role) corresponds to a column. A cell in this matrix is marked either TRUE, if the user (or role) is allowed access to the resource, or FALSE, if access for that user (or role) is disallowed.

An ACL is an appropriate security mechanism to use when the resources being secured are not particularly interdependent. One everyday example is a hallway in an office building. Each office is secured by its own locked door, entrance to which is obtained only from the hallway. Even if a single key happens to open multiple doors, having access to one office does not necessarily say anything about having access to another office. This type of flat structure is exactly what an ACL represents.

For comparison, an ACG is like an office building where some offices are reachable only by first going through another room. To gain entry to the second office, a key has to open both the door to the first room and the door to the second room. This concept is the basic idea behind an ACG. The datastructure that represents this idea is the “directed graph.”

A directed graph is a graph datastructure where the vertices between the nodes have a direction. The nodes correspond to secured resources. With security controlled by a directed graph, a user cannot jump arbitrarily to any resource in the structure. Rather, a user must start at a root node and traverse the graph to the desired resource.

For example, suppose you have an application that searches a database of people. It is implicit that you go to a search screen before you are granted access to a search results screen. From the search results screen, you may be able to go to a detail screen. It would be a violation of application security if you could jump to a detail screen directly, bypassing all previous screens (see Figure 1). If you think about it, applications implicitly use a directed graph as the actual model of access control.

Directed-graph security control is implicitly built into most thick-client applications. Most programmers use this implicit graph without awareness of its existence and use an ACL structure for access control. However, in building n-tier (Web-based) applications, an explicit graph makes sense. The nature of Web applications allows hackers to bypass control logic and make arbitrary requests to the Web server. Obviously, this detail is a potential security breach. N-tier applications should not assume, like traditional client-server applications, that the application implicitly controls a user’s available actions and context.

From a software architecture viewpoint, there are benefits to making an access-control graph explicit even for thick-client applications. Separating the authorization of application flow from the UI layer is good for constructing more robust applications. If a system uses a Model-View-Controller (MVC) pattern implementation, the ACG fits in at the controller level and helps separate the view from security considerations (more about this later, when I discuss implementation).

The graph concept provides a way to check the sequential consistency of access rights in the application flow. Sequential consistency means that if a particular node is meant to be reachable by certain users/roles, then an unbroken path exists through the application flow graph to that node. If a user is supposed to be able to go from A to B to C, we can ensure that the graph contains paths for that user between A to B and B to C. Unfortunately, an ACL does not provide an intrinsic way to check for sequential consistency of access rights.

For example, a system security administrator for a database application may grant access to application screens that can be reached only if some other screens are available first. These dependent resources are generally unreachable unless the administrator also grants access to the other preceding screens (see Figure 2). In an ACL, no direct approach ensures sequential consistency of access rights because the application flow information is not stored anywhere in the datastructure. System developers and administrators can look at using some of the many known graph algorithms to analyze application flow and security, but that topic is beyond this article’s scope.

To summarize, ACLs:

  • Do not consider a user’s current context within an application.
  • Do not provide a way to check dependencies among different services and resources; that is, ACLs do not provide a way to check sequential consistency of access rights.
  • May compromise application integrity and/or security in a stateless environment (such as the Web) since a user may request services arbitrarily without regard to application flow.

It is beneficial to use the implied directed graph of application flow as a real method of access control in computer applications, specifically in n-tier applications. The model of access-control graphs can handle anything an ACL can. Furthermore, ACGs:

  • Provide extra security since a user’s current context, in addition to security credentials, determines access rights.
  • Allow verification of application flow. The graph can be traversed to make sure no skipped paths exist for a role. For example, if a user is supposed to be able to go from A to B to C, we can ensure the graph contains paths for that user between A to B and B to C.
  • Lead to good software architecture by helping separate access-control issues from the user interface layer.

The ACG approach is a superset of the ACL approach; a graph can be made to hold a list or many lists. ACG does suffer from a minor performance disadvantage since the algorithm requires one more additional lookup than the ACL approach. ACGs also require more memory than ACLs. But in most cases, performance and memory overhead is not significant.

Implementation example

Building an ACG can be straightforward. In the Java code provided with this article (see Resources to download the accompanying source code), the ACG has a graph where each node corresponds to a particular business method, and each incoming vertex to that node is associated with a list of roles allowed access to that business method. I use this ACG implementation in a sample Web application that follows the Model-View-Controller design pattern. This example demonstrates the ACG idea and is not meant as a real system implementation. However, extending the application of the ACG concept to more robust application frameworks such as Apache Struts is not difficult.

The code for the graph structure is nothing unusual. I have a Graph object that holds a collection of Node objects. Each Node holds a collection of outgoing Vertex objects. The Node corresponds to a security resource, which, in this implementation, is a business method’s string identifier. (Of course, for production systems, programmers can substitute a more general Resource object instead of the string identifier.) Each Vertex holds a collection of associated roles (which, in the example, are strings, but should really be objects of some sort of role class). The Graph object has a method hasAccess() that takes as parameters the source and destination node identifiers as well as the role of the user making the request. I store the graph as XML for easy editing. (Take a look at the sidebar “Using XML Files for ACGs” for more tips on graph persistence.)

Once we have the graph, the next step is to integrate it as an access-control mechanism into an application. In the sample MVC Web application, I use a single controller servlet. This servlet provides a good point to plug in the ACG. In this application, the user’s role is stored in the session context. When a request arrives to the servlet, the request-processing code extracts the request’s source and destination and the user’s role, and checks with the ACG for permission to route the request. If permission is denied, the servlet routes the request to an error page. If permission is granted, the servlet routes the request to the appropriate business method. In the sample, I use a lookup table to route requests (see code snippet below), but you can consider using a more data-driven design and Java reflection, for example.

// This sample is a snippet from the controller.
// The code gets the user's roles from the HttpSession
// and stores the current application location in the HttpSession.
HttpSession session = request.getSession( false );
      
if( session == null ) return handleLoggedOut( request );
String source = (String)session.getAttribute( RoutingMap.SERVICE_SOURCE );
String destination = request.getParameter( RoutingMap.SERVICE_REQUESTED );
String[] roles = (String[])session.getAttribute( RoutingMap.USER_ROLES );
if( ACG.hasAccess( roles, source, destination ) )
{
forwardingURL = handleRegular( request );
session.setAttribute( RoutingMap.SERVICE_SOURCE, destination );
}
else
{
forwardingURL = handleUnauthorized( request );
}

If an existing ACL-based application has a well-defined design pattern implemented, a programmer can normally replace the ACL with an ACG without too much effort.

Web applications

In a Web application, its UI resides on the user’s untrusted machine. The user’s browser controls what the user sees and what the user sends back to the Web server for processing. Because the UI tier is distributed outside the trusted zone, Web application programmers must be aware of the security and control problems that occur. To be safe, the application should assume that all data being sent from the user is potentially corrupt.

The user can subvert application flow by using the Back button, opening links in new windows, clicking buttons or links multiple times, reloading pages, and so on. A user need not be maliciously inclined to wreak havoc in the application. For example, in a poorly designed application, a form can be submitted multiple times accidentally, and, as a result, a credit card may be doubly debited. A casual hacker can potentially do damage or access sensitive data by modifying the URLs that the browser sends to the Web server. More sophisticated hackers may modify the form field data that the browser sends. These are, of course, some of the concerns that lead to considering using ACGs for authorization.

One important question for ACG-based designs is the method for storing and extracting the source node. One simple option (and the one used in the sample app) is to store the source node in the session context; that is, the application stores the identifier of the last requested resource and uses that as the source context for the next request. This approach is simple to implement and keeps the source node information secure from client-side manipulation.

While storing the source identifier in the session context simplifies the access-control process, it includes some disadvantages. First, the programming becomes more complex if the application allows multiple windows to be open at once. In that case, there are multiple valid source nodes, and the application must track and store all of them. Second, the assumption that responses to the requests are always received by the client browser may not be valid. For example, the user may push the Stop button on the Web browser. In that case, the user’s source node stored in the session context may not correspond to the source of the displayed Webpage, and subsequent requests may be not authorized (depending on the access graph’s structure).

Therefore, it makes sense to consider an approach that relies less on session variables. For instance, the program can send the source node as a form field with every request. That way, the source node is associated with the page displayed on the client and no longer associated with a current context for the user’s session. The advantage, clearly, is that the issues discussed above are no longer a problem. Unfortunately, the system has become open to manipulation since the data sent from the client is now trusted. A hacker who learns the nodes’ identifiers and graph’s structure can issue requests that do not reflect the user’s actual application context and thus puts the system into the same security state the system has with just ACL-based authorization. One way to address this issue is to encode and/or encrypt the identifiers of the source and the destination nodes. As a side note, from a security perspective, obfuscating the identifiers of business methods is a good idea regardless of the authorization scheme (For example, consider the following sample of bad security in URLs: https://server.company.com/app/c?ACTION=FETCH_PERSON&ID=12939&).

ACGs are not the only possible solution to out-of-order access in n-tier applications. Some current Web systems avert out-of-order access through a current access token or similar device. In using a current token, each request has an expected identifier; if the client sends the wrong identifier, the request is not authorized. This method is also effective for preventing duplicate form submissions. While effective in stopping certain kinds of inappropriate requests, current-token control does not offer the full protection of ACGs because, if a hacker gets the current token, she can potentially submit a request to an arbitrary business method as in the ACL case. Moreover, given the architectural benefits of ACGs, the current access token method begins to look like a patch on a structural problem.

Conclusion

Since many applications and systems implicitly use a directed graph as the model of access control, I think it makes sense to create a datastructure that implements the model explicitly—an access-control graph. These ACGs offer several advantages over access control implemented via ACLs including: greater security, better software design, and verification of sequential consistency of access in applications.

I implemented the ACG model on two real-world systems at a global financial company. One system is an intranet (with portions to be extranet) workflow and imaging system for a broker-dealer group. The other is an extranet application for financial-plan delivery to the sales force. Both are J2EE systems and follow the Model-View-Controller design pattern.

ACG-based authorization systems do require more upfront planning regarding application flow and may take some getting used to. However, even with these issues, ACGs are a more robust and more secure alternative than ACLs for access control in many computer applications.

Efraim Berkovich has more than 10 years’ experience as a software engineer and architect. Consulting as an architect at one of the world’s leading financial companies, Berkovich implemented the ideas in this article on two Web-based systems: an intranet imaging and workflow application and an extranet financial plan preparation application. Berkovich has a BS in mathematics, an MS in electrical engineering, and is currently pursuing a PhD in economics at the University of Pennsylvania. He can be reached at eqberkovich@yahoo.com.