Agile best practices in Waterfall-based Java enterprise development Some have said that 2007 was the year that Agile arrived, with agile development best practices such as automated builds, test automation, and continuous integration finally beginning to reach critical mass among Java developers. While this is true in theory, the fact is that most Java enterprise projects are still based on more traditional development methods, such as the Waterfall model. In this article, ShriKant Vashishtha explores some of the “pain points” of traditional Java EE development, from both a developer and a build manager’s perspective. He then shows us how certain agile best practices can easily and naturally resolve these problems — without altering the flow of a Waterfall project.Java EE best practicesThis article builds on my discussion in the JavaWorld article “J2EE project execution: Some best practices.” That discussion focused on improving the productivity of Waterfall-based projects using Java EE development best practices such as a reference implementation, an effective developer’s handbook, and automated code reviews. In this article I focus on the benefits of incorporating agile best practices like automated buiids, continuous integration, and test automation into Waterfall-based Java EE projects. The buzz around agile development methodology is undeniable and growing, but actually transitioning a traditional shop to an agile one is a huge undertaking. The scope, timeline, and deliverables associated with traditional software development do not map easily to the agile process and some must be thrown out entirely. Managers, developers, and even clients must commit to the less predictable rhythm and greater personal demands of agile development. Change has to be managed across the entire enterprise and results evaluated based on a less familiar set of criteria. Geographically distributed teams are especially challenged to stay on target while implementing agile processes, which favor face-to-face communication. Is it any wonder that agile development appears to be more popular in theory than in practice?According to a Forrester Research study, only 26% of North American and European enterprises were using agile practices in 2007. Despite rumors of its demise, the Waterfall model (where each stage of the software creation lifecycle must be complete before the next one begins) is still status quo in Java enterprise development. For some Java developers this is a frustrating state of affairs: we want to improve the process of development and the quality of our products, and we believe that Agile is the way to do it; but the Waterfall model stands in the way. Or does it? While some of us would like to simply trade in traditional models for agile ones, a gradual transition is more practical. Even if your company or development group is tied to the Waterfall model, it is possible to incorporate agile best practices to improve productivity and efficiency. In fact, many Waterfall-based projects already support agile best practices, and could support even more.In this article I present a scenario that illustrates common issues in traditional development projects, from both a developer’s and a build manager’s perspective. I then show where agile best practices such as automated builds, test automation, and continuous integration can help overcome some of these problems, even in a Waterfall-based context. I also introduce a number of tools that can be used to test, measure, and improve code quality in test automation and CI environments.The evolution of the Waterfall modelIt is tempting to start a discussion of this sort by enumerating the weaknesses of the Waterfall model and then showing how they can be overcome through agile practices. In fact, the Waterfall model has evolved to incorporate practices very similar to ones used in agile development. Consider these: Build frequency: Gone are the days when most teams could wait until the final stage of development to integrate all of the pieces of an application. In today’s world, it is common to develop, unit test, and integrate software in a short timespan. I have seen many Waterfall-based projects where builds happen two to three times per day.Iterative development (small releases) and frequent feedback: In traditional Waterfall-style projects, software is delivered for system testing or acceptance testing after it has been developed. Few large or mission-critical projects can afford to work that way today. Customers often expect to monitor the quality of constructed software through continuous code reviews, encouraging the use of multiple iterations in shorter development cycles. The delivered functionality is prioritized based on the customer’s requirements and development feasibility, and the customer provides feedback on each delivered iteration. Feedback has a direct impact on the next iteration and there are fewer (if any) surprises for the customer when the final product is received.Change control: It has been argued that the basic disadvantage of the Waterfall model is that the requirements phase is closed before development begins. This process allows no room for changes later in the development cycle, even though the need for changes almost inevitably becomes apparent in these later stages. Many Waterfall-based IT shops have adopted a change control process based on how critical the change is, as well as the effort required to implement the change, as illustrated in Figure 1.Figure 1. A typical change control processAn effective change control process is defined and documented to ensure that changes are effectively managed and controlled. (A change, in this context, can mean a revision to the program scope, deliverables, milestones or levels of services, new clarity configurations, or enhancements that affect the cost, schedule, resources, quality or conformance of the services to the agreed specifications.)All of these typically agile best practices complement Waterfall-based development and make it more accessible and responsive to the needs of the client, while also breaking down the rigid separation of cycles that characterizes the Waterfall model.Traditional Java EE projects could benefit from incorporating a few more best practices from the agile development world, as the next section reveals.The project that ate my lifeA project scenario is perhaps the easiest way to explore the weaknesses of the Waterfall model. Consider a typical Java enterprise project that involves a team of 150 to 200 people. The project has been divided into functional modules, which in themselves may look like multiple smaller projects. In total, the project consists of 10 to 15 EAR (enterprise archive) and WAR (Web archive) files. To test the entire application, it is necessary to set up all the EARs and WARs. Subversion (SVN) is the source code repository and version control system for the project. Now let’s consider the day-to-day activities of a developer and a build manager, to find out what kind of challenges they face.A developer’s perspectiveThe first step in setting up my workspace is to check out the projects from SVN. But whenever I take the latest build from SVN, my workspace goes haywire. As it turns out, some of the files checked in by my colleagues do not compile. I suspect some files are committed to SVN with unresolved compilation issues. I also get into trouble when the method signatures of some internal-library classes change all of a sudden. Nobody informed me! It’s worse when the client library provided by an application changes but the changes aren’t committed in SVN. I get runtime errors when my code interacts with the application. Worse, even if the library is committed in one project, it might have a different version in another.Hmm, that’s a lot of trouble, isn’t it? As a humble developer, I just want to work on my part of the software and not bother about what is wrong with other components. Unfortunately, my part of the application is dependent on others. Also, I am just starting out as a Java developer and am not necessarily aware of all the nuances of application setup, classpaths, libraries, and so on. Believe it or not, sometimes it takes me nearly all day to resolve setup issues. I am left with very little time to devote to the project itself, which causes me a lot of stress. I have another issue that is a kind of living hell for me: The functionality of my module comes last, in terms of the entire development project. I want to test my module but I need the data provided by all the other functional modules in order to set up the data for mine. Nine application steps come before my own. If even one of them breaks, it stands in the way of me reaching my last step. It can take six or seven hours for me to fix those other issues before I am able to test my own business case!You might say “Aw, poor fellow,” but I’m afraid that’s what I am. Instead of concentrating on developing and fixing the functionality I own, I spend a lot of time concentrating on issues I shouldn’t be bothered with. Worse, everybody in the project is so busy in his or her own work. Sometimes fixing all the issues I’ve mentioned takes a lot of coordination, and that means a lot of time.What I needMy needs are pretty simple: Whenever I check out the latest copy of my project from SVN, I should be able to set up the project on a virgin machine (an operating system, JRE, and SVN client) and fully build the system. For this basic need to be met, the code available in SVN should be compilable at all times. It should have the latest libraries available and there should be very few manual steps to build the whole system. I also should be able to test my functionality, even in isolation. I just want to be able to build and test my functionality before integrating it into the bigger environment.A build manager’s perspectiveIf you take a look at the problems mentioned above, my situation is no different. Right now, each application component (EAR/WAR) in this big project is individually built. Each component has a Java project and in turn has a folder containing its various dependencies. If one client JAR of a component changes, it’s up to me to make sure the lib folder of all other components with the .jar is updated. That’s a manual task.Sometimes my team members fail to update that .jar in one lib folder and everything goes downhill fast. It results in runtime errors that are hard to debug in such a complex system. Making a change in one JAR and updating it in all the project lib folders is a pain. Each application component has its own set of libraries and build scripts, so it’s time-consuming and requires continuous manual intervention to build the entire system. (Sighs.) Oh, if everything goes well it only takes around 30 to 45 minutes to execute all build scripts and re-deploy the application on the server. But if a compilation problem comes up (which usually happens), I have to chase people around and get it fixed before I can deploy it on the server. This makes a lot of people unhappy. Developers are not able to test the application in a holistic environment. The testing team doesn’t get the application available for testing in time, which wastes their time and the company’s money. People sit idle and the testing manager and project manager start complaining, and then I am in trouble again. I need an easier way to tackle these problems!Hey hold on — I haven’t finished yet! It is also my job to take care of the project’s multiple build environments: one for development, another for testing, and yet another for performance testing. Having three build environments/machines can mean three constantly running application versions. For each build environment, I need to make changes in application configuration. And — you guessed it right — I need to make changes in the configuration properties file for each deployment before building the system.Now, imagine making changes in the configuration properties of a system that contains 15 EARs manually, just to make a build. That’s a mechanical job? Ha! I simply hate it, and my team members hate it even more. What I needI should be able to execute the system build with a single click. It should require minimal-to-no manual intervention. I also need a way to track whether code in various modules is compilable and ready to be built and deployed. Because we need to create builds for different environments on a regular basis, I need a mechanism to automatically configure properties for new build environments. I also need a way to update JAR files across all projects. Essentially, I need to be able to make changes in one place and have the changes proliferate to all concerned projects.Common needsThe needs of the developer and build manager in the above scenario probably are familiar to you. It is easy to see that a developer working on a specific piece of a project should not have to bother about the project’s larger technical infrastructure. Likewise, anyone can understand the build manager’s need to minimize manual efforts and focus on other important tasks in such a big project. I summarize the developer and build manager’s needs for this project as follows:It should be possible to check out a project, make minimal configuration changes, and build it on a virgin machine.Code checked into the SVN repository should be compilable and ready to build at all times.The latest libraries should be available in all Java projects at all times.It should be possible to test individual code modules without dependencies related to other projects.it is important to track whether all project environments are ready to build or not and keep the information where everyone can see it anytime.It should be possible to make configuration changes in one place to create a new build for any environment.It should be possible to update internal jars in one place, so that every individual component is affected by the change instead of making changes in lib directory of each and every component.In the next sections we’ll see how certain best practices in agile development meet these needs, even within a Waterfall-based project. Automated buildsThe first requirement says that you should be able to start up a virgin machine, do a checkout, and fully build the system without a hitch. Automated builds, a staple of agile development, would resolve this requirement. Many developers use IDEs that include include some kind of build management process. These are helpful for faster development, but you also need a script to build the project independently. Ant and Maven are two tools commonly used for build automation in Java-based projects. Gant also is gaining ground with developers who favor Groovy. In any case, build scripts are not proprietary and can be executed on any platform.A successful build automation process for the example Java EE project would need to achieve the following:The application code available in the SVN repository should contain the code with all default configuration steps implemented that otherwise would have to be performed by the developer. It should also contain a build script that updates the configuration properties file for each working environment (local, development, test, production, etc.) instead of updating them manually.A master build script should be created that contains the target of all application components. This script should make it possible to build the entire project with a single click.Code quality measurementBuild automation isn’t limited to just compiling, building, and deploying the application. Many other important tasks can be included as part of an automated build. For instance, you can execute xUnit test cases, or get the HTML output of static code-checker tools like CheckStyle or PMD, which could be sent to various project stakeholders. You can also integrate code coverage tools such as EMMA or Cobertura with your automated test cases. It is a good idea to test the efficacy of your automated build setup. For this a fresh developer should review all of the above-mentioned implementation steps to validate the ability to set up, build, and deploy the entire application with minimal effort. This is an iterative process; if the developer gets stuck somewhere, the process should be revised and updated. Every Java project should also contain a readme.txt in the root project directory that describes manual steps (steps that cannot be performed by a build script) to be performed by the developer. For instance, certain projects may need to make changes in the JRE security settings for encryption implementations, or for JAAS configurations. The readme file should also list the necessary steps to build and deploy the application on the server.Build setup: The master build scriptGiven that this project consists of numerous EAR and WAR files, it’s a pain to build the entire application by executing the individual Ant build scripts for each application component. Manual builds require the person responsible (in this case the build manager) to remember in what order he or she has to execute all the build targets. The solution to this kind of problem, as mentioned above, is to have a master build script for the project.Note that it is still good to have individual build targets in the root directory of each component (or module) of the Java project. The master build script simply orchestrates all the individual build targets. Listing 1 is an example of a master build script. Listing 1. A master build script using Ant<target name="build-ABCSecurityServices" depends="call-ABCSecurityServices-build,export-abc-securityservices.jar "> <copy todir="${artifact}/${env}"> <fileset dir="${SecurityServices}/dist"> <include name="*.ear"/> </fileset> </copy> </target> <target name="call-ABCSecurityServices-build"> <ant antfile="${Security}/ABCSecurityServices/build.xml" inheritAll="false" target="all" /> </target> <target name="export-abc-securityservices.jar"> <copy file="${abc-securityservices.jar}" toFile="${project.internallib.dir}/abc-securityservices.jar"/> </target> The Ant master build script in Listing 1 calls call-ABCSecurityServices-build available in build.xml of the ABCSecurityServices Java project. So, instead of containing individual build targets, the master build script just delegates the task and orchestrates multiple tasks. It also decides the order of tasks to be executed. All artifacts such as EAR and WAR files are copied to a single location, from which they can be easily deployed.Given that the build manager needs to make build for multiple environments, it also would be a good idea to externalize the factors that change from environment to environment. Most of the time, these are application configuration files. Instead of including them inside EAR/WAR files, they can be externalized and placed on the application server classpath. In this way, EAR/WAR files will not change from environment to environment. The build script will just change these configuration properties based on the environment. Naturally, this task also should be automated.The master build script makes the build manager’s life a lot easier by realizing the goal of making the build with a single click. Why not Maven?It could be argued that Maven is a better build tool than Ant for a development scenario involving multiple Java projects. Agreed. Practically speaking, however, a good percentage of Java EE projects still use Ant for various reasons, including ease-of-use, legacy code, and Maven’s dependence on an online repository. I chose to focus on Ant because it is a likely solution for many Java enterprise shops. Test automationIn many Waterfall-based projects the developer writes a software component and then performs a set of unit tests manually. In the future, if the developer needs to modify the same component, she or he must perform the entire manual unit testing process again. Often, when the changes are small, developers just test the impacted functionality. This seems reasonable given time constraints or the assumption that the change will not affect other parts of application. Unfortunately, even small changes can break related functionality, which won’t be discovered until the entire application is tested.JUnit is an answer to such problems. In JUnit a separate file contains all the test cases for a given component, so they can be run at the click of a button. JUnit makes it easy to test software components in isolation and also easily run numerous tests. JUnit is also a valuable component in test-driven development, where developers actually write test cases to test the impact of a desired change throughout the system before doing any coding.Some developers argue against test automation, saying “Why should I waste time writing test cases when I can make the change and test the functionality manually in 20 minutes?” While this is a convincing argument on the surface, it falls apart when you think about today’s accelerated development cycle, where the same manual process might need to happen 10 times in a row. Manual testing also doesn’t leave much documentation for new developers; and then there is the human tendency to err. Many ways to testThere are many ways to expand on the basic approach of test automation, and at least a handful of very good tools for specific types of scenarios. DBUnit, for instance, is a unit testing framework built on top of JUnit that is used to put your database into a known state between test runs. DBUnit is helpful in cases where the side-effects of component execution are changes in the database, or where you need to ensure a certain state in the database before you can execute test cases for a given component. Combining DBUnit and JUnit also is useful for testing components that have extensive database-driven operations.EasyMock is an open source utility that can be used to generate mock objects or mock interfaces so that you don’t have to wait around for third-party code. Using EasyMock also means you don’t waste time writing mock objects when you could be testing your code.DBUnit and EasyMock are two tools that provide solutions for individual component testing. But how about the problem mentioned by the developer in the original scenario, whose module is the tenth step in the overall application design? How can that developer ensure that his module is going to work with all the other modules?The key to this problem is creating end-to-end test cases for various modules, so that modules are combined and tested as a group. Integration testing can be done using any of the above-mentioned tools or frameworks combined. In some cases you may require a technical infrastructure for integration testing. One option in this case is to use the Spring testing module without using the Spring framework as a whole. Selenium is another excellent and easy-to-learn testing tool that is used to test Web applications. Selenium tests run directly in a browser and mimic the user’s interaction with the system and UI.Another open source tool that can round out your test automation suite is Cobertura. Rather than just count the number of test cases at the end of your development cycle, Cobertura lets you measure code coverage, or how much of your code, line by line, has actually been tested. Cobertura reports on the percentage of code tested and not tested for each file in your application.Now we get to the best part of all the above tools and frameworks: using them in a continuous integration environment.Continuous integrationUncompilable code was one of the problems mentioned by both the developer and manager in our Java EE project scenario. The developer often has to deal with broken code that has been uploaded to the SVN repository, and the manager also spends valuable time resolving compilation issues.As a first observation, this development team needs to adopt a basic discipline that nobody will commit the code in SVN if it’s not compilable. Beyond that, the basic problem is that human is practically synonymous with error.Who ensures that the build is ready to compile at all times? In the scenario described, it is a manual process where the build team checks out the code from source repository, makes the build, and often realizes that the build doesn’t compile. Something is wrong with this process, don’t you think? Build time is too late to report an error to a developer and then let him correct it!These common problems are compounded by leaving integration as the very last step in a software development cycle. Instead, why not integrate the project multiple times per day? In a continuous integration environment each integration is verified by an automated build, which includes executing test cases to detect errors as soon as possible. Continuous integration, or CI, involves communication, discipline, and automation of trivial tasks. As such, it offers some good solutions to the issues raised by the developer and build manager.CI serversIn a large project, manual checks are likely not sufficient to curb all of the problems raised by the developer and build manager. Getting them fixed can itself be a big task. Wouldn’t it be better to automate this process? Here is where a continuous integration server like CruiseControl comes in handy. CruiseControl performs many trivial tasks automatically. Consider these:Whenever someone commits changes to the source repository, it will check out those source files from the repository and build the project. It always has some kind of waiting period, so that if a person is committing multiple sources, it doesn’t start compilation in between. This time-lag is configurable. Whenever a build breaks, CruiseControl will check the user ID of the person who last committed changes to the source repository and will send a snapshot of the compilation issue through an email. You can configure the email IDs of stakeholders for each component. The team should take immediate action to fix the build issue based on the information provided.CruiseControl provides a Web-based user interface where stakeholders can view the state of the build at all times. If you click on the link of a failed build, it will provide all problems from the last build.If a build fails, CruiseControl doesn’t execute the build again until someone commits changes to the repository.Various artifacts like EARs, WARs, JUnit reports, PMD reports, and more can be downloaded/seen from the Web-based user interface.In small projects where there is no interdependence between components, it’s fine to build whenever someone commits changes to the repository. For projects with tight interdependence between components this could be a problem, however. In that case, CruiseControl can be set to compile individual components at the time of last commit, but build the entire project only periodically. (Building the entire project too frequently can lead to a situation where people are constantly buried in email; after a while it becomes tempting to avoid email from the build team.)Ready to go?Even with all of these solutions and tools at their disposal, the developer and build manager in our initial project scenario may still have some basic organizational problems to work out. Most importantly, both developer and build manager mentioned the need for code checked into the SVN repository to be compilable and ready to build at all times.This project is typical of enterprise development projects in that it actually consists of many inter-related Java projects, or functional components. Each of these is likely dependent on one or more other projects, and each project likely provides a client JAR file (contains only interfaces and necessary classes of functional component to compile the client class) or a full JAR (contains entire distribution of the functional component) file depending on the type of functionality provided. All of the JAR files are available in the lib folder of each Java project, but keeping them updated across all the Java projects in such a large project is a big undertaking, as the manager mentioned.It’s difficult to track which application component is running on which version of the internally generated JAR files, and this is also true for third-party JARs. It is possible to resolve such a problem by creating a single, centralized Java project that only contains JARs (third-party as well as internally generated).Now, if the project team creates new internal JARs on a daily basis, and the source code available in SVN is in work-in-progress state, it may break the functionality of other components. You may start receiving runtime issues. The answer lies in released version of internal JARs. First of all, developers should commit a code in SVN which has passed a set of test cases. The source repository should not have unstable code. Whenever there are updates in the source code and it is functionally stable, the application component team may release an update version of the JAR file in the centralized libs. Each application component team should have a person assigned to this role. One needs to run local package’s tests and ensure they pass before committing changes in the source repository.In conclusionIt has not been my goal in this article to advocate for agile development, per se. Even while suggesting some best practices from the agile world, I have tried to keep the spirit of the Waterfall model intact. You should be able to incorporate automated builds or test automation into a Waterfall project without disrupting its essential flow. When you start thinking about pair programming, just enough documentation or no documentation, business people as part of the development team, no upfront design, no roles (architect, build manager, etc.) — well, then you are meddling with the spirit of the Waterfall model.These factors will directly impact stakeholders and the way the IT team works as a whole. The best practices discussed in this article, by contrast, gel nicely with a Waterfall-based project — automated builds provide ready-to-build code at all times; test automation brings predictability, precision, and reliability to the process of testing code; and continuous integration is helpful for maintaining ready-to-build code. Taken as a whole these techniques improve the efficiency of the project and optimize the way individual team members work, and they can do so without affecting the steady, incremental flow we’ve come to associate with Waterfall-based projects.See the Resources section to learn more about the techniques and tools discussed in this article. You can also share your own experiences in the discussion forum associated with this article.ShriKant Vashishtha currently works as a Principal Consultant for Xebia IT Architects India Private Limited. He has more than nine years of experience in the IT industry and is involved in designing technical architectures for various large-scale Java EE-based projects for the banking and retail domains. ShriKant holds a bachelor’s degree in engineering from the Motilal Nehru National Institute of Technology in AllahaBad, India. Open SourceSecuritySoftware DevelopmentBuild AutomationJavaAgile Development