Use Jakarta POI to generate Excel spreadsheets from XML documents Often it is useful for XML data to be presented as a spreadsheet. A typical spreadsheet (for example, a Microsoft Excel spreadsheet) consists of cells represented in a grid of rows and columns, containing textual data, numeric data, or formulas. An Excel spreadsheet defines some standard functions such as SUM and AVERAGE that you can specify in cells. The Apache Jakarta POI project provides the HSSF API to create an Excel spreadsheet from an XML document or to go the opposite way, parsing an Excel spreadsheet and converting to XML. The HSSF API has provisions for setting the layout, border settings, and fonts of an Excel document. In this article, you’ll learn how to generate an example Excel spreadsheet by parsing an XML document and adding data from the XML document to a spreadsheet. Subsequently, you’ll convert the Excel spreadsheet to an XML document.OverviewThe Jakarta POI HSSF API provides classes to create an Excel workbook and add spreadsheets to the workbook. With the POI API, the HSSFWorkbook class represents a workbook, and you set the spreadsheet fonts, sheet order, and cell styles in the HSSFWorkbook class. You can represent the spreadsheet using the HSSFSheet class. Specifically, you set the sheet layout, including the column widths, margins, header, footer, and print setup using the HSSFSheet class. You can represent a spreadsheet row using the HSSFRow class, and you set the row height using the HSSFRow class. The HSSFCell class represents a cell in a spreadsheet row, and you set the cell style using the HSSFCell class. The indexing of spreadsheets in a workbook, of rows in a spreadsheet, and of cells in a row is zero based. In this article, we’ll show how to convert an example XML document to an Excel spreadsheet and then convert the spreadsheet to an XML document. Listing 1 shows the example document, incomestatements.xml.Listing 1. incomestatements.xml <?xml version="1.0" encoding="UTF-8"?> <incmstmts> <stmt> <year>2005</year> <revenue>11837</revenue> <costofrevenue>2239</costofrevenue> <researchdevelopment>1591</researchdevelopment> <salesmarketing>2689</salesmarketing> <generaladmin>661</generaladmin> <totaloperexpenses>7180</totaloperexpenses> <operincome>4657</operincome> <invincome>480</invincome> <incbeforetaxes>5137</incbeforetaxes> <taxes>1484</taxes> <netincome>3653</netincome> </stmt> <stmt> <year>2004</year> <revenue>10818</revenue> <costofrevenue>1875</costofrevenue> <researchdevelopment>1421</researchdevelopment> <salesmarketing>2122</salesmarketing> <generaladmin>651</generaladmin> <totaloperexpenses>6069</totaloperexpenses> <operincome>4749</operincome> <invincome>420</invincome> <incbeforetaxes>5169</incbeforetaxes> <taxes>1706</taxes> <netincome>3463</netincome> </stmt> <incmstmts> Creating an Eclipse projectIn this article, we create and parse an Excel spreadsheet using the Apache POI HSSF API. Before you can set up your project, you need to download Apache POI 2.5.1 and extract the zip file to an installation directory. You also need to download and install Java SE 5.0. (You can also use another version, such as 1.4 or 6.0.)To compile and run the code examples, you will need an Eclipse project. You can download project Chapter11 from the Apress Website and import it into your Eclipse workspace by selecting File, then Import.To compile and run the Apache POI code examples, you need some jar files in your project’s Java build path; Figure 1 shows these JAR files. The JAR file required for an Apache POI application is poi-2.5.1-final-20040804.jar, which consists of the Apache POI API. You also need to set the Java Runtime Environment system library to JRE 5.0, as shown in Figure 1. Figure 1. Java build pathFigure 2 shows the project directory structure.If you haven’t got a copy of Excel handy, you can instead open the Excel spreadsheet generated from the example XML document using Excel Viewer. Converting an XML document to an Excel spreadsheetIn this section, we will show how to convert the example XML document in Listing 1 to an Excel document using the Apache POI HSSF API. Specifically, you will parse the example XML document, retrieve values from the document, and construct an Excel spreadsheet. The procedure to create a spreadsheet is as follows: Create an Excel spreadsheet workbook and an empty spreadsheetDefine a cell styleSet the spreadsheet column widthAdd a header row to the spreadsheetParse the XML documentAdd statement columns to the spreadsheetOutput the spreadsheetYou need to import the Apache POI HSSF package, org.apache.poi.hssf.usermodel. You can create an Excel workbook using a no-arguments constructor for HSSFWorkbook, as shown in Listing 2. You create a spreadsheet, represented with the HSSFSheet class, by using the createSheet(String sheetName) method of the HSSFWorkbook class.Listing 2. Creating an Excel workbook and spreadsheetHSSFWorkbook wb=new HSSFWorkbook(); HSSFSheet spreadSheet=wb.createSheet("spreadSheet"); You can represent a cell in a spreadsheet using the HSSFCell class. You set the cell style using the HSSFCellStyle class. To set the cell style in the example spreadsheet being generated, create a cell style object using the createCellStyle() method of the HSSFWorkbook class, as shown in Listing 3. The example cell style defines a cell border and is used for cells that represent totals for a column or subcolumn. You can set the border settings for an HSSFCellStyle object using the setter methods setBorderTop(short), setBorderLeft(short), setBorderBottom(short), and setBorderRight(short). Listing 3. Setting the cell styleHSSFCellStyle cellStyle=wb.createCellStyle(); cellStyle.setBorderRight(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderTop(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderLeft(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderBottom(HSSFCellStyle.BORDER_MEDIUM); You can represent a border type with a short value, as shown in Listing 3. Table 1 lists some of the commonly used types of borders.Table 1. Border types ShortDescriptionBORDER_DASH_DOTDash-dot borderBORDER_DASHEDDashed borderBORDER_DOUBLEDouble-line borderBORDER_MEDIUMMedium borderBORDER_NONENo borderBORDER_THICKThick borderBORDER_THINThin border You can set the border color using the setter methods setBottomBorderColor(short color), setLeftBorderColor(short color), setRightBorderColor(short color), and setTopBorderColor(short color). You can represent spreadsheet color using the HSSFColor subclasses. For example, the class HSSFColor.BLUE represents the color blue. You can obtain a short value corresponding to a color using the field index. The following is an example of setting a color:short blue= HSSFColor.BLUE.index; cellStyle.setRightBorderColor(blue); You can set background color and foreground color using the methods setFillBackgroundColor(short fg) and setFillForegroundColor(short bg). You can set text indentation using the setIndention(short indent) method. You can wrap cell text using the setWrapText(boolean wrapped) method. For example, you can set cell-style indentation to 4 and add text wrapping, as shown here:cellStyle.setIndention((short)4); cellStyle.setWrapText(true); Further, you can add text rotation to cell text using the setRotation(short rotation) method. You specify rotation in degrees using values from -90 to +90. You can horizontally align cell text using the setAlignment(short) method. You represent cell alignment using a short value. Some of the commonly used cell alignment types are ALIGN_CENTER, ALIGN_RIGHT, ALIGN_LEFT, and ALIGN_FILL. You can set vertical alignment using the setVerticalAlignment(short align) method. Vertical alignment short values are VERTICAL_TOP, VERTICAL_CENTER, VERTICAL_BOTTOM, and VERTICAL_JUSTIFY. You define the spreadsheet font using the HSSFFont class. Listing 4 shows an example of creating an italicized font using font height 24 and font name Courier New. As shown in the listing, a font is created using the method createFont() of the HSSFWorkbook class. Listing 4. Setting the fontHSSFFont font = wb.createFont(); font.setFontHeightInPoints((short)24); font.setFontName("Courier New"); font.setItalic(true); cellStyle.setFont(font); A row in the spreadsheet created from the example XML document has cells corresponding to each of the elements in the stmt tag of the example XML document. You set the column width in a spreadsheet at column level using the HSSFSheet method setColumnWidth(short column, short width). For example, you specify the column width of the first column of a spreadsheet as shown here: spreadSheet.setColumnWidth((short)0, (short)(256*25));A spreadsheet has a header row that specifies headers for the columns in the spreadsheet. Therefore, add a header row to the HSSFSheet class. A header row is just like any other row and is created using the createRow(int rowNumber) method, as shown in Listing 5. You add column headers to the header row using the createCell(short) method, as shown in Listing 5. You set the cell value using the setCellValue(String) method. For example, add a column header for the Year 2005 column.Listing 5. Adding the spreadsheet header rowHSSFRow row = spreadSheet.createRow(0); HSSFCell cell = row.createCell((short) 0); cell.setCellValue("Year 2005"); You can add the column header for the Year 2004 column similarly. You need to parse the example XML document using a DocumentBuilder to navigate an XML document and retrieve the values from the document. You need to create a DocumentBuilderFactory from which you will create a DocumentBuilder parser, as shown in Listing 6. Subsequently, parse the example XML document and obtain a Document object. Listing 6. Parsing an XML documentDocumentBuilderFactory factory =DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(xmlDocument); You can obtain a node list that consists of stmt nodes from the Document object using the getElementsByTagName(String) method as shown in Listing 7. Each stmt node represents a column in a spreadsheet. The subelements in an stmt element represent the row values for a column. In the spreadsheet, add 11 rows corresponding to the subelements of an stmt element. For example, the following code shows how to add row 1: HSSFRow row1 = spreadSheet.createRow(1);To construct a spreadsheet, iterate over the node list and add a column to the spreadsheet corresponding to each of the stmt nodes in the node list, as shown in Listing 7. You add a spreadsheet column using the HSSFRow object. The node list of stmt elements has two nodes corresponding to the two stmt elements in the example XML document. Using a switch statement, you’ll add row labels and row values for two columns. For example, to add a row labeled Revenue, create a row label and create row cells for the two nodes in the stmt element node list, as shown in Listing 7. A column consists of cells corresponding to each of the elements in the stmt element. You create a cell using the createCell(short) method of the HSSFRow object, as shown in Listing 7. You set the cell value using the setCellValue(String) method. Listing 7. Constructing a spreadsheetNodeList nodeList = document.getElementsByTagName("stmt"); for (int i = 0; i < nodeList.getLength(); i++) { switch(i){ case 0: HSSFCell cell = row1.createCell((short) 0); cell.setCellValue("Revenue ($)"); cell = row1.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))). getElementsByTagName ("revenue").item(0).getFirstChild() .getNodeValue()); break; case 1: HSSFCell cell = row1.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))). getElementsByTagName("revenue"). item(0).getFirstChild().getNodeValue()); break; } } The first cell in a row has index 0. Earlier in the section, you defined a cell style. The cell style is set at the cell level using the setCellStyle() method of the HSSFCell object, as shown here: cell.setCellStyle(cellStyle);Similarly, you need to set row values for other cells in a column. HSSFSheet provides some methods to set different characteristics of a spreadsheet. Table 2 discusses some of these methods. Table 2. HSSFSheet methodsMethod nameDescriptionsetColumnBreak(short column)Sets a page break at the specified columnsetDefaultColumnWidth(short width)Sets the default column width, if the width is not specified atthe column levelsetDefaultRowHeight(short height)Sets the default row height, if the height is not specifiedat the row levelsetFitToPage(boolean b)Sets it to fit to the pagesetHorizontallyCenter(boolean value)Sets the output to be horizontally centeredsetMargin(short margin,double size)Sets the style sheet marginsetRowBreak(int row)Sets a page break at the specified rowsetZoom(int numerator, int denominator)Sets the zoom magnification forthe style sheet To output the Excel workbook to an .xls file, create a FileOutputStream object, as shown in Listing 8. You can output the Excel workbook using the write(HSSFWorkbook) method, and you can close the FileOutputStream object using the close() method.Listing 8. Outputting the Excel workbook FileOutputStream output=new FileOutputStream(new File("IncomeStatements.xls")); wb.write(output); output.flush(); output.close(); Listing 9 shows the Java application, XMLToExcel.java, used to convert an XML document to an Excel spreadsheet. The application consists of a method generateExcel(File) that generates an Excel spreadsheet from an XML document. In the generateExcel() method, an Excel workbook is created and a spreadsheet is added to the workbook. The cell style is added using an HSSFCellStyle object. An XML document is parsed, and the stmt element node list is iterated over to retrieve node values. A spreadsheet column is added corresponding to each of the stmt nodes in the example XML document. A column header value is set from the year element in an stmt element. A spreadsheet row is added corresponding to each of the subelements in an stmt element. A switch statement is used to set row values for a column. Subsequently, the Excel workbook is output using a FileOutputStream.Listing 9. XMLToExcel.javapackage com.apress.excel; import org.apache.poi.hssf.usermodel.*; import org.w3c.dom.*; import java.io.*; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import org.xml.sax.SAXException; public class XMLToExcel { public void generateExcel(File xmlDocument) { try {// Creating a Workbook HSSFWorkbook wb = new HSSFWorkbook(); HSSFSheet spreadSheet = wb.createSheet("spreadSheet"); spreadSheet.setColumnWidth((short) 0, (short) (256 * 25)); spreadSheet.setColumnWidth((short) 1, (short) (256 * 25)); // Parsing XML Document DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.parse(xmlDocument); NodeList nodeList = document.getElementsByTagName("stmt"); // Creating Rows HSSFRow row = spreadSheet.createRow(0); HSSFCell cell = row.createCell((short) 1); cell.setCellValue("Year 2005"); cell = row.createCell((short) 2); cell.setCellValue("Year 2004"); HSSFRow row1 = spreadSheet.createRow(1); HSSFRow row2 = spreadSheet.createRow(2); HSSFRow row3 = spreadSheet.createRow(3); HSSFRow row4 = spreadSheet.createRow(4); HSSFRow row5 = spreadSheet.createRow(5); HSSFRow row6 = spreadSheet.createRow(6); HSSFRow row7 = spreadSheet.createRow(7); HSSFRow row8 = spreadSheet.createRow(8); HSSFRow row9 = spreadSheet.createRow(9); HSSFRow row10 = spreadSheet.createRow(10); HSSFRow row11 = spreadSheet.createRow(11); for (int i = 0; i < nodeList.getLength(); i++) { HSSFCellStyle cellStyle = wb.createCellStyle(); cellStyle.setBorderRight(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderTop(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderLeft(HSSFCellStyle.BORDER_MEDIUM); cellStyle.setBorderBottom(HSSFCellStyle.BORDER_MEDIUM); switch (i) { // Creating column1 (Row label) and column 2 (Year 2005 stmt) case 0: cell = row1.createCell((short) 0); cell.setCellValue("Revenue ($)"); cell = row1.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("revenue").item(0) .getFirstChild().getNodeValue()); cell = row2.createCell((short) 0); cell.setCellValue("Cost of Revenue ($)"); cell = row2.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("costofrevenue").item(0) .getFirstChild().getNodeValue()); cell = row3.createCell((short) 0); cell.setCellValue("<a background-color="transparent" href="/?page=2#" itxtdid="2625781" target="_blank" rel="noopener">Research and Development</a> ($)"); cell = row3.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("researchdevelopment") .item(0).getFirstChild().getNodeValue()); cell = row4.createCell((short) 0); cell.setCellValue("Sales and Marketing ($)"); cell = row4.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("salesmarketing").item(0) .getFirstChild().getNodeValue()); cell = row5.createCell((short) 0); cell.setCellValue("General and Administrative ($)"); cell = row5.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("generaladmin").item(0) .getFirstChild().getNodeValue()); cell = row6.createCell((short) 0); cell.setCellValue("Total Operating Expenses ($)"); cell.setCellStyle(cellStyle); cell = row6.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("totaloperexpenses").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); cell = row7.createCell((short) 0); cell.setCellValue("Operating Income ($)"); cell = row7.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("operincome").item(0) .getFirstChild().getNodeValue()); cell = row8.createCell((short) 0); cell.setCellValue("Investment Income ($)"); cell = row8.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("invincome").item(0) .getFirstChild().getNodeValue()); cell = row9.createCell((short) 0); cell.setCellValue("Income Before Taxes ($)"); cell.setCellStyle(cellStyle); cell = row9.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("incbeforetaxes").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); cell = row10.createCell((short) 0); cell.setCellValue("Taxes ($)"); cell = row10.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("taxes").item(0) .getFirstChild().getNodeValue()); cell = row11.createCell((short) 0); cell.setCellValue("Net Income ($)"); cell.setCellStyle(cellStyle); cell = row11.createCell((short) 1); cell.setCellValue(((Element) (nodeList.item(0))) .getElementsByTagName("netincome").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); break; // Creating column 3 (Year 2004 stmt) case 1: cell = row1.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("revenue").item(0) .getFirstChild().getNodeValue()); cell = row2.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("costofrevenue").item(0) .getFirstChild().getNodeValue()); cell = row3.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("researchdevelopment") .item(0).getFirstChild().getNodeValue()); cell = row4.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("salesmarketing").item(0) .getFirstChild().getNodeValue()); cell = row5.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("generaladmin").item(0) .getFirstChild().getNodeValue()); cell = row6.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("totaloperexpenses").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); cell = row7.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("operincome").item(0) .getFirstChild().getNodeValue()); cell = row8.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("invincome").item(0) .getFirstChild().getNodeValue()); cell = row9.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("incbeforetaxes").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); cell = row10.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("taxes").item(0) .getFirstChild().getNodeValue()); cell = row11.createCell((short) 2); cell.setCellValue(((Element) (nodeList.item(1))) .getElementsByTagName("netincome").item(0) .getFirstChild().getNodeValue()); cell.setCellStyle(cellStyle); break; default: break; } } // Outputting to Excel spreadsheet FileOutputStream output = new FileOutputStream(new File("IncomeStatements.xls")); wb.write(output); output.flush(); output.close(); } catch (IOException e) { System.out.println("IOException " + e.getMessage()); } catch (ParserConfigurationException e) { System.out .println("ParserConfigurationException " + e.getMessage()); } catch (SAXException e) { System.out.println("SAXException " + e.getMessage()); } } public static void main(String[] argv) { File xmlDocument = new File("incomestatements.xml"); XMLToExcel excel = new XMLToExcel(); excel.generateExcel(xmlDocument); } } You can run the XMLToExcel.java application in Eclipse. This generates an Excel spreadsheet. IncomeStatements.xls, the example spreadsheet generated from the example XML document, gets added to the Chapter11 project, as shown in Figure 3. Figure 3. The Excel spreadsheet IncomeStatements.xls in the Chapter11 projectFigure 4 shows the Excel spreadsheet generated with the Apache POI HSSF API.Figure 4. IncomeStatements.xls spreadsheet Converting an Excel spreadsheet to an XML documentIn the previous section, you learned how to generate an Excel document from an XML document. In this section, you’ll convert the Excel document to an XML document. The procedure to generate an XML document from a spreadsheet is as follows:Create an empty XML document using DocumentBuilderAdd top-level stmt elementsCreate an HSSFSheet object from the Excel fileIterate over the spreadsheet and add subelements to the XML documentOutput the XML document using the Transformer APIYou can use the Apache POI HSSF API to parse an Excel spreadsheet and retrieve cell values from the spreadsheet. As in the previous section, first you need to import the Apache POI package org.apache.poi.hssf.usermodel.The root element of the XML document (Listing 1) that you will generate is incmstmts, and you’ll add an stmt element corresponding to each of the columns of the Excel spreadsheet. Therefore, generate an XML document using a DocumentBuilder object as shown in Listing 10 and add the root element of the document. You can obtain the DocumentBuilder object from a DocumentBuilderFactory object, as shown in Listing 10. Listing 10. Creating an XML documentDocumentBuilderFactory factory =DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.newDocument(); Element rootElement=document.createElement("incmstmts"); document.appendChild(rootElement); You can read the XLS spreadsheet that is to be converted to an XML document using an InputStream, as shown in Listing 11. Subsequently, obtain a workbook from the InputStream object and obtain the spreadsheet in the Excel workbook.Listing 11. Obtaining spreadsheetInputStream input=new FileInputStream(new File("IncomeStatements.xls") ); HSSFWorkbook workbook=new HSSFWorkbook(input); HSSFSheet spreadsheet=workbook.getSheetAt(0); To construct an XML document, add an stmt element for each of the columns in the spreadsheet. You also need to add an element, year, corresponding to the column header, to each of the stmt elements, as shown in Listing 12.Listing 12. Adding stmt elementsElement stmtElement1 = document.createElement("stmt"); rootElement.appendChild(stmtElement1); Element year1 = document.createElement("year"); stmtElement1.appendChild(year1); year1.appendChild(document.createTextNode("2005")); To add subelements to stmt elements, iterate over spreadsheet rows and, using a switch statement, retrieve row cell values; use these row cell values to create the subelements. Because the first row(corresponding to index 0)) is a header row, iterate from the second row. For example, to add a revenue element, retrieve the second spreadsheet row, which corresponds to index 1, using the getRow(int) method of the HSSFSheet class. You can retrieve a row cell value using the HSSFRow method getCell(short) and a row cell value using the method getStringCellValue(), as shown in Listing 13. You can obtain the number of rows in a spreadsheet using the HSSFSheet class method getLastRowNum().Listing 13. Adding elements to the XML documentfor (int i = 1; i <= spreadsheet.getLastRowNum(); i++) { switch (i) { case 1: HSSFRow row1 = spreadsheet.getRow(1); Element revenueElement1 = document.createElement("revenue"); stmtElement1.appendChild(revenueElement1); revenueElement1.appendChild (document.createTextNode (row1.getCell((short)1). getStringCellValue())); Element revenueElement2 = document.createElement("revenue"); stmtElement2.appendChild(revenueElement2); revenueElement2.appendChild (document.createTextNode (row1.getCell((short) 2). getStringCellValue())); break; } } Similarly, other cell values are retrieved from the spreadsheet and specified in the XML document. You can generate the XML document using the Transformer API. You obtain a Transformer object from a TransformerFactory object, as shown in Listing 14. You can output the XML document using the transform(DOMSource, StreamResult) method with a DOMSource object as input and a StreamResult object as output, as shown in Listing 14.Listing 14. Outputting an XML documentTransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(); DOMSource source = new DOMSource(document); StreamResult result = new StreamResult(new File(<a background-color="transparent" href="/?page=3#" itxtdid="2878689" target="_blank" rel="noopener">System</a>.out)); transformer.transform(source, result); Listing 15 shows the Java application, ExcelToXML.java, used to convert an Excel spreadsheet to an XML document. The application consists of a method generateXML(File excelFile) that converts a spreadsheet to an XML document. An XML document is created using a DocumentBuilder object. A spreadsheet is parsed, and an XML document element, stmt, is added corresponding to each of the columns in the spreadsheet. Elements are added to the stmt element that corresponds to the rows in a column. The XML document is output using the Transformer API.Listing 15. ExcelToXML.javapackage com.apress.excel; import org.apache.poi.hssf.usermodel.*; import org.w3c.dom.*; import java.io.*; import javax.xml.parsers.*; import javax.xml.transform.*; import javax.xml.transform.dom.DOMSource; import javax.xml.transform.stream.StreamResult; public class ExcelToXML { public void generateXML(File excelFile) { try { //Initializing the XML document DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = factory.newDocumentBuilder(); Document document = builder.newDocument(); Element rootElement = document.createElement("incmstmts"); document.appendChild(rootElement); //Creating top-level elements Element stmtElement1 = document.createElement("stmt"); rootElement.appendChild(stmtElement1); Element stmtElement2 = document.createElement("stmt"); rootElement.appendChild(stmtElement2); //Adding first subelements Element year1 = document.createElement("year"); stmtElement1.appendChild(year1); year1.appendChild(document.createTextNode("2005")); Element year2 = document.createElement("year"); stmtElement2.appendChild(year2); year2.appendChild(document.createTextNode("2004")); //Creating an HSSFSpreadsheet object from an Excel file InputStream input = new FileInputStream(excelFile); HSSFWorkbook workbook = new HSSFWorkbook(input); HSSFSheet spreadsheet = workbook.getSheetAt(0); for (int i = 1; i <= spreadsheet.getLastRowNum(); i++) { switch (i) { //Iterate over spreadsheet rows to create stmt element //subelements. case 1: HSSFRow row1 = spreadsheet.getRow(1); Element revenueElement1 = document.createElement("revenue"); stmtElement1.appendChild(revenueElement1); revenueElement1.appendChild (document.createTextNode (row1.getCell((short) 1). getStringCellValue())); Element revenueElement2 = document.createElement("revenue"); stmtElement2.appendChild(revenueElement2); revenueElement2.appendChild (document.createTextNode (row1.getCell((short) 2). getStringCellValue())); break; case 2: HSSFRow row2 = spreadsheet.getRow(2); Element costofrevenue1 = document.createElement("costofrevenue"); stmtElement1.appendChild(costofrevenue1); costofrevenue1.appendChild (document.createTextNode (row2.getCell((short)1). getStringCellValue())); Element costofrevenue2 = document.createElement("costofrevenue"); stmtElement2.appendChild(costofrevenue2); costofrevenue2.appendChild (document.createTextNode (row2.getCell((short) 2). getStringCellValue())); break; case 3: HSSFRow row3 = spreadsheet.getRow(3); Element researchdevelopment1 = document.createElement("researchdevelopment"); stmtElement1.appendChild(researchdevelopment1); researchdevelopment1.appendChild (document.createTextNode (row3.getCell((short) 1).getStringCellValue())); Element researchdevelopment2 =document.createElement("researchdevelopment"); stmtElement2.appendChild(researchdevelopment2); researchdevelopment2.appendChild (document.createTextNode (row3.getCell((short) 2). getStringCellValue())); break; case 4: HSSFRow row4 = spreadsheet.getRow(4); Element salesmarketing1 = document.createElement("salesmarketing"); stmtElement1.appendChild(salesmarketing1); salesmarketing1.appendChild(document.createTextNode(row4.getCell((short) 1).getStringCellValue())); Element salesmarketing2 = document.createElement("salesmarketing"); stmtElement2.appendChild(salesmarketing2); salesmarketing2.appendChild(document.createTextNode(row4.getCell((short) 2).getStringCellValue())); break; case 5: HSSFRow row5 = spreadsheet.getRow(5); Element generaladmin1 = document.createElement("generaladmin"); stmtElement1.appendChild(generaladmin1); generaladmin1.appendChild(document.createTextNode(row5 .getCell((short) 1).getStringCellValue())); Element generaladmin2 = document.createElement("generaladmin"); stmtElement2.appendChild(generaladmin2); generaladmin2.appendChild(document.createTextNode(row5 .getCell((short) 2).getStringCellValue())); break; case 6: HSSFRow row6 = spreadsheet.getRow(6); Element totaloperexpenses1 = document.createElement("totaloperexpenses"); stmtElement1.appendChild(totaloperexpenses1); totaloperexpenses1.appendChild(document.createTextNode(row6 .getCell((short) 1).getStringCellValue())); Element totaloperexpenses2 = document.createElement("totaloperexpenses"); stmtElement2.appendChild(totaloperexpenses2); totaloperexpenses2.appendChild(document.createTextNode(row6 .getCell((short) 2).getStringCellValue())); break; case 7: HSSFRow row7 = spreadsheet.getRow(7); Element operincome1 = document.createElement("operincome"); stmtElement1.appendChild(operincome1); operincome1.appendChild(document.createTextNode(row7 .getCell((short) 1).getStringCellValue())); Element operincome2 = document.createElement("operincome"); stmtElement2.appendChild(operincome2); operincome2.appendChild (document.createTextNode (row7.getCell((short) 2). getStringCellValue())); break; case 8: HSSFRow row8 = spreadsheet.getRow(8); Element invincome1 = document.createElement("invincome"); stmtElement1.appendChild(invincome1); invincome1.appendChild (document.createTextNode (row8.getCell((short) 1). getStringCellValue())); Element invincome2 = document.createElement("invincome"); stmtElement2.appendChild(invincome2); invincome2.appendChild (document.createTextNode (row8.getCell((short) 2). getStringCellValue())); break; case 9: HSSFRow row9 = spreadsheet.getRow(9); Element incbeforetaxes1 = document.createElement("incbeforetaxes"); stmtElement1.appendChild(incbeforetaxes1); incbeforetaxes1.appendChild (document.createTextNode (row9.getCell((short) 1). getStringCellValue())); Element incbeforetaxes2 =document.createElement("incbeforetaxes"); stmtElement2.appendChild(incbeforetaxes2); incbeforetaxes2.appendChild (document.createTextNode (row9.getCell((short)2). getStringCellValue())); break; case 10: HSSFRow row10 = spreadsheet.getRow(10); Element taxes1 = document.createElement("taxes"); stmtElement1.appendChild(taxes1); taxes1.appendChild(document.createTextNode(row10.getCell( (short) 1).getStringCellValue())); Element taxes2 = document.createElement("taxes"); stmtElement2.appendChild(taxes2); taxes2.appendChild(document.createTextNode(row10.getCell( (short) 2).getStringCellValue())); break; case 11: HSSFRow row11 = spreadsheet.getRow(11); Element netincome1 = document.createElement("netincome"); stmtElement1.appendChild(netincome1); netincome1.appendChild(document.createTextNode(row11 .getCell((short) 1).getStringCellValue())); Element netincome2 = document.createElement("netincome"); stmtElement2.appendChild(netincome2); netincome2.appendChild(document.createTextNode(row11 .getCell((short) 2).getStringCellValue())); break; default: break; } } TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(); //Add indentation to output transformer.setOutputProperty (OutputKeys.INDENT, "yes"); transformer.setOutputProperty( "{http://xml.apache.org/xslt}indent-amount", "2"); DOMSource source = new DOMSource(document); StreamResult result = new StreamResult(System.out); transformer.transform(source, result); } catch (IOException e) { System.out.println("IOException " + e.getMessage()); } catch (ParserConfigurationException e) { System.out .println("ParserConfigurationException " + e.getMessage()); } catch (TransformerConfigurationException e) { System.out.println("TransformerConfigurationException "+ e.getMessage()); } catch (TransformerException e) { System.out.println("TransformerException " + e.getMessage()); } } public static void main(String[] argv) { ExcelToXML excel = new ExcelToXML(); File input = new File("IncomeStatements.xls"); excel.generateXML(input); } } You can run the application ExcelToXML.java in Eclipse. Listing 11-16 shows the output from the ExcelToXML.java application.Listing 16. Output from ExcelToXML.java<?xml version="1.0" encoding="UTF-8"?> <incmstmts> <stmt> <year>2005</year> <revenue>11837</revenue> <costofrevenue>2239</costofrevenue> <researchdevelopment>1591</researchdevelopment> <salesmarketing>2689</salesmarketing> <generaladmin>661</generaladmin> <totaloperexpenses>7180</totaloperexpenses> <operincome>4657</operincome> <invincome>480</invincome> <incbeforetaxes>5137</incbeforetaxes> <taxes>1484</taxes> <netincome>3653</netincome> </stmt> <stmt> <year>2004</year> <revenue>10818</revenue> <costofrevenue>1875</costofrevenue> <researchdevelopment>1421</researchdevelopment> <salesmarketing>2122</salesmarketing> <generaladmin>651</generaladmin> <totaloperexpenses>6069</totaloperexpenses> <operincome>4749</operincome> <invincome>420</invincome> <incbeforetaxes>5169</incbeforetaxes> <taxes>1706</taxes> <netincome>3463</netincome> </stmt> </incmstmts> SummaryThe Apache POI API provides a useful mechanism for converting data between XML and spreadsheets. In this article, you learned how to convert an example XML document to an Excel spreadsheet and then convert the spreadsheet to an XML document. With XML being a universal format, there really is no limit to what you can do with it!Ajay Vohra is senior architect at DataSynapse, currently working on virtualizing business-critical applications on a shared infrastructure. He has 15 years of experience in the software industry, working on diverse projects such as making X Windows Toolkit thread-safe, developing embedded software for an ATM network switch, and architecting Java EE business-critical applications. He is the founder and a principal member of NuBean. He has an MBA from the University of Michigan, Ann Arbor and a masters of science in computer science from Southern Illinois University, Carbondale. Deepak Vohra is an independent consultant and a principal member of the NuBean.com software company. He has worked in the area of XML and Java programming for more than 5 years and has published numerous articles on related topics in various print and online trade journals. He has a masters of science in mechanical engineering from Southern Illinois University, Carbondale and has published original research papers in the area of fluidized bed combustion. Currently, he is working on an automated, Web-based Java EE development environment for NuBean.com. JavaProgramming LanguagesMicrosoft OfficeOpen Source