Use JDBC metadata and precompiled SQL to turn data into database data types Suppose you need to write a program that populates a table in a database with data from a text file. Say the table contains ticket information for an airline, including ticket number, date of purchase, date of travel, departure and arrival location, and the ticket’s price. Each piece of information has a specific data type associated with it in the database, such as number, text, date, or currency. The program must read information for several such tickets from a text file, translate each piece of information to the appropriate data type in the database, and load the table in the database with this translated information. In order to perform such a translation, the program needs to know the database-specific type of each piece of information. A simple approach would be to hardcode the type conversion in your program — treating data types as static information known at programming time, for example. By adopting this simple approach, you would need to write similar code in another program to populate a second table of the same database, such as a customer table with customer information. You can repeat a lot of boilerplate code from the previous program, but you would need to redo all the database-specific data-type-handling code. As the number of tables that need to be populated increases, you’ll need to write more code. But even this is only the case for a single database. If the data is ported to a different database system, you must rewrite all the existing code to account for the different data types in the new database. The issue of cross-database data typing has earned well-deserved notoriety in database programming.Create reusable codeHow can you avoid all this rewriting? The code you write to populate the table of a database from a text file can make the database name, table name, and column names into parameters. The question is, how can you do the same for the data types, so that you can write a class, or collection of classes, just once to populate any set of tables in any database? The answer is to postpone the determination of data types until runtime, using the DatabaseMetaData interface in the java.sql package. By writing to this interface, you avoid the hardcoding of data types and develop generic, reusable code. The DatabaseMetaData interface provides metadata information for a database. Metadata is data that describes data. For instance, our airline database table contains ticketing information. This is data. In this scenario, metadata would contain information on things like the number of columns in the table, the data types of these columns, whether a column can have null values, and so on. This is data about data. The specific focus of this article is on the data types of columns. In the following discussion, I will show you how to develop a reusable library of three Java classes that are organized in layers, from closest to the database to farthest. The farthest layer from the database becomes the nearest layer to the application that uses the library. Although this library is meant for populating a database table, you can use the techniques illustrated in building it, as well as some of the code, to build libraries for data-type-independent database queries and updates.Follow the scenarioSuppose you need to populate a table called emp in an organization database. Here is a sample data file that contains the table name on the first line, and a set of column names in the second. Every subsequent line contains row data for the specified columns. emp hiredate sal ename empno 1996-09-01 1250.00 jackson 7123 1980-01-01 2500.50 walsh 7124 1985-01-01 12345.67 gates 7125 While all input data is ASCII text, the data in these columns will be translated to the following database-specific types: column database type -------- ------------- hiredate date sal decimal ename ASCII text empno number The structure of emp is shown in the top row of Figure 1. Not all columns in a table need to be populated at once. In this example, only those columns marked as true are to be populated; the active columns are empno, ename, hiredate, and sal.Also, the order in which the columns appear in the input file need not be the same as the order in which they are arranged in the database table. An index array maintains a mapping from the columns in the input to their respective positions in the table. In this example, the active column order array is the index array. If i is the position of a column in the input file (hiredate is 0, sal is 1, and so forth), the activeColumnOrder[i] is the position of this column in the table (activeColumn[0] is 5, which means hiredate is the fifth column in the emp table).SUBHEAD_BREAK: Building library class layers Our library consists of three layers: A TableColumns class, closest to the database, which discovers and manages table column informationA TableMediator class, which prepares for populating the table using information managed in TableColumnsA TableBuilder class, farthest from the database and thus closest to the application, which reads the data from input and uses the TableMediator class to populate the given table.Figure 2 illustrates this layering.Figure 2. Application layersThese classes are all collected into a package called tablebuild.The Java source files for the tablebuild package are available in the Resources section below. In order to avoid clutter, only relevant portions of the code appear in this article. Layer 1: The TableColumns classThe TableColumns class discovers information about the given columns of a database table by querying the given database metadata instance. It stores column information in two parallel arrays: a string array named columnNames, and an array of short integers named columnTypeCodes . The types of all the columns in a given table can be discovered using DatabaseMetaData‘s getColumns method: public abstract ResultSet getColumns(String catalog, String schemaPattern, String tableNamePattern, String columnNamePattern) throws SQLException The getColumns method takes four parameters: a catalog name, a schema name, a table name, and a column name. Of these, the last three, which carry the Pattern suffix (see the code above), are interesting in that they let you search for all target strings that match a pattern expression. In other words, you specify search criteria through these parameters. This is what is being said: Find all columns that satisfy the following criteria: they belong to the given catalog, to ANY of the schemas that match the given schema pattern, and to ANY of the tables that match the given table name pattern. Their names also match the given column name pattern.The pattern is specified with syntax similar to that which is applied in SQL statements, in which the LIKE phrase is used to match names. In particular, the underscore character (_) in a pattern expression matches any character of a target string, and the percent character (%) matches any number of consecutive characters in a target string. For instance, the pattern expression j_b would match the target strings “job” and “jab,” while the pattern expression j%b would match any target string starting with “j” and ends with “b,” with any number (including zero) of intervening characters.The following is the invocation of the getColumns method in the TableColumns constructor, where dbMeta is a DatabaseMetaData instance for a given database: ResultSet rset = dbMeta.getColumns(null, null, table.toUpperCase(), "%"); Notice that the values of the catalog name and schema pattern parameters are null. A null value indicates that a parameter should be dropped from the search criteria. Also note that we specifically send in the uppercase version of the given table name for the table name pattern. It is not clear whether this is a JDBC requirement or a JDBC driver preference. I spent some frustrating moments looking at all kinds of other bugs before I discovered that the uppercase version of the table name makes my code work. Finally, the catch-all pattern "%" is specified for the column name pattern. This all translates to: “Get ALL columns in the GIVEN table.”The java.sql.ResultSet returned by the getColumns method consists of one row per matching column. Each row consists of 18 descriptive fields, which are the result-set columns. The relevant fields for our class are the column name and the column type code, which are the fields numbered four and five, respectively. The column type code is a constant in the java.sql.Types class that represents the SQL type of the column. In this example, the getColumns invocation with emp for the table name would return a description of all eight columns, one per row of the result set. The type codes for the columns are extracted and stored in the array called columnTypeCodes for the TableMediator class to use. We’ll examine this process next. Layer 2: The TableMediator classThe TableMediator class encapsulates the following functionality:It constructs a prepared statement (a precompiled SQL statement with parameters for which values are supplied at runtime) to populate a table row. The statement parameters correspond to the values of the active columns.It executes the prepared statement for a given set of active columns.The TableMediator class uses the TableColumns class to get the complete column information for the given table, and then matches this information against the given set of active columns to arrive at the configuration illustrated in Figure 1.The prepareRowInserts method constructs the required prepared statement. In this example, the prepared statement would look like this: insert into emp (empno, ename, job, mgr, hiredate, sal, comm, deptno) values (?, ?, ?, ?, ?, ?, ?, ?) All the columns in the table will be populated for every row, with values for the inactive columns (those that do not appear in the input file) set to null.The first key to making your library portable is to use the DatabaseMetaData interface to discover column types, as seen in the TableColumns class. The second is to use the generic setObject method of the java.sql.PreparedStatement interface: ps.setObject(activeColumnOrder[i], columnValues[i]); This line of code appears in the insertRow method of the TableMediator class, and ps is the PreparedStatement instance constructed earlier.The setObject method takes two parameters: the first is the parameter position in the associated prepared statement, and the second is the value of this parameter. (Recall that these parameters stand for columns in a table.) In general, the second parameter can be any type of object; in this code, it is a string. The JDBC driver is responsible for converting this string into the appropriate database type.In this statement, we have not used the column types discovered earlier. However, we do need the column types for values that are null — to populate a column for which no value is supplied, for example. In that case, we would use a supporting setNull method in the PreparedStatement interface. Here is how it is used in the insertRow method of the TableMediator class: ps.setNull(i, tc.columnTypeCodes[i]); This statement sets the value of prepared statement parameter i to null, but this setting requires that the type of the corresponding column be sent as the additional, second parameter.Layer 3: The TableBuilder classOnce the TableColumns and TableMediator classes are established, the code for populating the table with input data is straightforward, as developed in the TableBuilder class. The TableBuilder class encapsulates the following functionality:It reads the table name from the first line of the text file, and creates an instance of the TableMediator classIt reads the column names from the second line of the text file, and asks the TableMediator instance to set the active column list to these columnsIt asks the TableMediator instance to prepare for inserting rows into the tableIt reads the column data from each subsequent line of the text file, and asks the TableMediator instance to insert this data into the next row of the tableSUBHEAD_BREAK: Using the tablebuild library Given the tablebuild library, it is very easy to write an application that populates the table of a database. The following is such a sample application. import java.sql.*; import java.io.*; import tablebuild.*; public class PopulateTable { public static final String usage = "usage: java PopulateTable " + "<table class="legacyTable"datafile>"; public static void main(String[] args) throws SQLException, ClassNotFoundException, IOException { if (args.length != 1) { System.err.println(usage); System.exit(1); } Class.forName("sun.jdbc.odbc.JdbcOdbcDriver"); Connection conn = DriverManager.getConnection( "jdbc:odbc:orgdb"); BufferedReader br = new BufferedReader( new FileReader(args[0])); // tab-delimited input data file TableBuilder tb = new TableBuilder(conn, br, "t"); tb.buildTableInfo(); tb.buildActiveColumns(); tb.buildTable(); } } The example uses the JDBC-ODBC bridge to drive a Microsoft Access database, whose data source is named orgdb. I chose to illustrate this bridge mainly because it serves as a reference implementation for JDBC drivers for most JDBC capabilities.To complete the example, you can run the above application to populate the emp table of the orgdb data source. If the input file name is empfile, you run the application as follows:java PopulateTable empfile This populates the hiredate, sal, ename, and empno columns with the given data, translating each input data value to the required database-specific type.If you have a new data source, you don’t need to modify any of the tablebuild classes or the application, so long as the same input format is maintained — that is, the first line contains the table name, the second line contains the column names, and every subsequent line contains the data for all the columns of some row. The column delimiters can be anything, since they are set as a parameter to the TableBuilder constructor.If the database itself changes, all you need to do is change the lines of code in the application in which the connection to the database is established. Again, none of the tablebuild classes need to be touched; the TableBuilder constructor is simply passed the new connection as a parameter.Test your JDBC driverIn this article, you have seen how to build a data-type-independent class library to populate the columns of a table in a database. The key is to leave all the data-type management to the JDBC driver implementation of the java.sql.DatabaseMetaData and java.sql.PreparedStatement interfaces. You can use the techniques illustrated in this article for database-insertion operations, as well as for database queries and updates.There is one important caveat: the ability to use dynamic type discovery, as well as type translation, ultimately depends on the capabilities of the JDBC driver you use. For instance, some JDBC drivers may not support the type-translation capability of the PreparedStatement‘s setObject method. Also, different JDBC drivers implement the DatabaseMetaData interface to different extents. As a general rule in JDBC programming, when capabilities promised by the JDBC specifications do not materialize, suspect the JDBC driver. More often than not, a good driver that is JDBC-compliant should solve the problem, overcoming apparent limitations in portability or capability. Data Management