New Weapons Help Liberate Your Database
New technology is breaking down barriers between the keepers of the data and those who desperately want access to it.
New technology is breaking down barriers between the keepers of the data and those who desperately want access to it.
Are engineers preprogrammed to refuse even the most benign requests from data analysts? Is there a correlation between an engineer’s seniority and his propensity to say no? When, exactly, did engineers become “application architects”?
These are the questions upon which data analysts reflect while waiting for the engineering department to run their latest request.
At most large enterprises, data analysts and engineers seem to be in naturally opposing camps. The biggest challenge often faced by the analysts is getting access to the data. The biggest challenge faced by engineers is managing the data infrastructure — keeping servers from crashing and maintaining the enterprise’s ability to keep mission-critical functions running. Engineers, frankly, aren’t paid to enable you to get to the data. They’re paid to make sure things don’t break. Viewed in this light, your requests for increased access can only be seen as one thing: a threat.
Most managers of large companies or business units won’t really talk about these sorts of problems, which are lumped into the unacknowledged category of “internal politics.” That’s unfortunate, because this “natural” opposition between engineers and analysts is often the single largest hindrance to increased profitability within an enterprise. The most successful companies find ways to free the analysts to find correlations in the data; to access dispersed data sets and link marketing costs to lifetime customer value; and to find ways to incorporate data from across the value chain to improve the profitability of the enterprise. The more data the analysts have, the more successful an enterprise can become.
Recent trends in database management are set to alter the balance in favor of increased access while enabling engineers to keep their jobs. These trends include proliferation of Java, consolidation around XML as a universal interaction standard, more and cheaper processing ability, and more and cheaper local-area bandwidth.
Infrastructure application providers are increasingly joining the XML bandwagon, and for good reason. XML enables communication of information across servers and formats in a universal protocol. Since most database applications can now accept queries in Java, enterprises can enable analysts to send queries directly to database applications in XML, have a local Java-based driver convert XML requests into the database’s native language, and get response sets.
These response sets can, in turn, be converted to XML and, from there, to the protocol that is most useful for the analyst.
Here’s an example of how it all works. Say your company has data residing on a mainframe. The data sits within a legacy database application, such as IBM’s AS/400. By using a freely available JDBC driver for this database and mapping its tables to XML, an analyst can query it directly from his desktop. The entire production will take a few weeks to set up and cost next to nothing (really). All it takes is an initial job to map data structures to XML.
These methods enable an enterprise to access all the right information and services from enterprise applications, legacy systems, partner extranets, and the Internet, delivering them to an end user’s desktop.
Increased access to data creates some additional problems. First, legacy databases are probably sitting on legacy servers, where capacity could be really constrained. Second, increased access could choke the network. Fortunately, both these problems are addressable. Queuing requests, or simply running lengthy queries during off-peak hours, are the easiest solutions. Adding server and LAN capacity is the other, still relatively cheap, solution.
The third problem created by access to too much data is the issue of what to do with it. You need to think about consolidating response sets from multiple data sources. However, as any analyst will tell you, having too much data is a good problem to have.