Skip to content
March 26, 2012 / Keyhole Software

Build vs. Buy, Creating a Report Writing Framework

During one of my engagements, a requirement arose for the production of numerous financial audit reports in PDF format. These reports currently existed and were being produced by COBOL applications, and since COBOL (running on a mainframe) was being retired, they needed to be replaced with a Java solution.

I was tasked with defining a mechanism to efficiently enable developers to define these reports. My first instinct was to utilize a report writing tool, such as Jasper reports or Crystal reports. These report writing tools provide a robust WYSIWIG mechanism to define reports against a relational database source. Additionally, they have runtime libraries that allow reports to be launched from within applications. However, as I reviewed the population of reports that had to be created, they all fell into standard control break reports, typical of reports created with a procedural language like COBOL. When I say “control break” reports, I’m referring to column-based reports that have sub totals and totals that are reported when a column value changes. I learned this term in my COBOL course in college during the 80’s– yes, I’m old.

So, all of the reports did not have (what I would call) desktop publishing type requirements as they were used for information and audit purposes only, so they all followed a common theme headings, detail (columns), totals, subtotals, and footings. My current customer indicated they had enterprise licenses for Crystal Reports, and it had an eclipse plugin for designing reports, so I started to down the path of authoring reports and coming up with a quasi generalized way to launch these reports. I soon realized that the Crystal Reports’ learning curve, along with required server runtime binary, was making this solution more complicated than it needed to be. The other option was to enable the java developers to write their own reports using pure Java Code. I’d rather teach java programmers how to apply a java framework instead of a report editing tool. I decided to create a framework that generalizes the aforementioned reporting format layout requirements using OO design patterns and techniques. This approach was successful allowing numerous reports to be created quickly, and since the generalized framework was built with pure Java code and was easily digested, (as different requirements creeped up as reports were being developed) they easily applied to the framework. The remainder of this blog will describe the design of this framework, and at the end provide a link to a fully functioning implementation in GitHub.

Report Framework Design

The three main objects of the framework are the report iterator, report processor, and report writer. The developer implements a report iterator since its responsibility is to provide input records to the framework when asked. The report processor “processes” the input and then forwards report records to a writer implementation. For extend-ability, writer implementation are typed as an interface, a PDF writer is supplied, but an HTML, CSV, XML, JSON, etc… implementations could be created.

Here’s the model: Processor and PDFWriter classes are classes provided with the framework, the user or developer implements an iterator and factory. The factory defines a reports layout (i.e. columns, heading, footings, groups, totals, etc..).

Report Writer

The ReportIterator contract is to supply a List of Data objects when requested from the processor, this models a report row, the processor will ask for a row of data by invoking next(). The Data type is supplied by the framework and provides convenience methods for implementation. (As will be shown by following example code.)

The PDFReportWriter is supplied by the framework and uses the itext framework to turn directives sent by the report processor into itext PDF elements in order to generate a PDF document.

Report directives are received by a report writer implementation through the write(String[] row) method. The String[] array argument represents a report row and column values of the row. The first element of the array contains a directive id indicating the row type (i.e. header, detail, footer, etc…) and the writer implementation can format appropriately.

Reports layout information such as columns, heading, footing, totaling, grouping, etc., are defined as a report factory implementation. Factories are defined by extending a framework supplied ReportFactory class.

Stock Pricing Usage Example

To see how a report is generated, let’s create a PDF report of pricing information for a set of stocks. In this example we will read stock information from a Yahoo public URL that provides near realtime stock stock market information. Of course for many enterprise applications, reading will occur from a relational data source. By the way, there is a URL to complete source code for this implementation in GitHub. Here’s the reader implementation:

**
 * Reads current stock info from YAHOO finance URL and parses into report column rows
 *
 * @author dpitt www.keyholesoftware.com
 *
 */
public class StockReportIterator implements ReportIterator {

	// Some methods omitted for brevity..

	/**
	 * Return next report row NULL when done.
	 */
	public List<Data> nextRow() {

		Stock stock = stocks.poll();
		if (stock == null) {
			return null;
		}

		List<Data> cols = new ArrayList<Data>();
		cols.add(Data.convertToData(TICKER, stock.ticker));
		cols.add(Data.convertToData(NAME, stock.name));
		cols.add(Data.convertToData(TRADE_DATE, stock.tradeDate));
		cols.add(Data.convertToData(PRICE, stock.price));
		cols.add(Data.convertToData(PE, stock.pe));
		cols.add(Data.convertToData(DIVIDEND_YIELD, stock.dividendYield));

		return cols;
	}

	// Some methods omitted for brevity..

}

The report layout is defined in by implementing a report factory. A report factory extends from the framework ReportFactory class as shown below:

public class StockQuoteReportFactory extends ReportFactory {
…
}

Report columns are implemented and can be marked as being grouped or totaled:

	@Override
	public List<Column> getColumns() {
		List<Column> cols = new ArrayList<Column>();
		cols.add(Column.New(TICKER, TICKER));
		cols.add(Column.New(NAME, NAME));
		cols.add(Column.New(TRADE_DATE, TRADE_DATE));
		cols.add(Column.NewNumeric(PRICE, PRICE));
		cols.add(Column.NewNumeric(PE, PE));
		cols.add(Column.NewNumeric(DIVIDEND_YIELD, DIVIDEND_YIELD));
		return cols;
	}

Headings and footers are defined by returning a String[] array, each element is a footer/header array element, ~ characters are used to indicate left/center/right justification of heading lines, and example of this is shown below:

	@Override
	// report header with current date and time
	public String[] getHeader() {
		Calendar cal = Calendar.getInstance();
		String date = DateFormatUtils.format(cal, ReportingDefaultConstants.DATE_FORMAT);
		String time = DateFormatUtils.format(cal, ReportingDefaultConstants.TIME_FORMAT);
		// header array
		return new String[] { "Date: " + date, "Time: " + time, "~Stock Information~" };
	}

How to configure the report framework objects and generating a stock quote PDF report is shown below:

	// output writer with output stream
	ReportPDFWriter writer = new ReportPDFWriter();
	writer.setOut(new FileOutputStream(new File(new URI("file:/users/dpitt/stocks.pdf"))));

	// create processor with factory and iterator
	ReportProcessor processor = new ReportProcessor(new StockQuoteReportFactory(), new 	StockReportIterator());
	// set writer
	processor.writer = writer;
	// process report
	processor.process();

PDF generated by the configuration above looks like this:

As is turned out, the decision to implement an easy-to-use framework worked out. Reports were produced quickly, and the framework was easy to understand and use. Additionally, server runtime components, and associated cost, for the commercial reporting framework weren’t required. In many cases building your own framework is not the most efficient decision, there are so many excellent robust open source frameworks, and products available, so usually, writing your own solution is not justified. However, sometimes, adopting a framework or product is not as efficient as building your tailored framework. The decision of which way to go lies collectively in requirements, time, and instinct.

The report writing framework along with the example report is available as open source on GitHub here.

Also, for those of you using the Spring Batch framework, there is a Spring Batch Report Writer version on GitHub here.

— David Pitt, asktheteam@keyholesoftware.com

2 Comments

Leave a Comment
  1. Mark Adelsberger / Apr 6 2012 00:21

    I found this example particularly interesting as I’ve spent a lot of years working around reporting and analytic applications. (Even before I got pulled into the data warehousing world, much of my prior Java and C++ work had related to a product called Parallel Crystal, which was – or maybe still is, if you know where to find it – enterprise reporting software with a very strange relationship to Crystal Reports.)

    The conventional wisdom is averse to reinventing the wheel, and not without good reason. This example shows how sometimes the off-the-shelf wheel costs more than it delivers. To me it’s important to emphasize that this “build vs. buy” decision depends on the specifics of the situation. While that decision process may always be partially art rather than science, I find it useful to quantify it where possible, to lessen the urge to use a “one-size-fits-all” approach that will sometimes be wrong (and costly).

    In my view, it’s mostly about requirements match-up. What problems do I need to solve? What problems does the tool solve? What are my costs to make the tool solve my problem? What overhead costs will I bear if the tool is hard to specialize from the full range of problems it solves to the specific problems I need it to solve?

    I’m not sure if everyone here has dealt much with off-the-shelf reporting software, so thought it might be worth calling out some details about what requirements they’re good at meeting, and why (it seems to me) they turned out to be a bust in the above scenario.

    First, if I think about where commercial reporting packages are at their best, I think about letting non-programmers write reports. When the Business Objects reporting suite (of which Crystal is essentially now a component) is put to best use, it’s enabling dynamic and ad-hoc analytic operations by business users. But all of that value is inapplicable to this problem. The tools are notably less good at leveraging the existing skills and knowledge of developers; they aren’t in their “sweet spot” if the task is enabling developers to generate specific canned reports for the business to passively consume.

    I’m half-inclined to call that a special case. The reporting tools were simply not geared to the audience who would be using them. On the other hand, maybe the general take-away is that identifying the best tool for a job may require more detailed requirements than you’d initially think.

    Second, even looking more specifically at Crystal, it’s strength is in flexible report template design and advanced features for organizing the data merged into the report. Again, most of those features aren’t called for if every report is a simple table with control breaks. My experience was that a lack of modular design ensured that you’d pay in learning curve for much more than you might really need to use.

    Light-weight, modular frameworks really gain an edge in being cost-effectively applicable to a wider range of problems when they enable you to not just USE only what you need, but LEARN only what you need. Consider Spring, which (among other things) provides a great deal of flexibility in how you can configure various components, but then provides XML configuration with namespaces that directly facilitate the most common configurations.

  2. David / Apr 9 2012 15:27

    Could not have said better myself, thanks David

Leave a reply to David Cancel reply