PQPF VERIFICATION SYSTEM Design

Introduction

The Probabilistic Quantitative Precipitation Forecast Verification System (PQPFVS) is a software program currently in development being implemented in ANSI C++. There are two primary parts of this system, the preprocessor and the verification logic. The role of the preprocessor is to take raw forecast and observational data, format the data and create either forecasts or observations that will be used for verification. The role of the verification logic is to calculate calibration scores and informative scores for the different forecast elements with respect to the observations.

This document only provides a high-level explanation of the classes involved in the software. For detailed descriptions of the classes, their member functions and member data look at this .

Preprocessor Design

Two sets of input data are needed, forecast data and observational data used to evaluate the forecasts. Both sets of data contain multiple elements. Each element is represented as a matrix of points (called a grid) that covers a large area. The data must first be divided into smaller sets that represent particular areas, such as river basins or squares of some predetermined size. Next the raw data must be operated on to produce forecasts and observations. These are the tasks of the preprocessor.

The Preprocessor part of the PQPFVS was built to resemble the Builder Pattern (Gamma, et al. 1995). The idea behind the Builder Pattern is that an interface is specified for an operation. Then, various different ways of carrying out the operation can be implemented without impacting the other parts of the system. For the PQPFVS the interface in question is the DataFormatter class. Since it was known at the outset that a variety of different methods for formatting grids would be necessary, the DataFormatter class was made abstract. The two classes that use the DataFormatter, namely ForecastHandler and ObservationHandler interact only with the specified interface, while the actual work being done is by a subclass of DataFormatter that was specified as input to either of the handler classes.

Preprocessor Class Overview

There are six primary classes in the preprocessor. These classes are the basis of the preprocessor. The classes are:

There are two data classes that are used to pass different forms of data between classes, they are:

Preprocessor Class Descriptions

Primary Classes

The DbInterface is a simple interface class that interacts with the database that holds all of the raw observational and forecast data. The main job of DbInterface is to collect the specified data from the database and construct Grid objects.

The requests for Grid objects come from ForecastHandler and ObservationHandler (which will be referred to as the handlers). The handlers are constructed by the user according to what Forecasts or Observations are desired. The handlers request specific Grids from the DbInterface and then ask the specified DataFormatter to format the Grids into Areas. This generally means that a particular basin is extracted from the Grid and returned in the form of an Area. These Areas are then used to construct either Forecasts or Observations depending on which handler was called. The result is that a ForecastHandler object will have a vector of Forecasts available and an ObservationHandler object will have a vector of Observations available. Both ForecastHandler and ObservationHandler are children of the GridDataHandler template class. This class implements are large amount of the common functionality between the two handler classes.

The Forecast class creates a PQPF given raw forecast data and the Observation class combines the hourly observational data into a usable format.

The DataFormatter class is an abstract base class that provides an interface for different formatting methods. Two formatting methods that have been implemented to this point are Squares and Basins. Squares simply divides a Grid into Areas in the shape of a square of specifiable size. Basins creates Areas for particular basins and sub-basins from the Grids.

Data Classes

The two data classes are Grid and Area. A Grid class is simply a class that stores and allows access to the data stored in an xmrg file, the file format that all forecast and observational data is stored in. Area classes are created by DataFormatter subclasses. Areas consist of a vector of vectors of points. Each vector of points represents the points in a sub-basin. Collectively the vectors of points form the basin. If a format is used where there are no sub-basins, then an Area will simply hold one vector of points.

Verification Design

The verification classes are fewer in number, but contain far more complicated mathematics than the Preprocessor classes which primarily handle and format data.

Verification Class Overview

Verification Class Descriptions

The verification logic consists of three primary classes, they are the PopVerification, FractileVerification, and FractionVerification. Each of these classes inherits its interface from the abstract base class Verifcation. PopVerification calculates the calibration score and informativeness score for the POP. The same values are calculated for the Exceedance Fractiles and Expected Fractions in the associated classes. The other primary tasks of the Verification classes is to calculate sample statistics. Since each type of verification needs different s statistics, each class has its own method.

Each of these classes is constructed using a ForecastHandler, an ObservationHandler and beginning and ending dates. The beginning and ending dates are used so that a subset of the days specified by the two handlers can be used for verification. This allows multiple subsets of days to be verified while only reading and formatting the data once.

Utility Classes

There are a number of utility classes used by this software. These are classes that are not tied to this particular piece of software but are used by it.

Utility Class Overview

Finally there a some utility classes. The primary ones are

Utility Classe Descriptions

The Date class stores a date and time stamp. The class allows easy access to the elements of the date and has features that allow for simple handling.

The GenericException is an abstract class. The intent is that GenericException be subclassed for different types of exceptions. For instance if there is a problem reading the database an ExternalException will be thrown since the database is third-party software and external to this software. Another example would be if the number of grids returned by the data base was incorrect (say 20 observational grids for a day, and hence 4 short) a MissingDataException would be thrown. This allows for simple and flexible exception handling.

The Debug class is a simple class that allows for debug information to be written in different ways without necessitating recompilation. A value is specified in the $APPS_DEFAULTS configuration file and all debug statements with values less than or equal to the value specified will be written out.

Riemann is actually only a single function. This template function calculates the Riemann summation (integral estimation) for an arbitrary function over an arbitrary range (with the exception of handling infinity) for an arbitrary degree of precision (you're only limited by time!). The function takes a predicate class as an argument and type for the class. The operator() for this class is called by riemann() and is the function that is being integrated. Also specified in the arguments is the range over which the integration occurs and the number of slices in which to divide the range.

Tostring is another template function. This class takes any class with an operator<< defined for it, takes the output from operator<< and casts it to a string by passing it through a strstream.

Reference

Gamma, E., Helm, R., Johnson, R., Vlissides, J., 1995: Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, 97-106.