ModularityThe General Ecosystem Model (GEM) (Fitz et al., 1996) has been designed to simulate a variety of ecosystem types using a fixed model structure, in hope that the generic nature of the model will help alleviate the "reinventing-the-wheel" syndrome of model development. While the GEM approach still seems to be extremely important for cross ecosystem and scale comparisons, it turned out to be somewhat insufficient to cover all the possible variety in ecosystem processes and attributes that come into play when going from one ecosystem type to another. There is too much ecological variability to be represented efficiently within the framework of one general model. Either something important gets missed, or the model becomes too redundant to be handled efficiently especially within the framework of larger spatially explicit models. Similarly, when changing scale and resolutions different sets of variables and processes come into play. Certain processes that could be considered at equilibrium at a weekly time scale need be disintegrated and considered in dynamic at an hourly time scale. For example, ponding of surface water after a rainfall event is an important process at fine temporal resolution, but may become redundant if the time step is large enough to make sure that all the surface water is either removed by overland flows, or infiltrated. Daily net primary productivity fluctuations, that are important in a model of crop growth, may be less important in a forest model that is to be run over decades with only average annual climatic data available. Once again the general approach may result in either insufficiency or redundancy. The modular approach is a logical extension of the general approach. Instead of creating a model general enough to represent all the variety of ecological systems under different environmental conditions, we develop a library of modules simulating various components of ecosystems or entire ecosystems under various assumptions and resolutions. In this case the challenge is to put the modules together, using consistent and appropriate scales of process complexity, and to make them talk to each other within a framework of the full model. The concept of modularity gained strong momentum with the wide spread of the object oriented approach in software development (Silvert, 1993), (Sequeira et al., 1997). Reynolds and Acock (Reynolds and Acock, 1997) offer an extensive discussion of modular design criteria and rules in application to plant modeling. The features of decomposability and composability are probably the most important ones. The decomposability criterion requires that a module should be an independent, stand-alone submodel that can be analyzed separately. On the other hand the composability criterion requires that modules can be put together to represent more complex systems. Decomposability is mostly attained in the conceptual level, when modules are identified among the variety of processes and variables that describe the system. There is a lot of arbitrariness in choosing the modules. The choice may be driven either by purely logical, physical, ecological considerations about how the system operates, or by quantitative analysis of the whole system, when certain variables and processes are identified as rather independent from the other ones. The composability of modules is usually treated as a software problem. That aspect is usually resolved by use of wrappers that enable modules to publish their functions and services using a common high-level interface specification language (the federation approach) ((CORBA, 1996); (Villa and Costanza, 2000)). The other alternative is the design of model specification formalisms, that draws on the object-oriented methodology and embeds modules within the context of a specific modeling environment that provides all the software tools essential for simulation development and execution (the specification approach) (Maxwell, 1999). In both cases as models find themselves in the realm of software developers the gap between the engineering and the research views on models and their performance starts to grow. From the software engineering viewpoint the exponential growth of computer performance offers unlimited resources for the development of new modeling systems. With the advent of the Internet it becomes possible to assemble models from building blocks connected over the Web and distributed over a network of computers (Fishwick et al., 1998). New languages and development tools appear even faster than their user-communities manage to develop. On the other hand from the research viewpoint, if a model is to be a useful simplification of reality it should enable a more profound understanding of the system of interest. It is more important as a tool for understanding the processes and systems, than for merely simulating them. In this context there is a more limited demand for the overwhelming complexity of modeling systems. The existing software may remain on the shelves if it does not really help understand the systems. This is probably especially pertinent to models in biology and ecology, where in contrast to physical science or engineering, the models are much more loose and "black-box" much of the underlying complexity due to the difficulty of parameterizing and simulating all of the mechanisms from a first-principal basis. They may require a good deal of analysis, calibration and modifications, before they may be actually used. In this case the focus is on model and module transparency and openness. For research purposes it is much more important to know all the nuts and bolts of a module to use it appropriately. The "plug-and-play" feature that is so much advocated by some software developers becomes of lower priority. In a way it may even be misleading, creating the illusion of simplicity of model construction from prefabricated components, with no real understanding of process, scale and interaction. Models delivered by means of some of the icon-based systems such as STELLA (HPS, 1995) offer a lot of transparency, especially if they are properly documented. The STELLA software was used to formulate the GEM, which in part contributed to its fairly wide dissemination. STELLA has a number of advantages, but its support of modularity is very limited. There are no formal mechanisms that could put individual STELLA models together and provide their integration. Stella does allow submodels or sectors within the larger context (such as the sectors in the GEM), allowing each sector to be run independently of the others, or in any combination. However there is no easy way that a sector can be replaced, or moved from one model into another. One of the important features of the Spatial Modeling Environment (SME) (Maxwell and Costanza, 1997) is that it can take individual STELLA models and translate them into a format that supports modularity. In addition to STELLA modules, SME can also incorporate user-coded modules that are essential to describe various spatial fluxes in a watershed or a landscape. Instead of a general model that should represent all the variety of ecosystems, by using SME we can formulate a general modular framework, which defines the set of basic variables and connections between the modules. Particular implementations of modules are flexible and assume a wide variety of components that are to be made available through libraries of modules. The modules are formulated as stand-alone STELLA models, that can be developed, tested and used independently. However they can share certain variables that are the same in different modules, using a convention that is defined and supported in the library specification table. When modules are developed and run independently, these variables are specified by user-defined constants, graphics or timeseries. Within the SME context these variables get updated in other modules to create a truly dynamic interaction. For spatial dynamics, modules can be formulated in C++. They can use some of the SME classes to get access to the spatial data and can be then incorporated into the SME driver and used to update the local variables described within the STELLA modules. In this case it is hard to offer the same level of transparency as with the STELLA modules. More emphasis should be made on explicit documentation and comments to the code. We also hope that by presenting the various modules of the LHEM on the web and offering detailed description of various modules and their functions we can increase their utility for reuse and further improvement. At this time the LHEM is in its initial stages of development. It offers a framework to archive the modules that may be used either as stand-alone models to describe certain processes and ecosystem components, or may be put together into more elaborate structures by using the SME. To explore an example of how the LHEM was used, check out the Patuxent Landscape Model (PLM) which is a fairly complex spatial watershed model that has been put together entirely from the LHEM modules and then calibrated and used for scenario runs. General conventions
They include modeling languages, which are computer languages designed specifically for model development, and extendible modeling systems, which are modeling packages that allow specific code to be added by the user if the existing methods are not sufficient for their purposes. In contrast, there are also modeling systems, which are completely prepackaged and do not allow any additions to the methods provided. There is a remarkable gap between these packaged and extendible systems in terms of their user-friendliness. The less power the user has to modify the system, the fancier the graphic user interface and the easier the system is to learn. From modeling systems we go to extendible models, which are actually individual models that can be adjusted for different locations and case studies. In these the model structure is much less flexible, the user can make choices from a limited list of options and it is usually just the parameters and some spatial and temporal characteristics, that can be changed. Similarly for modeling environments such as SME there is a certain level of user-friendliness that is usually in reverse proportion to generality. To be able to link both unit and spatial modules together, SME adopts certain conventions on how the modules should be described and what are the formats of data that can be used. In SME, local modules can be described as Sectors in STELLA. Each module is a different STELLA model. The sector name should begin with the $ sign. In what follows we will call state variables, forcing functions and parameters simply variables if they do not need be distinguished. The variables within a sector will be considered as owned by this module. All the external variables that are defined outside of the sector borders can be defined in other modules. Within a module, to make it operable as a stand-alone model, these external variables should be defined as constants or as timeseries (say, defined as graphs in Stella) that can change with time or as functions of some other independent variables. Variables that are shared between modules should have the same name. The SME translator takes the STELLA equations saved as a text file, and translates them into an intermediate formalization, called the Modular Markup Language (MML) (Maxwell and Costanza, 1997). It will find the shared names and link them together. A config file will be produced that contains all the variables from all the modules. This config file can be further edited to change the values used for the variables in the driver. However these changes will not affect the values that the variables are set to in the STELLA formulations of the modules. Due to STELLA limitations there is no way back from MML or STELLA equations to the STELLA icon based diagram and modeling tools. Therefore all the changes that are made to the MML formulation or directly to the driver in C++ will be lost if we export and process a new STELLA equations file. Whereas most of the local dynamics can be effectively described within STELLA models, it becomes hard if not impossible to represent spatial processes using this formalism. To link individual local models into a spatial network, again, SME can be used, if the appropriate code is provided. The SME allows one to link C++ programs, described as UserCode, with the local ordinary differential (difference) equations (ODE) generated based on STELLA formulations. A number of the SME classes are made available for writing user code in order to provide access to spatial and non-spatial data structures handled by the SME. Besides, as local dynamics get treated in the SME in a spatial context, it also gets the spatial variability that can be associated with the various parameters being spatially distributed, related to, say, soil or habitat types. In this case when moving from one spatial locality to another the same system of ODE's generated from STELLA gets to be solved with a different parameter set, one that is substituted by SME. Currently SME does not incorporate any extensive data-base features to serve the needs of describing and archiving the numerous parameters encountered in models and modules. However there are several well elaborated input mechanisms that allow one to read the location-dependent data from various file formats. For example, the habitat-dependent parameters are accumulated in a file that has various columns representing the different model parameters, and rows describing the various habitats. A parameter described as habitat-dependent in the config.file is then input from this file based on the information about the particular habitat specified by the Land Use map. Another alternative that we have explored to integrate individual modules and run them jointly is the MADONNA software (Macey and Oster., 1993), that can take Stella equations, compile them and run, actually, much faster than Stella can (which interprets on the fly, not compiling the equations). In Madonna it is quite easy to combine equations from several Stella modules into one Equations file and thus create a new integrated model. Unfortunately the option of viewing the flowchart diagram of this integrated model will be also lost and the joint model will have to be maintained only in the Equations format, thus forfeiting some of the transparency and visualizations that the original modules deliver. For running the modular models spatially, SME still remains the only feasible software package. When applying the LHEM the major complication for the user is to put together the modules in a meaningful and consistent way. In a prefabricated model, the issues of scale consistency are predefined and presumably taken care of by the model developers previously. Now with the modular approach the challenge of combining the modules in such a way that they match the complexity of the system at study and be mutually consistent, becomes the task of the library user. Once again this added concern is the price that is paid for the added flexibility and optimality of the resulting models. In theory, we can envision modeling systems that would keep track of the scales and resolutions of the various processes involved, and automatically allow links with only such modules that would match these scales. In practice, with all the complexity and uncertainty associated with ecological and socio-economic systems, it may still be a while until such modeling tools will appear. In the meanwhile we think that the model transparency will be a very important prerequisite of modularity, especially if the modules are to be used in a research context.
E-mail to Alexey Voinov |