Sustainability—the capacity to endure—has emerged as a concern of central relevance for society. However, the nature of sustainability is distinct from other concerns addressed by computing research, such as automation, self-adaptation, or intelligent systems. It demands the consideration of environmental resources, economic prosperity, individual well being, social welfare, and the evolvability of technical systems.7 Thus, it requires a focus not just on productivity, effectiveness, and efficiency, but also the consideration of longer-term, cumulative, and systemic effects of technology interventions, as well as lateral side effects not foreseen at the time of implementation. Furthermore, sustainability includes normative elements and encompasses multi-disciplinary aspects and potentially diverging views. As a wicked problem (see the sidebar "Wicked Problems"), it challenges business-as-usual in many areas of engineering and computing research.
The complexity of these integrated techno-socioeconomic systems and their interactions with the natural environment is driving attention in several areas. These areas include means for understanding the emergent dynamics of these interactions and supporting better decision making through predictive simulation and system adaptation. At the heart of this is the notion of a model, an abstraction created for a purpose. Models are used throughout sustainability research (for example, for hydrology or pollution analysis) as well as software engineering (for example, for automated code generation). Models have a long history in research related to sustainability. The Global Modeling (GM) initiatives that started in 1960s and 1970s developed and used large mathematical dynamic global models to simulate large portions of the entire world.13 GM in general was applied to human decision-making in domains such as economics, policy, defense, minimization of poverty, and climate change. The goal of GM is to offer a prediction of the future state of the world, or parts of it, using (perhaps heavily) mathematical equations and assumptions. Mathematical models offer a framework of stability that is useful in domains such as climate modeling, but it may not be the same in the case of social sciences domains.
In GM, several models can be seen as "modules" of a larger one, where outputs of one model are inputs for other(s) model(s). This vision of modularity was perhaps very advanced for its time. The idea of building models of complex systems based on simpler models has progressed enormously in the engineering domains, software engineering included. However, in the areas of social and natural sciences it is not the case.12 The intention of initiatives related to GM, for example, International Futuresa or the GLOBIOM model,b have common qualities shared by our proposal. However, GM did not present software engineering practices as a relevant aspect, partly due to the state of software engineering in those years.
Model-driven engineering (MDE) advocates the use of models that are successively refined and help analyze system properties. This article addresses a question at the intersection of MDE and sustainability research: How can we better support and automate sustainability by bringing together models, data, visualization, and self-adaptive systems to facilitate better engagement, exploration, and understanding of the effects that individuals' and organizational choices have on sustainability? We addressed this question with members from the MDE, sustainability design, and sustainability modeling communities, building on earlier contributions.17
The article conducts a focused review of converging research in MDE, data integration, digital curation (see the sidebar "Digital Curation"), public engagement, and self-adaptive systems with the perspective of sustainability as a driving motivation. We draw upon a vision of a highly capable integrated environment that facilitates integration of models and data from multiple diverse sources and visual exploration of what-if and how-to scenarios for multiple constituencies. This lens is especially effective for such a review due to its central relevance and urgency, but also because of the massively heterogeneous nature of data required to understand sustainability. We note the limitations of existing approaches and the common assumptions around reductionist modeling perspectives, quantification of uncertainty, and resolution of conflicts and contradictions. These issues are leveraged to identify and characterize emerging research challenges.
Modeling has been the essential mechanism to cope with complexity. While in science, models are used to describe existing real-world phenomena, in engineering, models are mostly used to describe a system that is to be developed. Thus, engineering models are typically constructive, while scientific models, for example, mathematical models and stochastic models, are typically used to predict real world aspects.
Modeling underpins many activities related to sustainability. As such, research in MDE can provide a framework for conceptualizing and reasoning about sustainability challenges. One key challenge is how to support decision-making and trade-off analysis to guide behavior of (self-adaptive) systems used for addressing sustainability issues. For this purpose, we present an idealized vision of a conceptual model-based framework, termed Sustainability Evaluation Experience R (SEER) as depicted in the accompanying figure. This system enables broader engagement of the community (for example, scientists, policymakers, and the general public), facilitates more informed decision-making through what-if scenarios, and directly uses these decisions to drive the automatic and dynamic adaptation of self-adaptive systems (SAS).14 We elaborate this vision not as a design for a system to be implemented, but as a framework that enables us to distill the main nine capabilities needed to tackle this multi-disciplinary challenge. Since we argue that MDE is one of the main enablers for a system like the SEER, we contemplate the challenges for MDE research that lie ahead.
Figure 1. The Sustainability Evaluation Experience R.
Vision. This article introduces the SEER, a conceptual entity that brings together sustainability scientists and decision makers, whose output can be used to guide dynamic adaptation of an SAS. As such, the SEER focuses on enabling scientists to integrate and then test their heterogeneous models with an existing knowledge base; enabling individuals and policymakers to explore economic, social, and environmental impact of decisions, investigate trade-offs and alternatives, and express preferences; automating the acquisition of contextual data and enactment of decisions by directly feeding into the knowledge that guides the adaptation of an SAS. The SEER will give the context to introduce the nine capabilities.
Model integration. Scientists must be able to continuously integrate new knowledge into the SEER in the form of models or data. For example, an agronomist can contribute a biomass growth model corresponding to a newly discovered cultivation technique, or a city can decide to openly disclose urban data. Scientists can further connect this contributed material to available and relevant open data. Furthermore, they could investigate the consistency and validity of their models by testing them in combination with other existing domain models. This would help scientists to reach a common view or to highlight important divergences for discussion. To this end, the SEER must provide facilities for flexible data and model integration (C1), model curation (C2), as well as enable trustworthy open-world contributions (C3). The SEER should also support those scientists in investigating the consistency between heterogeneous models, accommodating different and possibly divergent world views (C4).
Model exploration and investigation. On the basis of this knowledge, individuals, communities, and policymakers would explore scenarios, evaluate trade-offs along the five sustainability dimensions7 (technological, environmental, social, individual, economic) or planetary boundaries,45 and explore direct, enabling, and structural effects (see the sidebar "Orders of Effects on Sustainability"). Hence, the SEER must enable use by the population at large (C8). For example, a farmer who is considering building a biowaste plant to become energy independent could investigate the consequences of this idea. This analysis must include basic information about the farmer's preferences and the current as-is situation, and to elicit any required information which is not available. To analyze this issue, the SEER needs data and model sources, such as an operational model of the farm or the heating system of the house. The SEER visualizes the analysis results to facilitate exploration. For example, economic analysis might suggest that heating with biowaste is more cost effective than oil. However, the user may doubt this assertion and wish to investigate the result, so the SEER should provide a transparent rationale and quantification of uncertainty (C6), as well as expose the underlying data. In addition to generating what-if scenarios (C5), the SEER should be capable of generating suggestions (C7) of how to reach user specified goals including quantifiable impacts.
Model automation. Strategic choices typically require a set of well-defined steps to implement them, a process that can benefit significantly from automation. This is especially pertinent when those steps are controlled by an SAS, for example, smart cities or smart buildings. In such cases, decisions are used directly to drive the runtime adaptation of the SAS. For example, when a farmer chooses to grow a specific crop, the SEER could continuously adjust the irrigation system to deliver the appropriate amount of water to the fields. Thus, the SEER must perform sustainability evaluation to determine adaptation needs (C9) to enable broader engagement from the various sustainability stakeholders and would hence serve as an adaptation trigger for an SAS.
Here, we revisit the capabilities introduced previously and discuss how techniques from the MDE community and other associated communities can support them.
MDE aims to raise the level of abstraction at which a software system is designed, implemented, and evolved to improve the management of intrinsic complexity.25 In MDE, a model describes an aspect of a system and is typically created for specific development purposes. Separation of concerns is supported through the use of different modeling languages, each providing constructs based on abstractions that are specific to an aspect of a system. But systems like the SEER also require, as a central function, a set of abilities to curate diverse collections of data and manage them throughout a long-lasting lifecycle to address concerns such as authenticity, archiving, transformation, reuse, appraisal, and preservation. In this context, data monitoring involves the continuous, automated acquisition of new datasets.
Accommodate flexible data and model integration (C1). The MDE community has been investigating how to integrate engineering models for various purposes (for example, analyses, code generation, simulation, execution). In addition to comparison operators such as those that can be defined in the Epsilon Comparison Language,c the community has developed various composition operators for model refinement/decomposition,42 model consistency or impact analyses,26 and model merging and weaving.9 While these composition operators have been extensively studied for homogeneous and structural models,15 recent efforts are also considering behavioral and heterogeneous models.33
Model-driven engineering aims to raise the level of abstraction at which a software system is designed, implemented, and evolved to improve the management of intrinsic complexity.
In the software and systems modeling community, research on domain-specific modeling languages (DSMLs) is investigating technologies for developing languages and tools that enable domain experts to develop solutions efficiently. Unfortunately, the current lack of support for explicitly relating concepts expressed in different DSMLs makes it difficult for domain experts to reason about information distributed across models describing different system views. Supporting coordinated use of DSMLs led to the grand challenge of the globalization of modeling languages16 and the GEMOC initiative. Beyond the current investigations that focus on relating languages of similar foundations, sustainability issues will impose additional research challenges relating to multiscale, uncertainty, and approximation or discontinuity.
An alternative to integrating DSMLs is to integrate models by co-simulation or model translation. For example, the functional mock-up interface (FMI) is a tool independent standard to support both model exchange and co-simulation of dynamic models using a combination of xml files and compiled C-code.d FMI is currently supported by over 130 modeling and simulation tools. Model translation approaches construct model transformation algorithms that integrate models by mapping them into a common modeling formalism. For example, the work described in Castro12 transforms system dynamics models into discrete event system specification (DEVS) models, which can then be integrated further with other discrete modeling formalisms, for example, state automata.
Unlike software-intensive systems, the SEER requires integration of numerous scientific models, regulations, preferences, and so on, when making predictions and in order to consider the many trade-offs when looking for potential solutions. The challenges for integrating models within a SEER are due to the following factors:
Different foundations. In traditional MDE, foundational notions, for example, hierarchy/containment or references, are used in constructing models; different notions are used in other modeling spaces (for example, derived attributes in MetaDepth22). The integration process must acknowledge and align these different notions.
Different technological spaces. Models may be constructed using mechanisms from different technological spaces (for example, databases, formulae such as ODEs), with varying assumptions about the basic building blocks of modeling; how those building blocks can be composed; how the well-formedness of models can be established; and how well-formed models can be manipulated.
Different levels and degrees of abstraction. Integrating models involves more than just establishing a consistent vocabulary: disparate models will use different abstractions (for example, patterns specific to the type of model), different layers and layering structures (for example, networking layers versus atmospheric chemistry model layers), and different forms of granularity (a grid of contemporaneous rainfall observations over a large area versus a time series of measurements of cumulative water flow at one location in a river).
Different scales. To integrate models at different scales, a model integration approach would have to clearly distinguish: Which models belong to which layers of abstraction (for example, given a predictive model of evapotranspiration that can be constructed for Earth as a whole, for a continent or a watershed, which one is relevant when integrating this with a model of crop production at a country level?); Which specific model out of a set of alternatives to use when there is no evidence demonstrating superiority of one model over another (for example, with insufficient ground truth to distinguish between two multispectral classifications, what characteristics of the classifiers would help the system to choose an option?); and How conflicts or inconsistencies between models and/or data are resolved (for example, given a set of decision trees that risk being overfitted to their training data, is it necessary to employ an ensemble method such as random forest?).
Different domains. In order to meaningfully integrate data from a variety of domains, it needs to be carefully described with metadata. This should include descriptions of units, phenomena measured, and other conceptual aspects, which are vital for communication when data is released "into the wild."
Composability. A crucial capability for the SEER is to automatically identify which data can and cannot logically be combined. For example, a user might be interested in assessing the economic value of a national park by overlaying its bounds on maps of ecosystem services. Such maps might be calculated in different ways, leading to conflicting results. For example, carbon capture per hectare may be computed for specific land covers by methods which rely on different assumptions about underlying physical processes. Should results derived from such datasets be averaged, or be shown as alternatives? A robust approach to this automated matching requires semantics to describe the underlying worldview implicit in each estimate.
Curate and evolve models (C2). The SEER must facilitate continuous management of models to ensure the generation of valid what-if and how-to scenarios. Model management involves supporting updates to models and to model integration. Key activities include model import and creation (for example, scientific model creation out of datasets), enhancing model quality, and representation of different views.
There are two approaches to scientific model creation: either start with a skeletal model with a few initial data points and incrementally collect relevant data while refining the model relationships; or build a model based on analysis of all accessible data.
From the perspective of the robust management of MDE products over time, version control is essential to reflect the state of the model at the time when a dataset was imported. When this initial dataset does not conform to later, updated versions of the model, maintenance challenges arise for the datasets. Conceptual approaches for version control in MDE have been developed, based on techniques for comparing and differencing models23 as well as merging models. More recently, tools such as EMF Storee and CDOf have been developed, which are closely aligned with version control systems such as git. Conflicts are common with such approaches, and hence support for their detection and resolution are critical. Such tools typically are combined with those for comparison and differencing (for detection), and merging (resolution).
From the perspective of digital curation, larger concerns around provenance, authenticity, and stewardship become paramount. The provenance of data has been a central concern in fields such as databases and e-science.11,41,47 Provenance modeling initiatives have focused on conceptual frameworks for representing generally applicable elements that capture provenance information in standardized ways.g Concepts such as research objects capture more than the dataset to support the flexible reuse of various products in research workflows and in particular, model-based scientific workflow software such as Kepler and Taverna.6 Again, data provenance is a central concern and raises new challenges, as we will discuss.
Enable trustworthy open-world contribution (C3). To enable trustworthy open-world contributions, everyone should be allowed to contribute to the SEER, regardless of their social background, domains of expertise, or technical qualifications. A simple example of the utility of such contributions are the citizen-science projects. For instance, the U.K's Spring Watch program enlists radio listeners to report, via text and/or photographs, the observations of native wild life species, which can be a cost-free tool for observing, recording, and where necessary taking actions for preserving biodiversity. Contributions to the SEER would consist not only of data or models, but also of new mappings or relationships for integrating data and models.
To foster trust toward and use of the contributions, their provenance must be publicly availed. This is essential47 in order to assure potential data users of the quality of the given data (providing answers to such questions as: what is the data source, were the derivation methods of the current data sound?); support the owners and users with the audit trial (Who is using the data? Are there any errors in the data generation?); provide recopies for replicating data derivation in order to maintain currency of the data, as well as to maintain clear derivation recipes; support attribution of data for both copyright and liability assignment purposes; and provide information about the data context, and for data discovery.
Currently data curation is being tackled by open-world contributions that have little provenance, so the quality of that data and the collection processes are questionable. For example, in the CARMEN bioinformatics project,h researchers can submit data and the metadata that describes it. However, provenance information is limited to the identity of the source. Yet, it is widely acknowledged that, in order to provide credible provenance for scientific workflow, one needs to report provenance not only of the provided data (for example, its sources and their views, including interests, purpose, concepts, principles, knowledge29) but also the process through which the data is derived (for example, used methodology, and technologies for data collection).29,47
In MDE's few open repositories for models, for example, ReMoDDi or the ATL Metamodel Zoo,j the situation is even worse, as little information is kept on the provenance or quality of the models, despite the long-established specification of provenance requirements for e-science systems.40
The challenges for trustworthy open-world contributions pertain to the following:
Subject of provenance,47 or the provenance of data and its workflow: It is not clear at what level of detail provenance information needs to be gathered (for example, what granularity should the data be collected, rainfall per cm2 or km2?). Which sources are acceptable, for what purposes?29 When pulling together several datasets, or starting analysis for a given purpose, are the used data collection methods and technologies compatible/appropriate for the said purpose? Who must take responsibility for errors in data collection or derivation? Eventually, how do the sources, their properties and the workflow affect the data quality, and how can the quality be separated from the notion of provenance itself?
Provenance representation.47 Should data be annotated directly with the provenance details (for example, many scientific workflow tools, such as Taverna, record the provenance data implicitly in event logs21), or should provenance be derived at each workflow stage from the previous one? What syntax and semantics should be used to represent it? Can these be applicable across all kinds of domains, as the SEER has to integrate environmental, economic, technical, societal, individual, policy, and cultural aspects of life?
Storing provenance.47 What are the costs of collecting and storing the provenance data at various granularity? Clearly, the richer the provenance data, the more it will affect the scalability of data collection and storage.
Integration. If the system accommodates import of new concepts of all kinds, we face integration challenges, for example, to find the best, that is, most reasonable, or most flexible open interfaces and common description language. Furthermore, the research community must let the ontology evolve iteratively, by adding new parts.
Trust. How do you foster trust, or calculate trust into the given model's output? How do we build trust models? How can we apply theoretical research models in the real world while large scale empirical evidence is still missing?
Relationship between risk and trust. How to deal with the inherent relationship between risk and trust? What are the risks involved in trusting a given model/data/process, and how to quantify these? Contrary to public perception, high trust does not mean low risk.
Currently research is ongoing on ways for handling many of the challenges mentioned earlier for controlled environments, such as for scientific work flows within tools like Tavernak and Keplerl (here, datasets and workflows are provided only by scientists or models by research groups who stake their professional reputation against the quality of their contributions). When the controls for contributions are removed, however, these challenges redouble and multiply.
Accommodate different world views (C4). The breadth of the impact of sustainability across five dimensions and multiple time scales, from human to global, inevitably brings with it differing and irreconcilable worldviews, and separates stakeholders socially and temporally.
To avoid bias, the SEER should provide all possible futures accommodating multiple and potentially divergent worldviews to the user given the available data and models. Therefore, the SEER must acknowledge that a model is constructed with its own (often implicit) worldview.37 Model integration requires combination of the views, which can be challenging or even impossible if they contradict.
The modeling community deals with situations where worldviews are assumed to be consistent across stakeholders if they share the same modeling background.36 In most engineering environments this is acceptable, since even large-scale systems have an ultimately "bounded" set of stakeholders. In these scenarios, any necessary negotiation of conflicting worldviews is a question of social organization and not addressed in modeling.
Traditional MDE normally resolves contradictions under model integration using constraints and transformations. This is feasible because even when the worldview is not fully shared, there should be overlap arising from agreement on a metamodeling stack (for example, three-tiered) and technology (the Eclipse Modeling Framework, EMF). This cannot be assumed in modeling for sustainability, where the social structure is so disconnected that the common assumption of consistent worldviews in MDE cannot hold. Different modeling schools must be integrated and multiple contradictory worldviews need to be made explicit and embraced.
The worldview has to become an explicit part of the modeling infrastructure, and several possible scenarios arise as noted in the following:
Matching worldviews. In some cases, worldviews can be reconciled. However, there may be no "actual" user/modeler who possesses this integrated view. How can this integrated view be derived/validated?
Incommensurable worldviews and models. Considering the fundamentally distinct nature of the types of concerns of interest for the stakeholders in sustainability, perspectives on what seems to be a common concern will not only disagree on the weighting of importance of particular aspects, such as "individual agency," but also on what this concern means, and how to evaluate it.
Contradictory worldviews. It should not be assumed that reconciliation of contradicting worldviews is always desirable and appropriate. Sometimes it may be desirable and useful to keep track of contradictions between models. To discuss this, we provide here a few examples for worldviews that disagree at least partially:
Incommensurable. In California, environmental sustainability can be regarded as fundamentally different in the problem context of preserving existing wetlands versus restoring an urban landscaping back toward the natural desert environment it was taken from.
Contradictory. In many developing cultures, big families still form the heart of the community. In many developed cultures, family structures have been overshadowed by career paths requiring mobility. One consequence is that two-income families struggle with local support systems for their kids while grandparents live far away and struggle with lonely old age. Neither worldview is wrong, but they cannot be consolidated completely.
The research challenge arising from this is not an unrealistic attempt at consolidating all existing worldviews. Instead, what we need are modeling concepts and mechanisms that allow us to contrast different worldviews to illustrate and explore conflicts between the assumptions and implications of two or more worldviews.37 One option would be to use system dynamics to reach a group consensus and enhance systems thinking.50
To avoid bias, the SEER should provide all possible futures accommodating multiple and potentially divergent worldviews to the user given the available data and models.
However, system dynamics on its own is arguably incapable of securing consensus.30 Because it lacks the awareness of social theory required to distinguish consensus from coercion, it must be positioned within a critically aware systems thinking framework that reflects upon its own selectivity, aims to emancipate marginalized perspectives and worldviews, and allows for pluralism in methods and theories.39
A useful starting direction in tackling these issues could be provided by model-documenting guidelines (for example, the ODD protocol28) that help to systematize and disambiguate categorizations of heterogeneous models, though full resolution of integration of such models is an open challenge.
Generate what-if scenarios (C5). The system should support the generation of what-if scenarios based on multiple types of models to project the scenarios' effects with regard to the five sustainability dimensions. Interactive exploration of the scenario as well as the involved data and models should be possible. Here, it is important the user of the SEER gets a feeling about how a possible future scenario may look and what effects the anticipated changes will have on the different sustainability dimensions. For example, what would a world look like that no longer used fossil fuel? To help SEER users understand the what-if scenarios and make the experience even more tangible, visualization techniques going beyond the presentation of numbers are needed.
What-if scenarios require query formulation, which is supported through query languages. These languages have been investigated by the MDE community with an intensive focus on automatic model management (for example, constraints, views, transformation). MDE provides languages for expressing structural queries based on first order logic (OCLm), use of optimization and search techniques combined with models,24 as well as for behavioral queries based on temporal logic.38 These languages rely on the modeling language specification for expressing queries related to the corresponding concepts or their associated behavioral semantics. The concept of model experiencing environments (MEEs)43 has been introduced as an approach to support complex model and data integration, while offering customizable interfaces to access model analysis results and their visualizations.
The need for broad engagement with diverse communities and decision makers requires an ability to process questions articulated within the mental models and terminologies used by communities, and support cross-domain compatibility and mapping across various domains. Different impacts must be presented back to the user (using different kinds of visualizations), in such a form that the indicators and their underlying assumptions can be deeply and inter-actively analyzed for a better understanding. Current practices must be adapted to support the what-if scenario capability. This requires a bridging of the gap between the indicators and the modeling concepts manipulated by the SEER. The user must be able to express the indicators of interest, and the specific views to be used for representing them.
Provide transparent reasoning and quantification of uncertainty (C6). If users do not feel they understand what is happening in a system and why, they are less likely to trust it. Therefore, trustworthiness can only be established if the reasoning provided by the SEER is transparent, meaning users can understand where data comes from, to what degree it is reliable, and how it is combined in order to generate predictions.
Intra-model relationships have been a general focus of interest in the MDE community. User-defined mappings between MDE models are supported via model management tools such as the Atlas Model Weaver, EMF Compare, or the Epsilon Comparison Language (ECL). These approaches enable users to describe mappings between models and model elements and attach semantics to the relationships that are produced. Such models are usually within a single technological space (for example, EMF). There are also software component interface definitions, such as Open-MI and Taverna, which provide APIs that allow models to be configured to exchange data at run-time within workflows. While such technology is meant to be model agnostic, it supports connection of models from within a technological space. Additionally, such frameworks effectively focus on mappings between data, where the models are used to enable the construction of such mappings.
There has been limited research in the MDE community on dynamic model selection from a large set of models or on run-time conflict resolution between models from disconnected domains and disciplines (most conflict resolution has focused on resolution between models from single or related domains). Current work on justifying model integration reasoning is centered around such topics as edit-aware modeling tools that keep track of the steps that the modelers take in modifying the model (for example, Altmannager et al.2) and tool support that allows one to keep track of all the versions of a model (Sparx Time Aware Modeling, Magic Draw Comparer, EMF Compare).
In goal modeling, the impact of alternative solutions on stakeholders' objectives is modeled to allow reasoning about trade-offs. Based on such models, explanations may be given of what influences what. It is still challenging to generate clear explanations of scenarios built on top of widely different types of models, each requiring different argumentation and concepts. For example, when analyzing a chart with a Pareto front to make an allocation decision, the farmer might see a cut off on one dimension. She might ask "but couldn't I do this?" for example, increase output beyond x? The SEER would need to be able to explain the Pareto front does not only take into consideration physical possibilities, but also considers legal constraints.
Within the domain of environmental modeling, there has been some consideration of integration challenges,52 particularly in relation to the propagation of uncertainty through a series of chained models and its communication in a usable form at the end of the analysis.3 'Models' in that context, however complex, are concrete mathematical transformations that represent physical processes such as soil erosion, or non-physical processes such as market fluctuations. As such they are materializations of the more abstract class of models with which the SEER must work, and form just part of the set of components of which it must be composed.
However, many of the insights from this research also apply to an integrated system such as SEER: for example, the importance of semantics and controlled vocabularies in describing requirements, constraints or phenomena, and the fact that physical models may also be matched and merged as appropriate.
The uncertainty of available data and information hinders the precise specification of certain models and their parameters. Uncertainty may be, for example, epistemic, linguistic, or randomized27 and can derive from many sources including measurement, data transformation, inaccurate definition of the phenomenon of interest, or generalizations made to ensure tractable computation. As such, uncertainty analysis (UA) and sensitivity analysis (SA) are prerequisites for model building.18 While UA aims to quantify the overall uncertainty associated with the model response as a result of uncertainties in the model input, SA can be used to quantify the impact of parameter uncertainty on the overall simulation/prediction uncertainty. This makes it possible to distinguish between high leverage variables, whose values have a significant impact on the system behavior, and low-leverage variables, whose values have minimal impact on the system.31,54 Such approaches can be used for various purposes, including model validation, evaluating model behavior, estimating model uncertainties, decision-making using uncertain models, and determining potential areas of research34 and a variety of SA techniques have been developed to achieve such purposes.35 However, federating several models is likely to result in the potential problem of enlarging the parameter space, which will require the automated detection of hotspots in the parameter space using approaches such as the ones proposed by Danos et al.20
Nevertheless, not all sources of uncertainty are known, and many are difficult to quantify. Uncertainty which sources can be assessed statistically may be communicated, for example, using probabilities, which are easily combined across a wide variety of well-supported frameworks and languages, for example, UncertML.51 Fuzzy sets are more complex to combine across domains but can still be represented in mathematical form. However, on many occasions a quality assessment is not easily mapped to a value scale, or a problem does not become apparent until a dataset or model is used or compared to better alternatives that were not originally available. This is a clearly recognized challenge in citizen science, where a number of initiatives aim to harmonize metadata standards,4 to adapt existing data formats to the citizen science context,48 to develop robust ontologies to capture heterogeneous data collection protocols and to allow flexible annotation by contributors and expert evaluators alike.n,5 Only through such concerted efforts can a potential user assess whether the reliability of a contributed resource matches their criteria, making it fit-for-purpose.
Generate suggestions (C7). The system should be capable of generating suggestions of how to achieve the user's specified goals. This generation of suggestions is based on the capability to create what-if scenarios (C3), as those are needed to build a knowledge base for a recommender system. Based on such a what-if scenario knowledge base, a recommender system can generate how-to scenarios by using model inference. Inferred models can be compared to current ones and criteria applied to select the most appropriate candidate solutions, for example, the closest to the current situation. Therefore, the SEER must calculate different alternatives to minimize negative impact on the different sustainability dimensions. To do so, the system must be informed what a user may and may not change, for example, they cannot change the weather. Furthermore, the SEER needs to know user preferences in order to make adequate individual suggestions. Such user preferences include the modeling view of the system under consideration, the agency over individual elements, and the scale at which they can be changed. The preferences could even be changed at run-time and the model recalculated based on the updated constraints.49
The uncertainty of available data and information hinders the precise specification of certain models and their parameters.
Enable use by the population at large (C8). Since the SEER is to be used by the population at large, careful consideration must be given to human factors and ergonomics in system design. Some example issues to be addressed here include simple ways to establish and update preferences and goals (for example, via graphical or voice-based interfaces); results interpretation (via visualization or voice feedback explaining the results' implications); and customization of interactions for different user groups (domain-specific model customization support for specialist users). The quality of the users' experience10 should also be considered, accounting for the users' emotional and physiological states, the situational characteristics of the experience, and the experience of model use itself.1
Evaluating adaptation for sustainability (C9). Based on the sustainability evaluation performed by the SEER, adaptation triggers may be generated to guide the self-adaptation of an SAS. In the original framework proposed by Kephart and Chess,32 an SAS has four key stages (MAPE-K loop): Monitoring environment and system conditions, Analysis to determine whether the system needs to self-reconfigure, planning for how to adapt the system safely to satisfy new requirements/needs, and Execution of the adaptation plan. All four stages make use of a Knowledge resource. While the original intent for Knowledge was for static information (for example, sensor properties, policies, and constraints), for our purposes, we realize the Knowledge resource with the SEER. As such, the SEER becomes a dynamic source of sustainability-evaluation knowledge that incorporates input from the stakeholders, scientific models and their integration, open data, results of what-if scenario exploration, or user needs to guide the self-adaptation of an SAS. The entire MAPE-K loop is hence open for human assessment and feedback to derive a recommendation that can either be realized by an automated adaptation or realized by human intervention. For example, Bruel et al.8 present a smart farming system including an irrigation system that determines and delivers the right amount of water every day in order to maximize produced biomass, based on current water stress, the climate series, biomass models, and the farmer's input.
In this article, we detailed each capability needed by the SEER and reported on how MDE has already contributed toward that capability. However, most of the disciplines in computer science (CS) must come together to realize the SEER vision outlined here. Therefore, we used the ACM Computing Classification Systemo to assess the CS disciplines and create a simplified heat map (see Figure 2) where we indicate for each top-level category whether or not we, that is, the 16 authors, believe it is not relevant (white), relevant (blue), or highly relevant (red) to realize the SEER. Whenever we feel that some subcategories are notably more important than others, they are mentioned explicitly in the appropriate cells of the heat map. The heat map represents the biased view of the authors, and as a result, the importance of some categories might have been misjudged. In general, it can be supposed that expertise in CS is needed across all capabilities, that each of the CS categories is highly relevant for at least one of the capabilities, and finally that MDE is highly relevant across all capabilities.
Figure 2. CS disciplines contributing toward realizing the SEER capabilities.
1. Abrahao, S. et al. User experience for model-driven engineering: Challenges and future directions. Model Driven Engineering Languages and Systems, 2017, 229–236.
2. Altmanninger, K. et al. Why model versioning research is needed! An experience report. MoDSE-MCCM Workshop at MoDELS, 2009, 1–12.
3. Bastin, L. et al. Managing uncertainty in integrated environmental modelling: The uncertweb framework. Environmental Modelling and Software 39, 2013. Elsevier, 116–134.
4. Bastin, L. et al. Good Practices for Data Management. Chapt. 11, 2017.
5. Bastin, L. et al. Volunteered Metadata, and Metadata on VGI: Challenges and Current Practices. Springer, 2017.
6. Bechhofer, S. et al. Why linked data is not enough for scientists. In Proceedings of 2010 IEEE 6th Intern. Conf. e-Science, Dec. 2010, 300–307.
7. Becker, C. et al. Requirements: The key to sustainability. IEEE Software 33, 1 (Jan. 2016), 56–65.
8. Bruel, J.M. et al. MDE in practice for computational science. In Proc. of Intern. Conf. on Computational Science, June 2015.
9. Brunet, G. et al. A manifesto for model merging. In Proc. of 2006 Intern. Workshop on Global Integrated Model Management, 2006, 5–12.
10. Bui, M. and Kemp, E. E-tail emotion regulation: Examining online hedonic product purchases. Int. J. Retail and Distribution Management 41, 2013, 155–170.
11. Buneman, P., Khanna, S., and Wang-Chiew, T. Why and where: A characterization of data provenance. Database Theory ICDT 2001, LNCS. Springer, 2001, 316–330.
12. Castro, R. Open research problems: Systems dynamics, complex systems. Theory of Modeling and Simulation (3rd Edition), chapt. 24. Academic Press, 2019.
13. Castro, R. and Jacovkis, P. Computer-based global models: From early experiences to complex systems. J. Artificial Societies and Social Simulation 18, 1 (2015), 1–13.
14. Cheng, B.H.C. et al. Software engineering for self-adaptive systems: A research roadmap. Software Engineering for SAS, 2009, 1–26.
15. Clavreul, M. et al. Integrating legacy systems with mde. In Proc. of Intern. Conf. Software Engineering, 2010, 69–78.
16. Combemale, B. et al. Globalizing modeling languages. Computer, (June 2014), 68–71.
17. Combemale, B. et al. Modeling for sustainability. Modeling in Software Engineering, 2016.
18. Crosetto, M., Tarantola, S., and Saltelli, A. Sensitivity and uncertainty analysis in spatial modelling based on GIS. Agriculture, Ecosystems & Environment 81, 1 (2000), 71–79.
19. Dallas, C. Digital curation beyond the wild frontier: A pragmatic approach. Archival Science 16, 4 (2016), 421–457.
20. Danos, A., Braun, W., Fritzson, P., Pop, A., Scolnik, H., and Castro, R. Towards an open Modelica-based sensitivity analysis platform including optimization-driven strategies. In Proc. of EOOLT'17, 2017. ACM, 87–93.
21. Davidson, S.B. and Freire, J. Provenance and scientific workflows: Challenges and opportunities. In Proc. of Intern. Conf. Management of Data, 2008. ACM, 1345–1350.
22. de Lara, J. and Guerra, E. Deep meta-modelling with metadepth. In Proc. Of the 48th Intern. Conf. Objects, Models, Components, Patterns, 2010, 1–20.
23. Dimitrios, S. et al. Different models for model matching: An analysis of approaches to support model differencing. In Proc. of Workshop on Comparison and Versioning of Software Models, 2009.
24. Faunes, M. et al. Automatically searching for metamodel well-formedness rules in examples and counter-examples. Model Driven Engineering Languages and Systems, LNCS, 2013, 187–202.
25. France, R.B. and Rumpe, B. Model-driven development of complex software: A research roadmap. In Proc. of Workshop on the Future of Software Engineering, 2007, 37–54.
26. Galvao, I. and Goknil, A. Survey of traceability approaches in model-driven engineering. In Proc. of EDOC 2007, Oct. 2007, 313–313.
27. Giese, H. et al. Living with Uncertainty in the Age of Runtime Models, 2014, 47–100.
28. Grimm, V., Polhill, G., and Touza, J. Documenting social simulation models: The ODD protocol as a standard. Simulating Social Complexity, Springer, 2017, 349–365.
29. Huang, J. From big data to knowledge: Issues of provenance, trust, and scientific computing integrity. Big Data 2018, 2197–2205.
30. Jackson, M.C. Systems Thinking: Creative Holism for Managers. Wiley Chichester, 2003.
31. Jorgensen, S.E. and Fath, B.D. 2—Concepts of modelling. Fundamentals of Ecological Modelling vol. 23, Developments in Environmental Modelling. Elsevier, 2011, 19–93.
32. Kephart, J.O. and Chess, D.M. The vision of autonomic computing. Computer 36 (Jan 2003), 41–50.
33. Larsen, V. et al. A behavioral coordination operator language (BCOoL). MODELS 2015, Aug. 2015.
34. Lehr, W., Calhoun, D., Jones, R., Lewandowski, A., and Overstreet, R. Model sensitivity analysis in environmental emergency management: A case study in oil spill modeling. In Proc. of Winter Simulation Conf. Dec. 1994, 1198–1205.
35. Hamby, D.M. A review of techniques for parameter sensitivity analysis of environmental models. Environmental Monitoring and Assessment, 32:135–154, 09 1994.
36. Meadows, D., Richardson, J., and Bruckmann, G. Groping in the Dark: The First Decade of Global Modelling. John Wiley & Sons, 1982.
37. Meadows, D. H. and Robinson, J.M. The electronic oracle: computer models and social decisions. System Dynamics Review 18, 2 (2002), 271–308.
38. Meyers, B. et al. Promobox: A framework for generating domain-specific property languages. Software Language Engineering, 2014, 1–20.
39. Midgley, G. What is this thing called CST? Critical Systems Thinking. Springer, Boston, MA, 1996, 11–24.
40. Miles, S., Groth, P., Branco, M., and Moreau, L. The requirements of using provenance in e-science experiments. J. Grid Comp. 5, 1 (2007), 1–25.
41. Moreau, L. et al. The provenance of electronic data. Commun. ACM 51, 4 (Apr. 2008), 52–58.
42. Mussbacher, G. et al. Assessing composition in modeling approaches. In Proc. of CMA Workshop, 2012, 1:1–1:26.
43. Mussbacher, G. et al. The relevance of model-driven engineering 30 years from now. MODELS 2014, LNCS 8767, 183–200.
44. Rittel, H.W. and Webber, M.M. Dilemmas in a general theory of planning. Policy Sciences 4, 2 (1973), 155–169.
45. Rockstrom, J. et al. A safe operating space for humanity. Nature 461 (Sept. 2009), 472–475.
46. Rusbridge, C. et al. The digital curation centre: A vision for digital curation. In Proc. of Intern. Symp. Mass Storage Systems and Technology, 2005, 31–41.
47. Simmhan, Y.L., Plale, B., and Gannon, D. A survey of data provenance in e-science. SIGMOD Rec. 34, 3 (Sept. 2005), 31–36.
48. Simonis, I. et al. Sensor Web Enablement (SWE) for citizen science. In Proc. of the IEEE Int. Geoscience and Remote Sensing Symposium, 2016.
49. Tikhonova, U. et al. Constraint-based run-time state migration for live modeling. Software Language Engineering, 2018.
50. Vennix, J. A. M. Building consensus in strategic decision making: System dynamics as a group support system. Group Decision and Negotiation 4, 4 (July 1995), 335–355.
51. Williams, M. et al. Uncertml: An XML schema for exchanging uncertainty. In Proc. of the 16th Conf. GISRUK 2008, 275–279.
52. Wirtz, D. and Nowak, W. The rocky road to extended simulation frameworks covering uncertainty, inversion, optimization and control. Environmental Modelling and Software 93 (2017), 180–192.
53. Yakel, E. Digital curation. OCLC Systems & Services: Intern. Digital Library Perspectives 23, 4 (2007), 335–340; https://doi.org/10.1108/10650750710831466
54. Zheng, Y., Han, F., Tian, Y., Wu, B., and Lin, Z. Chapter 5: Addressing the uncertainty in modeling watershed nonpoint source pollution. Developments in Environmental Modelling, Ecological Modelling and Engineering of Lakes and Wetlands. Elsevier, 2014, 113–159.
c. https://www.eclipse.org/emf/compare/
d. The Functional Mockup Interface Standard, https://fmi-standard.org
e. https://www.eclipse.org/emfstore/
f. http://www.eclipse.org/cdo/
g. https://www.w3.org/TR/prov-overview/
i. https://www.cs.colostate.edu/remodd/
j. https://web.imt-atlantique.fr/x-info/atlanmod/index.php?title=Zoos
k. https://taverna.incubator.apache.org/
l. https://kepler-project.org/
m. https://www.omg.org/spec/OCL/
©2020 ACM 0001-0782/20/3
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2020 ACM, Inc.
No entries found