Analysis of a hybrid neural network as underlying mechanism for a situation prediction engine

. This paper presents the results regarding a technique that can be used as an underlying mechanism for situation prediction. We analysed a hybrid neural network called Multi-output Adaptive Neural Fuzzy Inference System (MANFIS) and compared its predictive ability with a Multi-Layer Perceptron (MLP). The results demonstrate that, depending on the application, the use of neural networks can be considered to be a good approach for situation prediction, when combined with other techniques.


Introduction
Pervasive and ubiquitous computing (socalled "UbiComp") has been the focus of much research seeking to develop environments endowed with some degree of intelligence, capable of governing the performance of devices and applications autonomously and according to the needs of users.In this area of application, the concepts of Ambient Intelligence (AmI) (Augusto and Shapiro, 2007;Ramos et al., 2008;Cook et al., 2009) and Intelligent Environments (also called smart spaces, pervasive spaces or reactive environments; Lino et al., 2010) are employed.Although both concepts are closely related, they can be distinguished through the metaphor of "mind/brain" used in Artificial Intelligence.The former is more focused on techniques to make the environment have an intelligent behaviour (e.g.algorithms, techniques etc.), while the latter is more related to the control device, the smart way to interconnect resources and their collective behaviour (Buchmayr and Kurschl, 2011).
The products, services and applications made by using concepts from these emerging areas involve automation at various levels, ranging from a simple room in a residence, providing services in an intelligent manner to local residents, to reaching even an entire city, where residents have their daily activities aided by custom computer services.Such an environmental feature is dynamic, so that the applications have no deterministic behaviour, but are rather dependent on the context at hand.The applications must then understand the users' needs and orchestrate resources to provide optimized services (Zaidenberg et al., 2008).This is only possible through constant monitoring and interaction with the environment where users are located, so new contextual knowledge about them can be discovered and necessary adaptations on applications can be made (Bhatt, 2009).However, the context information acquired from the low-level sensors without further interpretation may not have meanings, not be trivial, and sometimes conflict or be uncertain.The limitations of low-level data appear when it becomes necessary to model the behaviour of existing entities in a different environment, as well as their relationships.One way to overcome this problem is to derive context information from low-level values, creating a new high-level layer that can capture the perceptions of sensors as inputs, to trigger actions in the system.In the literature, different no-tions have been used to define this high-level layer that represents context.The most used terms are situational context (Gellersen et al., 2002), situation sense (Endsley and Connors, 2008;Endsley, 2006) and situation (Dey, 2001).The notion of situation is used as a concept of higher-level representation of the state and is the basis for the definition of a holistic view of a context called "situation awareness" (or SAW).
From this premise, authors such as Buchmayr and Kurschl (2011) and Ye et al. (2011) focused their research on how to apply situation awareness in Ambient Intelligence.Among their conclusions, both studies emphasized that research in this area still has shortcomings with regard to issues about primitives to represent situations, the specifications of logical situations, and also the lack of mechanisms for reasoning and decision-making.Besides these, they also point out that the proposed solutions do not provide a prediction of future situations that allows the analysis of the impacts that actions taken will generate on the environment, and how these actions should be performed to consider the needs of the user.In other words, applications and services developed today are able to adapt to the momentary context in which they occur and can even infer the user's current situation (high-level vision).However, they have difficulties in orchestrating resources needed to meet the demands of a future state, because they have no predictive mechanisms capable of assisting applications in decisionmaking on context adaptation.The possibility of predicting situations can help to provide more invisible and proactive services, in accordance with the ideas of Mark Weiser (who is considered to be the father of UbiComp).
Thus, the development of a computational solution that performs inference and also permits prediction appears to be an alternative to the above mentioned problems, related to ambient intelligence.Accordingly, we argue that neural networks can be used as an underlying technique for this prediction mechanism, since they can make inferences from low-level context data, learn from it and thus predict values that will create a future situation.Hence, this paper aims to demonstrate the preliminary results of the predictive ability of a hybrid neural network called Multi-output Adaptive Neural Fuzzy Inference System (MANFIS), when applied to the prediction of data that characterizes a situation.For the sake of analysis, we compare the predictive power of a MANFIS and a widely used neural network called Mul-ti-Layer Perceptron (MLP).We emphasize that here our focus is only on the analysis of the predictive ability of a MANFIS and the possibility of using it as an underlying technique for the prediction mechanism.General issues about the development of overall computational solutions (such as modelling situation, inference, data fusion, uncertainty, etc.) are outside the scope of this work.
The main contribution of this research is the analysis of the feasibility of using a hybrid neural network as the underlying technique for a mechanism of situation prediction.The results of this comparative analysis between a MANFIS and a MLP, applied to Ambient Intelligence, in addition to being original, serve as a direction for further research in this area.
The paper is organized as follows: in "situation awareness section" section, we present the concepts of situation awareness and approaches to prediction and reasoning; in "Prediction using Neural Networks" we characterize the prediction mechanism and the motivation for using a MANFIS as the base technique for its development; Section "Experimental results" presents the experimental results of using a MANFIS to predict data of situations, and this is done as a comparison of its performance with an MLP; Section "Related work" presents some related work; and, finally, the conclusions and future work are presented in section "Conclusions and Future Directions".

Situation awareness
Before talking about situation awareness it is necessary to define the situation.According to Yau et al. (2004Yau et al. ( , 2002)), situation is a set of contexts relevant to an application in a period of time that affect the future behaviour of the system.More recently, Bettini (Bettini et al., 2010) has defined situation as a set of semantic abstractions derived from parts of low-level information, human knowledge and interpretations of the world.Ye (Ye et al., 2011) defines situation as an abstraction of the events occurring in the real world derived from context and hypotheses about how observed context relates to factors of interest to designers and applications.Computationally speaking, we can define situation as a particular state that is abstracted from sensors' data and is relevant for application objectives, so that certain actions can be taken when the determined situation is happening; that is, the situation can be used to represent high-level concept of state.
Applications that use the concept of situation to perform internal adaptation are known as "situation awareness applications".For these applications, situations are external semantic interpretations of low-level context, allowing a high-level specification of human behaviour and its interaction with the system.Situations inject meaning into applications, are more stable and easier to define, and keep basic contextual information.Adaptations in context-aware applications are then caused by changes in situations (e.g., switches in one context value trigger an adjustment if the context update changes the situation).The design of these applications becomes much easier because the designer can operate at a high abstraction level (situation), and not at the level of little context details that characterize a situation.
The great advantage of using situation is the ability to provide a comprehensive representation of the human sensor data for applications, while abstracting them from the complexity of read data, noise in these data and activities of inference (Ye et al., 2011).What distinguishes a situation from an activity and the recognition of situation from the recognition of activity is the inclusion of situation in temporal and structural aspects, including time of day, duration, frequency etc.A situation can be as simple as an abstract state of a certain entity (e.g. the room is occupied), or a human action that happens in one place (e.g.working or cooking).

Reasoning and prediction of situation
For Ambient Intelligence to be truly effective the systems must be able to reason and predict context information.This ends up also involving the stage of identifying data that characterizes a given context.Several techniques have been studied by researchers, although there are few jobs that use the higher level representation of context-named situation.According to the work of Ye et al. (2011), the hybrid techniques of reasoning and prediction of situation, which use both approaches based on specifications with the approaches based on learning, are considered to be the most promising.Specificationbased approaches are used in simple scenarios, with few sensors and accurate information.They can use specialist knowledge through logical rules to represent situations; reasoning is only used to make inferences about situations from sensor inputs.Learning-based approaches are meant to be applied to environments where a large number of sensors collect various data (which can often be inaccurate or conflicting) representing complex situations.In these environments, it becomes cumbersome, or even impossible, to use only the knowledge of experts to specify the situations that can happen properly.Figure 1 summarizes these approaches according to the techniques employed and their relationship with increased complexity.
Another note made by the authors refers to the lack of a solution that is capable of predicting situations.On this basis, the authors of this paper are working to develop a mechanism that can predict data likely to characterize future situations.This prediction mechanism is a small part of a hybrid computer solution (using specification and learning-based approaches), and is capable of dealing with issues related to situation awareness, called Situation Manager.The Situation Manager will be described in the future and is outside the scope of this work.The next section aims to expound the conceptual aspects involved in the prediction of situation, using neural networks.

Prediction using Neural Networks
The prediction mechanism has the purpose of predicting context data that will probably compose future situations.It infers data about the low-level context (obtained from sensors) that characterize a situation and, based on the knowledge acquired, provides new data will probably characterize a future situation.This data can be used by applications to change its own behaviour and for decision-making.The prediction mechanism can be imagined like a small "black box" capable of being implemented as a "pluggable" module in a large middleware, such as Percontrol (Leithardt et al., 2012), EXEHDA (Execution Environment for High Distributed Applications) (Yamin, 2004), Continuum (Costa, 2008), or provided as a service in some UbiComp platform, based on SOA (Service-Oriented Architecture).
Based on the previous section, this prediction mechanism can be constructed using an approach based on learning, since it needs to be able to deal with historical data, in order to learn about future situations.Owing to its probabilistic nature and the necessity of learning, a neural network was chosen as the underlying technique for its development.Neural networks are computational structures that employ mathematical models to represent intelligent behaviour, and are able to learn from experience.There are several types of neural network.A widely used type is the Multi-Layer Perceptron (MLP).An MLP consists of three or more layers (an input layer, an output layer and one or more hidden layers).A commonly criticized characteristic of this type of network is its "black box" nature (Haykin, 2011).This can be attributed to the difficulty of understanding the inner workings of the network.Another criticism is the effort required to adjust and select the best network configuration.
The alternative to these two problems is to have a network that is able to "learn" about the data and represent knowledge in a more "humanly" readable way.One approach used is to combine fuzzy logic with neural networks to build a hybrid neural network.Thus, each neuron in the network implements a fuzzy set, and then the network adjusts its membership functions via internal mathematical mechanisms, based on data provided in the training phase.Different types of hybrid networks have been built to implement Mamdani Sugeno and Takagi fuzzy inference models, e.g.FALCON, ANFIS, NEFCON, NEFCLASS, NEFPROX, FUN, SONFIN, and EFuNN (Abraham, 2005).In general, the Takagi-Sugeno model has a lower mean square error (RMSE) and produces more accurate systems than the Mamdani model (despite lower accuracy, it is faster than Takagi-Sugeno) (Marza and Teshnehlab, 2009).However, models such as FALCON, NEFCON, NEFCLASS and EfuNN, or even FUN (since it does not use formal techniques of learning: a value is randomized to be used in the membership functions) will not be considered as prediction demand accuracy.With respect to other models, Mackey and Glass (1977) carried out a comparative performance of some neuro-fuzzy systems for predicting chaotic time series and, using the Takagi-Sugeno model, showed that the ANFIS has a lower RMSE than NEFPROX, SOFIN and dmEFuNN (see Table 1).
Thus the choice was the use of an ANFIS for prediction.Its schema is depicted in Figure 2.However, in order to be applied to the prediction of low-level values of a situation, the network must provide more than one output value.An ANFIS only provides one output (see Figure 2).For this reason it was decided to use a Multioutput Adaptive Neuro-Fuzzy Inference System (Benmiloud, 2010), which is an extension of an Adaptive Neural Fuzzy System (ANFIS) (Jang, 1993).
Table 1.Performance of some neuro-fuzzy models (Mackey and Glass, 1977).As its name suggests, a MANFIS works as various interconnected ANFIS, providing multiple outputs instead of just one as in an ANFIS (Figure 3).

Model
As in the case of an ANFIS, the potential of a MANFIS is its ability to construct inputoutput mapping, using human knowledge (in the form of fuzzy if-then rules), and to learn from the data provided.This is possible because each MANFIS neuron implements a fuzzy set.The network adjusts the membership functions used by the internal fuzzy inference mechanisms based on the data provided in the training phase.To adjust the parameters of membership functions a gradient vector is used, which measures the inference system through a set of rules seeking to reduce the overall error.The adjustment of weights is done through a hybrid algorithm in two steps (or phases) that combine the methods of Least-Square Estimator (LSE) and Gradient Descent (GD).This hybrid approach allows the convergence of the network to take place more quickly, since the dimensions of the search space are reduced compared to the original one used in the method of the backpropagation algorithm.In the case of a MANFIS, the algorithm must be adapted to work with multiple outputs rather than just one, as in an ANFIS.
After the explanation of the factors that determined the choice of a MANFIS and the discussion of its features, the next section presents results that demonstrate the behaviour of a MANFIS for situation prediction.

Experimental results
The development of a neural network basically involves the following steps: defining the input and output data; configuring the network; training it; testing it; and then validating it.The experiment aimed to check the predictive power of a MANFIS and compare it to a MLP.We used the publicly available dataset from the CARE project (Context Awareness in Residence for Elders) presented in the work of Kasteren et al. (2010) .We chose this dataset because it represents real sensor data that was already processed and validated by other research under the CARE project.The dataset represents the state of 14 sensors positioned in a house, and the set of sensors states represents activities of the residents collected in segments of time of 60 seconds (see original paper by Kasteren et al. (2010) for more information about how this data was collected).Figure 4 depicts the sensor positioning.In the case of the original article, probabilistic methods were applied (Hidden Markov Models, HMM, and Conditional Random Fields, CRF) to identify activities in the environment.The focus was just related to identification and not to prediction issues.In the end, the author showed that CRF are more accurate in identifying activities than HMM.In the present work, the sensor data and timestamp of its collections (from Kasteren's dataset) were used as input of the various network models and then checked to see whether we could predict the next state of these sensors.Therefore, since we can predict the next state of the sensors correctly, we can infer the future activity from the situation in which a user is involved.We used 2,000 available entries from the dataset for training and another 2,000 were used to validate the networks, using the 10fold cross-validation method.
The first network analysed was the MLP.It was built using the Matlab and Neural Network Toolbox.After testing and validation through the confusion matrix, the best topology consisted of 15 input nodes, one hidden layer with five nodes, using the hyperbolic tangent as activation function, and 14 output nodes, representing the predicted values (status of each sensor).The learning rate μ used was 1e -5 .
A MANFIS was also constructed in the Matlab and tested in the same way as the MLP.As expected, it did not require much manual intervention.This is owing to the fact that the configuration parameters are adjusted by in-ternal mathematical functions, on the basis of input data.However, it was still necessary to select the best method for partitioning the input space.In the case of this work, as a result of the number of input variables, it was not possible to use the grid partition method (it generates rules by enumerating all possible combinations of membership functions for all inputs, which creates an exponential explosion of rules).Therefore, we selected the subtractive clustering method, which produces a scattered partitioning.With this method, we can perceive that few computer resources are used owing to the reduced number of rules generated, and the only information needed is the influence degree of the centre of the cluster for each input and output, assuming that the data is within a hypercube unit (range [0-1]).If the radius of the cluster is small, it will generate several small clusters of data and this will result in more rules.After building the confusion matrix configuration the best cluster was obtained with a radius equal to 0.25 in each of the entries.
The total number of rules generated in the space partitioning was 7 for each of the 14 inputs, totalling 98 rules (in tests performed with partition grid, using three membership functions for each of the 14 inputs, the total was 14 3 , or the absurd number of 4.782.969rules).If we reduced the value of the radius of the cluster, the number of rules and the time required to calculate the rules grew dramatically, with an increase in consumption of computational resources.In this case, it would be necessary to reduce the sizes of training data, which would also reduce the generalization power network.
The experimental results show that a MLP can predict the context values that characterize  a situation correctly due to its convergence in just 15 epochs.After 90 training epochs the mean square error (MSE) was 0.0063 (RMSE = 0.0794) (Figure 5).On the other hand, the mean square error (MSE) of the MANFIS was 0.00034 (RMSE = 0.0185) (Figure 6).Analysing these results it can also be seen that the MAN-FIS has a lower RMSE compared to the MLP and converged quickly, which demonstrates the ease with which it learns standards.
However, more manual intervention is required to configure the various parameters and procedures for implementation/test/validation, to achieve a satisfactory configuration.The MANFIS did not demand much manual intervention.We only adjusted parameters for clustering and the network did the rest.Despite this little intervention, MANFIS' predictive results were more accurate than the MLP and its convergence occurred faster, but more computational resources were used compared to the MLP.It can also be noted that the MAN-FIS was able to achieve an acceptable RMSE, even with reduced quantities of values in the training set.This is interesting since the prediction mechanism did not need to monitor the environment for a long time to define a pattern.However, special care must be taken with the maximum and minimum values in the universe of discourse, since otherwise the generalization power of the network will be degraded.In summary, both models learned about the behaviour of the sensors and could predict values that characterize a future situation correctly.In the context of our test, a MANFIS has better predictive ability that an MLP and requires less configuration and training efforts, but in exchange there is an increased use of computational processing.
Therefore, a MANFIS appears to be a good approach to be used as the underlying mechanism for our Situation Manager that is being built.However, in highly dynamic environments (where data standards are changing constantly) it will require a constant retraining of the system to adapt to the new standards (recalibration), and this will also involve new processing costs.Depending on the type of application, this feature can make the use of a MANFIS impracticable.One solution is to combine it with another technique, as listed in Figure 1, or to check the possibility of using a network with unsupervised learning.

Related work
This section only discusses works related to the prediction of context using machine-learning techniques.For several years, the Research Group on Pervasive Computing at the University of Athens, in Greece, has been developing works to predict the location of users (Anagnostopoulos et al., 2007).Among the group's more recent projects we can cite the one by Anagnostopoulos et al.,2009b), who presented an ART network (a neural network based on the Adaptive Resonance Theory) with k-means clustering method using the Hausdorff distance for calculating neighbourhood.They did a comparison of a network that uses an on-line algorithm and another one that uses an offline algorithm.The conclusion was although the network is promising, it responds slowly to pattern changes and does not provide adaptations for mobility patterns not seen previously.Further, it requires a lot of space for data storage.This led the authors to develop another work that was capable of responding quickly to changes, has short knowledge of space-temporal pattern of user mobility and uses less resources to store historical data of the users' movements.The result was an adaptive short memory for prediction with a fuzzy controller (Anagnostopoulos et al., 2011).Both studies demonstrate the efforts of a group in the development of prediction using machinelearning techniques such as our current work, but their focus is only on location and not on the high-level abstraction of the situation as desired by us.The works of Benta (Benta et al., 2009a(Benta et al., , 2009b(Benta et al., , 2010) ) present the use of an MLP network capable of learning user behaviour, making decisions about changes in the environment, and then assessing user satisfaction through the use of facial recognition.What is striking in this work is that although it makes use of ontologies, it is capable of learning new preferences and incorporating them into the rule base.Its limitation is a result of not having mechanisms for predicting the situation, and that it is only reactive to the changes that happen in the environment.Elmahalawy presents the construction of a simulator called Power Management in Smart Home Products (PM-SHP) (Elmahalawy et al., 2010).This simulator is designed for the prediction of energy issues in an intelligent environment, through the use of Evolutionary Algorithms (EA).The idea of using AE is interesting and deserves further investigation for their application in the prediction of situation.
These are some initiatives that are related directly to this work.As we can see, none of them have focused on prediction questions, proposed a new method, or even applied an existing technique, using a higher-level context called situation as was done here.Thus, as our work addressed these issues, we claim that it is innovative and original.The result ofthe analysis of the feasibility of using a hybrid neural network called MANFIS as the underlying technique for a prediction mechanism of situation is our main contribution and serves as a direction for further research in this area.

Conclusions and Future Directions
This work has analysed the predictive ability of a hybrid neural network called MANFIS, seeking to check whether it can be used as an underlying mechanism to predict situation.The results showed that a MANFIS has a lower rate of prediction errors and a faster convergence in comparison with an MLP.However, a MANFIS consumes more computational resources and lacks mathematical mechanisms that allow it to set its configuration parameters "on the fly", which is important in highly dynamic environments where data patterns change often or require real-time processing without prior training.We can conclude that a MANFIS can be used as an underlying technique for a situational prediction engine if, and only if, it is combined with some other technique which circumvents the problem of reconfiguration in real-time and consumes fewer computational resources.The ideal would be to combine it with another approach based on machine-learning or data-mining.If this is not possible and the application re-quires real-time processing without previous training, it is best to use a network with unsupervised learning instead of a MANFIS.
As future work we will investigate which other technique can be combined to improve MANFIS by addressing the appointed shortcomings.Another future topic for study is predicting not only the information that makes the next state of the sensors, but also the N states beyond that, which will improve the context adaptations of applications.

Figure 1 .
Figure 1.Reasoning techniques of situations and their relation to the increased complexity of the problem (Ye et al., 2011).

Figure 4 .
Figure 4. Floorplan of the house and sensor nodes used in the CARE project(Kasteren et al., 2010).