Gaussian Process Regression

From ILMS-Wiki
(Difference between revisions)
Jump to: navigation, search
Line 23: Line 23:
  
 
==Manual==
 
==Manual==
 
+
After starting JAMS you will see a window like that in figure 1.
 
+
[[Image:Window prediction.jpg|thumb|Figure 1]]
 +
There are some model attributes and parameters which can be changed.
 +
===data file===
 +
The data file is a simple text file. It contains all time series, which should be used for training, validation and verification. The datafile has a table like structure.
 +
Each row consists of all relevant measurements at time t, which are represented as tab seperated floating point numbers. It is important that every row has the same number of elements. The value, which should be predicted, must be the last element in each row. In cases in which this value is unknown, simply write an arbitary number in that column, but remember that you should not use them in the training process. A simple data file will look like that:
 +
  1.0 0.0 0.5 11.1 -0.1
 +
  0.0 0.2 0.3 11.0 -0.5
 +
  2.3 3.1 1.0 10.5  1.0
 +
  5.0 6.1 4.2 11.5  2.1
 +
  3.0 5.2 1.0 13.6  0.0
 +
  0.8 1.1 0.5 13.6 -0.7
 
==Literature==
 
==Literature==

Revision as of 09:15, 23 April 2008

Contents

Abstract

During the last decade the number of publications on the field of kernel machines has increased enormously. Widely known are studies on Support Vector Machines (SVM), much activity was also spent on applying Gaussian processes to problems on the area of machine learning. This method represents an universal and practical approach to learning with kernel machines. Because of its solid statistical foundation learning with Gaussian processes has advantages over other empirical approaches concerning interpretability of model predictions. It also offers an established framework for model selection and subsequent model setup.

Because of ongoing theoretical and practical developments during the last years, Gaussian processes are nowadays considered as a serious alternative in the area of supervised learning. Because of their promising characteristics these methods are especially suited for Rainfall-Runoff-Modelling. Nevertheless, so far they have not attracted much interest in this domain.

The Gaussian process regression is based on the assumption that observations follow a normally distributed stochastic process. This leads to the conclusion, that new observations do not change the probability distribution of earlier ones. Based on this simple property Gaussian process regression allows predictions for unknown values. This paper describes an application of a Gaussian process regression based simulation model on the River Ouse dataset. The results show that this model is very well suited for an automated short-term runoff prediction which is only based on measured precipitation and runoff.

Download

zip

Quick start

You need to have JAVA Runtime Environment (JRE) installed on your machine. It can be downloaded for free here

  • Now download JAMS Prediction Package from this site.
  • Extract the compressed zip file.
  • Run JAMS.exe
  • From the menu, choose File -> Load Model Configuration
  • From the installation directory, select the predict.jam model configuration file
  • Apply settings to initial model parameters as you like
  • Run the model and enjoy ;)
  • To see information about current model execution, choose Logs->Model Info Log
  • The Predictions will be saved in a file named result.txt

Manual

After starting JAMS you will see a window like that in figure 1.

Figure 1

There are some model attributes and parameters which can be changed.

data file

The data file is a simple text file. It contains all time series, which should be used for training, validation and verification. The datafile has a table like structure. Each row consists of all relevant measurements at time t, which are represented as tab seperated floating point numbers. It is important that every row has the same number of elements. The value, which should be predicted, must be the last element in each row. In cases in which this value is unknown, simply write an arbitary number in that column, but remember that you should not use them in the training process. A simple data file will look like that:

 1.0 0.0 0.5 11.1 -0.1
 0.0 0.2 0.3 11.0 -0.5
 2.3 3.1 1.0 10.5  1.0
 5.0 6.1 4.2 11.5  2.1
 3.0 5.2 1.0 13.6  0.0
 0.8 1.1 0.5 13.6 -0.7

Literature

Personal tools