Experiments

xDB is a GUI-powered database to keep the results of experiments. Hence, in order to understand xDB, it is important understand the concept of experiment in the context of xDB. Following descriptions explain that along with related concepts.

Experiment

If you are developing a tool like named entity recognizer or sentiment analyzer, you need to evaluate how accurately your tool works. This evaluation process can be seen as an experiment and all obtained results can be saved under one experiment created in the xDB. Use "Create a New Experiment" link button in the home tab and select an appropriate name for that new experiment.

Evaluation Files

xDB assumes that you have an evaluation file to test how well your program works. In a regular research setup, there are two such files, namely development and test files. However xDB does not require you to have a test file. The reason for two evaluation file is that you are expected to improve your model on development set and then apply the best model on the test set to get the final evaluation score. This allows you do blind review of your system.

In the context of xDB, you can use multiple development files in a single experiment. Then you can browse the results for each file separately. Use the evaluation file menu to switch between evaluation files under selected experiment. When you want to use a test file, use it in pair with development file. As your program runs and reports the results via xDB API, it checks which development and test file is used for the evaluation. If it is not seen before, then it adds them into the experiment automatically. However note that xDB does not show the results on the test file directly. Only results on the development set is shown in the results listing. In order to see the result on the test set, you need to click on the score of a run which opens up a dialog window showing the test score.

Check xDB API for more information about how to declare dev and test files.

Experiment Run

During evaluation, you may try multiple parameter combinations to find the highest scoring combination. Each such try is called an experiment run. When a run is saved into the xDB, it is saved along with all given parameters for that run, in addition to which evaluation file is used for the evaluation. This allows you to keep track of which parameters produce which scores. In the results listing, you can see the parameters of each run and even filter the listing by selecting certain parameter(s).

Run Parameters

We assume that as your program runs each time, you consider different input parameters to generate your model and find the best model that scores the highest at the evaluation. In xDB environment, we assume parameters are (key, value) pairs. In other words, each parameter has a name and it may take multiple values. Nevertheless, parameter can also consists of just a name. In that case, xDB assigns NA (Not available) as its value.

Evaluation Metrics

Currently, you can store four evaluation metrics: precision, recall, fscore and accuracy. Based on the type of task you are working on, your program might be evaluated based on one or more of them at the same time. Your program can report scores in appropriate metrics via xDB API and they are stored into the database. When time comes to seeing the results, you can select which metric score you want to see in the listing via Experiment Settings dialog. Note also that loss value can also be stored but not shown in the results listing.

Experiment Token

When you need to input experiment results via xDB API, token represents your login information. This is needed because API does not support login command. When API call is made, instead of providing username and password, you use this token instead. This allows you keep your login information in secure while making API call. Anyone who has this token can insert a result into your experiment though. Note that sharing this token publicly can cause abusive issues. Nevertheless, it also allows you to share it with trusted parties and let them enter new results into your experiment. This becomes especially useful feature when people work in groups on the same experiment.