GSOC2016: Week 8

This week I was working on implementing interactive training mode.

This week I learnt about parallel computing in python. I implemented simple things (ipywidgets button and a running thread), I wanted to generalize this to TMVA but it didn’t work. I learnt that this is because of GIL, and to enable threading to TMVA.Factory.TrainAllMethods I need to set _threaded flag to true for every function. Using this flag seemed to solve my problem, I was able to run ROOT methods on thread, but when I tried to do the same with TMVA.Factory.TrainAllMethods it didn’t work. I did everything in notebook environment, I thought simple python process and notebook process is the same (notebook runs a simple python process in the background), and also test examples worked fine. Enric suggested to try running the same code in simple process without jupyter and for my surprise there everything worked fine.

To be able to work on this week task, I created a function to simulate the training process:

MyTrain

I added this function to Factory, this just calls MyBigLoop if the method is MLP.

MyBigLoop

This is a member function of MethodMLP: contains a loop and calculates some function (1/x and exp(a-x/b)) and goes to sleep to simulate the long calculations of training.

Conclusion about what I will need from MethodMLP

I need 2 private variable:

Int_t fCurrentEpoch: I need to set this variable to current epoch in training loop
bool fExitFromTraining: the training loop should check this variable every step and if it’s true then we have to break the training loop, this variable needs to be initialized in constructor to false

I need 2 public function:

void ExitFromTraining(): this function sets fExitFromTraining value to true
std::vector<Double_t> GetInteractiveTrainingData();: this function returns the fCurrentEpoch and the training and testing errors (now the 1/x and exp(a-x/b) values). I would use the information in CalculateEstimator function to calculate the errors (if I’m not wrong this function calculates exactly what we need). I suggest not using the CalculateEstimator function, instead we should calculate exactly the same in this new function. In this way we can avoid creating the histograms. Or we should set up a flag which can turn of calculating the histograms.

python part

In python I set the _threaded flag for MyTraining function, and I start a thread with this (it doesn’t lock the GIL for whole running time). In main thread I create a button with ipywidgets, if we click on this it will run the TMVA::MethodMLP::ExitFromTraining() function. I write to output cell a JS script call which initializes my drawing class. After this I call in every 500ms an update function which gets new data from TMVA, and insert to output cell a JavaScript code which calls a data inserter method. After the insertion is done, I remove this update methods output. I didn’t use a Sleeping method for delaying (checking for new data in every 500ms), because that will need a thread just for waiting and will make the calculation slower. Instead I use threading.Timer function which will start a thread after the time passes, this is more efficient then using a Sleep method.

Javascript part

I created the IChart module (comes from InteractiveChart). This function will build the data and plot every changes in a nice way.

Google Summer of Code BlogCERN SFT

Week 8: Interactive training

Conclusion about what I will need from MethodMLP

python part

Javascript part

Author

Attila Bagoly

Google Summer of Code BlogCERN SFT

Conclusion about what I will need from MethodMLP

python part

Javascript part

Related Posts

Weeks 14-15: Deep neural network builder 15 Aug 2016

Documentation 15 Aug 2016

Weeks 9-13: Interactive training, new user interface, tutorial notebooks, documentation 14 Aug 2016

Author

Attila Bagoly

Share this post on