You are here: TWiki> CALO Web>CaloKali>KaliOnline (2016-12-22, DariaSavrina)
Tags:
create new tag
, view all tags

Online calibration with neutral pions

π0→γγ calibration method

The details of the π0→γγ method for the ECAL calibration as well as the description of its implementation within the KaliCalo package can be found in the internal notes:

Test run

At first run

To run the calibration within the online framework one needs to have an online account. To get it contact lbonsupp@cern.ch. For the test run login to a plus node

$> ssh -Y lxplus
$> ssh -Y lbgw
$> ssh -Y plus

and setup the build environment as shown here External link mark.

lb-dev AlignmentOnline v11r0
cd AlignmentOnlineDev_v11r0
getpack PRConfig head
cd PRConfig
ln -s . v1r999

Besides the PRConfig one needs to get the KaliOnline

cd $User_release_area/AlignmentOnlineDev_v11r0
getpack AlignmentOnline/PyKaliOnline head
cd AlignmentOnline/PyKaliOnline/python/PyKaliOnline

Note: if “getpack” requests for a password, try getpack -p anonymous AlignmentOnline/PyKaliOnline head

Put the scripts necessary for the offline tests into appropriate places

cp offline_tests/* .
cp -rf offline_tests/CalibTests ../../../../PRConfig/python/.
mv offline_tests/KaliOnlineTest.py ../../../../PRConfig/scripts/.
rm -rf offline_tests/

One would also need a local copy of Brunel and DaVinci

cd $User_release_area
lb-dev Brunel v50r1
cd BrunelDev_v50r1
make configure
make -j 5 install
cd ../
lb-dev DaVinci v40r1p3
cd DaVinciDev_v40r1p3
make configure
make -j 5 install

Don't forget to compile

cd $User_release_area/AlignmentOnlineDev_v11r0
make configure
make -j 5 install

Before running a test create a working directory, for example

mkdir /home/your_user_name/tests

And get some data. A small sample of test data files is located on hlte0902 at /localdisk/hlt1/test_align (copy it to a locally available working directory)

At each run

Follow the instructions for login into a suitable machine External link mark:

  • hltperf-action-x5650
  • hltperf-dell-x5650
  • hltperf-asus-amd627
Beware that you need to load the LHCb environment on these machines:

source /cvmfs/lhcb.cern.ch/group_login.sh

And setup the running environment as it is shown here External link mark

cd $User_release_area/AlignmentOnlineDev_v11r0
./run bash
export PYTHONPATH=/home/raaij/pydim/lib/python2.7/site-packages:/scratch/jenkins/benchmark/python:$PYTHONPATH
export HLTTCKROOT=/group/hlt/sattelite/MooreOnlinePit_v24r0p1/TCK/HltTCK/
cd $User_release_area/AlignmentOnlineDev_v11r0/PRConfig/scripts

To run the calibration test instead of the alignment use the command

python KaliOnlineTest.py --numa --nodes=2 --workers=10 --task-log=log/align.log --directory=/home/your_user_name/tests/ --viewers KaliCalibration

An output

An output of the calibration will be written to /home/your_user_name/tests/Store (or whatever working directory you use). It will contain

  • a list of zipped files starting with 'Histograms' – the full database of pi0 histograms saved at each iteration;
  • a list of zipped files starting with 'Lambdas' – the databases with calibration constants at each iteration;
  • a list of zipped files starting with 'Problematic' – the databases with lists of badly-fitted cells at each iteration;
  • root files 'OutputHistograms.root' and 'OutputCoefficients.root' – histograms with visual information of the calibration output;
  • a text file 'CalibCoefficients.txt' – the list of calibration constants for each cell.
The OutputHistograms.root file contains the histograms allowing to estimate the convergency and watch the neutral pion signal evolution during the calibration. That are:
  • Conv1 (Conv2, Conv3,... - here the number corresponds to the number of a "secondary" iteration) - neutral pion peak position on each of the "primary" iterations;
  • Sigm1 (Sigm2, Sigm3,... - here the number corresponds to the number of a "secondary" iteration) - neutral pion resolution on each of the "primary" iterations;
  • MassInnerPassNItM, MassMiddlePassNItM, MassOuterPassNItM, MassPassNItM (where N and M are the numbers of the "secondary" and the "primary" iteration correspondingly) - the diphoton invariant mass distribution at each of the iterations for the Inner, Middle, Outer parts of the ECAL and for the whole electromagnetic calorimeter correspondingly.
The OutputCoefficients.root file contains the calibration constants distributions:
  • It1distr(Inner/Middle/Outer) - the distributions of the coefficients obtained during the first "secondary" iteration fitted with a Gauss function;
  • It2distr(Inner/Middle/Outer) - the distributions of the coefficients obtained during the second "secondary" iteration fitted with a Gauss function;
  • It12distr(Inner/Middle/Outer) - the total distributions of the coefficients fitted with a Gauss function;
  • comp2DFace(Inner/Middle/Outer) - the 2D distribution of the total coefficients over the surface of the Inner, Middle and Outer zones of the ECAL (one bin corresponds to one cell).
Several notes: To reduce time spent for the tests one can reduce the number of iterations (by default there are 2 “secondary” iterations with 7 “primary” iterations in each). For this open $User_release_area/AlignmentOnlineDev_v11r0/AlignmentOnline/PyKaliOnline/python/PyKaliOnline/Paths.py Change the PassIt (double number of primary iterations) and MaxIt (total number of iterations) parameters. For example, if you need 2 “secondary” iterations with n “primary” iterations in each, it will be PassIt = 2n, MaxIt = 4n+1.

All the data is transmitted between Iterator and Analyzer by saving it to files. The filenames always contain the host name of the node, at which the calibration is run. So, if you stopped the task and then wish to start it from the existing point (for example not to rerun the reconstruction step, but to use already existing dsts), please make sure, that you run it at the same node.

An example of an output database with histograms may be downloaded from here: https://cernbox.cern.ch/index.php/s/NocZidC69TUxkGX External link mark. An example script to read it is ReadHistoMap.py.txt (remove the .txt extension before use). The DaVinci environment is needed to run the script.

Calibration

The calibration is run from a control panel for the LHCb Align partition (aka LHCbA), which is shared for several tasks. Note, that there is a timetable External link mark for the partition usage. The calibration also shouldn't be run along with the datataking (which could be checked with the vistar External link mark).

Please, note that at the moment one should not use the partition, unless an explicit permission is given!

Code and data location

The code is available in the AlignmentOnline/PyKaliOnline package under the AlignmentOnline project. The one running for the online calibration is located at /group/calo/cmtuser/AlignmentOnlineDev_v10r4/KaliOnline/PyKaliOnline/python/PyKaliOnline/ (accessible from any plus or hlt machine).

The startup scripts point to the debug build. So, making any modifications, don't forget to compile it also:

>> echo $CMTCONFIG
>> x86_64-slc6-gcc49-opt
>> export CMTCONFIG=x86_64-slc6-gcc49-dbg
>> make
>> export CMTCONFIG=x86_64-slc6-gcc49-opt
>> make

The startup scripts are: /group/online/dataflow/cmtuser/OnlineDev_v5r19/Online/FarmConfig/job/Alig{Drv,Wrk}.sh. You have to be user online to change them. The raw input data on each node is located in

/localdisk/Alignment/BWDivision/

folder.

LHCbA partition

To open it log in to a ui machine:

>> ssh -Y lbgw
>> ssh -Y ui
>> /group/online/ecs/Shortcuts311/LHCb/ECS/ECS_UI_FSM.sh &

That will start the run control. Right-click on LHCbAlign.

LHCbA0.jpg

And the control panel will be opened. To be able to see the output logs, open the error logger. For this in another shell

>> ssh -Y lbgw
>> ssh -Y plus
>> errorLog LHCbA

(three windows with black backround will open).

Back to the LHCbA partition control panel. When the partition is relased all the buttons are grey. Click the lock sign to take it:
LHCbAreleased.jpg

This will open a small new window, click the "Take" button. This could take a few minutes.

Modes.jpg

Click dissmiss on the popup.

JustSomeInfoWindow.jpg

The partition is now in state "NOT_ALLOCATED". To prepare for running

  1. Enter your name in a window and click "Reserve alignment" so that everyone could know, who's using it at the moment.
  2. Select the Calo task from menu.
  3. Choose the runs to be used.
notAllocated-Beginning.jpg

After clicking on "Choose runs for alignment" a new window will be opened. Usually not too much data is saved under the Calo task, the data from BWDivision task is used for calibration. So, select the "BWDivision" task from the menu, then select the runs (Shift+Up or Shift+Down to select several runs). Close window by clicking "OK".
RunsSelect.jpg

Getting back to the control panel, one can now send the "ALLOCATE" command from the top:
notAllocated-Allocate.jpg

ToAllocate.jpg

The partition will become "NOT_READY". Send "CONFIGURE" from the top.
ToConfigure.jpg

Sometimes during the configuration the partition can go to an "ERROR" state, as not all the nodes are ready to work. This can be overcome by recovering
ToRecover.jpg

This will bring the partition into the "NOT_READY" state. Then try to configure once again. The nodes causing troubles will be excluded. Another way to exclude the nodes is to click the "HLT" button.
HLT.jpg

A new window will be opened, in which one can look at the states of the nodes, include or exclude them. To include (exclude) nodes select them from the right (left) column and move to the left (right) one by clicking on left (right) arrow. Then click "Include" ("Remove") button below. The same thing could be done to a whole sub-farm.
ToRemove.jpg

When the partition is finally in the "READY" state, send "START_RUN" from the top That will start everything.
LHCbA1.jpg

Error logs

Error logs are the three windows allowing to watch an output of the job.
errorLog.jpg

The Message log shows output from all the nodes in real time. Usually it is more convenient to set an output level for it to "Warning" or "Error". The History log allows to look at an output from a particular node. The Error logger allows to change the output settings:
errorLogPanel.jpg

To look at the history output from a certain node just enter its name and press "Enter". For example "hlta0101". The Iterator is doing its job on "hlt0x" machines (where "x" is usually "1" or "2"), its output is also can be looked at through the history display. To set Message log output level change the value in the "Severity of the messages" window. For this twice press the "Up" key on your keyboard and then select a necessry value with ">" and "<" buttons. To change a number of output lines press "Up" button and enter a number of lines you would like to see in the output. Then press "Enter".

Output location

Output histograms and calibration coefficients go to

/group/calo/CalibWork/Store/

The fmDSTs, DSTs and root files created during the work are saved to

/localdisk/Alignment/Calo

at each node.

How it works

Independand of whether the test or full calibration is being run the scripts, which govern the job are the Iterator and Analyser. The Iterator runs on a single node, counting the iterations and performing those tasks which need not to be parallelized. The Analyser script is run on each node of the farm, having an access to a piece of data saved at a current node and allowing to split large tasks in many parallel processes.

Not being able to pass the data between each other, the Iterator and the Analyser communicate through the Communicator by sending and recieving information about states of each other. All the other data, which needs to be passed between different alorithms is saved as files in specific folders. The dsts, fmDSTs and root files are saved at a local disk of each node (accessible by the Analyser only). The databases with histograms and calobration constants are saved to a common folder and can be picked up by both the Analysers and the Iterator.

The Paths.py file contains the locations and names under which the transitional data is saved and severl functions to get the list of existing files of a certain type (for example to get a list of databases with the calibration constants obtained from several nodes). This is made for convenience, as the same files could be used by several scripts. The Paths.py also contains the variables for the required number of iterations and an import of some Online configration.

The RunBrunel.py ( RunKali.py) script is aimed to set up the Brunel (DaVinci) environment and run the reconstruction (Kali functions) under this environment as if it was run from a command line. MultiBrunelStep is just a usual Brunel job with some tuning allowing it to run in online. To make the reconstruction at each node faster, the job is also split between cores of the node, using the multiprocessing python module.

The MultiKaliPi0 _producefmDST script is a Kali (DaVinci) job, performing neutral pion selection and creating the fmDSTs and root files. It can be run in two modes. It case if no calibration constants are given (which normally means the very first iteration, "first pass") it produces the fmDSTs and root files using the dsts as an input. In case, when a database with constants is recieved it runs a re-reconstruction from the fmDSTs ("second pass"). Again, to increase the speed of processing this task is split between the cores of each node using the multiprocessing module.

In the Run.py file several functions are gathered, which perform different tasks at different stages of iterative procedure. That are:

  • FillTheHistograms - given a list of root files it fills the histograms for each cell using the corresponding Kali function;
  • SplitHistos - collects the databases with histograms filled at different nodes and merges them so that for each cell the full statistics was used. Then the full database is split into cells groups and saved to different files to be processed by the fitting function at different nodes;
  • FitTheHistograms - given a database with histograms fits them and calculates the calibration constants using the corresponding Kali functions;
  • CollectLambdas - collects the databases with calibration constants for each cells group obtained at different nodes and saves them to a single file;
  • GetOutput - in the end of calibration produces files with some statistical histograms and final list of calibration constants.

Some tips

Reconstruction and fmDST production steps

Some problems (errors like 'PropertyConfigSvc: could not obtain alias TCK/0x11291600' or even zero output without any visible errors) may arise at the reconstruction/fmDST production steps due to not up-to-date tags or software versions. Currently these versions are set up manually in the code. One may look up the latest versions of the Online and OnlineBrunel available online here:

/group/online/dataflow/cmtuser/

Then open '/group/calo/cmtuser/AlignmentOnlineDev_v10r4/KaliOnline/PyKaliOnline/python/PyKaliOnline/RunBrunel.py' and in the string

brunel_path = "/group/online/dataflow/cmtuser/OnlineBrunel_%s" %'vXrYY'

replace the OnlineBrunel (vXrYY) version with the one you wish to use.

The latest CondDB tags available can be found here

/group/online/hlt/conditions/

just look for everything starting with 'LHCBCOND_'. The part of the filename containing 'cond-yyyymmdd' (where yyyymmdd are the 8 digits) will be the name of the tag. Then open '/group/calo/cmtuser/AlignmentOnlineDev_v10r4/KaliOnline/PyKaliOnline/python/PyKaliOnline/Paths.py' and set the tag in the function

def importOnline():
    [...]
    Online.CondDBTag = 'cond-yyyymmdd'

If you wish to get a newer version of DaVinci, first check it out from repository (getpack Phys/KaliCalo if needed):

cd /group/calo/cmtuser
lb-dev DaVinci vXXrY
cd DaVinciDev_vXXrY
make configure
make -j 5 install

then do

source /group/online/dataflow/scripts/shell_macros.sh
cd /path/to/DaVinciDev_vXrY
do_configure
do_install
cmsetup

don't forget to make. Then go to '/group/calo/cmtuser/AlignmentOnlineDev_v10r4/KaliOnline/PyKaliOnline/python/PyKaliOnline/RunKali.py' and in the string

dv_path = "/group/calo/cmtuser/DaVinciDev_v40r1p3"

replace the DaVinci version with the one you wish to use. Don't forget to make AlignmentOnline package also.

During the run

The configuration of the run is picked up, when a "CONFIGURE" command is sent from the top. So, if any changes were made in the calibration code and one wishes them to be included, then after compiling (both opt and debug build), in is necessary to configure once again. For this:

  • Stop run;
  • Send the "RESET" command, partition will go in into "NOT_READY" state;
  • Send the "CONFIGURE" command.
Sometimes it happens that during an iteration one or several of nodes can stall, which hangs everething in the "RUNNING" state with no visible activity (no output in the logs for these nodes). One can exclude them as shown above, without stopping the run. It will run the job further (the partition will go to the "READY" state, as it should be between iterations).

-- DariaSavrina - 2016-02-05

Topic attachments
I Attachment Action Size Date Who Comment
Texttxt ReadHistoMap.py.txt manage 1.0 K 2016-03-15 - 00:12 DariaSavrina  
Texttxt ReadLambdaMap.py.txt manage 0.5 K 2016-06-09 - 21:20 DariaSavrina Example on how to read the coefficient database
Texttxt ReadProblematicCells.py.txt manage 1.0 K 2016-06-09 - 21:17 DariaSavrina Example on how to read the Problematic cells database
Topic revision: r22 - 2016-12-22 - DariaSavrina
 

TWIKI.NET
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2022 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback