Skip to content

ATLAS Analysis Example

Introduction

Root may be run in batch mode on the grid to analyze large data samples. This example creates simulated data in root format using trees and performs analysis on the simulated data by means of processing on the grid. This example is based on a demo developed by OU programmer Chris Walker.

Prerequisite

  • Open a new Terminal on your local desktop. NOTE: You are no longer using the browser based terminal now, but the Terminal on your CentOS VM, just like you did to display mandle.gif with firefox.
  • Make a directory for this exercise
$ mkdir -p analysis_example
$ cd analysis_example

Again the $ sign at the beginning of the commands to execute is the command prompt, so it should not be entered as part of the command.

Simple Analysis Example

Step 1: Create simulated data using the grid

Note: Since the new training VMs on OSpool do not support running root, we will run root on the local desktops instead of using condor. So we will not need the condor submit scripts below but we will leave the instructions for them for future reference.

Now in your test directory we will create the three files: run-root.cmd, run-root.sh, and run-root.C with the contents given below. This may require running an editor such as emacs or nano on your local desktop.

We will not submit grid jobs so the "run-root.cmd" script is not needed for this exercise.

-----------------------------Skip from here-1-----------------------------------------

First, we will utilize a simple command script to submit the grid jobs. It is run-root.cmd:

universe=vanilla
executable=run-root.sh
transfer_input_files = run-root.C
transfer_executable=True
when_to_transfer_output = ON_EXIT
log=run-root.log
transfer_output_files = root.out,t00.root,t01.root
output=run-root.out.$(Cluster).$(Process)
error=run-root.err.$(Cluster).$(Process)
notification=Never
queue 

-----------------------------------------------Skip to here-1----------------------------------------------------

Note that the executable script is: run-root.sh which is as follows:

#!/bin/bash 
root -b < run-root.C > root.out

This script runs Root in batch mode and executes input macro run-root.C and produces output that is routed to file root.out. It has to be made executable, by use of the chmod Linux command (protections can be checked with the command ls -l):

$ chmod +x run-root.sh

The macro run-root.C consists of the following code:

{ 

 // create files containing simulated data

 TRandom g; 
 char c[256]; 
 for ( int j = 0 ; j < 2 ; j++ ){ 
    sprintf(c,"t%2.2d.root\000",j); 
    TFile f(c,"RECREATE","MyFile", 0/*no compression*/); 
    TTree *t = new TTree("t0","t0"); 
    Int_t Run; 
    TBranch * b_Run = t->Branch("Run",&Run); 
    Int_t Event; 
    TBranch * b_Event = t->Branch("Event",&Event); 
    Float_t Energy; 
    TBranch * b_Energy = t->Branch("Energy",&Energy); 
    Run = j; 

        for( Event = 0 ; Event < 100 ; Event++ ){ 
          Energy = g.Gaus(500.0 , 200.0);   
          t->Fill(); 
        }  
    f.Write(); 
    f.Close(); 
 } 
} 
.q 

We will not submit grid jobs during this exercise. So we will skip to running root.

-----------------------------Skip from here-2-----------------------------------------

The grid job can be submitted using:

$ condor_submit run-root.cmd

It can be checked with:

$ condor_q YOUR_USER_ID -nobatch

After it runs, you will find a log file that describes the job: run-root.log, and output file: root.out, and the files containing the simulated data: t00.root, t01.root in your test directory. You need to copy these files into your public directory, so that you can download it to your local desktop:

$ cp t0*.root ~/public/

Now open a different terminal window on your local desktop, and download the root files with:

$ wget http://stash.osgconnect.net/~YOUR_USER_ID/t00.root  http://stash.osgconnect.net/~YOUR_USER_ID/t01.root

-----------------------------------------------Skip to here-2----------------------------------------------------

Execute the script to run root:

./run-root.sh

You can then inspect the contents of t00.root and t01.root by running root in your current directory in the local terminal window:

$ root t00.root

And then the Root command: TBrowser b

With the TBrowser you can plot the simulated data in branch Energy as well as the other branches. Double click on the name of the root files, and then on the variables you would like to plot.

Each data file contains a TTree named t0. You can plot the contents of all (in this example both) data file TTree's by using the TChain method as follows:

In root execute the following commands:

TChain tc("t0");
tc.Add("t*.root");
tc.Draw("Energy");

When you are done with this, you can quit root again with the command .q <Return>.

Step 2: Analyze Real Data

We will not submit grid jobs during this exercise. So we will skip submit script.

-----------------------------Skip from here-3-----------------------------------------

The grid job can be submitted using:

Now we want to have a look at a real live ATLAS root file. For this, go back to the remote terminal window on osgconnect. You will need a new condor submit script called run-z.cmd:

universe=vanilla
executable=run-z.sh
transfer_input_files = readEvents.C,/home/pskubic/public/muons.root
transfer_executable=True
when_to_transfer_output = ON_EXIT
log=run-z.log
transfer_output_files = root-z.out,histograms-z.root
output=run-z.out.$(Cluster).$(Process)
error=run-z.err.$(Cluster).$(Process)
notification=Never
queue 

-----------------------------------------------Skip to here-3----------------------------------------------------

The new executable script you need for this job is: run-z.sh which is as follows:

#!/bin/bash 
root -b -q readEvents.C+ > root-z.out

This script runs root in batch mode and executes input macro readEvents.C and produces output that is routed to file root-z.out. It has to be made executable, by use of the chmod Linux command (protections can be checked with the command ls -l):

$ chmod +x run-z.sh

The macro readEvents.C consists of the following code:

#include "TFile.h"
#include "TTree.h"
#include "TCanvas.h"
#include "TH1F.h"
#include "iostream"
//#include "TLorentzVector.h"
using namespace std;

void readEvents(){

    // load the ROOT ntuple file
    TFile * f = new TFile("muons.root");
    TTree *tree = (TTree *) f->Get("POOLCollectionTree");
    int nEntries = tree->GetEntries();
    cout << "There are " << nEntries << " entries in your ntuple" << endl;

    // create local variables for the tree's branches
    UInt_t NLooseMuons;
    Float_t LooseMuonsEta1;
    Float_t LooseMuonsPhi1;
    Float_t LooseMuonsPt1;

    Float_t LooseMuonsEta2;
    Float_t LooseMuonsPhi2;
    Float_t LooseMuonsPt2;

    // set the tree's branches to the local variables
    tree->SetBranchAddress("NLooseMuon", &NLooseMuons);
    tree->SetBranchAddress("LooseMuonEta1", &LooseMuonsEta1);
    tree->SetBranchAddress("LooseMuonPhi1", &LooseMuonsPhi1);
    tree->SetBranchAddress("LooseMuonPt1", &LooseMuonsPt1);

    tree->SetBranchAddress("LooseMuonEta2", &LooseMuonsEta2);
    tree->SetBranchAddress("LooseMuonPhi2", &LooseMuonsPhi2);
    tree->SetBranchAddress("LooseMuonPt2", &LooseMuonsPt2);

    // declare some histograms
  TH1F *muPt1 = new TH1F("muPt1", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *muPx1 = new TH1F("muPx1", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *muPy1 = new TH1F("muPy1", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *muPz1 = new TH1F("muPz1", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  TH1F *muEta1 = new TH1F("muEta1", ";#eta;Events", 50, -3, 3);
  TH1F *muPhi1 = new TH1F("muPhi1", ";#phi;Events", 50, -4, 4);
  TH1F *muE1 = new TH1F("muE1", ";Energy;Events", 50, 0, 200);

  TH1F *muPt2 = new TH1F("muPt2", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *muPx2 = new TH1F("muPx2", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *muPy2 = new TH1F("muPy2", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *muPz2 = new TH1F("muPz2", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  TH1F *muEta2 = new TH1F("muEta2", ";#eta;Events", 50, -3, 3);
  TH1F *muPhi2 = new TH1F("muPhi2", ";#phi;Events", 50, -4, 4);
  TH1F *muE2 = new TH1F("muE2", ";Energy;Events", 50, 0, 200);

  TH1F *zPt = new TH1F("zPt", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *zPx = new TH1F("zPx", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *zPy = new TH1F("zPy", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *zPz = new TH1F("zPz", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  //TH1F *zEta = new TH1F("zEta", ";#eta;Events", 50, -3, 3);
  //TH1F *zPhi = new TH1F("zPhi", ";#phi;Events", 50, -4, 4);
  TH1F *zE = new TH1F("zE", ";Energy;Events", 50, 0, 200);  
  TH1F *zMass = new TH1F("zMass", ";Mass;Events", 50, 0, 200);  


    // loop over each entry (event) in the tree
    for( int entry=0; entry < nEntries; entry++ ){
      if( entry%10000 * 0 ) cout << "Entry:" << entry << endl;

      // check that the event is read properly
      int entryCheck = tree->GetEntry( entry );
      if( entryCheck <= 0 ){  continue; }

      // only look at events containing at least 2 leptons
      if(NLooseMuons < 2) continue;

      // require the leptons to have some transverse momentum
      if(abs(LooseMuonsPt1) *0.001 < 20 || abs(LooseMuonsPt2) *0.001 < 20 ) continue;

      // make a LorentzVector from the muon
      //TLorentzVector Muons1;
     // Muons1.SetPtEtaPhiM(fabs(LooseMuonsPt1), LooseMuonsEta1, LooseMuonsPhi1, 0);

      // print out the details of an electron every so often
      if( entry%10000 * 0 ){ 
        cout << "Muons pt1: " << LooseMuonsPt1 << " eta: " << LooseMuonsEta1 << " phi " << LooseMuonsPhi1 << endl;
        cout << "Muons pt2: " << LooseMuonsPt2 << " eta: " << LooseMuonsEta2 << " phi " << LooseMuonsPhi2 << endl;
      }

      //calculation of muon energy
        Double_t muonMass = 0.0;  // assume the mass of the muon is negligible
        Double_t muonPx1 = abs(LooseMuonsPt1)*cos(LooseMuonsPhi1);
        Double_t muonPy1 = abs(LooseMuonsPt1)*sin(LooseMuonsPhi1);
        Double_t muonPz1 = abs(LooseMuonsPt1)*sinh(LooseMuonsEta1);
    Double_t muonEnergy1 = sqrt (muonPx1*muonPx1 + muonPy1*muonPy1 + muonPz1*muonPz1 + muonMass*muonMass);

    Double_t muonPx2 = abs(LooseMuonsPt2)*cos(LooseMuonsPhi2);
        Double_t muonPy2 = abs(LooseMuonsPt2)*sin(LooseMuonsPhi2);
        Double_t muonPz2 = abs(LooseMuonsPt2)*sinh(LooseMuonsEta2);
    Double_t muonEnergy2 = sqrt (muonPx2*muonPx2 + muonPy2*muonPy2 + muonPz2*muonPz2 + muonMass*muonMass);

    Double_t zCompX = muonPx1 + muonPx2;
        Double_t zCompY = muonPy1 + muonPy2;
        Double_t zLongi = muonPz1 + muonPz2;
        Double_t zPerp = sqrt (zCompX*zCompX + zCompY*zCompY);  
    Double_t zEnergy = muonEnergy1 + muonEnergy2;
    Double_t zM = sqrt (zEnergy*zEnergy -zCompX*zCompX -zCompY*zCompY -zLongi*zLongi);


      // fill our histograms
        muPt1->Fill((LooseMuonsPt1)*0.001); // in GeV
        muEta1->Fill(LooseMuonsEta1);
        muPhi1->Fill(LooseMuonsPhi1);
    muPx1->Fill( muonPx1*0.001); // in GeV
    muPy1->Fill( muonPy1*0.001); // in GeV
    muPz1->Fill( muonPz1*0.001); // in GeV
        muE1->Fill(muonEnergy1*0.001); // in GeV

    muPt2->Fill((LooseMuonsPt2)*0.001); // in GeV
        muEta2->Fill(LooseMuonsEta2);
        muPhi2->Fill(LooseMuonsPhi2);
    muPx2->Fill( muonPx2*0.001); // in GeV
    muPy2->Fill( muonPy2*0.001); // in GeV
    muPz2->Fill( muonPz2*0.001); // in GeV
        muE2->Fill(muonEnergy2*0.001); // in GeV

    zPt->Fill( zPerp*0.001); // in GeV
    zPx->Fill( zCompX*0.001); // in GeV
    zPy->Fill( zCompY*0.001); // in GeV
    zPz->Fill( zLongi*0.001); // in GeV
        zE->Fill( zEnergy*0.001); // in GeV
        zMass->Fill(zM*0.001); // in GeV

    }

  // draw the eta distribution
  zMass->Draw();

  // make a ROOT output file to store your histograms
  TFile *outFile = new TFile("histograms-z.root", "recreate");
  muPt1->Write();
  muEta1->Write();
  muPhi1->Write();
  muE1->Write();
  muPx1->Write();
  muPy1->Write();
  muPz1->Write();

  muPt2->Write();
  muEta2->Write();
  muPhi2->Write();
  muE2->Write();
  muPx2->Write();
  muPy2->Write();
  muPz2->Write();

  zPt->Write();
  zE->Write();
  zPx->Write();
  zPy->Write();
  zPz->Write();
  zMass->Write();

  outFile->Close();
}

We will not submit grid jobs during this exercise. So we will skip to run root.

-----------------------------Skip from here-4-----------------------------------------

The grid job can be submitted using:

$ condor_submit run-z.cmd

It can again be checked with:

$ condor_q YOUR_USER_ID -nobatch

After it runs, you will find a log file that describes the job: run-z.log, and output file: root-z.out, and the files containing the simulated data: histograms-z.root in your test directory.

You again need to copy that file into your public directory, so that you can download it to your local desktop:

$ cp histograms-z.root ~/public/

Go back to the local terminal window on your local desktop, and download the root files with:

$ wget http://stash.osgconnect.net/~YOUR_USER_ID/histograms-z.root

-----------------------------------------------Skip to here-4----------------------------------------------------

Setup a soft link to the input data file, muons.root, and execute the script to run root:

ln -s /opt/data/muons.root .
./run-z.sh

You can inspect the contents of histograms-z.root by running Root (i.e., root histograms-z.root) in your current directory in your local terminal window:

$ root histograms-z.root

And then using the Root command: TBrowser b

With the TBrowser you can plot the variables in the root file. Double click on histograms-z.root, and then on the variables to plot them.

Step 3: Make TSelector

Now let's go back to the files created in step 1, in the local terminal window. Start root in your test directory with the following commands:

$ root -b

And then execute the following commands:

TFile f("t00.root");
t0->MakeSelector("s0","=legacy");
f.Close();
.q

This will create files s0.C and s0.h in your test directory that contain code corresponding to the definition of the TTree t0. This code can be used to process files containing data in these TTree's.

Now we will add a histogram to the TSelector code. Several code lines have to be added to the TSelector code files s0.C and s0.h.

To s0.h make the following additions: after existing include statements add:

#include <TH1F.h>

After class s0 definition:

class s0 : public TSelector {
public :

add

TH1F *e;

To s0.C make the following additions:

After entry:

void s0::SlaveBegin(TTree * /*tree*/)
{

add

e = new TH1F("e", "e", 1000, -199.0, 1200.0);

After Process entry:

Bool_t s0::Process(Long64_t entry)
{

add

GetEntry(entry);
e->Fill(Energy);

After terminate entry:

void s0::Terminate()
{

add

TFile f("histograms.root","RECREATE");
f.WriteObject(e,"Energy");
f.Close();

We will not submit grid jobs during this exercise. So we will skip submit script.

-----------------------------Skip from here-5-----------------------------------------

Now create the new script files for Step 2:

create run-root-2.cmd:

universe=vanilla
executable=run-root-2.sh 
transfer_input_files = s0.C,s0.h,run-root-2.C,t00.root,t01.root 
transfer_executable=True 
when_to_transfer_output = ON_EXIT 
log=run-root-2.log 
transfer_output_files = root-2.out,histograms.root 
output=run-root-2.out.$(Cluster).$(Process) 
error=run-root-2.err.$(Cluster).$(Process) 
notification=Never 
queue 

-----------------------------------------------Skip to here-5----------------------------------------------------

Create run-root-2.sh:

#!/bin/bash 
root -b < run-root-2.C > root-2.out 

It has to be made executable, by use of the chmod Linux command:

chmod +x run-root-2.sh

Create run-root-2.C

.L s0.C++ 
{ 
 //Load and run TSelector 

  s0 *s = new s0(); 

  TChain tc("t0"); 
  tc.Add("t*.root"); 
  tc.Process(s); 

} 

We can test the root job on the local machine by executing the script to run root:

./run-root-2.sh

We will not submit grid jobs during this exercise. So we will skip running condor.

-----------------------------Skip from here-6-----------------------------------------

If this works, we can process the data files t00.root and t01.root on the Grid with our new command script run-root-2.cmd.

This can be done with command:

condor_submit run-root-2.cmd

Once your job has finished, you again need to copy that file into your public directory, so that you can download it to your local desktop:

cp histograms.root ~/public/

Go back to the local terminal window on your local desktop, and download the root files with:

wget http://stash.osgconnect.net/~YOUR_USER_ID/histograms.root

-----------------------------------------------Skip to here-6----------------------------------------------------

You can look at the output histogram file: histograms.root with TBrowser b as before, in your local terminal window.