Skip to content

ATLAS Analysis Example

Introduction

Root may be run in batch mode on the grid to analyze large data samples. This example creates simulated data in root format using trees and performs analysis on the simulated data by means of processing on the grid. This example is based on a demo developed by OU programmer Chris Walker.

Prerequisite

  • Open a new Terminal on your VM.
  • Make a directory for this exercise
$ mkdir -p ~/analysis_example
$ cd ~/analysis_example

Again the $ sign at the beginning of the commands to execute is the command prompt, so it should not be entered as part of the command.

Simple Analysis Example

Step 1: Create simulated data using the grid

Now in your test directory we will create the three files: run-root.cmd, run-root.sh, and run-root.C with the contents given below. This may require running an editor such as nano.

First, we will utilize a simple command script to submit the grid jobs. It is run-root.cmd:

universe=vanilla
executable=run-root.sh
transfer_input_files = run-root.C
transfer_executable=True
when_to_transfer_output = ON_EXIT
log=run-root.log
transfer_output_files = root.out,t00.root,t01.root
output=run-root.out.$(Cluster).$(Process)
error=run-root.err.$(Cluster).$(Process)
notification=Never
queue 

Note that the executable script is: run-root.sh which is as follows:

#!/bin/bash

# setup
source /cvmfs/sft.cern.ch/lcg/views/setupViews.sh LCG_105a x86_64-ubuntu2204-gcc11-opt

# execute
root -b < run-root.C > root.out

This script runs Root in batch mode and executes input macro run-root.C and produces output that is routed to file root.out. It has to be made executable, by use of the chmod Linux command (protections can be checked with the command ls -l):

$ chmod +x run-root.sh

The macro run-root.C consists of the following code:

{ 

 // create files containing simulated data

 TRandom g; 
 char c[256]; 
 for ( int j = 0 ; j < 2 ; j++ ){ 
    sprintf(c,"t%2.2d.root\000",j); 
    TFile f(c,"RECREATE","MyFile", 0/*no compression*/); 
    TTree *t = new TTree("t0","t0"); 
    Int_t Run; 
    TBranch * b_Run = t->Branch("Run",&Run); 
    Int_t Event; 
    TBranch * b_Event = t->Branch("Event",&Event); 
    Float_t Energy; 
    TBranch * b_Energy = t->Branch("Energy",&Energy); 
    Run = j; 

        for( Event = 0 ; Event < 100 ; Event++ ){ 
          Energy = g.Gaus(500.0 , 200.0);   
          t->Fill(); 
        }  
    f.Write(); 
    f.Close(); 
 } 
} 
.q 

The grid job can be submitted using:

$ condor_submit run-root.cmd

It can be checked with:

$ condor_q YOUR_USER_ID -nobatch

After it runs, you will find a log file that describes the job: run-root.log, and output file: root.out, and the files containing the simulated data: t00.root, t01.root in your test directory. You need to download these root files to your local desktop:

See instructions for downloading mandle.gif in the Mandlebrot Session. 

Now open a different terminal window on your local desktop to view the files.

If condor submit was skipped, you can execute the script to run root interactively:

./run-root.sh

You can then inspect the contents of t00.root and t01.root by running root in your current directory in the local terminal window:

$ root t00.root

And then the Root command: TBrowser b

With the TBrowser you can plot the simulated data in branch Energy as well as the other branches. Double click on the name of the root files, and then on the variables you would like to plot.

Each data file contains a TTree named t0. You can plot the contents of all (in this example both) data file TTree's by using the TChain method as follows:

In root execute the following commands:

TChain tc("t0");
tc.Add("t*.root");
tc.Draw("Energy");

When you are done with this, you can quit root again with the command .q <Return>.

Step 2: Analyze Real Data

Now we want to have a look at a real live ATLAS root file. For this, go back to the remote terminal window on osgconnect. You will need a new condor submit script called run-z.cmd:

universe=vanilla
executable=run-z.sh
transfer_input_files = readEvents.C,muons.root
transfer_executable=True
when_to_transfer_output = ON_EXIT
log=run-z.log
transfer_output_files = root-z.out,histograms-z.root
output=run-z.out.$(Cluster).$(Process)
error=run-z.err.$(Cluster).$(Process)
notification=Never
queue 

The new executable script you need for this job is: run-z.sh which is as follows:

#!/bin/bash

# setup
source /cvmfs/sft.cern.ch/lcg/views/setupViews.sh LCG_105a x86_64-ubuntu2204-gcc11-opt

# execute
root -b -q readEvents.C+ > root-z.out

This script runs root in batch mode and executes input macro readEvents.C and produces output that is routed to file root-z.out. It has to be made executable, by use of the chmod Linux command (protections can be checked with the command ls -l):

$ chmod +x run-z.sh

The macro readEvents.C consists of the following code:

#include "TFile.h"
#include "TTree.h"
#include "TCanvas.h"
#include "TH1F.h"
#include "iostream"
//#include "TLorentzVector.h"
using namespace std;

void readEvents(){

    // load the ROOT ntuple file
    TFile * f = new TFile("muons.root");
    TTree *tree = (TTree *) f->Get("POOLCollectionTree");
    int nEntries = tree->GetEntries();
    cout << "There are " << nEntries << " entries in your ntuple" << endl;

    // create local variables for the tree's branches
    UInt_t NLooseMuons;
    Float_t LooseMuonsEta1;
    Float_t LooseMuonsPhi1;
    Float_t LooseMuonsPt1;

    Float_t LooseMuonsEta2;
    Float_t LooseMuonsPhi2;
    Float_t LooseMuonsPt2;

    // set the tree's branches to the local variables
    tree->SetBranchAddress("NLooseMuon", &NLooseMuons);
    tree->SetBranchAddress("LooseMuonEta1", &LooseMuonsEta1);
    tree->SetBranchAddress("LooseMuonPhi1", &LooseMuonsPhi1);
    tree->SetBranchAddress("LooseMuonPt1", &LooseMuonsPt1);

    tree->SetBranchAddress("LooseMuonEta2", &LooseMuonsEta2);
    tree->SetBranchAddress("LooseMuonPhi2", &LooseMuonsPhi2);
    tree->SetBranchAddress("LooseMuonPt2", &LooseMuonsPt2);

    // declare some histograms
  TH1F *muPt1 = new TH1F("muPt1", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *muPx1 = new TH1F("muPx1", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *muPy1 = new TH1F("muPy1", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *muPz1 = new TH1F("muPz1", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  TH1F *muEta1 = new TH1F("muEta1", ";#eta;Events", 50, -3, 3);
  TH1F *muPhi1 = new TH1F("muPhi1", ";#phi;Events", 50, -4, 4);
  TH1F *muE1 = new TH1F("muE1", ";Energy;Events", 50, 0, 200);

  TH1F *muPt2 = new TH1F("muPt2", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *muPx2 = new TH1F("muPx2", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *muPy2 = new TH1F("muPy2", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *muPz2 = new TH1F("muPz2", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  TH1F *muEta2 = new TH1F("muEta2", ";#eta;Events", 50, -3, 3);
  TH1F *muPhi2 = new TH1F("muPhi2", ";#phi;Events", 50, -4, 4);
  TH1F *muE2 = new TH1F("muE2", ";Energy;Events", 50, 0, 200);

  TH1F *zPt = new TH1F("zPt", ";p_{T} [GeV/c];Events", 50, 0, 200);
  TH1F *zPx = new TH1F("zPx", ";p_{x} [GeV/c];Events", 50, 0, 200); //added px
  TH1F *zPy = new TH1F("zPy", ";p_{y} [GeV/c];Events", 50, 0, 200); //added py
  TH1F *zPz = new TH1F("zPz", ";p_{z} [GeV/c];Events", 50, 0, 200); //added pz
  //TH1F *zEta = new TH1F("zEta", ";#eta;Events", 50, -3, 3);
  //TH1F *zPhi = new TH1F("zPhi", ";#phi;Events", 50, -4, 4);
  TH1F *zE = new TH1F("zE", ";Energy;Events", 50, 0, 200);  
  TH1F *zMass = new TH1F("zMass", ";Mass;Events", 50, 0, 200);  


    // loop over each entry (event) in the tree
    for( int entry=0; entry < nEntries; entry++ ){
      if( entry%10000 * 0 ) cout << "Entry:" << entry << endl;

      // check that the event is read properly
      int entryCheck = tree->GetEntry( entry );
      if( entryCheck <= 0 ){  continue; }

      // only look at events containing at least 2 leptons
      if(NLooseMuons < 2) continue;

      // require the leptons to have some transverse momentum
      if(abs(LooseMuonsPt1) *0.001 < 20 || abs(LooseMuonsPt2) *0.001 < 20 ) continue;

      // make a LorentzVector from the muon
      //TLorentzVector Muons1;
     // Muons1.SetPtEtaPhiM(fabs(LooseMuonsPt1), LooseMuonsEta1, LooseMuonsPhi1, 0);

      // print out the details of an electron every so often
      if( entry%10000 * 0 ){ 
        cout << "Muons pt1: " << LooseMuonsPt1 << " eta: " << LooseMuonsEta1 << " phi " << LooseMuonsPhi1 << endl;
        cout << "Muons pt2: " << LooseMuonsPt2 << " eta: " << LooseMuonsEta2 << " phi " << LooseMuonsPhi2 << endl;
      }

      //calculation of muon energy
        Double_t muonMass = 0.0;  // assume the mass of the muon is negligible
        Double_t muonPx1 = abs(LooseMuonsPt1)*cos(LooseMuonsPhi1);
        Double_t muonPy1 = abs(LooseMuonsPt1)*sin(LooseMuonsPhi1);
        Double_t muonPz1 = abs(LooseMuonsPt1)*sinh(LooseMuonsEta1);
    Double_t muonEnergy1 = sqrt (muonPx1*muonPx1 + muonPy1*muonPy1 + muonPz1*muonPz1 + muonMass*muonMass);

    Double_t muonPx2 = abs(LooseMuonsPt2)*cos(LooseMuonsPhi2);
        Double_t muonPy2 = abs(LooseMuonsPt2)*sin(LooseMuonsPhi2);
        Double_t muonPz2 = abs(LooseMuonsPt2)*sinh(LooseMuonsEta2);
    Double_t muonEnergy2 = sqrt (muonPx2*muonPx2 + muonPy2*muonPy2 + muonPz2*muonPz2 + muonMass*muonMass);

    Double_t zCompX = muonPx1 + muonPx2;
        Double_t zCompY = muonPy1 + muonPy2;
        Double_t zLongi = muonPz1 + muonPz2;
        Double_t zPerp = sqrt (zCompX*zCompX + zCompY*zCompY);  
    Double_t zEnergy = muonEnergy1 + muonEnergy2;
    Double_t zM = sqrt (zEnergy*zEnergy -zCompX*zCompX -zCompY*zCompY -zLongi*zLongi);


      // fill our histograms
        muPt1->Fill((LooseMuonsPt1)*0.001); // in GeV
        muEta1->Fill(LooseMuonsEta1);
        muPhi1->Fill(LooseMuonsPhi1);
    muPx1->Fill( muonPx1*0.001); // in GeV
    muPy1->Fill( muonPy1*0.001); // in GeV
    muPz1->Fill( muonPz1*0.001); // in GeV
        muE1->Fill(muonEnergy1*0.001); // in GeV

    muPt2->Fill((LooseMuonsPt2)*0.001); // in GeV
        muEta2->Fill(LooseMuonsEta2);
        muPhi2->Fill(LooseMuonsPhi2);
    muPx2->Fill( muonPx2*0.001); // in GeV
    muPy2->Fill( muonPy2*0.001); // in GeV
    muPz2->Fill( muonPz2*0.001); // in GeV
        muE2->Fill(muonEnergy2*0.001); // in GeV

    zPt->Fill( zPerp*0.001); // in GeV
    zPx->Fill( zCompX*0.001); // in GeV
    zPy->Fill( zCompY*0.001); // in GeV
    zPz->Fill( zLongi*0.001); // in GeV
        zE->Fill( zEnergy*0.001); // in GeV
        zMass->Fill(zM*0.001); // in GeV

    }

  // draw the eta distribution
  zMass->Draw();

  // make a ROOT output file to store your histograms
  TFile *outFile = new TFile("histograms-z.root", "recreate");
  muPt1->Write();
  muEta1->Write();
  muPhi1->Write();
  muE1->Write();
  muPx1->Write();
  muPy1->Write();
  muPz1->Write();

  muPt2->Write();
  muEta2->Write();
  muPhi2->Write();
  muE2->Write();
  muPx2->Write();
  muPy2->Write();
  muPz2->Write();

  zPt->Write();
  zE->Write();
  zPx->Write();
  zPy->Write();
  zPz->Write();
  zMass->Write();

  outFile->Close();
}

The grid job can be submitted using:

$ condor_submit run-z.cmd

It can again be checked with:

$ condor_q YOUR_USER_ID -nobatch

After it runs, you will find a log file that describes the job: run-z.log, and output file: root-z.out, and the files containing the simulated data: histograms-z.root in your test directory.

You again need to download those files to your local desktop:

See instructions for downloading mandle.gif in a previous session.

Go back to the local terminal window on your local desktop and locate the files you downloaded.

Use wget to download the input data file, muons.root, and execute the script to run root:

wget https://www.nhn.ou.edu/~hs/tmp/muons.root
./run-z.sh

You can inspect the contents of histograms-z.root by running Root (i.e., root histograms-z.root) in your current directory in your local terminal window:

$ root histograms-z.root

And then using the Root command: TBrowser b

With the TBrowser you can plot the variables in the root file. Double click on histograms-z.root, and then on the variables to plot them.

Step 3: Make TSelector

Now let's go back to the files created in step 1, in the local terminal window. Start root in your test directory with the following commands:

$ root -b

And then execute the following commands:

TFile f("t00.root");
t0->MakeSelector("s0","=legacy");
f.Close();
.q

This will create files s0.C and s0.h in your test directory that contain code corresponding to the definition of the TTree t0. This code can be used to process files containing data in these TTree's.

Now we will add a histogram to the TSelector code. Several code lines have to be added to the TSelector code files s0.C and s0.h.

To s0.h make the following additions: after existing include statements add:

#include <TH1F.h>

After class s0 definition:

class s0 : public TSelector {
public :

add

TH1F *e;

To s0.C make the following additions:

After entry:

void s0::SlaveBegin(TTree * /*tree*/)
{

add

e = new TH1F("e", "e", 1000, -199.0, 1200.0);

After Process entry:

Bool_t s0::Process(Long64_t entry)
{

add

GetEntry(entry);
e->Fill(Energy);

After terminate entry:

void s0::Terminate()
{

add

TFile f("histograms.root","RECREATE");
f.WriteObject(e,"Energy");
f.Close();

Now create the new script files for Step 2:

create run-root-2.cmd:

universe=vanilla
executable=run-root-2.sh 
transfer_input_files = s0.C,s0.h,run-root-2.C,t00.root,t01.root 
transfer_executable=True 
when_to_transfer_output = ON_EXIT 
log=run-root-2.log 
transfer_output_files = root-2.out,histograms.root 
output=run-root-2.out.$(Cluster).$(Process) 
error=run-root-2.err.$(Cluster).$(Process) 
notification=Never 
queue 

Create run-root-2.sh:

#!/bin/bash
# setup
source /cvmfs/sft.cern.ch/lcg/views/setupViews.sh LCG_105a x86_64-ubuntu2204-gcc11-opt

# execute
root -b < run-root-2.C > root-2.out 

It has to be made executable, by use of the chmod Linux command:

chmod +x run-root-2.sh

Create run-root-2.C

.L s0.C++ 
{ 
 //Load and run TSelector 

  s0 *s = new s0(); 

  TChain tc("t0"); 
  tc.Add("t*.root"); 
  tc.Process(s); 

} 

We can test the root job on the local machine by executing the script to run root:

./run-root-2.sh

If this works, we can process the data files t00.root and t01.root on the Grid with our new command script run-root-2.cmd.

Use the upload feature in ospool to upload files needed to submit this to condor:

s0.C, s0.h, run-root-2.C, run-root-2.cmd, run-root-2.sh.

Make sure t00.root and t01.root are present in ospool.

The condor job can now be submitted from ospool using this command:

condor_submit run-root-2.cmd

Go back to the local terminal window on your local desktop, and download the root files:

See download instructions for mandle.gif file in a previous session.

You can look at the output histogram file: histograms.root with TBrowser b as before, in your local terminal window.