Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NanoScouting in Prompt #4991

Open
silviodonato opened this issue Sep 9, 2024 · 8 comments
Open

NanoScouting in Prompt #4991

silviodonato opened this issue Sep 9, 2024 · 8 comments

Comments

@silviodonato
Copy link

Dear Tier-0 experts,
from the Scouting group we would like to start having the NanoScouting production in Prompt, possibly before the end of the pp collision data taking (16/10/2024). All the code is available in IB and it will be included in 14_2_0_pre1 (#44970) and in one of the upcoming 14_0_X release (#45950 )

Describe the new feature you would like T0 to implement.

  • What are the goals of the new feature?
    • Provide the NanoScouting automatically during the data taking as soon as PromptReco is ready.
  • Which systems will it affect?
    • Tier-0 and storage of offline data

What are the changes and technical challenges?

  • The request is to run the nanoSequence defined in PhysicsTools/NanoAOD/python/custom_run3scouting_cff.py at Tier-0. A working example can be provided using cmsDriver.py [1]. Note that this sequence is tested in IB through workflow 2500.237.
    The input dataformat is HLTSCOUT and specifically the scouting dataset, ie. /ScoutingPFRun3/Run2024G-v1/HLTSCOUT.
    The output dataset should be in the NANO format, ie. the output dataset should be something like /ScoutingPFRun3/Run2024G-v1/NANO.
    Please note that this is a production of NANO directly from HLTSCOUT, as it is not possible to run the standard offline reconstruction (RECO step) in scouting dataset.

Describe the implementation timeline of the new feature and relevance for the CMS data taking.

  • perform a test with 14_2_0_pre1 as soon as the release will be available (#44970)
  • deploy it online as soon as a 14_0_X release containing #45950 is available (hopefully before 16/10/2024)

Additional information
This has been already discussed at the DeepDive on NanoAOD, with PPD (@malbouis @vlimant ) and TSG (@missirol @mtosi). @elfontan

[1]
cmsDriver.py step2 -s NANO:@Scout --process NANO --data --eventcontent NANOAOD --datatier NANOAOD -n 10000 --era Run3_2024 --conditions auto:run3_data_prompt --no_exec:

import FWCore.ParameterSet.Config as cms

from Configuration.Eras.Era_Run3_2024_cff import Run3_2024

process = cms.Process('NANO',Run3_2024)

# import of standard configurations
process.load('Configuration.StandardSequences.Services_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')
process.load('FWCore.MessageService.MessageLogger_cfi')
process.load('Configuration.EventContent.EventContent_cff')
process.load('Configuration.StandardSequences.GeometryRecoDB_cff')
process.load('Configuration.StandardSequences.MagneticField_cff')
process.load('PhysicsTools.NanoAOD.custom_run3scouting_cff')
process.load('Configuration.StandardSequences.EndOfProcess_cff')
process.load('Configuration.StandardSequences.FrontierConditions_GlobalTag_cff')

process.maxEvents = cms.untracked.PSet(
    input = cms.untracked.int32(10000),
    output = cms.optional.untracked.allowed(cms.int32,cms.PSet)
)

# Input source
process.source = cms.Source("PoolSource",
    fileNames = cms.untracked.vstring('file:step2_PAT.root'),
    secondaryFileNames = cms.untracked.vstring()
)

process.options = cms.untracked.PSet(
    IgnoreCompletely = cms.untracked.vstring(),
    Rethrow = cms.untracked.vstring(),
    TryToContinue = cms.untracked.vstring(),
    accelerators = cms.untracked.vstring('*'),
    allowUnscheduled = cms.obsolete.untracked.bool,
    canDeleteEarly = cms.untracked.vstring(),
    deleteNonConsumedUnscheduledModules = cms.untracked.bool(True),
    dumpOptions = cms.untracked.bool(False),
    emptyRunLumiMode = cms.obsolete.untracked.string,
    eventSetup = cms.untracked.PSet(
        forceNumberOfConcurrentIOVs = cms.untracked.PSet(
            allowAnyLabel_=cms.required.untracked.uint32
        ),
        numberOfConcurrentIOVs = cms.untracked.uint32(0)
    ),
    fileMode = cms.untracked.string('FULLMERGE'),
    forceEventSetupCacheClearOnNewRun = cms.untracked.bool(False),
    holdsReferencesToDeleteEarly = cms.untracked.VPSet(),
    makeTriggerResults = cms.obsolete.untracked.bool,
    modulesToCallForTryToContinue = cms.untracked.vstring(),
    modulesToIgnoreForDeleteEarly = cms.untracked.vstring(),
    numberOfConcurrentLuminosityBlocks = cms.untracked.uint32(0),
    numberOfConcurrentRuns = cms.untracked.uint32(1),
    numberOfStreams = cms.untracked.uint32(0),
    numberOfThreads = cms.untracked.uint32(1),
    printDependencies = cms.untracked.bool(False),
    sizeOfStackForThreadsInKB = cms.optional.untracked.uint32,
    throwIfIllegalParameter = cms.untracked.bool(True),
    wantSummary = cms.untracked.bool(False)
)

# Production Info
process.configurationMetadata = cms.untracked.PSet(
    annotation = cms.untracked.string('step2 nevts:10000'),
    name = cms.untracked.string('Applications'),
    version = cms.untracked.string('$Revision: 1.19 $')
)

# Output definition

process.NANOAODoutput = cms.OutputModule("NanoAODOutputModule",
    compressionAlgorithm = cms.untracked.string('LZMA'),
    compressionLevel = cms.untracked.int32(9),
    dataset = cms.untracked.PSet(
        dataTier = cms.untracked.string('NANOAOD'),
        filterName = cms.untracked.string('')
    ),
    fileName = cms.untracked.string('step2_NANO.root'),
    outputCommands = process.NANOAODEventContent.outputCommands
)

# Additional output definition

# Other statements
from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, 'auto:run3_data_prompt', '')

# Path and EndPath definitions
process.nanoAOD_step = cms.Path(process.nanoSequence)
process.endjob_step = cms.EndPath(process.endOfProcess)
process.NANOAODoutput_step = cms.EndPath(process.NANOAODoutput)

# Schedule definition
process.schedule = cms.Schedule(process.nanoAOD_step,process.endjob_step,process.NANOAODoutput_step)
from PhysicsTools.PatAlgos.tools.helpers import associatePatAlgosToolsTask
associatePatAlgosToolsTask(process)



# Customisation from command line

# Add early deletion of temporary data products to reduce peak memory need
from Configuration.StandardSequences.earlyDeleteSettings_cff import customiseEarlyDelete
process = customiseEarlyDelete(process)
# End adding early deletion
@silviodonato
Copy link
Author

silviodonato commented Sep 16, 2024

FYI: cms-sw/cmssw#45950 has been just merged, so this request can be implemented directly on the next 14_0_X release (ie. CMSSW_14_0_15_patch2 or CMSSW_14_0_16)

@patinkaew
Copy link

patinkaew commented Sep 18, 2024

Hi @silviodonato, all

CMSSW_14_0_16 was just released with cms-sw/cmssw#45950. Release note

Additionally, scenario hltScoutingEra_Run3_2024 is included in the PR as well. This is designed for producing ScoutingNano from HLTSCOUT datatier (ScoutingPFRun3 dataset).

Some testing with RunPromptReco.py was also performed in PR:

python3 Configuration/DataProcessing/test/RunPromptReco.py --scenario=hltScoutingEra_Run3_2024 \
--global-tag=140X_dataRun3_Prompt_v4 --nanoaod --nanoFlavours=@Scout --lfn=fileIN.root

@malbouis
Copy link
Contributor

Dear @patinkaew and @silviodonato , would you please take a look at this cmsTalk post from Antonio, where he tried out the NanoScouting workflow and let him know if anything is missing from the configuration?
Thanks!

@LinaresToine
Copy link
Contributor

Thank you @malbouis @patinkaew @silviodonato for your responses on the cmstalk post. The replay was successful after disabling AOD and MINIAOD output.

@malbouis
Copy link
Contributor

Thanks, @LinaresToine !
I think it would be nice if the NanoScouting experts could check the files produced by the replay and provide feedback on the content as well as if it is according to expectations.

@silviodonato
Copy link
Author

Hi @malbouis,
from scouting point of view we confirm that the event content looks as expected, so everything looks ok to deploy it online.

@LinaresToine
Copy link
Contributor

In agreement with ORM @jeyserma , we will deploy it in production along with the likely coming era change during MD4.

@LinaresToine
Copy link
Contributor

For the record, the new scenario was deployed and ScoutingPFRun3 PD will now produce nanoaod in production.

https://cms-talk.web.cern.ch/t/acqusition-era-change-to-run2024i/51084

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants