Difference between revisions of "UMICH-2015: Simulations & Data Analysis Break-Out Session 1"

From CMB-S4 wiki
Jump to navigationJump to search
(Created page with "Nick Battaglia and ... The roll simulations will play in focusing the science goals of S4 What are the simulation requirements for S4? What do we have now? ==Wiki navigation=...")
 
 
(76 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Nick Battaglia and ...
+
== What needs to be in the S&DA section of the CMB-S4 Science Book? ==
  
The roll simulations will play in focusing the science goals of S4
+
For each mission phase (design, construction, deployment, observation):
 +
* What are the S&DA goals?
 +
* What (human, compute) resources are required? available?
  
What are the simulation requirements for S4?
+
Given this:
 +
* Where does the balance lie between veracity and tractability?
 +
* How do we ensure that our progress through phases is evolutionary not revolutionary?
  
What do we have now?
+
''Please add to the italic-headed lists of resources and their points of contact below''
 +
 
 +
 
 +
'''Simulations'''
 +
 
 +
[[File:sim.jpg|400px]]
 +
 
 +
A hierarchy of methods of decreasing veracity/increasing tractability.
 +
 
 +
Simulations used for:
 +
* Forecasting/Design
 +
** sample mission model space for given sky model(s)
 +
** requires explicit metric(s), eg. Science/$
 +
* Validation & verification
 +
** representative & self-consistent mission realization(s)
 +
** bootstrapping V&V of both data analysis and simulation codes
 +
* Debiasing & uncertainty quantification
 +
** accuracy & statistics
 +
** dominant computational cost
 +
 
 +
For each simulation element (sky simulation, data simulation):
 +
* What is the current status?
 +
* What is required in each mission phase?
 +
* What are the challenges to meeting these requirements?
 +
 
 +
 
 +
''Sky Simulation Tools/POC''
 +
* PSM (Delabrouille): IDL code generating 10-component foreground model
 +
* PPSM working group further develop the post Planck sky model - port into python?
 +
* Hydrodynamical sims for extragalactic secondary anisotropies - lensing, tSZ, kSZ, CIB, cross-correlations, point sources, etc. for forecasts
 +
 
 +
 
 +
''Data Simulation Tools/POC''
 +
* PSM (Delabrouille): IDL code applying detector band-passes to sky model; pyPSM python re-implementation in progress as precursor to web interface
 +
* TOAST (Kisner): C++ MPI/OpenMP tools for massively parallel TOD manipulation, including OTFS of colored correlated noise timelines; pyTOAST python re-implementation in progress including OTFS of full-beam convolved sky timelines & diskless interface with pre-processing
 +
* CONVIQT/libCONVIQT (Prezeau/Keskitalo): full-beam convolution in the time domain
 +
* FEBeCoP (Rocha): effective-beam convolution in the pixel domain
 +
* Fisher matrix methods
 +
 
 +
 
 +
 
 +
'''Data Analysis'''
 +
 
 +
[[File:da.jpg|400px]]
 +
 
 +
How do we combine data from multiple telescopes of multiple classes at multiple sites?
 +
 
 +
* Working group needed here
 +
 
 +
What do we do about data covariance (functional forms, approximate matrices, Monte Carlos, other)?
 +
 
 +
For each data analysis element (mission characterization, pre-processing, map-making, component separation, power spectrum estimation, parameter estimation, ... ):
 +
* What is the current status?
 +
* What is required in each mission phase?
 +
* What are the challenges to meeting these requirements?
 +
 
 +
 
 +
''Map Making Tools/POC''
 +
* MADAM/TOAST (Keskitalo): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.
 +
* Springtide (Ashdown): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.
 +
 
 +
 
 +
''Component Separation Tools/POC''
 +
 
 +
 
 +
''Power Spectrum Estimators/POC''
 +
 
 +
 
 +
''Parameter Estimators/POC''
 +
 
 +
 
 +
 
 +
'''Data Challenge & Computational Resources'''
 +
 
 +
[[File:cmb_hpc_scaling.jpg|400px]]
 +
 
 +
Suborbital data growth exactly tracks Moore's Law; satellite data grow at half the rate.
 +
 
 +
Beware of committing to compression (lossy, outdated)
 +
 
 +
 
 +
''Computational Resources (Including Scale & Accessibility)''
 +
* Currently available
 +
** NERSC: DOE general purpose HPC center with new top-20 system every 3 years; 1% of cycles annually for 20 years, accounts for anyone, public data repository, http://crd.lbl.gov/cmb
 +
* Future potential
 +
** ALCF: DOE leadership HPC center with a new top-10 system every 3 years; limited number of users & 'heroic' computations, next-generation architecture in common with NERSC
 +
** NSC: Chinese HPC center with world #1 system; limited number of Chinese-led users & 'heroic' computations
 +
** SciNet: Canadian HPC center
 +
 
 +
 
 +
'''Collaboration'''
 +
 
 +
* Dedicated sky modeling activity!
 +
** Clean
 +
** Self-consistent
 +
** Usable
 +
* Pipelines & interfaces
 +
** Tightly-coupled pipelining where I/O is prohibitive (time-domain) - interface in memory.
 +
 
 +
[[File:tight.jpg|400px]]
 +
 
 +
** Loosely-coupled pipelining where I/O is reasonable (pixel-, multipole-, parameter-domains) - interface on disk.
 +
 
 +
[[File:loose.jpg|400px]]
 +
 
 +
** Support for both rapid prototyping and efficient production
 +
* Standard data objects & formats - combine with data working group
 +
** Define file and memory formats for interfacing, informed by computational efficiency.
 +
** Generalized mission model
 +
* Data distribution
 +
** Take-out vs Eat-in
 +
* Synergies with S3, LiteBIRD, COrE+, etc
 +
** Common sky models
 +
** Parallel pipelines (V&V, prototyping, general vs specific)
 +
 
 +
 
 +
 
 +
 
 +
----
 +
 
 +
Fast mocks
  
 
==Wiki navigation==
 
==Wiki navigation==
 
[[Cosmology with CMB-S4|Return to main workshop page]]
 
[[Cosmology with CMB-S4|Return to main workshop page]]
  
[[UMICH-2015: Computing / simulations and forecasting / analysis challenges| Return to / simulations and forecasting / analysis challenges sessions page]]
+
[[UMICH-2015: Simulations & Data Analysis| Return to Simulations & Data Analysis page]]

Latest revision as of 11:45, 22 September 2015

What needs to be in the S&DA section of the CMB-S4 Science Book?

For each mission phase (design, construction, deployment, observation):

  • What are the S&DA goals?
  • What (human, compute) resources are required? available?

Given this:

  • Where does the balance lie between veracity and tractability?
  • How do we ensure that our progress through phases is evolutionary not revolutionary?

Please add to the italic-headed lists of resources and their points of contact below


Simulations

Sim.jpg

A hierarchy of methods of decreasing veracity/increasing tractability.

Simulations used for:

  • Forecasting/Design
    • sample mission model space for given sky model(s)
    • requires explicit metric(s), eg. Science/$
  • Validation & verification
    • representative & self-consistent mission realization(s)
    • bootstrapping V&V of both data analysis and simulation codes
  • Debiasing & uncertainty quantification
    • accuracy & statistics
    • dominant computational cost

For each simulation element (sky simulation, data simulation):

  • What is the current status?
  • What is required in each mission phase?
  • What are the challenges to meeting these requirements?


Sky Simulation Tools/POC

  • PSM (Delabrouille): IDL code generating 10-component foreground model
  • PPSM working group further develop the post Planck sky model - port into python?
  • Hydrodynamical sims for extragalactic secondary anisotropies - lensing, tSZ, kSZ, CIB, cross-correlations, point sources, etc. for forecasts


Data Simulation Tools/POC

  • PSM (Delabrouille): IDL code applying detector band-passes to sky model; pyPSM python re-implementation in progress as precursor to web interface
  • TOAST (Kisner): C++ MPI/OpenMP tools for massively parallel TOD manipulation, including OTFS of colored correlated noise timelines; pyTOAST python re-implementation in progress including OTFS of full-beam convolved sky timelines & diskless interface with pre-processing
  • CONVIQT/libCONVIQT (Prezeau/Keskitalo): full-beam convolution in the time domain
  • FEBeCoP (Rocha): effective-beam convolution in the pixel domain
  • Fisher matrix methods


Data Analysis

Da.jpg

How do we combine data from multiple telescopes of multiple classes at multiple sites?

  • Working group needed here

What do we do about data covariance (functional forms, approximate matrices, Monte Carlos, other)?

For each data analysis element (mission characterization, pre-processing, map-making, component separation, power spectrum estimation, parameter estimation, ... ):

  • What is the current status?
  • What is required in each mission phase?
  • What are the challenges to meeting these requirements?


Map Making Tools/POC

  • MADAM/TOAST (Keskitalo): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.
  • Springtide (Ashdown): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.


Component Separation Tools/POC


Power Spectrum Estimators/POC


Parameter Estimators/POC


Data Challenge & Computational Resources

Cmb hpc scaling.jpg

Suborbital data growth exactly tracks Moore's Law; satellite data grow at half the rate.

Beware of committing to compression (lossy, outdated)


Computational Resources (Including Scale & Accessibility)

  • Currently available
    • NERSC: DOE general purpose HPC center with new top-20 system every 3 years; 1% of cycles annually for 20 years, accounts for anyone, public data repository, http://crd.lbl.gov/cmb
  • Future potential
    • ALCF: DOE leadership HPC center with a new top-10 system every 3 years; limited number of users & 'heroic' computations, next-generation architecture in common with NERSC
    • NSC: Chinese HPC center with world #1 system; limited number of Chinese-led users & 'heroic' computations
    • SciNet: Canadian HPC center


Collaboration

  • Dedicated sky modeling activity!
    • Clean
    • Self-consistent
    • Usable
  • Pipelines & interfaces
    • Tightly-coupled pipelining where I/O is prohibitive (time-domain) - interface in memory.

Tight.jpg

    • Loosely-coupled pipelining where I/O is reasonable (pixel-, multipole-, parameter-domains) - interface on disk.

Loose.jpg

    • Support for both rapid prototyping and efficient production
  • Standard data objects & formats - combine with data working group
    • Define file and memory formats for interfacing, informed by computational efficiency.
    • Generalized mission model
  • Data distribution
    • Take-out vs Eat-in
  • Synergies with S3, LiteBIRD, COrE+, etc
    • Common sky models
    • Parallel pipelines (V&V, prototyping, general vs specific)




Fast mocks

Wiki navigation

Return to main workshop page

Return to Simulations & Data Analysis page