UMICH-2015: Simulations & Data Analysis Break-Out Session 1
What needs to be in the S&DA section of the CMB-S4 Science Book?
For each mission phase (design, construction, deployment, observation):
- What are the S&DA goals?
- What (human, compute) resources are required? available?
Given this:
- Where does the balance lie between veracity and tractability?
- How do we ensure that our progress through phases is evolutionary not revolutionary?
Please add to the italic-headed lists of resources and their points of contact below
Simulations
A hierarchy of methods of decreasing veracity/increasing tractability.
Simulations used for:
- Forecasting/Design
- sample mission model space for given sky model(s)
- requires explicit metric(s), eg. Science/$
- Validation & verification
- representative & self-consistent mission realization(s)
- bootstrapping V&V of both data analysis and simulation codes
- Debiasing & uncertainty quantification
- accuracy & statistics
- dominant computational cost
For each simulation element (sky simulation, data simulation):
- What is the current status?
- What is required in each mission phase?
- What are the challenges to meeting these requirements?
Sky Simulation Tools/POC
- PSM (Delabrouille): IDL code generating 10-component foreground model
- PPSM working group further develop the post Planck sky model - port into python?
- Hydrodynamical sims for extragalactic secondary anisotropies - lensing, tSZ, kSZ, CIB, cross-correlations, point sources, etc. for forecasts
Data Simulation Tools/POC
- PSM (Delabrouille): IDL code applying detector band-passes to sky model; pyPSM python re-implementation in progress as precursor to web interface
- TOAST (Kisner): C++ MPI/OpenMP tools for massively parallel TOD manipulation, including OTFS of colored correlated noise timelines; pyTOAST python re-implementation in progress including OTFS of full-beam convolved sky timelines & diskless interface with pre-processing
- CONVIQT/libCONVIQT (Prezeau/Keskitalo): full-beam convolution in the time domain
- FEBeCoP (Rocha): effective-beam convolution in the pixel domain
- Fisher matrix methods
Data Analysis
How do we combine data from multiple telescopes of multiple classes at multiple sites?
- Working group needed here
What do we do about data covariance (functional forms, approximate matrices, Monte Carlos, other)?
For each data analysis element (mission characterization, pre-processing, map-making, component separation, power spectrum estimation, parameter estimation, ... ):
- What is the current status?
- What is required in each mission phase?
- What are the challenges to meeting these requirements?
Map Making Tools/POC
- MADAM/TOAST (Keskitalo): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.
- Springtide (Ashdown): Fortran MPI/OpenMP destriper with TOAST OTFS & Monte Carlo capabilities.
Component Separation Tools/POC
Power Spectrum Estimators/POC
Parameter Estimators/POC
Data Challenge & Computational Resources
Suborbital data growth exactly tracks Moore's Law; satellite data grow at half the rate.
Beware of committing to compression (lossy, outdated)
Computational Resources/POC
- Currently available
- NERSC (Borrill): DOE general purpose HPC center with new top-10 system every 3 years; 1% of cycles annually for 20 years, accounts for anyone, public data repository, http://crd.lbl.gov/cmb
- Future potential
- ALCF: DOE leadership HPC center with a new top-5 system every 3 years; limited number of users & 'heroic' computations, next-generation architecture in common with NERSC
- NSC: Chinese HPC center with world #1 system; limited number of Chinese-led users & 'heroic' computations
- SciNet: Canadian HPC center
Collaboration
- Dedicated sky modeling activity!
- Clean
- Self-consistent
- Usable
- Pipelines & interfaces
- Tightly-coupled pipelining where I/O is prohibitive (time-domain) - interface in memory.
- Loosely-coupled pipelining where I/O is reasonable (pixel-, multipole-, parameter-domains) - interface on disk.
- Support for both rapid prototyping and efficient production
- Standard data objects & formats - combine with data working group
- Define file and memory formats for interfacing, informed by computational efficiency.
- Generalized mission model
- Data distribution
- Take-out vs Eat-in
- Synergies with S3, LiteBIRD, COrE+, etc
- Common sky models
- Parallel pipelines (V&V, prototyping, general vs specific)
Fast mocks