Design#
Goals#
Likelihood maximization frameworks tend to have lots of moving parts. There are many ways to peel this potato [1]. Here are some of the design goals that drive the design of csky:
Modularity: csky should be extensible, both internally by developers and externally by the user. Separation of concerns should be taken seriously.
Performance: csky should be fast. Profilers can guide our eyes towards computational “hot spots”; we should use that information to optimize for speed. It’s ok to introduce tighter coupling than desired between the various parts, for the sake of performance, as long as the ugliness can still be limited in scope.
Brevity: csky usage should be concise. If you are interested in quickly reproducing a result you saw in some slides, it should be as straightforward as possible to do so; ideally you could write it out without referring to an example script.
Lightning Tour#
csky is organized into several modules with simple names. They are listed below, roughly in order of subjective importance (which is well-correlated with the extent to which they are deeply thought-out and thus likely to be relatively stable):
Analysis configuration is found in
csky.conf
, with several items imported directly under the top level ofcsky
.The signal vs. background discrimination PDFs are found in
csky.pdf
.The likelihood and its log likelihood optimization routines are found in
csky.llh
.Real data, scrambled data, and simulated signal injection are found in
csky.inj
.Trial operations are implemented in
csky.trial
.Various event selections are described in, and can be loading using,
csky.selections
.The basic building blocks common to many analyses – background space PDFs, signal acceptance parameterization, and energy PDF ratios – are organized using
csky.analysis
.Spectral hypotheses are characterized using
csky.hyp
. (TODO: it turned out that other features of signal hypotheses never fit well in this module; it should probably be renamed to e.g.csky.fluxes
orcsky.spectra
.)Test statistic distribution fitting is handled by
csky.dists
.Data manipulation and random state management are implemented in
csky.utils
.Inspection of implementation details is made a little easier by
csky.inspect
.A few plotting helpers are implemented in
csky.plotting
.A simple timing tool is given in
csky.timing
.
The following additional modules are highly stable, but don’t seem to belong near the top of the “lightning tour”:
Coordinate transformations are handled in
csky.coord
.Some relatively generic bookkeeping assistance (useful for getting information out of cluster job outputs) are found in
csky.bk
.Noisy healpy outputs are silenced using
csky.quiet_healpy
.
Likelihood Implementation#
The core task of this library is to define and evaluate the source search likelihood; pretty much everything else serves only to get data into and out of that machinery. The likelihood implementation in csky is structured as follows:
csky.pdf
defines PDF ratio models (the specifications) and evaluators (plug-and-chug calculators).csky.llh
defines log likelihood ratio (LLH) models and evaluators, largely in terms of the PDF ratio models and evaluators. This module also provides a framework for parameter fitting via likelihood maximization.csky.inj
provides an interface for generating dataset realizations using actual or randomized (thus background-like) data and/or simulated signals.csky.trial
gives the user-level interface for generating actual or randomized datasets and evaluating and/or maximizing likelihoods. This module also includes tools for performing batches of trials; estimating threshold quantities such as sensitivities, discovery potentials, and upper limits; and distributing trials over multiple local cores.