Skip to content

Configuration files

Gary Bernstein edited this page Jul 23, 2020 · 18 revisions

The first step of using gbdes codes is to create configuration tables that constitute a little relational database about all of the extensions, exposures, instruments, devices, and fields that you are matching together. The program configure.py is run first to create this file, which we'll usually give the suffix <basename>.db.

You will need to tell configure.py where all of your catalogs are, what fields you have defined, and where it should look for all the information that will be needed by the code later. You do this by giving configure.py one or more YAML-format configuration files of your own. The command line for this step is

% configure.py input1.config [input2.config ...] <basename>.db

Each argument except the last is the name of an input file in YAML format. The last argument is the output FITS filename. The formats of these input files are described below. One important thing to remember is that the information in these files is given increasing priority as we go through the list - anything defined in input2.config will override a definition in input1.config, etc., and the same thing holds for order within a file.

This program will run straightforwardly but can take a long time because it needs to access the headers of all the extensions of all the FITS files that you are using as input.

YAML format is very simple to read and write by hand, here is a quick introduction. gbdes does not use any fancy features like cross-references.

The YAML input files each are in the form of maps. You must supply three kinds of information, corresponding to three keys in the root-level YAML map:

  • Fields: Each entry describes one field for object matching
  • Files: Each entry describes where to find one (or more) input catalogs
  • Attributes: Each entry describes an input characteristic of one or more of the input catalogs, either giving the information outright or saying where to find it in a catalog header or column.

Field specifications

Each exposure must be assigned to a field. Here is where you say what the fields are. Here's an example:

Fields:
- name:  0640-34
  coords:    06:40:00  -34:00:00
  radius: 2.
  pmepoch: 2016.8
- name:  0730-50
  coords:    07:30:00 -50:00:00
  radius: 2.
- name:  1327-48
  # Outskirts of Omega Cen:
  coords:    13:27:00 -48:45:00
  radius: 2.

As you can see, Fields is a key in the root-level YAML map node. The value associated with this key is a list. Each entry of the list describes one field. The field description is a map, which must contain these keys:

  • name: A string name for this field.
  • coords: ICRS coordinates of the center of the field, in hours, degrees for RA, Dec respectively. Any format that astropy.coordinates.Coords can read is good here.
  • radius: The half-width of a box that contains all the objects you want to match in this field, in degrees. The matching code in WCSFoF needs to know this and will fail if anything falls outside the box, so be generous.
  • pmepoch: This is used only in WCSFit in the case where freePM=false (not fitting PM and parallax to each star) but you have a reference catalog (Gaia DR2) giving proper-motion data. In this case the pmepoch will be used to derive 2d positions for the reference stars in this field that use this epoch to include proper motion corrections to RA,Dec. Currently no parallax shift is applied.

You can have as many fields as you want in the list, and each input YAML file can have its own distinct Fields key - they'll be merged. [Note the use of # to denote a comment line in YAML.]

File specifications

Next you need to say where all the input FITS catalogs can be found. Here is an example:

Files:
- glob: CATS/D00*.cat
  expkey: < EXPNUM
  translation: '(.*)=D\1'
# The Gaia catalog extracted for this zone
- glob:  CATS/*.gaia.cat
  expkey: gaia

Under the Files key of the root node is a list of entries. Each is a YAML map node, with these keys:

  • glob: A UNIX-style filename glob (or a single filename) - everything that matches it will be assumed to be a FITS file with valid object catalogs as described in Preparing catalogs. As a special bonus, you specify numerical ranges in parentheses such as
- glob: D00453(255-272).*.cat

which will be expanded by matching all globs D00453255.*.cat, D00453256.*.cat, ... D00453272.*.cat. It's ok to have zero matches for a glob.

  • expkey: Tells how to assign the attribute EXPOSURE to the catalogs found in the matching files. There are multiple ways to specify this:
    • Just give a string-valued name, such as gaia that is assigned to the second glob above.
    • If the string begins with <, such as in the first example, it means that the value will be read from the specified FITS header keyword. Recall that the catalog's extension header is searched first, and then the primary header is searched for the keyword. In the example above, the value under header keyword EXPNUM is taken as the value of the EXPOSURE attribute.
    • The special value _FILENAME means that the name of the FITS file will be taken as the value of EXPOSURE
    • Enter nothing: the default value is expkey: < EXPNUM since DECam exposures all record a unique serial exposure number in the header (which is treated as a string here).
  • translation: (optional) specifies a regular expression operation that can be performed on the value retrieved from the specified expkey. The format is :regex:=:replace:, where :regex: and :replace: follow the syntax for Python's regular expression module. In the example above, the EXPNUM string will be prefixed by D so that we end up with something like EXPOSURE="D453255". [Obviously your regular expression cannot include the = character.]

The Files lists from your input YAML files will be concatenated. If an input file matches more than one glob, it will be used multiple times in the subsequent fitting processes, which is probably not what you want.

Attribute specifications

The bulk of your configuration inputs will specify attributes that are to be associated with each input extension. Each attribute will end up being a column in the Extensions table of the output config file. Here's a part of our standard list of attributes, which we'll use to illustrate the ways you can specify attributes.

Attributes:
- key:    INSTRUMENT
  value:  < BAND
  translation: '^\s*(.).*$=\g<1>${EPOCH}'
- key:    BAND
  value:  < BAND
  translation: '^\s*(.).*$=\1'
- key:    FIELD
  value:  _NEAREST
- key:    DEVICE
  value:  < DETPOS
- key:    RA
  value:  < RA
- key:    DEC
  value:  < DEC
- key:    EPOCH
  value:  < CALEPOCH
  default: '00000000'
- key:    AFFINITY
  value:  < BAND
  default: STELLAR
- key:    MJD
  vtype:  float
  value:  < MJD-OBS
- key:    MAGKEY
  value:  MAG_APER[7]
# Now overrides that apply to the Gaia reference fields
- key:  INSTRUMENT
  select: gaia.*
  value: REFERENCE
- key:  BAND
  select: gaia.*
  value: REFERENCE
- key:  WCSIN
  select: gaia.*
  value: _ICRS
- key:  XKEY
  select: gaia.*
  value: RA_ICRS
- key:  YKEY
  select: gaia.*
  value: DE_ICRS

As with Fields and Files, the Attributes key at YAML root level has a value that is a list of attributes, and the lists from your multiple input files are concatenated. Each attribute on the list has the following keys:

  • key: The name of the attribute.
  • value: The value to be assigned to the attribute. You can either
    • Specify a constant value directly. Note that for attributes like MAGKEY that are expecting values to be names of columns in the FITS table, the format MAG_APER[7] indicates that this column is going to be array-valued, and that we will be extracting element 7 from this array (using zero-indexing)
    • A value beginning with < is followed by the name of a header keyword whose corresponding value will be given to the attribute. Recall that the extension's header will be searched first, then the primary header.
    • Certain attribute keys have special values that begin with _. A table of these is below.
  • default: (optional) When the value for an attribute refers to a header keyword, and the keyword is not found in either extension header or the primary header, then the value specified as default is used. If no default is given, an error results.
  • translation: (optional) The value for the attribute that was given, retrieved from a header, or defaulted can be optionally translated by a regular expression match and replacement. The format is :regex:=:replace:, as above. Note that all attribute values are treated as strings at this point.
  • vtype: (optional) The attribute value is assigned the type given here. It should be a python builtin type, such as float or int. The default is str.
  • select: (optional) A regular expression which is applied to the EXPOSURE value for each extension. The value specified here is assigned to the key only if the expression matches. This allows you to specify values that will be used only for particular subsets of the input files. In the example above, the select: gaia.* entries for the XKEY and YKEY values mean that for the Gaia catalogs, we'll be searching for coordinates under the columns RA_ICRS and DE_ICRS instead of the values used for the other fields.

The select property for an attribute allows you to have multiple attribute entries for a given key and get a value that depends on the EXPOSURE value. configure.py expects that you have entered the possible values in increasing order of priority: the list of attributes that you give is searched from back to front, and the first entry whose select value matches the EXPOSURE value is the one assigned to that extension. So you should put your more specialized attribute entries at the back of your list, as the example above has done for the Gaia catalogs.

Variable substitution

The value strings for attributes can have variables with the format ${<varname>}. In the example above, look at the INSTRUMENT attribute. The value is < BAND, meaning that the header entry with keyword BAND will have the value given to BAND. Let's say this is the string g. Then the translation regular expressions will convert this to g${EPOCH}. The variable translation mechanism will be called (after all other attributes, including EPOCH, have been assigned values, and the value of attribute EPOCH will be substituted into the expression. The default value for EPOCH is the string 00000000, so we'll end up with INSTRUMENT=g00000000. If we had an CALEPOCH key somewhere in the header with value 20140809, we will end up with INSTRUMENT=g20140809.

The name of any string-valued attribute can be used as a variable, and variables can be used in the values of any string-valued attribute (except EXPOSURE). It is up to you to make sure that there are no circular dependencies in the use of variable names in attributes - the substitution algorithm is not sophisticated about dependencies.

Special Attributes

Certain attributes get special treatment by configure.py because they're assumed to be characteristics of an exposure, and there is therefore some checking that there is agreement between the values assigned to all the different devices' catalogs from the same exposure. These are INSTRUMENT, FIELD, RA, DEC, AIRMASS, EXPTIME, MJD, BAND, EPOCH, and APCORR.

A few other attributes are generated automatically, you don't specify them in the yaml file. These are EXPOSURE (which comes from the Files part of the YAML file), EXTENSION, and FILENAME.

Special Values

This table lists the special value fields that are available for certain attribute keys.

Attribute Value Meaning
expkey _FILENAME Use the catalog filename as starting value for EXPOSURE (in Files)
FIELD _NEAREST Assign field whose center coordinates are closest to exposure's (RA,DEC)
INSTRUMENT _REFERENCE This catalog has no free parameters in its maps
INSTRUMENT _TAG Do not use this catalog in solutions, just match it to others
WCSIN _ICRS The input (x,y) coordinates are already ICRS (RA, Dec) in degrees, so WCS is Identity map
WCSIN _HEADER Read the starting WCS from FITS the extension's FITS header
WCSIN <name>@<file> The starting WCS will be the one with given name serialized in file.
IDKEY _ROW Detections' ids will simply be the number of the table row in which they appear.

Required for processing

The config/sample.yaml is an annotated example of all the entries that are required or often useful for running the PhotoFit and/or WCSFit processes.

Clone this wiki locally