Establishing the provenance of science data products is important to the users of HLSP collections for a number of reasons, including:
- Promoting data traceability and reproducibility
- Establishing the pedigree for the quality of the data processing
- Understanding the facilities, instruments, configurations used obtain the data, and under what environmental circumstances the data were obtained
Provenance should be recorded in multiple ways, including the Project Description file and the README file that must be included in every HLSP collection (see Required Contents). It is also established in the primary journal paper where each HLSP collection is described. Various of the metadata listed in this chapter as required or recommended partially address these needs. The focus of this article is the recording of sufficient metadata that associations can be created between HLSP products and the MAST mission products (or the products from other observatories) from which they were derived. Two approaches have been used for this purpose, as described in the subsections below.
The other sections of this chapter identified some keywords where the value in the contributing products may be different. To represent a meaningful value in the HLSP product:
- the value should be 'MULTI'
- the keyword record should be followed by supplemental records with an abbreviation of the original keyword, with a 2-digit numerical suffix
- each supplemental record should contain a value appropriate to the contributing product, in numerical order
The following table illustrates the concept for a composite UV spectrum of a target, using two HST instruments. Note that the contributing observations are linked to the HLSP product through the MAST Observation ID, shown in the table as '
An acceptable synonym for
FITS Provenance Extension
It is sometimes tedious, even impractical, to record critical metadata for every data product that contributed to constructing an HLSP product. Examples include:
- combined images constructed from many exposures,
- multi-band catalogs that draw data from many observing facilities,
- SEDs composed from spectra from multiple observing facilities, telescopes, and instruments in one or more configurations
In these cases it may be better to collect this information in a BINTABLE extension to each HLSP FITS product. The table should have the following attributes:
EXTNAME = 'PROVENANCE'in the extension header
- List each contributing data product, one per table row
- Use one table column per attribute
- At a minimum, include all relevant attributes that vary among contributing products
- Specify the data type and units (if applicable) in the header for each attribute
The following table gives examples of attributes that may be applicable to an HLSP collection.
|File name or observatory-unique identifier of the contributing observation. For products from MAST missions, provide the Observation ID so that the contributing data may be linked within MAST.
|ISO 8601-formatted date-time string for observation start
|ISO 8601-formatted date-time string for observation end
|Name of dispersing optical element used
|Name of (possibly passband limiting) filter used
|Name of instrument used
|Coordinate reference frame
|Name of telescope used
|Total duration of exposure in sec, exclusive of dead-time