{ "cells": [ { "cell_type": "markdown", "id": "cd633213", "metadata": {}, "source": [ "# JWST SI Keyword Search for Observations\n", "## Introduction\n", "\n", "Thus tutorial will illustrate how to use MAST API to search for JWST science data by values of [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) header keywords, and then retrieve all products for the corresponding Observations. \n", "Searching by SI Keyword values and accessing all data products is not supported in the [MAST Portal](https://mast.stsci.edu/portal/Mashup/Clients/Mast/Portal.html), nor with the [astroquery.mast](https://astroquery.readthedocs.io/en/latest/mast/mast.html) `Observations` class by itself. \n", "\n", "Specifically, this tutorial will show you how to:\n", "* Use the `Mast` class of [astroquery.mast](https://astroquery.readthedocs.io/en/latest/mast/mast.html) to search for JWST science files by values of [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) header keywords\n", "* Construct a unique set of Observation IDs to perform a search with the astroquery.mast `Observation` class\n", "* Fetch the unique data products associated with the Observations\n", "* Filter the results for science products\n", "* Download a bash script that retrieve the filtered products\n", "\n", "
Mast
class to search for FITS products that match values of user-specified keywords, where the set of possible keywords is very large. Returns only FITS products, and only finds highest level of calibrated products (generally, L-2b and L-3). Observations
class to search for data products that match certain metadata values. The available metadata upon which to conduct such a search is limited to coordinates, timestamps, and a modest set of instrument configuration information. Returns MAST Observations
objects, which are collections of all levels of products (all formats) and all ancillary data products. _mjd
appended. The values are equivalent, but are expressed in ISO-8601 and MJD representations, respectively. \n",
"\n",
"Change or add keywords and values to the keywords
dictionary below to customize your criteria. Note that multiple, discreet-valued parameters are given in a list. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7454aea",
"metadata": {},
"outputs": [],
"source": [
"keywords = {\n",
" 'category': ['COM','ERS']\n",
" ,'exp_type': ['NIS_SOSS']\n",
" ,'tsovisit': ['T']\n",
" #,'productLevel': [3]\n",
" ,'date_obs_mjd': [set_mjd_range('2022-06-01','2022-08-04')]\n",
"}\n",
"\n",
"params = {\n",
" 'columns': '*',\n",
" 'filters': set_params(keywords)\n",
" }"
]
},
{
"cell_type": "markdown",
"id": "eeecd0b8",
"metadata": {},
"source": [
"The following cell displays the constructed parameter object to illustrate the syntax for the query, which is described formally [here](https://mast.stsci.edu/api/v0/_services.html#MastScienceInstrumentKeywordsNircam). "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c3ef4b17",
"metadata": {},
"outputs": [],
"source": [
"params"
]
},
{
"cell_type": "markdown",
"id": "9491f5c9",
"metadata": {},
"source": [
"The full selection of keywords upon which to build search criteria is described in the [Field Descriptions for JWST Instrument Keywords](https://mast.stsci.edu/api/v0/_jwst_inst_keywd.html). Note that [astroquery.mast](https://astroquery.readthedocs.io/en/latest/mast/mast.html) parameter names do not always match the FITS keyword names. "
]
},
{
"cell_type": "markdown",
"id": "d584c72a",
"metadata": {},
"source": [
"## Execute the SI Keyword Search\n",
"\n",
"\n",
"This type of query is a little more primitive in [astroquery.mast](https://astroquery.readthedocs.io/en/latest/mast/mast.html) than that for the `Observation` class. Begin by specifying the webservice for the query, which for this case is the [SI keyword search for NIRCam](https://mast.stsci.edu/api/v0/_services.html#MastScienceInstrumentKeywordsNiriss). Then execute the query with arguments for the service and the search parameters that were created above."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e08f9e66",
"metadata": {},
"outputs": [],
"source": [
"service = 'Mast.Jwst.Filtered.Niriss'\n",
"t = Mast.service_request(service, params)"
]
},
{
"cell_type": "markdown",
"id": "509e27b3",
"metadata": {},
"source": [
"## Construct the Observation Search\n",
"\n",
"\n",
"The keyword search returnes an astropy table of *files* that match the query criteria. We need to construct MAST Observation IDs from the file names in order to query for all JWST *Observations* that match our criteria. This can be derived from the filenames by removing all characters including and beyond the final underscore character. Here we make a list of unique Observation IDs for the subsequent query. Note that we limit the list to *unique* IDs, as many filenames have common roots."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d2cdd491",
"metadata": {},
"outputs": [],
"source": [
"# Unique file names:\n",
"fn = list(set(t['filename']))\n",
"# Set of derived Observation IDs:\n",
"ids = list(set(['_'.join(x.split('_')[:-1]) for x in fn]))"
]
},
{
"cell_type": "markdown",
"id": "d1b1567a",
"metadata": {},
"source": [
"Print the list of unique ids if you like."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f34e0487",
"metadata": {},
"outputs": [],
"source": [
"ids"
]
},
{
"cell_type": "markdown",
"id": "425d4bd7",
"metadata": {},
"source": [
"### Execute the Query for Observations\n",
"\n",
"\n",
"Now search for Observations that match the list of Observation IDs constructed above. This search uses the [astroquery.mast](https://astroquery.readthedocs.io/en/latest/mast/mast.html) `Observations` class, where the available search criteria are described [here](https://mast.stsci.edu/api/v0/_c_a_o_mfields.html). Note that we specify the MAST Mission (i.e., the `obs_collection` field) as JWST
to limit the scope of the query (which also greatly speeds up the search). "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab8c0310",
"metadata": {},
"outputs": [],
"source": [
"matched_obs = Observations.query_criteria(\n",
" obs_collection='JWST',\n",
" instrument_name='Niriss', \n",
" obs_id=ids\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9080f483",
"metadata": {},
"source": [
"Verify that your query matched at least one observation, or the remaining steps will fail."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "646b96c7",
"metadata": {},
"outputs": [],
"source": [
"print('Found {} matching Observations'.format(len(matched_obs)))"
]
},
{
"cell_type": "markdown",
"id": "61a482b4",
"metadata": {},
"source": [
"## Query for Data Products\n",
"\n",
"\n",
"Next fetch the data products that are connected to each Observation. Here we take care to fetch the products from Observations a few at a time (in chunks) to avoid server timeouts. This can happen if there are a large number of files in one or more of the matched Observations. A larger chunk size will execute faster, but increases the risk of a server timeout.\n",
"\n",
"The following bit of python magic splits a single long list into a list of smaller lists, each of which has a size no larger than `sz_chunk`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c5c1c08a",
"metadata": {},
"outputs": [],
"source": [
"sz_chunk = 4\n",
"chunks = [matched_obs[i:i+sz_chunk] for i in range(0,len(matched_obs), sz_chunk)]"
]
},
{
"cell_type": "markdown",
"id": "0349477a",
"metadata": {},
"source": [
"Now fetch the constituent products in a list of tables."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "84b78624",
"metadata": {},
"outputs": [],
"source": [
"t = [Observations.get_product_list(obs) for obs in chunks]"
]
},
{
"cell_type": "markdown",
"id": "0075100e",
"metadata": {},
"source": [
"We need to stack the individual tables and extract a unique set of file names. This avoids redundancy because Observations often have many files in common (e.g., guide-star files). "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5c6386ca",
"metadata": {},
"outputs": [],
"source": [
"products = unique(vstack(t), keys='productFilename')\n",
"print(' Number of unique products: {}'.format(len(products)))"
]
},
{
"cell_type": "markdown",
"id": "b883a0d8",
"metadata": {},
"source": [
"Display the resulting list of files if you like. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "21a450dc",
"metadata": {},
"outputs": [],
"source": [
"products.show_in_notebook(display_length=10)"
]
},
{
"cell_type": "markdown",
"id": "9a115fee",
"metadata": {},
"source": [
"### Filter the Data Products\n",
"\n",
"\n",
"If there are a subset of products of interest (or, a set of products you would like to exclude) there are a number of ways to do that. The cell below applies a filter to select only products classified as `SCIENCE` plus the files that define product associations; it also excludes guide-star products. See the full set of [Products Field Descriptions](https://mast.stsci.edu/api/v0/_productsfields.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1cbf1fd7",
"metadata": {},
"outputs": [],
"source": [
"filtered_products = Observations.filter_products(\n",
" products\n",
" ,productType=['SCIENCE','INFO']\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "16e0283e",
"metadata": {},
"source": [
"Display the filtered product table if you like."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8877fa35",
"metadata": {},
"outputs": [],
"source": [
"filtered_products.show_in_notebook(display_length=10)"
]
},
{
"cell_type": "markdown",
"id": "7b6ec36e",
"metadata": {},
"source": [
"### MAST Login\n",
"\n",
"\n",
"If you intend to retrieve data that are protected by an Exclusive Access Period (EAP), you will need to be both *authorized* and *authenticated*. You can authenticate by presenting a valid [Auth.MAST](https://auth.mast.stsci.edu/info) token with the login function. (See [MAST User Accounts](https://outerspace.stsci.edu/display/MASTDOCS/MAST+User+Accounts) for more information about whether you need to login.) Note: this step is unnecessary if you are only retrieving public data. \n",
"\n",
"