IJDPW Slack Archive - Session 1.3

This page archives Slack comments from Day 1, Session 3 of the Improving JWST Data Products Workshop (IJDPW).

Marco Sirianni - Updated schedule for this session:

1:50 PM 0:30 Exploring Uncharted Territory with JWST Wide Field Slittles Spectroscopy Gabriel Brammer
2:20 PM 0:05 Flickerings in MIRI-MRS observations Gabriel Luan Oliveira
2:25 PM 0:05 Overcoming Telegraph Pixels Timothy Brandt
2:30 PM 0:05 MIRI and NIRSpec IFU observations of the HH46IRS protostar Maria Navarro
2:35 PM 0:20 A Kernel Phase Pipeline for High-Contrast Imaging below the Diffraction Limit with JWST Thomas Vandal
2:55 PM 0:05 Corrections and utilities for imaging and IFU data by the PDRs4All team Dryes van de putte
3:00 PM 0:20 High-SNR Spectral Extraction Using Empirical PSF Fitting Ian Wong
3:20 PM 0:20 NRSClean: An Algorithm for Removing Correlated Noise from JWST NIRSpec Images Bernie Rauscher

Melanie Clarke - I’m not sure I understood the comment at the end of the last session - will the NSClean talk be accessible to virtual participants or no?

Howard Bushouse - I think you already know all there is to know.

Melanie Clarke - I was looking forward to Bernie’s perspective, though!

Marco Sirianni - There will be no broadcast nor registration for the NRSClean talk.

Melanie Clarke - So not on bluejeans either?

Martha Boyer - The slides aren’t being shared on bluejeans.

Harry Ferguson - @Gabe Brammer When you define custom associations, are you doing that with scripts that parse header metadata and apply some logic, or scripts that use proposal files (and some logic), or really by hand one by one by typing in filenames for each data set?

Nathan Adams - i would presume using the metadata as by hand would be a lot of work and I know gabe works on a lot of surveys!From my experience, i make asn files for STAGE3 because i sometimes grab them from mast too fast before its run its own stage3 reduction and put the products online. I use a python script to sort files by filter keywords and use the stsci asn_from_list.py code to make a stage3 asn from that. asn_from_list

Gabe Brammer - It's fairly automated based on MAST queries with the astroquery API. All based on rate/cal queries. Here's a query demo.

# Demo grizli / mastquery query for defining associations

# github/gbrammer/mastquery
from mastquery import jwst
import mastquery.utils
import mastquery.overlaps

filters = []
filters += jwst.make_program_filter([4111]) # Suess Abell 2744
filters += jwst.make_query_filter('productLevel', values=['1a','1b','2a','2b'])

# Add criteria to reduce search
filters += jwst.make_query_filter('filter', values=['F480M'])

extensions = ['rate','cal','uncal']

instruments = ['MIR','NRC','NIS'] #,'NRS']

res = jwst.query_all_jwst(instruments=instruments,
                          recent_days=30,
                          filters=filters,
                          columns='*',
                          extensions=extensions,
                          fix=True)

# Compute parent "fields" of overlapping footprints
buffer = 3 # arcmin
tabs = mastquery.overlaps.find_overlaps(res, use_parent=True,
                                        buffer_arcmin=buffer, close=False)
                                        
# Define associations within a "field"
assoc_args = {'max_pa': 5,     # allowed PA offset
              'max_sep': 1.2,  # "visit-splitting"
              'max_time': 2.0, # delta-time, days
              'match_filter': True,
              'match_instrument': True,
              'match_program': True,
              'hack_grism_pa': False,
              'parse_for_grisms': False, 
              'match_detector':True, # Separate associations by detector
              }

for i, tab in enumerate(tabs[:]):
    _ = mastquery.overlaps.split_associations(tab, assoc_min=0, 
                                    assoc_args=assoc_args,
                                    force_split=True,
                                    with_ls_thumbnail=False,
                                    xsize=8, ls_args={},
                                    pad_arcmin=2.5,
                                   )

The results are then sent to my grizli RDS database: grizli-v2 assoc_table. Actually, this notebook is the full workflow: step1-define-associations.ipynb. Looks complicated, but I just "run all" the notebook

Jeff Valenti - @Gabe Brammer - For NIRCam images is the a significant difference in RA, Dec for each pixel using FITS SIP vs GWCS? In other words, can one represent the other?

Varun Bajaj - I'm not sure about the grisms, but for the Imaging filters it kind of depends. In the header they are stored by default with a SIP degree of 3. This plot shows the offset between sky coords computed from the GWCS and FITSWCS (degree=3) stored in the header of NRCA1 image. The offsets are generally small, but can be noticed when matching catalogs precisely.

Going up to degree=5 makes them effectively 0 for this detector, but I think for some of the other detectors you have to go higher. To do that though, you still have to read in the GWCS. I assume offsets this small aren't as relevant for WFSS though.

Gabe Brammer - Haven't looked recently, but in early days I found that the SIP representation provided directly by the pipeline wasn't reliable. In grizli I fit the SIP coefficients for each exposure from the GWCS model with a 4th or 5th degree polynomial: grizli/jwst_utils.py#L525. I haven't actually checked or tried to use the fact that the coefficients should always be the same for a given distortion model. But it's a pretty deterministic / noiseless fit in my experience.

Jeff Valenti - Very useful information - thanks!

Howard Bushouse - Regarding FITS SIP for WFSS images, forgot to mention in my talk this morning that WFSS is the one spectroscopic mode in which we do include a FITS SIP approximation to the WCS in the headers of the images downstream of the assign_wcs step in the calwebb_spec2 pipeline, for the purpose of backwards compatibility with other packages, such as grizli.

Paul Goudfrooij - @Gabe Brammer - the “continuum” also often contains significant absorption lines, especially Halpha if the underlying continuum is a relatively young population. That would be washed out by using a large median filter, right? (still, it works well to clarify where the emission lines are, of course.)

Gabe Brammer - When doing the redshift / emission line fits, I usually fit with population synthesis continuum templates, e.g., FSPS, along with the emission lines.

Paul Goudfrooij - Right, makes sense, I figured the subtraction of a median filter was “just” to indicate which sources have emission lines.

Jo Taylor - FYI: the NIRISS team is working to improve the trace calibration.

Rachel Plesha - And we do see that there is an offset in the absolute wavelengths (especially present in F200W) that we are very close to tracking down/fixing.

Gabe Brammer - Great, thanks! It's a tricky problem! Much easier with a single artificial source during cryovac testing.

Loic Albert - @Gabe Brammer Is the trace position calibration is what limits the model subtraction with NIRISS WFSS? Could undersampling/intra pixel sensitivity be at play?

Gabe Brammer - Yes, that will also be a factor. But at least up to now, in the grizli models the traces were just slightly offset. More precise fitting for point sources is eventually possible. For example, you'd like to have a wavelength-dependent effective PSF that you could evaluate along the trace. Maybe WebbPSF would be enough. But for galaxies I generally ignore wavelength-dependence along the profile within a particular filter.

Everett Schlawin - @Gabe Brammer, any comment on 1/f noise with GRISMR versus GRISMC? Perhaps for these emission lines it's not as bad as stars with more continuous spectra.

Gabe Brammer - Haven't looked in much detail. At least for the programs I've been working with the grism exposures are relatively long, so 1/f importance somewhat reduced.

Jeff Valenti - @Gabe Brammer - Please put DJA links here in slack.

Tyler Pauly - @Gabe Brammer This may relate only to the slide describing grizli, but how much of the WFSS science you’ve shown requests changes in MAST/pipeline products, and how much of this requires a detailed, hands-on and/or iterative reduction by a scientist?

Gabe Brammer - t pretty much stops with the pipeline products after the Level1 rate files. Doesn't take so much manual interaction, though, to get to the point of extracting and fitting spectra.These are all pretty badly out of date, but the general procedures haven't changed much: grizli-notebooks JWST. I'll try to update this week.

David Law - @Gabriel Oliveira Those look like bad/warm pixels, which evolve over time. In large part they should be taken care of when the bad pixel masks are updated (to be roughly every 6 months). You should also be able to dial the parameters of the spec3 outlier detection to be more or less aggressive in flagging them, have you had any luck changing this from the default parameters? (The crosses are from inter-pixel capacitance spilling the charge from a very warm central pixel in which the central peak is being flagged but the surrounding spilled pixels are not).

Anil Seth - Hi David — have you had luck in removing these features yourself in MIRI spectra? Any recommendations on the outlier detection flagging would be appreciated!

Gabriel Oliveira - @David Law Hi. I tried different kernel_size in outlier detection, as well as different values for the threshold_percent, just can't remeber now how low I set it. I will try this later, thanks for the suggestion. I've been using the latest version of the pipeline, so I think there are no update-related problems. Our observations were made in June, maybe we have to wait more for the pipeline?

David Law - At the moment the most reliable method I've seen is to use dedicated backgrounds; bad/warm pixels will be consistent between a science observation and the corresponding background. I median the background observations together, and percentile flag everything above a certain level as a bad pixel, then set those flags in the science data too. That gets a little tricky in Channel 3/4 where the background is higher, so I use a notebook that models that and removes it first. I can point you to a notebook I've been using.In the longer term, this should be improved enormously when we update our bad pixels masks in the next few weeks. The badpixel mask in use is from commissioning, and we've been finding that it evolves fast enough that it really needs to be updated every few months. The outlier detection routine in spec3 does a reasonable job of finding leftover bad pixels, but the bad pixel mask is so outdated at this point that the outlier routine can't handle it all for data taken more recently.

Anil Seth - Thanks, David, this is a very helpful reply. And we’d certainly appreciate links to any notebooks you have that implement this.

Gabriel Oliveira - That is a very clever way to find these pixels. I actually didn't look in the background to realize they present the same warm pixels. I'll look for that.Good to know that the outlier detection will be improved. @David Law Thank you for the feedback.

David Law - @Anil Seth @Gabriel Oliveira Here's that notebook I was talking about that uses background data to find and flag new warm pixels. It's a little clunky, but I've found it to work pretty well in general. Flag_badpix.ipynb

Gabriel Oliveira - @David Law I re-ran the updated pipeline today and dial the threshold_percent to be more aggressive. The problem with lowering the threshold_percent is that when the removal of hot pixels is getting good, it starts to flag too many pixels on the emission lines and on the continuum. About the notebook that you sent, when you flag the warm pixel, did you flag the pixels around it? Or the pipeline avoid spreading the warm pixel signal when you update the data quality in the rate files? I'm asking about this because my script update the data quality in the cal files, just to remove the artifacts that show up in the data cube. Is this not ideal or is it wrong in some way?

David Law - Hi @Gabriel Oliveira Just realized I'd missed your followup. The notebook above just flags warm pixels, not those around them explicitly. Although, if the surrounding pixels are sufficiently affected then they too would get flagged. The pipeline does not try to do any explicit correction for the inter-pixel capacitance spreading flux from one pixel to another, but would rely on individual pixels being flagged to use or not use. Flagging in the rate or cal files would both be ok, though flagging at the rate stage means that the pixel_replacement routine would be able to try to interpolate a reasonable value for the missing data which can improve the overall SNR of the final spectra.

Anil Seth - @Gabriel Oliveira is your implementation available anywhere?

Gabriel Oliveira - Not yet. As soon as I clean the code, I will push it to github.

Everett Schlawin - @Timothy Brandt, sorry if you already said this, but do you use the Gaussian mixture model with all pixels and identify Random Telegraph or do you focus on the ones that are already marked as telegraph in the DQ extension?

Timothy Brandt - I identify them myself by the distribution of read-to-read differences in long darks. I then apply this approach only to the telegraph pixels.

Jeff Valenti - Good question!

Michael Regan - It seems like we need to use input that hasn’t run the Jump step. So, we need to replace the last two steps of detector 1 for telegraph pixels.

Timothy Brandt - For example, the 40th, 70th, and 95th percentile of absolute differences should have certain ratios under Gaussian assumptions. Deviations from these ratios in a series of long darks indicate misbehaving pixels. Yes, no jump step, that is correct. And I am assuming the read noise limit; otherwise the problem is much harder. If photon noise dominates then it hardly matters whether a pixel is a telegraph pixel anyway.

Jeff Valenti - @Thomas Vandal - For reference: jwst-kpi

Jeff Valenti - @Thomas Vandal - How many pixels remain after the first "crop" step in the KPI stage 3 pipeline? Next slide suggests ~60 pixels on a side.

Thomas Vandal - Yes, it's up to the user, but the default is 70 (half-size=35)

Kevin Volk - I have lost the live stream on youtube. Have others had this issue?

Oliver King - It’s just died for me too.

Mario Giuseppe Guarcello - Me too.

Macarena Garcia Marin - Thanks for reporting. We are looking into it.

Marco Sirianni - It should be back.

Aaran Shaw - @Thomas Vandal how does this technique compare with ground-based NIR interferometry and AO capabilities? (Really cool technique by the way!)

Aaran Shaw - Great answer, thank you. Looks like this technique can really expand upon and complement ground based capabilities (editor note: see recorded Q&A after talk)

Thomas Vandal - Yes! It can also be used with ground-based images, see this paper with VLT/NaCO by Jens: 2019MNRAS.486..639K. Typically it will reach shorter separations, but lower contrast than coronagraphic imaging

Kevin Volk - The stream is back now, thank you.

Jeff Valenti - @Thomas Vandal - Can KPI be used on some, lots, or all archival imaging data?

Thomas Vandal - It can be applied to any observation technically, though a relatively high SNR will probably help. A point I did not really cover is that we typically want a reference observation of a point source to calibrate the observations.

Tyler Pauly - @Thomas Vandal Does the sensitivity derived from simulations show any dependence on position angle?

Thomas Vandal - The test has been done for bright targets by Ceau et al. (2019) and the sensitivity does not vary too much.

For more realistic simulations (or injection in real data), I don't have the answer yet, unfortunately!

Tyler Pauly - Was just a question of curiosity, thanks for the reply!

James Davies - @Gabe Brammer - DJA

David Law - @Maria Gabriela Navarro Ovando Did you have a dedicated background for your observations? For the fringing, was this on pipeline extracted spectra or spectra of individual cube spaxels? (edited)

David Law - @Dries Van De Putte Did the 1d residual fringe correction help for the MIRI MRS spectra?

Dries Van De Putte - It improved things, but there were still strange curves left. So smaller amplitude, but harder to interpret if they were real or not. For that specific paper (identifying new emission features), we turned it off, and used the empirical method I presented instead. At a later stage, I tried the 1D fringe correction (applied after extracting spectra from the cube), and this worked better than the ‘whole cube’ residual defringer.

David Law - Gotcha- I've found that the 1d fringe correction works better than the 2d in general as well.

Jane Morrison - When you applied the 1D fringe correction did you also have to use your empirical method

Dries Van De Putte - I tried it with some of the PDR GTO data (Horsehead and NGC7023). The 1D defringer is mainly needed for channel 2 long and channel 3 short (where the 11.3 feature is). I don’t think I will need the empirical method for my science goals. But I will take a look at some of my notebooks from a while ago, and double check what it looked like after the correction. Here’s what it looks like for the horsehead (original without residual defringe + 1D defringe applied)

There are still some suspicious things left for the biggest peaks.

David Law - Looks like it's improving quite a lot, but not as much as it could. I wonder if it might do better running the 1d defringer on individual exposures, and if the spectral coadding is washing out the fringe signal a little so that the fitter can't get such a good solution. I've seen cases before where it fits the residual fringes well at most wavelengths, but will then miss a chunk of one waveband.

Inbal Mizrahi - @Ian Wong How do you decide the size of the box you use as your background? And do you adjust based on how crowded the field is?

Ian Wong - For the background region, the performance is not substantially affected by a wide range of region boundaries, as long as it is big enough to exclude the source’s PSF wings. For crowded fields (with several discrete background sources), I manually mask non-target sources prior to running the PSF fitting.

Anna Pusack - @Ian Wong What do you think is the applicability of PSF fitting in a crowded field, as in star clusters, etc.?

Ian Wong - If the PSFs of the individual sources are overlapping, then my current PSF fitting implementation does not work, unless you assume that that the sources are identical, in which case you can effectively extract them simultaneously. Otherwise, you would need an external PSF model (e.g., webbpsf).

Kevin Volk - The stream seems to have gone down again for me.

Marco Sirianni - AV is aware and working on it.

Jeff Valenti - @Ian Wong - How difficult would it be to automate PSF photometry? It sounds like it needs expert tuning.

Ian Wong - For most faint point sources in clear fields, we have arrived at some default prescriptions that work well across the board. The knobs that I turn for PSF photometry are primarily (1) extraction region size (2) background region inner boundary, (3) width of moving median interval used to generate template PSFs. There are other more minor knobs like outlier rejection threshold (my PSF fitting procedure is iterative, with outlier masking based on residual array values between each fit), but the three parameters above handle most of the “tuning”.

David Law - @Ian Wong Athalea is time variable on hour-long scales, and there are systematic diferences in the absolute calibration between different standard stars. Which are you tying your recalibration to?

Ian Wong - The custom flux calibration I’m using takes a range of different standard stars (G and A types). By in large, I’ve found that different standard stars result in a ~constant common-mode offset in the flux correction factors, i.e., not much wavelength dependence. offset = multiplicative factor

David Law - Right, it's a nearly constant multiplier for each star, generally the A stars seem more reliable than the G if I recall.

Ian Wong - Yes, for MIRI I’m mostly using A stars. For NIRSpec, where we’re interested in reflectance spectra, I’m directly taking the G-star solar standard extraction and dividing it from the extracted target spectrum.

Jeff Valenti - We are not broadcasting @Bernie Rauscher's talk because STScI requires copyright forms to broadcast talks, but civil servants are not allowed to sign copyright forms themselves. There is a process to get a signed copyright form, but we did not get that done.

Jane Rigby - Apologies that this mixup prevented online folks from hearing Bernie’s talk. Bernie’s NSClean paper is here on Arxiv; I know it’s just been accepted, so please look for the accepted version on PASP soon, or reach out to Bernie via email. Also, we’re working to find a general solution, so this issue doesn’t re-occur for other workshops.

Jeff Valenti - @Bernie Rauscher - NSClean software (tar ball) is linked from the Tools from the Community page.

Marco Sirianni - @Bernie Rauscher beside the convenience of the "masking" is there anything NIRSpec specific in the code ? - can be applied to the other NIR data with appropriate masking ?

Michael Regan - The key with NIRSpec is there are pixels that are not exposed to the sky.

Loic Albert - @Bernie Rauscher If you measure the background scatter in a small region (say 16x16 pixels), how much scatter improvement will NSClean bring?
(In other words, does the low-pass filter leave high spatial frequency 1/f residuals?)

Howard Bushouse - The NSClean implementation for the "official" pipeline is in draft form at pull/8000. It inserts the nsclean step near the beginning of the Spec2Pipeline, after executing assign_wcs and msa_flagging, so that WCS is available and DQ flags for stuck open MSA shutters are also available. It uses the WCS info to create the mask on the fly, masking out all IFU slices, MOS slitlets, etc., so that only unilluminated pixels are used to do the background fitting. It works for all NIRSpec modes: IFU, MOS, fixed-slit, and BOTS. In the interest of giving appropriate attribution, the pipeline implementation is largely modeled on the python notebooks created by @Melanie Clarke (see the jwst-caveats repo) and of course @Bernie Rauscher’s underlying nsclean library. Will hopefully be merged into the development version of the pipeline soon and then available in the DMS B10.1 release coming out early next year. </advertisement>

Harry Ferguson - @Bernie Rauscher How much better is the 1/f removal for NRSIRS2RAPID relative to NRSIRS2 for the same number of frames? For NIRCam if we could download just the reference pixels from all the reads, could we basically cure 1/f, or would we still have a large pattern to remove empirically?

Bethan James - For those interested, @Melanie Clarke has produced a jupyter notebook for NSClean which shows an example of how to clean residual 1/f noise in NIRSpec IFU Products

James Davies - @Gabe Brammer you had mentioned you rely on information of which exposures use the same guide star to inform you on whether you do any tweaking to the WCS. Is this information available somewhere in the science data headers, or do you have to pull the _gs data?

Gabe Brammer - Hmm, I think you're referring to that I said that I essentially define associations as groups of exposures obtained with the same GS acquisition. I don't actually check that explicitly, I just meant that my associations aren't necessarily exactly as JWST / APT would define them.

Benjamin Johnson - There is a guide star ID in the headers.

Sarah Kendrew - Yes, guide star ID is in the header.

Gabe Brammer - I always refine the WCS, with something like "tweakshifts" for exposure shifts within an association and then a global shift + rotation relative to some external reference catalog (e.g,, legacy_surveys DR9, PS1, GAIA)

Thomas Williams - Currently for the PHANGS stuff we just use a shift (no rotation) for simplicity, do you have a feel for a) are there rotations from tile to tile in a mosaic and b) if there’s some overall rotation that should be included? And a sense of the size of them (is it a few degrees rotation or smaller)?

Gabe Brammer - A small fraction of a degree, but not always zero.

Thomas Williams - Constant between tiles in an observation? so we can get away with just an rshift to an absolute catalogue once we have a mosaic?

Benjamin Johnson - @Gabe Brammer have you looked at trends or typical values per detector for these within association tweaks?

Gabe Brammer - No, I haven't really studied patterns in the alignment.

Savannah Gramze - @Thomas Vandal How well does KPI work on fields with a high stellar density? Do targets need to be well separated from other stars?

Loic Albert - An important requirement is the SNR of the star needs to be >>1000. KPI works for accumulated photons > a few 10^6.

Thomas Vandal - Indeed, and if you reach a high enough SNR, you'd need a good forward model of the scene. I'm not aware of cases where it was used for a case like this one, but it can work on extended sources (e.g. a disk around a star)

Savannah Gramze - Could KPI be used to identify stars with disks then? That could be very useful for identifying young stellar object candidates.

Loic Albert - Yes, if the disk to star contrast is more favorable than ~1/200 to 1/300. If the disk is fainter, then KPI won't have the sensitivity

Savannah Gramze - If the disk is bright enough, could it then be used to find properties like the radius and inclination of the disk?

Thomas Vandal - Yes, if you have a model for your scene, you can model it in the fourier plane on the kernel phases!

Content

Space Tools