This page archives Slack comments from the focused discussion on NGDEEP during the Improving JWST Data Products Workshop (IJDPW).



Dan Coe - NGDEEP deep NIRCam images not achieving expected depth. NIRCam parallels DEEP8 4 groups, 7 integrations. 2079.pdf. Description of the issue with NGDEEP: @Mic Bagley reported via @Alicia Canipe:

  1. The measured depths from the final mosaics in NGDEEP are generally 0.1-0.5 magnitude shallower than what is predicted by the ETC.
  2. Comparing with the MIRI Deep Survey, which has a similar strategy in a partially overlapping field, NGDEEP clearly is not deeper though it has 50-70% more exposure time in some filters.

Timothy Brandt - Ok, I have significantly better results for jw02079004001_03201_00001 than from the _rate.fits.  I fit a ramp including jump detection, combining all of the integrations with synthetic jumps between resets.  I assumed a gain of 2.05 and the read noise from the read noise files.  I am using the first read in addition to the four resultants per integration, subtracting the first read from the first resultant (res1 -> 8/7*(res1 - read1/8))

The first of the two images is mine (brandt_samplerates1.fits); the second is the _rate.fits file.  I flatfielded both to make them easier to visualize.  I did not apply a bad pixel mask. I can do slightly better than this file on the jump detection I think.


Mic Bagley - This looks great, thanks for sharing! Would you mind describing what you mean by synthetic jumps?


Timothy Brandt - Here's another look (hopefully the gif blinks).  I mean that I consider it not as seven exposures of four resultants, but as a single series of 35 resultants.  The first resultant is the first read, the second resultant is reads 2, 3, 4, 5, 6, 7, 8, etc.  There is a reset between resultants 5 and 6, so I pretend that there is a jump/cosmic ray there as far as the ramp fitting algorithm is concerned.

I can clean up and share a notebook if you'd like to do this yourself.  I am currently running no linearity correction, applying no bad pixel mask, and the reference pixel correction could probably be improved.


Mic Bagley - Yeah, that would be awesome, thank you!  This looks really promising.


Timothy Brandt - See google drive folder. I think you have everything you need there, at least for nrca1.  You will need the corresponding flats and noise files for the other detectors.  You'll also need to adjust your prefixes, but I think everything else should work. Run time should be less than a minute.


Mic Bagley - Thank you!! We'll check this out and play around with it. Really appreciated!  Quick question: did you do anything special to the flat and rnoise file, or are they just renamed from CRDS?


Timothy Brandt - Just renamed.


Mario Gennaro - I am not sure it is an immediate test of what @Timothy Brandt is doing, but bear with me. I guess the hypotesis we are testing is that CR rejection is really poor with just 4 DEEP8 groups (Tim, you are clearly already a Roman team member, with this "resultants" parlance), am I correct? (sorry for missing some of teh discussion on Friday) . And Tim's image seems to really indicate that. Has anyone checked the DQ arrays of the MIRI GTO to simply count the found CRs, then scale that by the ratio of the GTO vs NGDEEP measurement times? I guess, from what TIm is showing, that we expect NGDEEP to have way fewer CRs flagged than the GTO. Having said that, since I missed some of the discussion on Friday, are we saying that possibly the undetected CRs are boosting the variance of the background, as measured by @Mic Bagley, and hence, if the SNR she's testing is in the background-limited regime, this could be the reason for the apparent discrepancy? As an aside (likely mentioned by others when I wasn't there), the ETC does not inject CRs in their "simulations" so we don't lose groups in the ETC. The ETC (pending details I could check later) simply reduces the SNR by some small fraction that corresponds to the average loss of groups due to the known CR rate (ie if, given the exposure setup, you'd expect eg x% of pixels to have lost 1 group due to CRs, then the effective exposure time is reduced by x/100 * tgroup and the SNR is reduced by the sqrt of that - don't check my math here, just follow my logic).


Timothy Brandt - @Mario Gennaro Yes, the CR rejection with DEEP8 and 4 groups seems to be poor.  A tidbit of interest here: 48% of the pixels have a DQ flag of 4 in the _rate.fits file.  When I do my own fit, I reject at least one resultant difference in 26% of pixels.  So the pipeline is flagging a lot more pixels than I am, but somehow it isn't handling them very well.

@Mario Gennaro We have seven integrations here, so the impact of CRs should be relatively minor.  The mean number of CRs per pixel is less than one, so we should typically lose less than a factor of sqrt(7/6) in SNR from lost data due to CRs.  I also find that the chi squared values for my fits are pretty good; they suggest that we may be underestimating read noise by ~5-10% in this readout mode but not by more than that.

I am finding ~1% of pixels to be bad by a metric of either having a chi squared value much higher than expected, or having at least 4 jumps detected in the ramp.


Timothy Brandt - One semi-random question: the dark for nrca1, jwst_nircam_dark_0331.fits, looks like nonsense to me.  The science frame is almost all zeros for almost all pixels and reads--there are exactly 2561 pixels with nonzero values in each read.  The uncertainties are larger than I expect and larger than in the read noise file supplied for this detector, but don't look obviously corrupted.  Might there have been an error in the type conversion for the dark data, and if so, is it recoverable?


James Davies - This is a ground-based CV3 dark.  No on-orbit darks have been delivered yet.  Here’s the info on it from CRDS: jwst_nircam_dark_0331.fits. And under Database > Descrip:  “CV3 based dark with zero for nominal pixels and non zero for hot pixels and noutputs added.” I’m sure @Bryan Hilbert can add more info.


Bryan Hilbert - @James Davies is right. In the shortwave channel, the dark rate is low enough that we didn’t get enough data for a decent SNR measurement of it. Many of the darks we took were contaminated with low level persistence. Given that, we decided to set the dark rate to zero for all pixels except those with a dark rate above some threshold. The dark rate in the long wave channel is higher, and so we decided not to zero-out any pixels.


James Davies - I will add that the uncertainty in the dark reference file is not currently propagated by the pipeline.  It is ignored.


Mario Gennaro - @Timothy Brandt - if 48% of the pixels in the _rate.fits file have a DQ flag of 4 that might mean that more than 1 integration per each of those pixels may have gotten a CR detection though, right?  @Bryan Hilbert do we have a per-int DQ flag for presence of CRs? And out of curiosity, I guess that a quick way to test this would be to take the STDEV of the background as processed by Tim vs the pipeline one and see if indeed we do typically lose less than a factor of sqrt(7/6) in SNR from lost data due to CRs or perhaps, through multiple-INT-CR-flagging we are losing more than that (but again probably not enough to justify the observed discrepancy)


Bryan HilbertThere's no separate per-int DQ flag for jumps. But you could get that information from the rateints files.


Mario Gennaro - And since we are trying to account for all various effects here, in the MIRI-GTO vs NGDEEP comparison we have to take into account the difference between the exposure times (as reported by APT) and the measurement times (which can be inferred from eg using the ETC and is the relevant time for estimating the SNR scaling, given the way the pipeline performs the up-the-ramp fit). This table might come in handy in future discussions. It is limited to F115W and for NGDEEP just to the DEEP8 exposures

Case                 | NGroups | NINTS | Dithers                           | Total Exp. time | Total Meas time
---------------------|---------|-------|-----------------------------------|-----------------|-----------------
MIRI-GTO (PID 1283)  | 7       | 2     | 10x2 (10 dithers times 2 obs)     | 55,186 s.       | 51,536 s.
NGDEEP (PID 2079)    | 4       | 7     | 3x6  (3 dithers time 6 exp specs) | 93,152 s.       | 81,170 s.     

That is a namely small factor but indeed the ratio of the relevant times goes from ~1.69 to ~1.58


Michael Regan - @Mic Bagley - Tim strings all the integrations together and inserts a “synthetic jump” between integrations. Since he works with the differences, the reset is effectively a jump.


Timothy Brandt - @James Davies, @Bryan Hilbert - Thanks for the answer on the dark.  The original (non-zeroed) file is still available somewhere for diagnosing read noise, telegraph pixels, behavior next to hot pixels, etc.?


Bryan Hilbert - @Timothy Brandt I’ll have a look around and see if I can find them. I’m not sure offhand whether that information was saved.


Timothy Brandt@Mic Bagley, @Mario Gennaro, @James Davies - Some more results: I was able to get rid of almost all artifacts except the 1/f noise using only the bad pixel mask+raw groups (I am currently doing nothing about 1/f apart from a reference pixel correction).  The image looks nice and clean, and I am throwing away just under 0.5% of the non-reference pixels for the nrca1 detector.  I will send a fits file in a following message.  Also, I tested the impact of cosmic rays, discarded jumps, and possible underestimates of the noise on sensitivity.  With the noise inflation I need to get good chi squared values and the read differences I am rejecting, I am getting a ~5% SNR penalty in most of the image over the ideal case of no discarded resultant differences and the read+photon noise being exactly correct.  This rises to a ~10% SNR penalty near a snowball.  So I would expect sensitivity to be no more than ~5-10% worse than the ETC estimate for these particular exposures, i.e., <0.1 mag.

In the attached file (brandt_samplerates2.fits) the first slice is the pipeline _rate.fits file with the flatfield divided out (nearly half of the pixels are flagged with DQ=4; I am not saving the flags).  The second slice is my rate measurement, with a flatfield applied and with pixels I view as unrecoverable set to zero.  The third slice is the uncertainty on the second slice.  The final slice is the ideal noise divided by my best estimate of the noise.  It might not be perfect but it should be reasonably close, and it suggests that we should be able to get within 10% of the ideal ETC sensitivity for this data set, and potentially a fair bit closer than that if various calibrations like 1/f corrections improve things.


Mario Gennaro - @Timothy Brandt, thanks again for all the work. May I ask you if I am interpreting your last message and your data correctly? Can one say that the ETC is reasonably accurate, ie to within 10% and that the SNR loss that @Mic Bagley is reporting may be related to the relatively poor handling of integrations with very few groups by the pipeline?


Timothy Brandt - @Mario Gennaro. I believe that this is the case, yes.  I think we would need to do end-to-end testing to be absolutely certain. Fundamentally there is not much to do with four groups on their own.  If there is a CR within group 2, for example, the difference between groups 1 and 2, and between groups 2 and 3 are both corrupted.  Only the difference between groups 3 and 4 is valid.  But how would you know that?  The pipeline sees only three group differences, all of which disagree. How to pick one?  You could pick the lowest value, assuming CRs to be positive fluctuations, but that isn't a great fix.  We get around it in two ways: the extra read at the beginning makes 5 groups rather than 4, and with more than one integration we are back in business.  So a guideline for the instrument, I think, would be that you should use no fewer than six groups unless you use multiple integrations.


Mario Gennaro - In fact that IS the recommended strategy, but in case of parallels it becomes harder to enforce it due to data rate and data volume constraints.


Timothy Brandt - In RAPID mode (or anything with single read groups) this requirement is relaxed because a jump is guaranteed to corrupt only one group difference, not two.


Mario Gennaro - With a simple ds9 look at your samplerates.fits file, I see the following: if you pick a "clean" region of the _rate image, that does not look that different from yours (eg in term of variance of the sky). BUT there is a huge number of "low level" CR-affetced pixels (eg with a rate that is double or triple than the sky rate or something like that) that are not correctly handled by the pipeline and that you instead do manage to flag out (ie your image even by eye looks so much better!). I think that @Mic Bagley told me that one of the ways they are checking the SNR is to estimate the variance of the background in level 3 images. Although we can't say for sure without an ETE test, as you said, I suspect that what is happening is that even with image recombination from several dithers, these  defects that are visible by eye in the _rate image end up propagating into the level3 products (possibly they are smoothed out by averaging over several dithers but they are not sigma-clipped because they are not hugely different from the sky level). Given the large number of these "unnoticed blemishes" in individual dithers, they end up percolating quite ubiquitously onto the level3 images. The end result is that the sky level is higher than it should be. Not sure if I am oversimplifying things.


Timothy Brandt - @Mario Gennaro, that is my guess, yes.  Without a jump the pipeline and my algorithm should give nearly identical slopes, so this is all down to the jump flagging/masking. I am not doing some of the calibration steps (bias removal, nonlinearity correction), so there are small differences down to that. I think the issues could also be mostly avoided with a median rather than a mean combination of the integrations, or a sigma-clipped mean.  A median is far from ideal, because missed jumps tend to be positive (so you will still get a biased answer) and the median entails a SNR penalty with nearly Gaussian noise.  A sigma-clipped mean of the integrations should come closer but still throws away valid group differences that I can use.


Michael Regan - @Timothy Brandt, @Mario Gennaro, The current jump step is optimal when pixels are Poisson noise dominated. We’ve known it was not optimal for read noise limited observations. The plan has been to switch to Generalized Least Squares eventually. Karl Gordon has tested GLS against the current ramp fitting routine and it ran significantly slower except where there were a lot of cosmic rays. Karl was concerned about correlated noise which breaks the fast solution method. We don’t have a model for that yet and I don’t think we should wait for it. Addressing the small number of groups problem: By using the read noise reference file we are able to find more jumps with short ramps.


Timothy Brandt - @Michael Regan, Is there evidence of significant correlated noise, i.e., noise that is correlated between groups at a fixed pixel?  I found very little evidence of that.  There is certainly correlated noise (1/f) between pixels, but by the time a pixel is read out again the 1/f has died down a lot, and you can correct most of that.


Mario Gennaro - @Michael Regan, I didn't mean to imply a "sub-optimal" pipeline behaviour in a strict sense. I think @Timothy Brandt summarized corrrectly what I meant by "the pipeline can't handle small number of groups" in his message above where he says that it basically CAN'T handle that unless there are NINTS>1, in which case the ramps could be rearranged in a way that allows more lever arm for CR identification. I think that in this particular case we are background (poisson) limited, so in that respect the optimal weights are still good, so it is simply the small-number statistics effect of having too few groups that is tripping the pipeline, nothing fundamentally wrong with the algorithm (edited) 


Michael Regan - @Timothy Brandt, @Mario Gennaro - I’m referring to spatial correlation not temporal. It doesn’t really matter since I don’t see us ever using this term.


Timothy Brandt - @Michael Regan, If it's just spatial correlation I think GLS is correct, and you can remove the spatial correlation component at the group level to improve the ramp fit and jump detection.


  • No labels