The QUEST Data Processing Software Pipeline

Peter Andrews; Charles Baltay; Anne Bauer; Nancy Ellman; Jonathan Jerke; Rochelle Lauer; David Rabinowitz; Julia Silge

doi:10.1086/588828

1. INTRODUCTION

The QUEST Large Area Camera, covering about 10 deg² on the sky, has been installed and commissioned on the Samuel Oschin Schmidt Telescope at the Palomar Observatory. Science-quality data-taking with the camera started in the Fall of 2003. As a permanently mounted survey instrument, this camera has been used successfully to find low-mass stars (Slesnick et al. 2006), new dwarf planets (Brown et al. 2005), nearby supernovae (Copin et al. 2006), and other transients. As part of the Palomar QUEST Survey (Djorgovski et al. 2008), 15,000 deg² in an equatorial strip between declinations -25° to +25° have been observed multiple times in drift-scan mode in seven different passbands: Johnson filters U, B, R, and I and the Sloan Digital Sky Survey (SDSS) filters r^', i^', and z^'.

This large data set is used to study a variety of science topics, and a number of different software packages are in use to reduce the data:

1.
The Yale Pipeline to analyze drift-scan data for quasar identification, blazar variability studies, and searches for strongly lensed and highly redshifted quasars and variable objects (Andrews 2003; Bauer 2008).
2.
The Caltech Data Cleaning Program to remove instrumental artifacts.
3.
The Berkeley Program developed by the Supernova Cosmology Project (SCP) to search for supernovae (Perlmutter et al. 1999; Copin et al. 2006).
4.
The Caltech Real Time Pipeline to carry out rapid, real-time detection of transients (Djorgovski et al. 2008).

The purpose of this paper is to present a detailed description of the Yale Pipeline to process drift-scan data in pursuit of goals (1), above. The general requirements for the pipeline are to produce a catalog of observations with stellar photometry in the seven passbands with absolute precision ≲10% and relative precision ≲2% for repeated observations.

A detailed description of the QUEST Large Area Camera has been presented in a previous paper (Baltay et al. 2007). A very brief summary of some of the features relevant to this paper is as follows: The camera consists of 112 CCDs arranged in four rows (labeled A, B, C, D) with 28 CCDs in each row (labeled 1 to 28), as shown in Figure 1. Each of the four rows has a different filter. Rows A, B, C, D have respective filters R, I, B, U filters in the Johnson set and filters z^', z^', i^', r^' in the SDSS set, where the z^' filter is doubled (note that the SDSS filter set is referred to erroneously as the "Gunn" set in Baltay et al. 2007). While the order of the filters is adjustable, they have been fixed for the duration of the survey. Generally the Johnson set is used during dark time and the SDSS set is used in gray and bright time. In the drift-scan mode the telescope is locked in a fixed position, typically for a whole night, and the rotation of the Earth causes the image of any given star to move across the camera. The camera is rotated in such a way that the images traverse rows A, B, C, D in sequence, thus providing data in four passbands essentially simultaneously. The CCDs are 600 × 2400 pixels each, with a pixel size of 13 μm × 13 μm (0.88^'' × 0.88^'' each). Near the equator it takes a star image 140 seconds to traverse each CCD, which is clocked synchronously with the motion of the star image. Given that there are 22 mm gaps between the rows, it takes to traverse the entire field of view of the camera. Some of the properties of the camera are summarized in Table 1 with additional information provided by Baltay et al. (2007). The typical gain in the CCD readout electronics is 4 electrons per digital unit and the typical full well for a CCD is 50,000 electrons, but there is considerable variation from CCD to CCD.

**Fig. 1.—** Arrangement of the 112 CCDs in the QUEST Large Area Camera. The correspondence between the Johnson filter set (*U, B, I, R*) and the SDSS filter set (z^', i^', r^') and the fingers (A, B, C, D) is shown along with the scan direction (*arrow*).

The 28 CCDs in a row, which we call the 28 columns, are each at a different declination, and thus the star images move at different rates across the 28 CCDs. In drift-scan mode the CCDs have to be synchronized at slightly different clocking rates. There was a concern that this would introduce too much noise if one CCD were read out (typically microvolt signals) while a neighboring CCD was clocked (typically 10 volt signals). To avoid this, all CCDs are clocked synchronously at the same rate, but clocking signals are dropped at different intervals for the different columns to achieve the appropriate average clocking rates. These "line drops" have to be carefully taken into account in the data reduction programs. All observational information, including clocking parameters, UT start and stop times, telescope position, CCD temperatures, and filter options are recorded in the image headers and operator logs. Each 600-pixel row of each CCD image is also clocked out with an additional 40 pixels of overscan signal to keep a continuous record of the bias level.

2. GENERAL DESCRIPTION OF THE YALE PIPELINE

The software packages to process drift-scan data were written at Yale University. They were designed to deal optimally with closely separated sources (e.g., gravitationally lensed quasars), to deal with the peculiarities of the data from the QUEST camera (such as dropped lines), and to process the huge volume of data in a time-efficient manner. It was a priority to be able to process each night's data during the following day so as not to fall behind in the data reduction.

The first version of this pipeline (Andrews 2003) was written to analyze data from the QUEST1 survey using a 16 CCD prototype of the present camera on the Schmidt telescope at the CIDA Observatory in Venezuela (Sabbey et al. 1998). This first version was then modified and optimized for the Palomar QUEST survey.

The data from the camera are transferred from Palomar via a radio link and arrive at Yale in essentially real time. The data flow and archiving scheme are shown in Figure 2. The raw data from a night of drift-scanning (an area 4.6° wide in declination and typically -long strip) are packaged into 28 folders, for the 28 columns of CCDs, each folder containing the full night's data stream in each of the 4 passbands. The 28 columns are independent data sets, each covering different declination strips of the sky. The 4 CCDs in each column contain the same objects, in different passbands. Initially each column was processed on a different processor, using a farm of 28 processors to simultaneously process the data. At the present time, with faster processors available, 12 processors are sufficient to reduce a full night's data every night. The raw data are compressed by a factor of 2 using a lossless compressions program and archived. A typical night's data are about 50 Gbytes compressed.

**Fig. 2.—** Flowchart showing the Palomar QUEST Data Distribution, Processing, and Archiving.

The pipeline consists of three major programs:

1.
The PHOTOMETRY program. This handles preprocessing of the raw image data (bias subtraction, flat-fielding), object detection, astrometric solutions, CCD-to-CCD coordinate transformations, and flux measurement with both aperture and point-spread function (PSF) photometry. The input to the program is a setup file defining the threshold for object detection, the aperture radii for photometry and sky-background measurement, and other information needed to process a given column of image data. The setup file also specifies the bias- and flat-field images, which are one-dimensional images (single rows) measured from calibration scans (darks and twilight flats) using an auxiliary program. The output is a binary catalog with J2000 R.A. and decl. positions, the flux and flux error measurements in each passband, and an error code flagging any problems with the photometry (such as saturated pixels or bad PSF fit) for each identified object.
2.
The CATALOG program. This reads the output catalogs of the PHOTOMETRY program, combines repeated measurements of any given object from observations on different nights, and writes a Master Catalog of all objects detected. The program attempts to resolve ambiguous associations (e.g., observations of closely spaced objects that are resolved on one night but not the next). The Master Catalog is a binary catalog, stored on disk in -degree zones that cover the survey area. It is designed for rapid loading and searching with the ANALYSIS program.
3.
The ANALYSIS program. This reads the Master Catalog, applies the photometric calibrations, and sets up the framework for users to carry out the science data analysis, generate distributions, and write various output files. For example, cuts can be applied to eliminate objects with large measurement errors or to identify objects with significant variability. The calibration of the data is performed with auxiliary programs that provide calibration tables accessible by ANALYSIS, and with functions within the ANALYSIS program itself. These calibrations allow for extinction corrections and transformation of the instrumental magnitudes to standard photometric systems.

The detailed components of the Yale pipeline program are described in the following sections of this paper. The final section of the paper describes the photometric calibration of the data.

3. THE PHOTOMETRY PROGRAM

The PHOTOMETRY program consists of a series of seven operations, executed in series without loops, on the raw drift-scan data in the following order: preprocessing, object detection, CCD coordinate transformation, astrometry, seeing determination, aperture photometry, and PSF photometry. The following discussion describes each of those programs in detail.

3.1. Initial Data Formatting

Prior to starting the PHOTOMETRY program, each night's drift-scan data are separated into smaller files on disk. Each column of data, typically spanning ∼120° in R.A., is saved on disk as separate images (frames) of 640 × 2400 pixels each. This is the pixel size of the CCD arrays (plus 40 overscan pixels) and corresponds to 0.15° × 0.58° on the sky. A typical night's data consist of 28 columns, with four passbands per column and 200 drift-scan frames per passband. Each column is processed independently. Within each column, the frames of each passband are also processed independently in the initial stages (preprocessing and object detection). To get CCD coordinate transformations and astrometric solutions, scan-wide solutions are obtained for each passband. For seeing and photometry, the frames are once again processed independently, but information from the CCD coordinate and astrometric transformations is used to tie the photometry across passbands.

3.2. Preprocessing

The first stage of the PHOTOMETRY program is to bias-subtract, dark-subtract, and flat-field the data on a frame-by-frame basis for each of the 112 CCDs in the camera. For each row of each frame, the first correction is to subtract the overall bias level of the CCD readout amplifier. This level is determined by median-averaging the overscan pixels in each row of the frame. Doing this on a row-by-row basis corrects for occasional low-level (a few counts) intermittent changes in the bias level that sometimes occur during image readout. The second correction is to subtract the fixed, column-dependent deviations in the dark level that are caused by nonnegligible dark currents and the presence of bad pixels in some of the CCDs (bad pixels appear as bad columns in drift scans). This column-dependent correction (the dark field) is determined from a dark drift scan. Finally, variations in the response of the CCD are corrected by dividing by a flat field that is derived from a twilight drift scan. The dark and twilight flat fields required for these corrections are taken nightly, but we create only a single set of correction fields per lunation. For this, we use the darks and flats taken on the first dark night (Johnson filters) and the first gray night (SDSS filters).

To make the dark correction field, a single raw dark frame is first corrected by subtracting the overscan bias level. The dark field is then divided into 240 subframes of 10 rows each. These subframes are median-averaged together, pixel by pixel, to yield a single average subframe. This subframe is then reduced to a single-row image by computing the average signal in each column. This final result is the dark correction field that is subtracted from each row of the drift scan to be corrected.

To make the flat correction field, there is an initial selection of the appropriate twilight frames to use for the correction. Ideally the sky level in counts should be above 5,000 so that the signal-to-noise ratio is high, yet lower than 10,000 to avoid levels that are close to the full well of some of the CCDs. During twilight, we take a long drift scan (∼20 frames), starting when the sky is bright and ending when it is dark to ensure that at least three consecutive frames are taken in each filter that meet this criterion. These chosen frames are then processed, like the darks, to remove the overscan bias. Then the dark correction, previously determined from the dark field, is subtracted from each row of each frame to remove column-dependent biases and dark currents. Next a one-dimensional, row-dependent fit to the sky level is determined for each frame. Dividing each frame by this fit removes the sky background that varies monotonically with time during the twilight scan. The three frames are then median-averaged together. This average frame is further divided into 240 subframes, 10 rows each, which are median-averaged together to yield a single subframe. A final averaging of the subframe rows together and normalizing of the average level to unity yields the flat-field correction.

To check that no mistakes have been made in the creation of the dark and flat fields (owing, for example, to processing errors, lights left on in the dome, scattered moon light, or electronics problems) we compare newly created darks and flats to those from the previous lunation. Fractional differences smaller than a few percent are acceptable. If large deviations are found, however, a new set of darks and flats are created using dark and twilight scans from a subsequent night. Occasionally, we have used night-sky drift scans for flat fields when we have not been able to obtain twilight flats (owing to weather or equipment problems). The twilight and night-sky scans yield equally valid flat fields if the night-sky background is high enough (this depends on the passband). For drift scans, fringing is not an issue because fringe effects are averaged out in the scanning processes.

3.3. Object Detection

Object detection is carried out on one frame at a time, separately for each of the four filters. It is a two-pass procedure. The first pass is a simple, threshold-based routine. In the second pass, these initial detections are refined by PSF fitting. This fitting procedure more carefully separates closely spaced stars (separation ) which would otherwise be counted as single objects. Since the sky level across a whole frame can vary (owing to twilight, moon rise, extinction changes, etc.), each frame is subdivided into thousands of small regions and the local-sky level for each region is calculated as a two-stage clipped average of the pixel values.

In the first pass, the detection algorithm steps through the pixels in an image in a raster fashion, assembling objects with pixels where the flux exceeds 2.5 standard deviations above the local sky background. An object is completed when it is surrounded on all sides by below-threshold pixels. Objects are kept only if they consist of at least four adjacent pixels above threshold. This criterion voids the detection of single-pixel noise artifacts such as cosmic ray hits. Because the CCDs are thinned to ∼15 μm thickness and because exposure times for drift scans are only a few minutes, cosmic ray hits exceeding a few pixels in length rarely appear. At this stage of the pipeline, detections in one filter are not tied to those in another (this occurs later; see § 3.7).

During the raster scan of each row of a given CCD, a map of bad columns for that CCD is consulted as each pixel is evaluated. When an object is found that overlaps or abuts a bad column, the boundary for that object is terminated at the bad column. No further pixels from the row are associated with the object. There is no attempt to smooth over the bad column. At this state in the pipeline, there is no special treatment for objects abutting bad columns or any edges of the frame. There is also no effort made to merge the objects at the start or end row of a frame with the previous or following frame in the scan sequence. Later, when fluxes are measured, flags are set to indicate which objects are at frame boundaries or connect to bad columns, and there are additional checks to remove bad detections.

In the second pass of the detection routine, the first step is to obtain an accurate model of the PSF for each frame (this accounts for changes in the PSF over time owing to variable seeing conditions). The PSF is viewed as a two-dimensional distribution of light impacting an imaging device, which the atmosphere and the instrument produce from an idealized point source such as a distant star. We do not attempt to describe this as an analytic function but instead use the actual measurements of stars on the image that have typical shapes and good statistics. The two-dimensional flux distribution varies slightly depending on the location within a pixel where the object is centered; i.e., the PSF is slightly different for a star centered at the center of a pixel than for a star centered near the corner of a pixel. To take this into account, the central pixel is divided into 4 × 4 segments. Selected good-quality stars are sorted into 16 classes, each class consisting of images centered on one of the 4 × 4 segments of the central pixel. For each class the pixel fluxes of many images are summed and then normalized to a total unit flux. This produces 16 different PSF tables. In the PSF fitting of a particular object, the segment that holds the central pixel of the object is first evaluated. This determines the appropriate PSF table to use.

The PSF fitting procedure for each object is an iterative one. The fluxes in the individual pixels of the object (determined in the first pass) are examined. If there is a simple maximum the object is fit to a single PSF. If there is more than one local maximum the object is fit to a number of separate PSFs corresponding to the number of local maxima. Both the fluxes and positions of the PSFs are simultaneously fit to each object. Then the fluxes expected from the fitted PSFs are subtracted from the object image. If there is a significant positive residual, the residual flux is considered as a separate object and a new PSF is added centered on the residual flux. This allows the detection algorithm to detect multiple images or faint stars previously obscured by a close bright star. The procedure is then repeated for a maximum number of iterations specified by the user, typically three fitting iterations in total. In each iteration, the best fit is determined by a search for the χ² minimum. The search is implemented using Powell's multidimensional methods with Brent's inverse parabolic algorithm for each successive line minimization (Press et al. 1986)

3.4. Astrometry

To obtain the transformation from CCD pixel coordinates to astrometric positions, objects detected in each filter are matched to stars in the USNO A-2.0 catalog (Monet 1988). All the objects detected with a given CCD during an entire drift scan are pooled together into a single catalog so that a single average transformation, linear in pixel coordinates, can be found for the entire scan. Within a given catalog, the pixel coordinate corresponding to an object's image row number is indexed cumulatively from the starting row of the drift-scan image. A linear solution is possible in the decl. direction because each CCD covers only a small angle in declination. The solution is necessarily linear in the R.A. direction because any nonlinear distortions are averaged in the scanning direction. Once the average solution is found, residual deviations from the average solution (owing to small clocking errors, any small telescope motions during the scan, and the effect of line drops discussed above) are tracked as a function of right ascension and used as second-order corrections to the average solution.

To derive an astrometric transformation, the first step is to make a crude estimate of the pixel positions of the USNO catalog stars, given the known telescope pointing, the known pixel scale of the drift-scan images, and the catalog positions. A triangle-matching algorithm is then used with selected bright objects to associate the catalog stars to the detected stars, and then the linear transformation is computed. We use the matching algorithm distributed by NOAO with their IRAF image processing software (see FOCAS routine at http://iraf.noao.edu). This is described in detail by Valdes et al. (1995). In our pipeline, the matching is done one frame at a time, each time using the solution from the previous frame to predict the pixel positions of the catalog stars in the next frame. Here we choose a frame size of 2,400 rows because it is convenient and usually yields a sufficient number of stars to yield a good solution, but the choice is not critical. Once all frames have been solved, the average solution is determined for the whole scan. If there are any frames that do not yield good solutions, they are skipped and the average solution is assumed to hold for these frames.

It should be noted that the USNO catalog is in J2000 coordinates, which is not the natural coordinate system that is observed on any given night. The USNO coordinates are precessed to the epoch of the observations before the matching is attempted. Once the final transformations are obtained, the astrometric coordinates of the detected objects are precessed back to J2000 coordinates. The SLALIB software library (Wallace 1994) is used to make these transformations.

To judge the precision of the astrometry of this procedure we compared the right ascension (R.A.) and declination (decl.) from the QUEST survey with the astrometric standards of Stone et al. (1999). We found that the mean values of USNO and the Stone standards disagree by about 0.2''. To bring our data into agreement with the Stone standards we applied a final calibration correction of typically -0.1^'' in R.A. and -0.2^'' in decl. After this final correction (which was applied in the ANALYSIS program described below) the comparison with the astrometric standards is shown in Figure 3. We see from this comparison that the astrometric precision of the drift-scan data is 0.1''. However, over the entire area of our survey, the systematic error is limited by that of the USNO catalog that we use for calibration (∼0.2^''; see Assafin et al. 2001).

**Fig. 3.—** Astrometric precision of the Palomar QUEST observations in the drift-scan mode. The figures show the number of observations versus their measurement error in R.A. and decl. as determined by comparison with the astrometric standards from Stone et al. (1999). Smooth Gaussian fits are shown for both distributions.

3.5. CCD-to-CCD Coordinate Transformations

Bright objects should be present in all four passbands in each of the 28 columns. Fainter objects or those with extremely blue or red color may be bright enough for detection in some of the filters but below the threshold for detection in others. So that we can measure the flux or an upper limit to the flux in the passband where a detected object is below the detection threshold, and so that detections of a given object in multiple passbands can be identified as the same object, we calculate linear transformations between CCDs that relate their pixel coordinates. As in the astrometric solutions (see § 3.4), all the detections by a given CCD in a given drift scan are pooled into a single catalog, ignoring frame boundaries, and pixel rows are indexed from the starting row of the drift scan.

The process starts with the selection of a lead CCD. Although the choice is not critical, we pick the CCD for which the corresponding passband is most likely to yield detections of any given object (i.e., the passband with the faintest magnitude limit). For the Johnson filters it is the R filter, and for the SDSS filters, it is the r^' filter. Next, given the astrometric solutions previously determined for each passband (see § 3.4), the objects detected in each of the nonlead CCDs are matched to the objects from the lead CCD. Once matched, a Powell minimization algorithm (Press et al. 1986) is used to determine the linear equations transforming the nonlead pixel coordinates to the lead-CCD pixel coordinates. As in the case of the astrometric solution described in § 3.4, a linear transformation is expected, but we track the residuals to the solution as a function of R.A. and use these as second-order corrections to the transformation. The accuracy of the final CCD-to-CCD transformations has been measured to be 0.08 pixels in both the x and y directions on the CCDs. Note that this is more accurate than the astrometric solutions discussed above (see § 3.4) because no transformation to a catalog is involved, other than to match the stars on two different CCDs. The final transformations are used to match the objects detected in the four different passbands to each other and to register the apertures and PSF fits used for photometry (see §§ 3.7 and 3.8).

3.6. Seeing Determination

This routine takes the list of detected bright objects and calculates the average seeing for each frame in each filter by performing a clipped mean of the measured FWHM values (determined from the PSF fitting). These frame-wise computed values of FWHM, as well as the number of objects per frame, are stored for later use in the aperture photometry routine.

3.7. Aperture Photometry

Before making any photometric measurements, the first step is to merge the detection list of the different passbands. Typically the CCDs with different filters have widely varying fluxes for a given object. For example, in the Johnson set, R and I usually have the highest flux, B less, and U the least. Often objects are detected in the reddest filters and not detected in the U filter. To obtain a flux measurement in a passband where the object is too faint for detection but its position is known from its detection in another passband, the list of objects detected in all four passbands in a scan are merged. Using the CCD-to-CCD coordinate transformations (see § 3.5), the pixel coordinates of these merged objects are transformed into the coordinate system of each separate CCD. For each object, a flux is measured using an aperture centered at the transformed location. Even if there is no detectable signal for a given passband, a measurement is made. In cases where an object is detected in multiple passbands including the lead CCD, the position is taken to be the one determined from the lead CCD, as this is likely to give the best estimate of the centroid.

Aperture photometry is performed in a standard way with four different apertures: one with a one-FWHM-radius circle aperture, one with a two-FWHM radius, and two with fixed apertures with 3- and 10-pixel radii. The one-FWHM aperture is optimal for photometry, as the contribution of the sky noise is minimal. Fluxes are measured in the other apertures to allow for later discrimination of extended or blended objects from point sources (see § 5). The choices for these larger apertures were not tied to any specific survey goal, but are in the size range useful for characterizing the shapes of extended sources. For each aperture the flux is taken to be the total flux in the aperture minus the sky flux (the number of pixels times the average sky background per pixel). The average sky background is obtained as the clipped mean sky level in an annulus (typically of inside and outside radii of 20 and 25 pixels respectively) around the object.

The error on the flux, σ_F, is calculated by adding in quadrature the errors in the object flux within the aperture, the error in the sky flux within the aperture, and the error in the estimate of the sky background:

where F is the object flux in counts (ADC units), G is the gain in counts per electron, N_obj is the number of pixels inside the aperture, N_sky is the number of pixels in the sky annulus, and σ_sky is the sky noise. We do not separately account for read noise because the measured sky noise already includes the contribution from read noise. Gains are checked every few months or any time adjustments have been made to the electronics. The flux and its error are then converted to the instrumental magnitude, M_I, and its error, σ_M, using the following formulas:

For the one-FWHM aperture, the ellipticity, orientation, and size of the minor and major axes of the image, the FWHM, the sky level, the maximum pixel value and the number of pixels above threshold are recorded. Error flags are set to indicate problems such as saturated pixels, badly measured sky background, and negative flux. There is a nonlinear response of the CCDs, which only becomes significant for U- and B-band observations on dark nights when the sky background is very low, but it is corrected for in the calibration of the instrumental magnitudes (see § 6.2).

3.8. PSF Photometry

The QUEST pipeline measures PSF as well as aperture photometry for all objects. As in the aperture photometry, the list of object positions is provided by the detection routine and is the merged list from all passbands. Much of the PSF photometry code is very similar to that used in the PSF detection routines (second pass of object detection). The main difference is that object positions are not allowed to vary in the PSF fitting. The positions determined at detection are frozen. Only fluxes are allowed to vary. As in the detection stage, a set of empirical PSFs is constructed and fitted to each object. Objects within 13 pixels of each other are gathered into a group, and their fluxes are simultaneously fit using the appropriate PSFs. A 13-pixel separation is chosen because it is much larger than the FWHM, ensuring that any one group will not contain flux from a neighboring group. Simultaneous fitting of all the group members ensures proper handling of objects whose PSFs overlap. The flux errors are obtained by evaluating the flux range, for a given object, for which the resulting reduced χ² varies by unity. This PSF routine is quite robust. It fails to give a meaningful flux <1% of the time.

The errors obtained from the χ² fits are essentially statistical errors. To get a measure of systematic errors in this procedure, we compared the results of repeated measurements of the same object in the same filter carried out on different nights. It was found that a systematic error of 0.015 mag had to be added in quadrature to the statistical errors to bring the errors into agreement with the rms of the repeated measurements. No specific reason for this systematic error could be found (or else we would have corrected the problem). Likely causes are variable extinctions from night to night and frame to frame that are not completely taken out by the extinction corrections, and variations in the PSF with position in each field.

4. THE CATALOG PROGRAM

In the PHOTOMETRY program described above, the measurements of any given object on a particular night in the four different bandpasses are treated in a correlated fashion. The program processes one scan at a time. The output is the four bandpass measurements for each object, already tied together. In the Palomar QUEST drift-scan survey the same piece of sky has typically been observed 4 to 8 times during different nights. It is the purpose of the CATALOG program to collect the results of the different observations of any given object. Because the bandpass observations are already tied together for each night (more specifically each drift scan), the task of the CATALOG program is only to tie together the multiband observations of the same object from different scans. The output of the CATALOG program is the Master Catalog containing for each object all of the information from all measurements in all passbands and all the different nights. This Master Catalog is updated periodically as the survey progresses.

The CATALOG program consists of two parts, the LOAD section and the COMBINE section. The LOAD section takes the output of the PHOTOMETRY program and reformats the data into structures more convenient for further analysis. The output of the PHOTOMETRY program is organized by drift scan. The LOAD output has the data organized into bins of -degree each on the sky (approximately 240,000 bins for the sky covered by the survey).

The COMBINE section works on one -degree bin of data at a time, stepping in turn through all of the bins. In each bin the routine looks at each entry in turn, using the R.A. and decl. coordinates to match up all the multiple measurements of the same object. The matching algorithm used is the following, which works in two passes.

In the first pass, a separate list is generated for each scan covering the same area. Each list records the R.A., decl., and brightness measurements of all the objects detected by the corresponding scan. For each object in each list, the algorithm then checks the other objects in the same list and in the other lists to see if they are separated by <0.5^''. Such objects are grouped together as observations of the same master object. After the first pass, a list of master objects is created, each coupled to a group of one or more observations from one or more scans. During this first pass, however, occasional ambiguities arise. For example, if an extended object is resolved into two objects separated by <1^'', both components can match a single master object. In addition, two close objects may be separated in one scan but not in another. To handle these cases, any multiple matches from a given scan to a single master object are provisionally listed as separate master objects. In the second pass, all close master objects are reexamined. If an observation associated with one master object is better grouped with those of another, the new association is made.

The output of the CATALOG program, the Master Catalog, is a catalog of all objects detected one or more times, with objects organized into the -degree bins, and within each bin listed sequentially by R.A. The information on all of the multiple measurements of each object is not physically copied into the Master Catalog; instead, the Master Catalog, in addition to some summary information such as R.A., decl., and number of measurements in each passband, contains pointers to where the individual measurements can be found in the LOAD output. In order to flag possible transients and moving objects, the Master Catalog also counts the number of times a given object is missed (i.e., the number of scans covering the object position for which the object was not detected).

5. THE ANALYSIS PROGRAM

The ANALYSIS program is the last section of the Yale Pipeline. This program reads the Master Catalog one object at a time and performs the following functions:

1.
The pointers are followed to the LOAD output and all of the relevant information for all of the measurements of this object are collected.
2.
The photometric calibrations are applied to each individual measurement to convert the instrumental magnitudes to calibrated magnitudes. The appropriate calibration parameters have been obtained by a previous calibration procedure and are in the ANALYSIS program. The details of the calibration procedure are discussed in § 6 of this paper.
3.
Data quality cuts are applied to flag individual measurements that are of inferior quality and should not be used in calculating average magnitudes and other science analysis. These quality cuts set limits on quantities like sky brightness, extinction, and seeing quality (see § 7 for the actual cuts used). Cuts can also be used to flag transients and moving objects by requiring a detection on only one night and missed detections on other nights. Cuts that set threshold values for the aperture flux ratios (e.g., large to small) can be used to discriminate point sources from extended objects.
4.
A framework is provided to carry out further science analysis of the data and to facilitate writing output text, graphs, catalogs, etc., as desired. For example, 1 σ upper limits to the measured magnitudes can be substituted for measured magnitudes where the flux values are close to zero or negative.

6. PHOTOMETRIC CALIBRATION

The fluxes for detected objects recorded by the camera have to be corrected, or calibrated, to account for two different kinds of effects. The first is extinction due to light cloud cover or other atmospheric effects that tend to vary not only from night to night but possibly more frequently during one night of observation. The second is due to the differing responses of the CCDs, variation of response over the area of one CCD, filter transmissions, nonlinear response of the CCDs, and other instrumental effects. This second kind of instrumental effect does not vary significantly with time (constant to better than the 5% level) except for discrete incidents like replacement or adjustments of electronics, telescope mirror resurfacing, etc., which are carefully noted and tracked. Thus, the photometric calibrations consist of two parts: the extinction corrections, and the calibration of the instrument.

6.1. Extinction Corrections

If all nights were perfectly photometric, and all observations were made at the same air mass, we would not need this correction. On nights with a significant cloud cover or other adverse weather conditions, the dome does not open and no observations are made. However, even on nights when data are recorded, there is often a significant amount of extinction. In this survey we also use the nonphotometric nights with extinctions <0.5 mag. We thus have to make a correction for the extinction. The only large area survey that covers all of the area of the Palomar QUEST survey is the USNO survey (Monet 1988). We started out using the USNO catalog for the extinction correction but found that the photometry of the USNO catalog was too variable to be useful for this purpose (exceeding 10%). We therefore adopted the following procedure using only our own survey data, making use of the fact that we covered each area of sky in the survey multiple times over different nights.

The procedure starts by taking each drift-scan strip (recall that these are 4.6° wide in declination) and dividing it into 1° segments in R.A. (about 4 minutes of scan time). In each passband all the observations in a given R.A. segment are collected and the brightness of a sample of stars is compared. We then assume that the observation with the highest flux for the selected stars is a good approximation to photometric and assign an extinction correction to all of the other observations to bring them up to the level of the brightest night. No other requirements are made for the brightest night.

The distribution of the extinction corrections (the number of 1° segments vs. their extinction, as defined above) is shown in Figure 4. We find that ∼80% of the data are extincted by <0.2 mag. In other words, 80% of the data yield photometry consistent to ∼20% under repeated observation. If the extinction correction is larger than 0.5 mag, the data are flagged as bad by the ANALYSIS program and not used for science data analysis. About 10% of the data are lost due to this cut. These extinction corrections are fed into the ANALYSIS program described above and are used to correct the instrumental magnitudes. The precision of these corrections is discussed in § 6.3 below.

**Fig. 4.—** Distribution of the extinction corrections for the Palomar QUEST survey data. Plotted are the number of 1-degree scan segments as a function of their relative extinction correction. See text for details.

6.2. Calibration of Instrumental Effects

The Sloan Digital Sky Survey (SDSS) Data Release 4 (Adelman-McCarthy et al. 2006) is used to do the photometric calibration of the QUEST survey, using areas of sky where the two surveys overlap. Several clear nights of QUEST data, both with Johnson and with SDSS filters, are selected (these are nights with the least extinction based on comparison with at least four other nights covering the same area). Objects in QUEST are matched to those in SDSS (using objects classified as stars by SDSS) and the QUEST instrumental magnitudes (after being corrected for extinction as described above) are compared to the SDSS-quoted magnitudes. The calibration consists of calculating the zero-point offset, i.e., the magnitude that has to be added to the QUEST instrumental magnitudes to bring them into agreement with SDSS. This is straightforward for the SDSS filters because those are the filters used by SDSS. To calibrate the data with Johnson filters, an algorithm (Fukugita et al. 1996) was used to convert the magnitudes of the SDSS catalog to Johnson magnitudes with sufficient precision to compare to QUEST data with Johnson filters.

This calibration procedure was carried out for all 112 CCDs, separately with the Johnson and SDSS filter sets, so there were a total of 224 calibrations (for each CCD, there are only two filter possibilities). Furthermore, the calibration of a single CCD was allowed to vary as a function of the column across the CCD, the magnitude of the object, and the sky brightness. The full calibration of the camera thus consisted of 112 CCDs × 2 filters × 10 regions across a CCD × 6 magni-tude bins × 6 sky brightness bins = 80,640 calibration parameters. As discussed in Baltay et al. (2007), several of the CCDs in the full array are not functional, and so the number of calibration parameters is slightly smaller. Thanks to the substantial overlap area with the SDSS survey we could use ∼10⁶ stars to evaluate all of these parameters. These parameters were fed into the ANALYSIS Program described above and were used to transform the instrumental magnitudes (already corrected for extinction) to obtain the final calibrated magnitudes.

6.3. Quality of the Photometric Calibration

Extensive checks were carried out to verify the quality of the photometric calibration. In one of the most stringent tests, we looked at the absolute photometric magnitude calibration by comparing with the SDSS data using a 1500 deg² region where the two surveys overlap. We compared each of the eight filters separately. The comparison for the SDSS r^' filter is shown in Figure 5. We use in this comparison the entire QUEST data set from September 2003 to September 2006, with the standard quality cuts applied in the ANALYSIS program (see § 7), including both photometric and non-photometric nights. The comparison, Figure 5a, has a standard deviation σ = 0.11 mag. This includes both the SDSS and QUEST statistical errors as well as the calibration errors.

**Fig. 5.—** Comparisons of the calibrated SDSS r magnitudes measured by Palomar QUEST to the r magnitudes measured by SDSS for the total SDSS-QUEST overlap area (∼1550 deg²) and including all QUEST data from the first three years of the survey: (a) The number of observations, N, vs. SDSS-QUEST r. The standard deviation is 0.11 mag, which includes both the statistical errors and the calibration errors; (b) The number of observations versus (SDSS-QUESTr)/error, where error is the combined error after adding an 8% calibration error in quadrature to the SDSS and QUEST statistical errors. This additional error is required to make a Gaussian fit with a standard deviation of 1 (shown by dashed line)

To separate the statistical and calibration errors, we plot the distribution of the magnitude difference between SDSS and QUEST divided by the total error, where the total error is the SDSS and QUEST errors on each object added in quadrature to the estimated calibration error. We vary the calibration error estimate and choose the one that gives σ = 1 for this distribution. For the r^' filter, the best estimate of the overall systematic calibration error was 0.08 mag, as shown in Figure 5b. The results for the estimated calibration error for all of the other passbands including both photometric and non-photometric nights vary between 0.06 and 0.12 mag, except for the U and B filters where the errors are larger. There is also typically 15% of the data in a non-Gaussian tail.

This rough overall photometric calibration error is for the entire survey, including non-photometric nights and with no quality cuts. When we restrict the comparison to photometric nights only, the calibration errors are considerably smaller. Furthermore this procedure uses a single set of magnitude zero points for all nights, with the relative extinction corrections as described in § 6.1. No attempt was made at this point to determine color terms in the corrections, though the ANALYSIS program has procedures to do this. In any particular science analysis where better photometric precision is required, calibration procedures appropriate for that particular analysis will have to be used. For the general goals outlined in § 1, the precision indicated by Figure 5 is adequate.

For example, to check the systematic errors of the relative magnitude calibration we took frames with repeated measurements of the same area of the sky spread out over the three years of the survey. We normalized the different epochs to each other using a sample of 10 or more field stars, and plotted the distribution of the deviation of the magnitudes of the individual measurements from the average magnitude. An example plot for the Johnson R filter, using data for bright stars with six or more measurements at different epochs, is shown in Figure 6a. The standard deviation of this distribution is σ = 0.03 mag. This error includes both the statistical and the systematic components of the error. To separate the two we used a procedure similar to that described above. We divided each deviation from the average by the error, where the error was taken as both the statistical and estimated systematic errors added in quadrature, and varied the estimated error. We needed a systematic error of 0.02 mag to produce a distribution with σ = 1 as shown in Figure 6b. Thus, our best estimate for the systematic error on relative photometry is 2%. It is also gratifying that a single Gaussian with σ = 1 is a good fit to the data with very little non-Gaussian tail. By careful selection and further calibration of the data, with cuts to eliminate the lower-quality observations, we obtain a systematic error of 0.7% (Bauer 2008).

**Fig. 6.—** (a) Number of observations, N, vs. individual-average magnitude for objects measured at multiple epochs; (b) Number of observations vs. (individual - average magnitude)/(combined error), where the combined error is the statistical error on each measurement added in quadrature to an 0.02 mag systematic error, to produce a Gaussian distribution with σ = 1. In both panels, smooth Gaussian fits to each distribution are overlaid.

7. COMPLETENESS AND PURITY OF THE DATA SAMPLE

To check the overall quality of the Yale data processing pipeline, we carried out further comparisons with the SDSS Data Release 4 (Adelman-McCarthy et al. 2006) in the area where the two surveys overlap. To estimate the completeness of the Palomar QUEST (PQ) data set, we selected all objects in the SDSS catalog in the SDSS-QUEST overlap areas and looked for a match in the PQ data within 2'' in R.A. and decl. The completeness of PQ was taken to be the fraction of the SDSS objects that were found in PQ. This completeness, as a function of SDSS r magnitudes, is shown as the upper solid line in Figure 7. We see from this plot that the overall completeness of the Palomar QUEST data set is around 97% up to r magnitude of 21, and falls to 50% around 22 mag.

**Fig. 7.—** Completeness of the Palomar QUEST data set compared to the SDSS survey. The upper solid line is without any cuts, the lower dashed line is with the selection criteria described in the text.

To check the purity of the PQ data set, we took objects in the PQ Master Catalog in the SDSS-QUEST overlap area and asked what fraction of the PQ objects have a match (within 2'') in the SDSS catalog. Because SDSS is presumably complete at the magnitude limit of the Palomar QUEST survey (SDSS goes ∼1 mag deeper), objects found by PQ that are not in SDSS are false events or else transients (such as asteroids, variable objects, supernovae, or high proper-motion stars). Hence, the fraction of the detections that are found by SDSS is a lower limit to the fraction that are real detections—the purity.

For this comparison the standard quality (QA) cuts were made on PQ objects in the ANALYSIS program:

1.
Astrometry residuals ≤1^''.
2.
Sky level ≤6 times normal.
3.
Extinction correction ≤0.5 mag.
4.
Image FWHM ≤4^''.
5.
Objects were removed if the image was saturated, near the edge of the CCD, or had a PSF fitting error flag.

The first 4 cuts are a made at a level far from their norms. The intention is to eliminate unusual outliers, noise artifacts, or objects generally detected in usually bad observing circumstances.

Two further selection criteria were used to eliminate false detections:

1.
To remove cosmic rays and other artifacts not rejected by the PHOTOMETRY program, only objects with either detections in two different passbands of one scan or detections in a single passband but in at least two different scans were kept.
2.
To remove false detections near saturated bright stars, detections near bright USNO stars and detections with more than 10 neighboring detections in a 10'' radius were removed.

After these selection criteria were applied, the purity of the PQ sample was found to be around 98% up to SDSS r magnitude 21, as shown in Figure 8.

**Fig. 8.—** Purity of the Palomar QUEST data set compared to the SDSS survey (i.e., the fraction of the Palomar QUEST observations also found by SDSS), with the quality cuts described in the text.

These quality cuts and selection criteria that we used to clean up the PQ data set reduce the completeness of the data set. Some legitimate objects are close to bright stars, or are lost because of high sky background due to the moon, large extinction due to clouds, or other similar effects. The completeness with all of the selection criteria applied is shown as the lower, dashed curve in Figure 7. It should be pointed out, however, that for any particular science analysis the severity of these selection criteria can be varied, thus optimizing purity at the expense of completeness or vice versa.

8. CONCLUSIONS

The Yale photometry pipeline described in this paper has been in place essentially in its present final form since the beginning of science-quality observations for the Palomar QUEST survey in the fall of 2003. The pipeline has been run regularly every day during the course of the survey and has kept up with the data on a daily basis. The program has not been changed in any significant way since the beginning to ensure a uniform quality to the survey data sample. The calibration procedures described in this paper were initiated after data taking began, and these calibrations have been completed just recently. A cataloging program has also been put in place to analyze the data, to place cuts on the measurements, and to produce catalogs of scientific interest. A number of scientific studies are ongoing, including measurements characterizing the timescale of blazar variability, and searches for lensed and highly redshifted quasars. A program of rapid detection and reporting of transients is also under way. As one of the largest full-sky variability surveys undertaken to date, the Palomar QUEST survey is a test bed for search strategies and analysis techniques to be used on future large-scale synoptic surveys. Some of the lessons learned in the processing, analysis, detection, and characterization of the variable objects in the Palomar QUEST survey will be of interest in the design of these future surveys.

This work has been supported by the Department of Energy Grant DEFG-02-92ER40704 and the National Science Foundation Grant AST-0407297. We thank Anne Sommer for final proofing.

The QUEST Data Processing Software Pipeline

Article metrics

Permissions

Author affiliations

Dates

Abstract

1. INTRODUCTION

2. GENERAL DESCRIPTION OF THE YALE PIPELINE