Further Analysis of "C"-source Completeness

In order to think more deeply about the completeness issues I examined the question of how Rae's analysis of the repeated calibration field observations provided a measure of completeness when completeness is defined as:

 given multiple observations of the same field, 
      the repeated recovery fraction of sources flagged as "C" in 
      the 1/2 magnitude bin above the  level 1 specification 
      for sensitivity in each band.

Rae created a database using a calibration field which contains the number of times a source received the "C" designation out of 3692 possibilities.

The query below calculates the ratio of the number of times all sources in some limited magnitude range received "C" vs. the number of opportunities they had to receive a "C".

wsdb=> select count(*) as rows, sum(c_rows) as sum_c_rows, count(*)*3692 as chances,sum(c_rows)::float/(count(*)*3692.) as prob from countcatsum where cat='C' and k_m between 13.8 and 14.3 and cc_flg='000' and prox>8.0;
 rows | sum_c_rows | chances |       prob       
------+------------+---------+------------------
   40 |     145783 |  147680 | 0.98715465872156
(1 row)

The example above evaluated the recovered fraction for sources with Ks-band mags between 13.8 and 14.3 (i.e. one-half magnitude above the level 1 specifications). By this measure the completeness was 98.7%.

j_m between 15.3 and 15.898.4
h_m between 14.6 and 15.197.4
k_m between 13.8 and 14.398.7

In fact, these numbers are a bit deceptive. Below is a sorted list of the contribution of individual sources to this Ks-band completeness measure ordered by the source's individual completeness fraction. Note that the 98.7% completeness value was driven entirely by the first source on the list. It turns out this source should not have been included in the analysis because for south-going scans it hit a persistence ghost and got a "P" while for north-going scans it looked like a good source (and thus was included in this sample). The real Ks-band completeness by this measure is in excess of 99.9%.

wsdb=> select ra,decl,c_rows,f_rows,trunc(1000.*c_rows::float/3692.)/10. as prob,j_m,h_m,k_m from countcatsum where cat='C' and k_m between 13.8 and 14.3 and cc_flg='000' and prox>8.0 order by prob;
     ra     |   decl    | c_rows | f_rows | prob |  j_m   |  h_m   |  k_m   
------------+-----------+--------+--------+------+--------+--------+--------
 132.821376 | 11.868126 |   1845 |      0 | 49.9 | 15.069 |  14.47 | 14.202
 132.779506 | 11.976994 |   3686 |      0 | 99.8 | 14.818 | 14.187 | 14.025
 132.801895 |  11.78515 |   3688 |      0 | 99.8 | 15.057 | 14.424 | 14.226
 132.798146 | 12.298159 |   3689 |      1 | 99.9 | 14.666 | 14.344 | 14.194
 132.819967 | 12.215964 |   3690 |      0 | 99.9 | 15.077 | 14.399 | 14.268
 132.851304 | 12.211119 |   3690 |      0 | 99.9 | 14.411 | 13.947 | 13.813
 132.809108 | 12.197284 |   3691 |      0 | 99.9 | 14.846 |  14.24 | 14.124
  132.85019 | 12.143058 |   3690 |      0 | 99.9 | 15.069 | 14.372 | 14.169
 132.836876 | 12.092064 |   3689 |      1 | 99.9 |  14.52 | 14.226 |  14.08
 132.821561 |  12.08109 |   3689 |      0 | 99.9 | 14.637 | 14.108 | 13.954
 132.772214 | 12.030453 |   3690 |      1 | 99.9 | 14.932 | 14.263 | 14.046
 132.779205 | 11.996083 |   3690 |      0 | 99.9 | 14.755 |  14.15 | 14.032
 132.810839 | 11.986065 |   3691 |      0 | 99.9 | 14.828 | 14.118 | 14.023
 132.805369 | 11.964724 |   3691 |      0 | 99.9 | 14.848 | 14.245 | 14.098
 132.782394 | 11.941971 |   3691 |      1 | 99.9 | 14.535 | 14.108 | 14.106
 132.838723 | 11.913996 |   3691 |      0 | 99.9 | 14.795 | 14.238 | 13.893
  132.84806 |  11.81837 |   3691 |      0 | 99.9 | 14.466 | 13.906 | 13.899
 132.856325 | 11.815358 |   3690 |      0 | 99.9 | 14.776 | 14.146 | 13.987
 132.797656 |  11.78646 |   3691 |      0 | 99.9 | 15.043 | 14.287 | 14.128
 132.779954 | 11.762146 |   3690 |      0 | 99.9 | 14.389 | 13.966 | 13.861
 132.844233 | 11.682869 |   3691 |      1 | 99.9 | 15.101 | 14.456 | 14.278
 132.771745 | 11.681249 |   3691 |      0 | 99.9 | 14.968 | 14.346 | 14.092
 132.812069 | 11.659136 |   3691 |      0 | 99.9 | 14.707 | 14.026 | 13.922
 132.819881 | 11.581359 |   3690 |      0 | 99.9 | 14.511 | 14.078 |   13.9
 132.805291 | 11.540347 |   3691 |      0 | 99.9 |  15.11 | 14.452 | 14.234
  132.78385 | 11.527624 |   3690 |      0 | 99.9 | 15.035 | 14.443 | 14.239
 132.821604 | 11.475908 |   3691 |      0 | 99.9 | 14.476 | 13.958 | 13.898
  132.80827 | 11.439159 |   3691 |      0 | 99.9 | 14.607 | 14.293 |  14.21
 132.817241 | 12.181568 |   3692 |      0 |  100 | 14.835 | 14.262 | 14.144
 132.792901 | 12.025516 |   3692 |      0 |  100 | 14.332 | 13.914 | 13.835
 132.841019 | 11.980089 |   3692 |      0 |  100 | 14.956 | 14.438 | 14.069
 132.794569 | 11.909554 |   3692 |      0 |  100 | 14.457 | 13.946 | 13.835
 132.776643 |  11.90529 |   3692 |      0 |  100 | 14.895 | 14.273 | 14.034
 132.788671 | 11.894362 |   3692 |      0 |  100 | 14.816 |  14.29 | 14.207
 132.794747 | 11.890795 |   3692 |      0 |  100 | 14.657 |  13.95 | 13.954
 132.776957 | 11.791928 |   3692 |      0 |  100 | 14.538 | 13.931 | 13.893
 132.842975 | 11.735915 |   3692 |      0 |  100 | 14.503 | 13.984 | 13.861
 132.841751 | 11.729288 |   3692 |      0 |  100 | 14.698 | 14.133 | 13.903
 132.821946 |  11.58698 |   3692 |      0 |  100 | 14.767 |  14.18 | 14.027
 132.785526 | 11.428737 |   3692 |      0 |  100 | 14.347 | 13.942 | 13.928
(40 rows)

As noted above, the first source drives down the completeness in this band. Below are some of the individual detections from the calibration database. Half the scans caused the star to appear on a persistence ghost.

  132.821372   11.868121  15.010  14.291  13.871  0.061  0.068   null     0.07   18.50   16.50 226    PPP    
  132.821368   11.868116  14.965  14.307  13.713  0.068  0.070   null     0.08   18.50   16.50 226    PPP    
  132.821374   11.868131  14.979  14.305  14.135  0.072  0.072  0.096     0.09   18.50   16.50 222    PPP    
  132.821376   11.868110  15.033  14.344  14.082  0.052  0.066  0.078     0.05   18.50   16.50 222    000    
  132.821391   11.868143  14.963  14.329  14.183  0.074  0.063  0.101     0.12   18.50   16.50 222    PPP    
  132.821371   11.868145  14.482  14.331  14.188   null  0.070  0.080     0.14   18.50   16.50 622    PPP    
  132.821403   11.868102  14.978  14.350  14.312  0.052  0.066  0.094     0.06   18.50   16.50 222    000    
  132.821363   11.868113  14.978  14.291  14.155  0.072  0.067  0.106     0.09   18.50   16.50 222    PPP    
  132.821363   11.868114  15.048  14.326  14.235  0.059  0.058  0.095     0.09   18.50   16.50 222    000    
  132.821374   11.868128  14.887  14.378  14.354  0.067  0.076  0.093     0.08   18.50   16.50 222    PPP    
  132.821386   11.868114  14.965  14.351  14.216  0.052  0.063  0.074     0.02   18.50   16.50 222    000    
  132.821395   11.868101  14.995  14.297  14.205  0.075  0.082  0.103     0.04   18.50   16.50 222    PPP    
  132.821381   11.868143  15.000  14.317  14.343  0.062  0.075  0.101     0.12   18.50   16.50 222    000    
  132.821398   11.868107  14.969  14.333  14.144  0.057  0.070  0.094     0.04   18.50   16.50 222    000    
  132.821372   11.868141  15.021  14.237  14.157  0.054  0.068  0.093     0.12   18.50   16.50 222    000    
  132.821397   11.868108  15.074  14.403  14.231  0.052  0.076  0.101     0.03   18.50   16.50 222    000    
  132.821375   11.868114  14.977  14.414  14.235  0.049  0.077  0.096     0.05   18.50   16.50 222    000    
  132.821379   11.868151  14.991  14.228  14.067  0.060  0.074   null     0.15   18.50   16.50 226    PPP    
  132.821378   11.868121  14.983  14.320  14.193  0.047  0.050  0.058     0.05   18.50   16.50 222    000    
  132.821384   11.868153  15.060  14.340  14.269  0.048  0.049  0.069     0.15   18.50   16.50 222    000    
  132.821363   11.868150  15.031  14.308  14.138  0.045  0.043  0.053     0.17   18.50   16.50 222    000    

So, the Ks-band completeness measured this way is close to 100%. Why is the Ks-band so complete? Because Ks is the least sensitive band for sources with typical high-latitude colors. If a source is detected at Ks band at good SNR then (at least for this high-latitude cal field) it's detection at J- and H-band at high SNR is assured. The J-SNR will be so high that there is little chance the source will toggle across the "F"/"C" flagging boundary.

Now, consider the same issues for the 97.4% "completeness" measured in the H-band. Below are the source by source completeness values:

wsdb=> select ra,decl,c_rows,f_rows,trunc(1000.*c_rows::float/3692.)/10. as prob,j_m,h_m,k_m from countcatsum where cat='C' and h_m between 14.6 and 15.1 and cc_flg='000' and prox>8.0 order by prob;;
     ra     |   decl    | c_rows | f_rows | prob |  j_m   |  h_m   |  k_m   
------------+-----------+--------+--------+------+--------+--------+--------
 132.845409 | 11.939528 |   2198 |   1220 | 59.5 | 15.938 | 15.083 | 14.566
 132.846515 | 12.031677 |   2682 |    682 | 72.6 | 15.886 | 15.074 | 15.462
 132.841729 | 11.921727 |   3408 |    226 | 92.3 | 15.749 | 15.074 | 14.839
 132.818919 | 11.489823 |   3462 |    186 | 93.7 | 15.772 | 15.034 |  15.03
 132.825953 | 11.935231 |   3487 |    169 | 94.4 | 15.788 | 15.075 | 14.952
 132.814392 | 11.454962 |   3623 |     55 | 98.1 | 15.704 | 15.074 | 14.938
 132.768251 | 11.846685 |   3638 |     27 | 98.5 | 15.544 | 14.958 | 14.841
 132.851644 | 11.423502 |   3641 |     42 | 98.6 | 15.742 | 14.937 | 14.733
 132.840189 | 12.185337 |   3649 |     31 | 98.8 | 15.626 | 14.977 | 14.872
 132.848493 | 11.957039 |   3648 |     37 | 98.8 | 15.646 | 14.938 | 14.996
 132.823666 | 11.636212 |   3653 |     33 | 98.9 | 15.757 | 15.031 | 14.696
 132.768842 | 11.697128 |   3662 |      5 | 99.1 | 15.353 | 14.656 | 14.454
 132.811161 | 12.178091 |   3664 |     19 | 99.2 | 15.492 | 14.938 | 14.915
 132.772348 | 11.785291 |   3665 |      0 | 99.2 | 15.381 | 14.848 | 14.723
 132.832543 |  11.63244 |   3666 |     20 | 99.2 | 15.711 |  14.91 | 14.869
 132.780489 | 11.567529 |   3669 |     21 | 99.3 | 15.586 | 15.007 | 14.646
 132.784569 | 11.905586 |   3671 |      6 | 99.4 | 15.562 | 14.929 | 15.205
 132.801621 | 11.431091 |   3671 |     11 | 99.4 | 15.509 | 14.995 |  15.26
 132.801079 | 11.980979 |   3675 |     15 | 99.5 |  15.54 | 15.002 | 14.941
 132.815887 | 11.403999 |   3674 |      4 | 99.5 | 15.414 |  14.71 | 14.462
 132.788986 | 12.104757 |   3678 |      8 | 99.6 | 15.548 | 14.906 | 14.553
 132.786869 | 11.840988 |   3679 |      5 | 99.6 | 15.347 | 14.828 | 14.652
 132.773284 |  11.57869 |   3680 |      4 | 99.6 | 15.137 | 14.857 | 14.945
 132.817918 | 12.106892 |   3683 |      4 | 99.7 | 15.438 | 14.693 | 14.482
 132.841568 | 11.914836 |   3684 |      4 | 99.7 | 15.483 | 14.838 | 14.518
 132.774684 | 11.896566 |   3684 |      3 | 99.7 | 15.391 |  14.69 | 14.553
 132.777049 | 11.692842 |   3684 |      4 | 99.7 | 15.455 | 14.715 | 14.448
 132.792845 | 11.647357 |   3682 |      6 | 99.7 |  15.31 | 14.855 | 15.069
 132.850904 | 11.407166 |   3682 |      0 | 99.7 |  15.37 | 14.731 | 14.481
 132.849609 | 12.184153 |   3687 |      2 | 99.8 | 15.322 | 14.828 | 14.553
 132.805224 | 12.124827 |   3685 |      3 | 99.8 | 15.512 | 14.803 | 14.668
 132.856307 | 11.927506 |   3685 |      3 | 99.8 | 15.451 | 14.773 |  14.41
 132.801319 | 11.893528 |   3686 |      2 | 99.8 | 15.124 | 14.681 | 14.588
 132.839369 | 11.857634 |   3685 |      1 | 99.8 | 15.329 | 14.653 | 14.346
 132.836382 | 11.808805 |   3686 |      5 | 99.8 | 15.429 |  14.72 | 14.446
 132.802969 | 11.599224 |   3687 |      2 | 99.8 | 15.336 | 14.678 | 14.364
 132.810058 | 12.138291 |   3689 |      3 | 99.9 |   15.3 | 14.734 | 14.498
 132.818782 | 11.816794 |   3689 |      2 | 99.9 | 15.307 | 14.667 | 14.468
 132.802888 | 11.801827 |   3689 |      1 | 99.9 | 15.409 |  14.84 | 14.817
 132.828263 | 11.686715 |   3689 |      1 | 99.9 | 14.883 | 14.666 | 14.348
(40 rows)

Once again, the first couple of sources have the largest contribution to incompleteness. The population of the "f_rows" column suggests that most of this incompleteness is now due to sources toggling between the "F" and "C" flag. Below are the individual records for the first source. Note the occurrence of "p" and "c" in the cc_flg. About 8% of the incompleteness for this source can be attributed to this contamination (due once again to a persistence ghost seen only in the north-going scans).

Note that the completeness results can be biased (for better or worse) by small number statistics. Suppose there are 100 observations of 50 sources. Suppose 49 are 100% complete, but 1 source was so faint that 99 of its 100 observations got flagged as "F" yet, by bad luck, the 1 "C" observation got chosen as the fiducial for this analysis. Now the bulk completeness is 98% instead of close to 100%.

  132.845446   11.939525  16.036  15.170  14.623  0.097  0.098  0.127     0.49   18.80   17.10 222    000    
  132.845408   11.939567  16.118  15.121  14.542  0.096  0.099  0.097     0.36   18.80   17.10 222    000    
  132.845375   11.939601  16.065  15.243  14.620  0.121  0.130  0.147     0.32   18.80   17.10 222    000    
  132.845399   11.939562  16.300  15.236  14.726  0.141  0.118  0.160     0.33   18.80   17.10 222    000    
  132.845440   11.939605  15.971  15.192  14.401  0.108  0.113  0.114     0.52   18.80   17.10 222    000    
  132.845427   11.939551  16.147  15.069  14.537  0.112  0.114  0.105     0.42   18.80   17.10 222    000    
  132.845428   11.939549  16.022  15.238  14.777  0.125  0.133  0.158     0.42   18.80   17.10 222    000    
  132.845406   11.939609  15.940  15.126  14.338  0.115  0.104  0.118     0.42   18.80   17.10 222    000    
  132.845440   11.939597  16.154  15.148  14.749  0.135  0.140  0.146     0.50   18.80   17.10 222    000    
  132.845427   11.939517  15.974  15.251  14.439  0.110  0.131  0.117     0.43   18.80   17.10 222    000    
  132.845475   11.939570  16.126  15.133  14.555  0.130  0.108  0.124     0.60   18.80   17.10 222    000    
  132.845513   11.939429  16.000  15.172  14.623  0.130  0.120  0.152     0.83   18.80   17.10 222    000    
  132.845444   11.939598  15.993  15.265  14.373  0.128  0.126  0.119     0.52   18.80   17.10 222    000    
  132.845455   11.939584  16.104  15.183  14.752  0.128  0.130  0.141     0.54   18.80   17.10 222    000    
  132.845428   11.939553  16.071  15.222  14.666  0.121  0.144  0.124     0.42   18.80   17.10 222    000    
  132.845433   11.939548  16.042  15.185  14.444  0.116  0.151  0.129     0.44   18.80   17.10 222    000    
  132.845419   11.939509  16.038  15.123  14.696  0.128  0.128  0.145     0.41   18.80   17.10 222    0p0    
  132.845413   11.939553  16.032  15.125  14.812  0.104  0.098  0.150     0.37   18.80   17.10 222    000    
  132.845428   11.939505  16.031  15.007  14.635  0.125  0.112  0.139     0.44   18.80   17.10 222    000    
  132.845421   11.939488  15.924  15.300  14.814  0.112  0.146  0.166     0.44   18.80   17.10 222    000    
  132.845454   11.939556  15.852  15.152  14.582  0.097  0.139  0.120     0.52   18.80   17.10 222    000    
  132.845366   11.939460  15.633  14.445  14.753   null   null  0.156     0.36   18.80   17.10 662    00c    
  132.845401   11.939639  15.986  15.389  14.812  0.105  0.120  0.137     0.48   18.80   17.10 222    000    
  132.845445   11.939599  16.022  15.200  14.723  0.124  0.102  0.127     0.52   18.80   17.10 222    000    
  132.845425   11.939577  15.963  15.275  14.712  0.094  0.128  0.100     0.43   18.80   17.10 222    000    
  132.845420   11.939582  15.928  15.171  14.496  0.085  0.108  0.085     0.42   18.80   17.10 222    pp0    
  132.845395   11.939606  15.898  15.107  14.512  0.097  0.107  0.095     0.38   18.80   17.10 222    pp0    
  132.845390   11.939583  16.087  15.112  14.567  0.103  0.102  0.090     0.33   18.80   17.10 222    000    
  132.845423   11.939544  15.969  15.047  14.549  0.087  0.106  0.091     0.41   18.80   17.10 222    pp0    
  132.845425   11.939620  16.012  14.998  14.699  0.095  0.087  0.116     0.50   18.80   17.10 222    pp0    
What does it all mean?

  1. This discussion reinforces the view that the "C" vs. "F" vs. "whatever" flagging must be done on a band-by-band basis in order to provide meaningful overall completeness statistics. The interplay of the bands (particularly due to the swap in dominance of the J-band at high-latitude to Ks-band at low-latitude) obviates a one-letter-fits-all-bands approach. Why should the J-band flux drive how complete the source/catalog is when evaluated in the Ks-band according to the "C" flag.

  2. For those that adhere to the definition of completeness above (half-mag bins above level1 sensitivities) the current approach of separating "C" from "F", but now applied at SNR=10 in each band independently, comes close to yielding a satisfying level of completeness on a band-by-band basis. This result is illuminated by the J- and H-band results above.

  3. As Rae has pointed out, this sort of completeness analysis illuminates uniformity, with the calibration fields providing statistical exposure to the range of conditions (airglow, seeing, etc.). Since uniformity it may be more practical to provide cumulative percent-uniform-sky-coverage vs. magnitude statistics (as well as band-by-band all-sky "99%-completeness-magnitude-threshold" maps) than to have a one-uniformity-fits-all band-by-band flag. The availability of these products can lead to thoughtful and flexible approaches to selecting uniform samples (e.g. some people might be thrilled to find 1000 sq. deg of contiguous sky with 99% completeness 0.2 mags fainter than the rest of the survey). The question remains whether PSP, photometric offset, seeing, etc. can be turned into good completeness magnitude proxies for individual tiles.

  4. Recommendation:


    May 10, 2002