A set of tools to verify the operation of the j2ms2 and tConvert programs
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4.5 KiB

compare-ms-idi.py

Use this script to verify tConvert's operation when it comes to exporting all MeasurementSet data to a (set of) FITS-IDI files or j2ms2 operation by comparing different MeasurementSets. This script collects the integrated weights for all (baseline, source) combinations (“key” hereafter), and counts how many times each integration is present.

If more than one data set is specified it will report:

  • extra keys that are not common to all data sets, and in which data set these occur
  • for each key common across all given data sets, the tool will compare the integrated weights and report differences. If those numbers are not equal some data is missing in some of the data set(s) - it is reported which value was found in which dataset.
  • if a key has duplicated time stamps this means that the same data was present in the data set more than once and these occurrences are reported too

Usage:

    > compare-ms-idi.py --ms /path/to/input.ms
                        --idi /path/to/output.idi*

Note: the wildcards should not be escaped in this case.

Multiple --idi and/or --ms options are supported. The script effectively performs a multiway diff across all datasets given on the command line. It is also possible to compare only MeasurementSets, or only FITS-IDI files - the script doesn’t care, even when they’re (partially) disjoint.

If only one data set is passed on the command line no diff will be computed but the found keys are displayed - effectively summarizing the data set for exposure time per baseline per source.

The script has -h/--help.

Example output

The uninteresting case

This run compares two MeasurementSets that end up comparing equal, despite being created by two different versions of j2ms2 - the changes in j2ms2 have not affected the data in the produced MeasurementSet:

    > compare-ms-idi.py --ms rsm02-prod-j2ms2.ms
                        --ms rsm02-antdiam-j2ms2.ms
    Successful readonly open of default-locked table rsm02-prod-j2ms2.ms: 22 columns, 94752 rows
    Successful readonly open of default-locked table rsm02-antdiam-j2ms2.ms: 22 columns, 94752 rows
    Checked 2 data sets, 90 common keys                                  
    >

This is a more interesting case - an instrumented one.

For one experiment a subset of the data was converted to a second (much) smaller MeasurementSet, which was subsequently converted to FITS-IDI as well.

The comparison compared the full MeasurementSet, the partial MeasurementSet and the FITS-IDI file produced from the partial MeasurementSet.

    > compare-ms-idi.py --ms rsm02-dev.ms
                        --ms rsm02-prod-j2ms2.ms
                        --idi RSM02-PRODJ2MS2.IDI

It produces a lot of output, which can be summarized in three sections:

The extra keys

The full data set had some stations coming in later and/or sources that were only observed later than those present in the partial MeasurementSet. As such there are keys that are only present in the full MeasurementSet.

These types of keys - those that cannot be compared - are listed first:

==== Problem report ====
   MS: rsm02-dev.ms 
 Extra keys:
    ('T6T6', 'J1310+3220') found     89 times
    ('JbTr', 'J1310+3220') found     89 times
    ('McYs', 'J1310+3220') found     89 times
    ...

The common keys that have different values

For keys whose integrated weight differs between any of the data sets, the following is displayed:

('EfTr', 'J1427+2632') :
      6656.00s wgt= 53246.27   3328 times in   MS: rsm02-dev.ms
       416.00s wgt=  3328.00    208 times in   MS: rsm02-prod-j2ms2.ms
       416.00s wgt=  3328.00    208 times in  IDI: RSM02-PRODJ2MS2.IDI*
('EfYs', 'J1419+2706') :
      2550.00s wgt= 40775.69   1275 times in   MS: rsm02-dev.ms
       152.00s wgt=  2432.00     76 times in   MS: rsm02-prod-j2ms2.ms
       152.00s wgt=  2432.00     76 times in  IDI: RSM02-PRODJ2MS2.IDI*
('EfYs', 'J1427+2632') :
      6448.00s wgt=103158.98   3224 times in   MS: rsm02-dev.ms
       208.00s wgt=  3328.00    104 times in   MS: rsm02-prod-j2ms2.ms
       208.00s wgt=  3328.00    104 times in  IDI: RSM02-PRODJ2MS2.IDI*

It also shows that the data from the partial MeasurementSet and the corresponding FITS-IDI file is consistent but that a significant amount of data is missing compared to the full MeasurementSet.

The summary line

The tool always displays a one-line summary:

Checked 3 data sets, 90 common keys with 90 problems identified and 108
non-common keys in 1 formats