compare-ms-idi.py
Use this script to verify tConvert
's operation when it comes to exporting
all MeasurementSet data to a (set of) FITS-IDI files or j2ms2
operation by
comparing different MeasurementSets. This script collects
the integrated weights for all (baseline, source) combinations (“key”
hereafter), and counts how many times each integration is present.
If more than one data set is specified it will report:
Usage:
> compare-ms-idi.py --ms /path/to/input.ms
--idi /path/to/output.idi*
Note: the wildcards should not be escaped in this case.
Multiple --idi
and/or --ms
options are supported. The script effectively performs a multiway diff across all datasets given on the command line. It is also possible to compare only MeasurementSets, or only FITS-IDI files - the script doesn’t care, even when they’re (partially) disjoint.
If only one data set is passed on the command line no diff will be computed but the found keys are displayed - effectively summarizing the data set for exposure time per baseline per source.
The script has -h
/--help
.
The uninteresting case
This run compares two MeasurementSets that end up comparing equal, despite being created by two different versions of j2ms2
- the changes in j2ms2
have not affected the data in the produced MeasurementSet:
> compare-ms-idi.py --ms rsm02-prod-j2ms2.ms
--ms rsm02-antdiam-j2ms2.ms
Successful readonly open of default-locked table rsm02-prod-j2ms2.ms: 22 columns, 94752 rows
Successful readonly open of default-locked table rsm02-antdiam-j2ms2.ms: 22 columns, 94752 rows
Checked 2 data sets, 90 common keys
>
This is a more interesting case - an instrumented one.
For one experiment a subset of the data was converted to a second (much) smaller MeasurementSet, which was subsequently converted to FITS-IDI as well.
The comparison compared the full MeasurementSet, the partial MeasurementSet and the FITS-IDI file produced from the partial MeasurementSet.
> compare-ms-idi.py --ms rsm02-dev.ms
--ms rsm02-prod-j2ms2.ms
--idi RSM02-PRODJ2MS2.IDI
It produces a lot of output, which can be summarized in three sections:
The full data set had some stations coming in later and/or sources that were only observed later than those present in the partial MeasurementSet. As such there are keys that are only present in the full MeasurementSet.
These types of keys - those that cannot be compared - are listed first:
==== Problem report ====
MS: rsm02-dev.ms
Extra keys:
('T6T6', 'J1310+3220') found 89 times
('JbTr', 'J1310+3220') found 89 times
('McYs', 'J1310+3220') found 89 times
...
For keys whose integrated weight differs between any of the data sets, the following is displayed:
('EfTr', 'J1427+2632') :
6656.00s wgt= 53246.27 3328 times in MS: rsm02-dev.ms
416.00s wgt= 3328.00 208 times in MS: rsm02-prod-j2ms2.ms
416.00s wgt= 3328.00 208 times in IDI: RSM02-PRODJ2MS2.IDI*
('EfYs', 'J1419+2706') :
2550.00s wgt= 40775.69 1275 times in MS: rsm02-dev.ms
152.00s wgt= 2432.00 76 times in MS: rsm02-prod-j2ms2.ms
152.00s wgt= 2432.00 76 times in IDI: RSM02-PRODJ2MS2.IDI*
('EfYs', 'J1427+2632') :
6448.00s wgt=103158.98 3224 times in MS: rsm02-dev.ms
208.00s wgt= 3328.00 104 times in MS: rsm02-prod-j2ms2.ms
208.00s wgt= 3328.00 104 times in IDI: RSM02-PRODJ2MS2.IDI*
It also shows that the data from the partial MeasurementSet and the corresponding FITS-IDI file is consistent but that a significant amount of data is missing compared to the full MeasurementSet.
The tool always displays a one-line summary:
Checked 3 data sets, 90 common keys with 90 problems identified and 108
non-common keys in 1 formats