PIN1/PIN2 (shared phone line) Archiving/Processing Mishap
------------------------------------------------------------
Period : May 8 (127) of 2002 through August 1 (213) of 2002
Archive: SOPAC
Authors: SOPAC Staff
Date: August 6, 2002
****** ALERT : DO NOT use any data related to PIN2 from the time period
of 2002-137 through 2002-213! It is simply a copy of
data for PIN1!
****** ALERT : DO NOT use data for PIN1 for 2002-127! It is a copy of
data for PIN2!
PROBLEM:
SOPAC staff were processing pin1 and pin2 data in GAMIT's baseline mode for
data collected on day 200, 2002. The GAMIT solution indicated that the two
sites were only 1-2 meters apart, although in actuality they are ~50 m
apart. After extensive investigation, they discovered that the pin2 data
file actually contained PIN1 data. The finite baseline delivered by GAMIT
was due to the different antenna heights for pin1 and pin2.
What caused the archiving of pin2 as an identical copy of pin1 from May 8
through August 2 (days 128-214), and the subsequent loss of PIN2 data for
this period?
a. Script from new Egads/Schedg upgrade sent hard-coded value of
into newly-created
MasterSites.xml file during translation/conversion of previous
Egads configuration file, MasterSites.lst. Local administrators
@ SOPAC responsible for the Egads/Schedg software did not notice
this occurrence.
b. As a result of (a), and in direct conjunction with the fact that pin1
and pin2 share the same phone number, the new Egads/Schedg
configuration dialed and downloaded both sites (pin1 and pin2) shortly
after the end of the UTC day boundary each day (since they are both
single session sites).
c. Because the two sites share the same phone number, they must be
downloaded through two different "time windows" on a daily basis
(facilitated by a "splitter" on site). Instead, Egads/Schedg was
essentially downloading pin1 twice, and writing the resultant raw
files with site identifier codes pin1 and pin2 (respectively) each
day - resulting in identical copies of the same file with two
different names.
ADM.pl (SOPAC's archive data manager), in turn, took each Ashtech raw
file at face value (based on the file name) and "rinexed" each file
with information from the database based on the site code from the
file in question. As a result, pin1 raw files (though named pin2)
were being rinexed with pin2 metadata inserted into the header, pin1
data, and archived as pin2 rinex files. Note that the "Marker Name" in
the RINEX header was identified as 'PIN1' which is input by the
operator at the receiver, inserted in the raw data file, and untouched
by ADM.pl.
It is interesting to note that PIN2 data were missing from the archive
on days 198-201, 204-209. An investigation of these missing data may
have led to discovery of the mishap.
d. During the regional analysis of pin1 and pin2 at SOPAC, some of the
'pin2' data (actually pin1 observations) may have slipped through the
cleaning procedure during the prefit solution, since those two sites
are relatively close to one another (~50 m). This resulted in
position estimates with large adjustments. In subsequent postfit
solutions, the cleaning procedure may have included more 'valid' data
since pin2's a priori position had been updated, resulting in 'pin2'
site position moving towards pin1. Subsequently, these erroneous
positions were published on the Web site
(e.g. http://sopac.ucsd.edu/scripts/coordIndex.cgi) for PIN2.
e. During the timeseries generation in post-processing for 'pin2' at
SOPAC, the newly estimated 'pin2' position differed significantly from
the correct position obtained from previous processing. The magnitude
of such a 'jump' automatically disqualifies it from the published time
series while though it is still marked as 'processed'. This abnormal
jump had escaped from our operator's notice, and precluded the
discovery of the mishap.
**f. On day 217 of 2002, two days after correcting the MasterSites.xml
configuration for pin1 and pin2 at SOPAC (and two days after
retrieving old files from pin2 receiver and deleting manually to free
up memory) PIN2 was downloaded again, just after UTC (roughly 14 hours
before the intended "window" set in the tag for pin2)
as PIN1. In turn, Michael (Scharber) had to delete the incorrect data
again and revert to downloading pin2 manually each morning. It appears
as though the new Egads/Schedg (at least SOPAC's installed version)
doesn't take the parameter into account.....or at least
doesn't appear to. As a result, even if SOPAC had had the correct
parameters in MasterSites.xml from the start, the same problem would
have resulted.
** Michael Scharber and Keith Stark are currently working to resolve
this problem and prevent future problems of this sort from arising.
SOLUTION(S):
- What can we do to avoid this sort of problem in the future and improve
our overall data archiving, analysis and post-processing integrity at
SOPAC?
a. Egads/Schedg should verify (through an upgrade to sharc for providing
"anticipated" site identification for any given raw file) that the
file(s) it is downloading and naming with a given site code/identifier
are indeed for the same site, and not for another site being dialed
with incorrect metadata information.
*b. In ADM.pl, a simple step prior to archiving a RAW Ashtech "r-file" can
be included to essentially verify the "identity" of the raw file. If
that "identity" is different from that implied by the name of the file,
then an email should be sent to SOPAC administrators and the file
should not be rinexed though it should be archived and inspected ASAP.
c. During SOPAC's regional analysis, any large adjustment (20 mm
horizontal 40 mm at vertical) at prefit stage should trigger a warning
message sent to the operator.
d. During SOPAC's timeseries generation (post-processing), any large jumps
(5 mm horizontal, 10 mm vertical) should be investigated by the
operator.
* Already completed.
CLEANUP:
- What steps were taken to clean up SOPAC's archive of this mess?
a. Both raw and rinex files for pin1 day 127 of 2002 were recalled
from SOPAC's public archive, as they were copies of pin2 (an artifact
of the Egads/Schedg upgrade taking place "in-between" downloads - pin2
happened to be available (using the phone number shared with pin1)
after the upgrade completed.
b. Both raw and rinex files for pin2 days 128-136 of 2002 were reposted
per collection of (and rinexing) the actual raw files from pin2 for
the same period by hand, after the problem was detected.
c. Both raw and rinex files for pin2 days 137-213 of 2002 were recalled
from SOPAC's public archive, permanently. Data from pin2 during this
period are/were irretrievably lost as no communication with the site
was occurring, and the receiver filled up its local memory and stopped
recording data.
d. The proper "allowedtimes" values were added to MasterSites.xml for pin1
and pin2.
e. Incorrect posted site coordinates for pin2 were removed.
f. SOPAC and SCIGN newsletters were corrected.
g. Add a comment in the database that documents the problem.