GBT Archive Documentation



How to get GBT Data from the Archive

The NRAO Archive Access Tool does not work (see next section for background), so currently if the data are not stored under /home/archive and /lustre (or cannot be retrieved from /home/gbdata and /home/sdfits) one needs to use the new tool from Thomas to get data from the archive. Then the data can be staged locally for users with GB accounts or placed on annoymous FTP site for users without GB accounts.

Data Retrieval Process

Background of the GBT Archive Systems

As of the end of 2020, we still do not have a public archive for access by outside users. The NRAO archive system did work with the old spectrometer data, but after it was updated to handle GBT VEGAS data, the process fell apart (due to retirements and loss of staff). See the following documentation for background information about the current GB archiving "system": Definitions: The meta-data for all GBT observations are archived in the GB meta-data archive database. The testdata are not archived by NRAO. All spectral-line science data are archived in the NRAO archive. The goal was to also archive all "folded" pulsar data in the archive, but this did not happen in all cases, but the meta-data for pulsar data have been saved in the NRAO archive (at least in most cases). The GB and CV pulsar teams have their own methods for distributing data to the science community.

Currently, VEGAS data are written to /lustre/gbtdata, GUPPI data were written to beef and are then moved to /lustre/pulsar. The VLB and radar backend data are not written to /home/gbdata and are not archived; these data are shipped to their customers directly.

All non-VEGAS data are save locally in GB in the /home/archive area which is important for staff engineering work, such as PTCS long-term performance trending (e.g., OOF and pointing solutions).

When GBT observations are collected the non-VEGAS data are written to /home/gbtdata, while the VEGAS data are written to /lustre. The spectral-line data are filled in real-time into SDFITS and save in /home/sdfits. The GB meta-data archive is also populated in semi-realtime during the observations. On a daily basis, the /home/gbtdata is copied to the GB data archive area (/home/archive) and is copied to the CV archive staging area. When the GB lustre fills up, projects are archived from the CV staging area into the NGAS system, and then removed from the GB /lustre. As part of this process, the meta-data information from the GB meta-data archive is copied to the NRAO meta-data archive database.

Currently, NRAO-SOC is still working to ingest the GBT data into the new NGAS system and not all data are available. They hope to have this process finished by the end of December 2019, December 2020, March 2021, September 2021.


Lost Data

The state of the GBT data archive is a mess. Not all data are in the CV archive staging area that should be, so even after the ingestion process of GBT data is completed by SOC, some GBT will be missing from the archive. Some data that are missing in the staging area are also not on tape. The requirement/policy of having all data saved in at least two places was not met. The following data sessions listed below are LOST. These data are missing in the archive, are not on disk in GB, are not in the CV staging area, and are not on tape. Our current data management systems are not reliable, and I would maintain copies of any data you wish to preserve. This list will grow as we find more missing/lost data: David T. Frayer (GBO)