The NRAO Archive Access Tool does not work (see next section for
background), so currently if the data are not stored under
/home/archive and /lustre (or cannot be retrieved from /home/gbdata
and /home/sdfits) one needs to use the new tool from Thomas to get
data from the archive. Then the data can be staged locally for users
with GB accounts or placed on annoymous FTP site for users without GB
accounts.
Data Retrieval Process
Staff should first check the DSS system to verify that data are
no longer proprietary before supplying data to a non-PI of the
project. The GBT proprietary period is 1 year after the last session
of a project; e.g., if someone wants data from AGBT19B_999_01 and the
project was closed on 2020.02.01 with the last session observed on
2019.12.15, then the data for AGBT19B_999_01 would be available on
2020.12.15. If the project is still open and/or if you have
questions, please contact Toney Minter before releasing archival data.
For the most recent data (<~12months - 2 years), check
/home/sdfits and /home/gbtdata and /lustre/gbtdata first.
Data before VEGAS (<~2012) can be retrieved at
/home/archive/science-data.
For archived VEGAS data use the web interface "Alda" from a
computer on the internal network using your my.nrao.edu
authentication Beta-archive
Search by Project-ID; click Archive tab; enter project-ID; click filter.
Scroll down to the "NGAS URLs" and download the data. Download
the data to
your /home/scratch area (avoid your home area or you may fill your
home quota).
Run SDFITS for the user if needed/requested. The sdfits program
needs to be run on a GB machine.
If users have an active GB account, have users copy the data from your
scratch area to their area. If users do not have an active GB
account, put the data on the anonymous GB FTP site.
Anonymous FTP Instructions for Staff
Copy data to the FTP area directly using the linux file system,
e.g., cp *data /home/ftp/pub/dfrayer/. or use your own NRAO
username. The /home/ftp/pub/dfrayer is group writable (gbstaff),
but the files in the FTP area count against your limited "home"
quota, so contact David Frayer if the files are larger than you can
provide (CIS temporarily increased the home quota for David for FTP
data transfers until the archive is fixed/ready). Your current home
quota limit and usage can be determined by typing "myquota" from the
linux prompt.
Provide users instructions below so they can retrieve the data
ftp> cd pub/dfrayer (or your own staff username if stored there)
ftp> ls
ftp> bin (change to binary for fits files)
ftp> mget AGBT19B_999*, where AGBT19B_999 is project ID
Background of the GBT Archive Systems
As of the end of 2020, we still do not have a public archive for
access by outside users. The NRAO archive system did work with the
old spectrometer data, but after it was updated to handle GBT VEGAS
data, the process fell apart (due to retirements and loss of staff).
See the following documentation for background information about the
current GB archiving "system":
NGAS data archive: NRAO archive containing GBT data
NRAO archive: NRAO data and meta-data archives
The meta-data for all GBT observations are archived in the GB
meta-data archive database. The testdata are not archived by NRAO.
All spectral-line science data are archived in the NRAO archive. The
goal was to also archive all "folded" pulsar data in the archive, but
this did not happen in all cases, but the meta-data for pulsar
data have been saved in the NRAO archive (at least in most cases).
The GB and CV pulsar teams have their own methods for distributing
data to the science community.
Currently, VEGAS data are written to /lustre/gbtdata, GUPPI data were
written to beef and are then moved to /lustre/pulsar. The VLB and
radar backend data are not written to /home/gbdata and are not
archived; these data are shipped to their customers directly.
All non-VEGAS data are save locally in GB in the /home/archive area
which is important for staff engineering work, such as PTCS long-term
performance trending (e.g., OOF and pointing solutions).
When GBT observations are collected the non-VEGAS data are written to
/home/gbtdata, while the VEGAS data are written to /lustre. The
spectral-line data are filled in real-time into SDFITS and save in
/home/sdfits. The GB meta-data archive is also populated in
semi-realtime during the observations. On a daily basis, the
/home/gbtdata is copied to the GB data archive area (/home/archive)
and is copied to the CV archive staging area. When the GB lustre
fills up, projects are archived from the CV staging area into the NGAS
system, and then removed from the GB /lustre. As part of this
process, the meta-data information from the GB meta-data archive is
copied to the NRAO meta-data archive database.
Currently, NRAO-SOC is still working to ingest the GBT data
into the new NGAS system and not all data are available. They hope to
have this process finished by the end of December
2019, December 2020, March
2021, September 2021.
Lost Data
The state of the GBT data archive is a mess. Not all data are in the
CV archive staging area that should be, so even after the ingestion
process of GBT data is completed by SOC, some GBT will be missing from
the archive. Some data that are missing in the staging area are also
not on tape. The requirement/policy of having all data saved in at
least two places was not met. The following data sessions listed
below are LOST. These data are missing in the archive, are not on
disk in GB, are not in the CV staging area, and are not on tape. Our
current data management systems are not reliable, and I would maintain
copies of any data you wish to preserve.
This list will grow as we find more missing/lost data: