The lags computed by the spectrometer are 16-bit values. However, most pulsar observations do not need this much precision and can use 4- or even 2-bit values without significant loss of signal-to-noise. Therefore, the Spigot cards can truncate the lags to a lower precision before sending them to the acquisition computer. This enables faster sampling, more lags, and/or more IFs to be recorded while maintaining the same data rate.
To preserve signal-to-noise as much as possible when truncating the data, each individual lag has a scale and offset which are applied to it prior to truncation:
We first determine the statistics of the 16-bit signal robustly (using an iterative -clipping algorithm), where we assume that the signal is a Gaussian with mean µ_{i} and standard-deviation _{i}, or lag_{i, 16-bit} ~ N(µ_{i}, _{i}^{2}).
To measure how good the lower-precision representation is, we define an error function E(µ_{i},_{i (1/N) i (lagin - lagout)2. We must scale this by 1/2 for the results to be comparable. Therefore, plotting E(µi,i) / i2 vs. (µi,i) should allow us to determine the optimal range of (µi,i) where the degredation is minimized. Ei/i2 is then related to the more standard signal-to-distorion ratio (SDR) by SDR -10 log10 Ei \ i2, where SDR is in dB. }
We plot results of simulations of degredation in the Figures (below). With 16 bit data, the finite precision (i.e. converting 2.5 to 2) has very little effect on E_{i} as long as _{i} 100. The largest effect is the limited range of the data, which causes misrepresentations when _{i} 7000. The optimal range (for µ_{i}=0) is 100 _{i} 7000. Changing µ_{i} has very little effect on E(µ_{i},_{i}) as long as it is |µ_{i}| 5000. Above this, it begins to push data on one side of the Gaussian towards a limit. The optimal region has E_{i}/_{i}^{2} 10^{-6}. For 8-bits, this is similar to the 16-bit case, where the results are good for 20 _{i} 50 and |µ_{i}| 10, and the optimal region has E_{i}/_{i}^{2} 10^{-3}. With the 4-bit case the effects of truncation are becoming more apparent, as only when 1.7 _{i} 3 and |µ_{i}| 2 are the results good. The optimal region has E_{i}/_{i}^{2} ~ 10^{-2} - 10^{-1}. The 2-bit case has the smallest region with acceptable degredation. For 0.6 _{i} 1, results are good, but µ_{i}=0 is required. The optimal region has E_{i}/_{i}^{2} ~ 0.8 - 1.0.
The results, summarized in the Table (below), show that there is a range of _{i} for which the lag_{out} have acceptable deviation from the lag_{in} (µ_{i}=0 is the best value in all cases). The values of _{i} that minimize E_{i}/_{i}^{2} give SDRs comparable to the best values obtainable (e.g. http://www.data-compression.com/vq.html).
To then have the least degredation when truncating our input signal, we set M_{i}=-µ_{i} and _{i} = _{opt}/_{i}, where _{opt} is the optimal value of for a given precision, listed in the Table. Prior to truncation the signal will then be ~N(0,_{opt}^{2}), as desired.
That there is a somewhat broad range for _{opt} is crucial, as we have assumed that the statistics of the incoming signal are stationary. Strong, variable sources or interference can alter the statistics. So we will normalize the signal for the middle of the optimal range (perhaps slightly to the low side of that), and then variations in signal strength will not significantly change the quality of the data unless ~ , which should only happen if the antenna temperature T fluctuates by T ~ T. This may happen due to significant RFI, but this will generally wash out the data too so that it will not limit our precision.
In the discussion above, we assumed that the data could be characterized properly. When operating in a 16-bit mode, this is easily done, but at the same time it is not crucial to set the scales and offsets precisely as a broad range of values will allow operation with negligible degredation. For modes with < 16 bits, and especially those with 4 bits, the scales and offsets are more important, but with some low-precision data it is hard to characterize it (determine its mean and standard-deviation). Therefore, the spigot computer can direct the spigot cards over the serial connection to enter a temporary 16-bit mode. This mode is not meant for normal operation and does not supply data for every time sample (if it did, it would violate the 25 Mb/s constraint on the system). For example, in mode 5 the data have 4-bits per lag, so the temporary 16-bit mode only supplies data every fourth time sample. In this way the data rate is held constant. While losing three out of four samples is not useful for observing, it does not degrade the data in a statistical sense, so these data can be used to determine the means and standard-deviations of every lag. The appropriate scales and offsets are then computed and sent over the serial connection, and finally the spigot cards are directed to return to the normal 4-bit mode.
Optimal _{i} for reduced-precision Gaussians | ||
---|---|---|
Bits | Range for _{i} | E(0, _{i})/_{i2} |
2 | 0.6 - 1 | 1 |
4 | 1.7 - 2.5 | 10^{-1} |
8 | 20 - 40 | 10^{-3} |
16 | 100 - 7000 | 10^{-6} |
Error in estimating a Gaussian with 4-precision (right) and 2-bit
precision (left). The solid curves and lower axes are for µ_{i}=0,
while the dashed curves and upper axes are for _{i}=2.1
(4-bit) and _{i}=0.8 (2-bit).