CLEO Logo

YGOR Fundamentals

Home Getting Started Fundamentals Applications Feedback & Support Glossary Acknowledgments Table of Contents

 

 

Up
Next

GBO Logo

YGOR Fundamentals
Menu Bar
State/Status Frame
Lock/Unlock Frame
Parameters
Samplers
Error Traps


YGOR is the name of the Monitor and Control system software. The following, mostly written by Mark Clark and John Ford, are the basics of YGOR that you will need to know to make effective use of a CLEO application.

Managers:

The setup and operation of all GBT devices are defined through a software module called a manager. Managers are used to control and coordinate everything on the telescope from I.F. electromechanical switches to the antenna servo system. Understanding how a manager works is understanding how software controls the GBT.

An Analogy:

Each manager contains a set of values called parameters which may be set directly by the user, computed from other parameters, and/or loaded into the device itself. A useful analogy is a spreadsheet program where a manager is the spreadsheet and the parameters are the cells of the spreadsheet. A user defines the setup or use of a device by filling in values in the "higher-level" cells and the spreadsheet computes or translates these into the values needed by the device's various registries, RAMs, controllers, or other device interfaces. Control of the device is a matter of filling in values of the various cells. As far as user control is concerned, the only difference between devices is the set of cells or parameters defined within its spreadsheet or manager.  

Common Parameters:

There exists a subset of parameters which are built into all managers.  The most interesting common parameters are:

startTime
specifies when a scan starts. This may be entered manually, but usually is computed by the system as the "earliest guaranteed start time."
scanLength
specifies the length of the scan.
scanNumber
an integer which is incremented on every scan.
projectId
a character string as specified by the observing schedule.
scanId
an arbitrary character string defined by the observer.
state
the current state of the managers which is one of the following: Off, Standby, Operating, Ready, Activating, Committed, Running, Stopping, or Aborting.
status
Summarizes the severity of the messages for the manager and any managers it coordinates.  Can be one of the following: clear, Info, Warning, Error, Fault, or Fatal.
 

Synchronous and Non-synchronous Managers:

The various device subsystems that make up the GBT may be classified into two broad categories. First there are devices which sequence through a set of events or actions as a scan proceeds. These synchronous devices must respond to the onset of a scan and synchronize their activity to the scan as it progresses. Examples are backends, the tracking LO, and the antenna. Second are non-synchronous devices which merely have to be in the correct state by the beginning of a scan and remain so. Examples are electromechanical switches, static LOs, and receivers. Nominally, synchronous managers and their associated software may be thought of as "real-time" systems, while non-synchronous managers are not.

Manager Commands:

Different sets of parameters define the differences between managers, but there exist a whole set of commands which are common to all managers which may be accessed via widgets or menus in CLEO. Most of these have to do with controlling scans and thus the sequencing of manager states. A manager may be placed in Off, Standby, or Ready states by the commands off, standby, and on respectively. Off causes the manager, and indeed all software, to completely disconnect from the hardware. So, if one wished to turn off the hardware and not initiate a stream of error messages as the software detects that it is not getting the correct feedback from its digital interfaces, one would use off prior to turning the power off. Standby allows the software to continue monitoring the hardware, but not start scans or respond to any commands other than on or off. Ready denotes the system is ready to start a scan.

The commands start and prepare are the manager's activate commands which cause the software to actually command the hardware. The command start causes a scan to begin while prepare causes the manager to do everything possible to get ready for a scan (as specified by its parameters) short of actually starting the scan. The command start will cause a synchronous manager to pass from the state Ready through Activating and Committed to Running, and a non-synchronous manager to pass from the state Ready through Activating back to Ready. The values specified in the parameters are not loaded or used by the system until either of the activate commands are invoked. In fact, one can return the parameters to their value when the last activate command was given with the command revert. A manager's activate commands can only be invoked from the Ready state and if none of its parameters contain an illegal value (see parameter attributes below). The commands stop and abort cause a manager to return to Ready. The command abort is the much stronger version of the two and should not be used unless there are operational problems. Another command of interest is conform which forces the re-computation of all parameters as opposed to only those whose antecedent cells have been modified.

Parameter Characteristics:

As commands are sent to a manager, and parameter values are set, computed, and loaded into the system (activated), each parameter acquires various attributes. The following two lists provide short descriptions of parameter types and attributes.

There are three types of parameters: control, feedback, and auto.

control
Control parameters are most analogous to the cells of a spreadsheet described above. They are either set directly by the user or are computed from the values of other control parameters. They may be passed into the system upon activate commands. They are the basic working values of the system.
feedback
Feedback parameters exist merely as a means of the manager to pass values back to the user. Their values may be set at any time by the system, but may not be modified by the user.
auto
Control parameters values are only held in memory until an activate command is issued. Auto parameters, on the other hand, are activated immediately upon any change of their value. In addition, their values are never computed from the values of other parameters, but may only be set directly by the user. These parameters exist for controlling aspects of devices that are scan independent, for example the definition or periodicity of monitor points.

An example of how these types are used is the parameter used for setting the attenuators that control the input levels to the A/Ds in the downconverter drawers of the Spectral Processor. The value of the attenuator parameter is not dependent on any other parameters and the attenuators themselves are set during balancing. However, we would like the user to be able to set the attenuators manually. In all states, except Ready, the attenuator parameter is defined as a feedback parameter, so whenever the system changes the value of the attenuator, that value may be reflected back to any user-interface program. Therefore, the parameter performs only monitoring duties. In Ready however, the attenuator parameter becomes an auto parameter so the user can directly set the values of the attenuators independent of the operation of balancing. This scheme works because balancing takes place during Activating.

Some of the important attributes of parameters are:

touched
As previously mentioned, a Control parameter when modified is held in memory until such time an activate command is issued. The setting of a Control parameter's value sets the touched attribute, and an activate command clears the touched attribute. In other words, it indicates that the parameter is holding a proposed value which will either be used on the next activate command or dropped on the next revert command.
manual
This is an additional attribute to touched which indicates the new value was set directly, i.e., not computed from other parameters.
activated
This is a transient attribute that occurs during Activating for only those parameters which are directly used by the device's various registries, RAMs, controllers, or other device interfaces.
primary
Primary parameters are akin to primary colors. This is the minimum set of parameters which fully define the state of the manager, i.e., all other parameters may be computed from them.
illegal
Parameter values cannot take on just any value. The manager protects its associated device by checking all values and marking those having bad values as illegal. A parameter value cannot be computed if its formula depends on a parameter with an illegal value, nor will a manager accept either activate command if any of its control parameters have an illegal value.
fault
If an individual parameter cannot be successfully used because of interface problems during Activating, it acquires a fault attribute until such time the interface operation succeeds.
off
If the manager cannot determine the value for a parameter because the manager is in an off state.
 

Coordination:

Each manager basically works independently except for the computation of a common start time for a scan. Every synchronous manager is able to compute its earliest guaranteed start time. This time may be used by an other manager designated to manage other managers which then computes the start time for the entire set of managers it is managing. That time is then commanded to all of its managed managers as the actual start time. The analogy is the old movie motif of planning a surprise attack, where the soldiers get together beforehand and are given their individual assignments, synchronize their watches, report when they can be in position to "attack", and then agree when the attack will begin. The same principle is used to coordinate the beginning of a scan. Notice that once everyone is agreed on what and when things take place, there is no need for further communication between the individuals; likewise, there is no software interaction between managers during a scan.

Accessor, Registries, and Samplers:

Throughout the YGOR programs are software "test points" or samplers which periodically read and time-tag values which can be tapped into by one of two programs: registries or the accessor. Registries accept sampler values continuously for purposes of creating log files. For example, the weather station software exists primarily to provide samplers for the WeatherRegistryMgr which keeps a perpetual log of recorded weather readings. The accessor, on the other hand, provides "on-demand" access to all samplers. For example, whenever a CLEO application displays a monitor component, it establishes a connection to the accessor to get a stream of sampled values which are displayed in the widget on the screen. 

Messages, Manager Status, and messageWindow program:

Messages -- strings of text -- are generated throughout the YGOR programs to describe all types of expected and unexpected events.  All messages from all YGOR programs are sent via ethernet to a single program (currently messageMux).  Every message has associated with it a severity level as well as what device produced the error, when the error occurred and when it was cleared.

A user starts up a  YGOR program, messageWindow, to view the messages.  To start messageWindow, you must specify the computer on which the messageMux is running.  For example:

sample% messageWindow  -h vega

will use the messageMux found on vega. The messageWindow currently being used is a curses program.  

In addition to messages being displayed in a messageWindow, the level of the severest messages from a manager is stored in the manager's status parameter.  There are six levels of messages and manager status parameters:

Info -
Merely provides parenthetical text for expected events or for debugging purposes. In many cases these should be viewed as temporary mechanisms until appropriate means are developed for displaying system status or confidence in the system is built up over time. A small set of these will be retained for the final system to provide context for the message log.

Examples:

  • The reporting of the generation of a data record for a backend.
  • A scan is in progress.
Notice -
This level reports any unexpected events that are not reported at the more severe levels. This level includes illegal actions by users.

Examples:

  • The user enters an illegal value for a control parameter.
  • The one pps in the timing center has drifted past a threshold value. It is expected this event will occur from time to time even when the device is in good working order.
Warning -
This level provides a description of an unexpected or possible problem-causing event which requires more careful monitoring or investigation by the operator.

Examples:

  • An antenna surface panel's temperature has an unreasonable reading.
  • The cryo temperatures have climbed beyond a specified threshold.
  • The antenna is pointing within a specified distance from the sun.
Error -
An event has occurred which requires a specific action by the operator, such as notifying the contact engineer.

Examples:

  • An off-line receiver is warming up.
  • A cryo compressor has failed and a spare should be connected.
  • The A Rack of the Spectral Processor fails to start a scan on time.
Fault -
This level describes events which will cause the system to -- at least in part -- generate bad observational data or prevents the completion of actions.

Examples:

  • The cryo temperatures of the active receiver have climbed to the point there is little hope of detecting a signal.
  • The tracking LO has gotten out of sync with the sig/ref signal.
  • The antenna is more than N beamwidths off the commanded track.
  • A failure of a digital interface.
  • The A Rack of the Spectral Processor fails to start.
  • An LO repeatedly fails to stay in lock.
Fatal -
Description of an unexpected and problem-causing event which the system took some action as a result of, or requires some action by telescope personnel, i.e., equipment or personnel are in danger. An event from which the software cannot recover and must be restarted.

Examples:

  • The commanded track has driven the telescope into a limit.
  • The active surface is shutting down because an emergency stop was initiated.
  • The temperature in a rack of equipment is high enough that the equipment needs to be powered down.
  • The wind speed is high enough to cause the automatic stowing of the antenna.
  • Some sub-task of a system has crashed.
  • An "impossible" software condition occurs, a la assertion.

There are two types of events described by messages: transient and state. Transient events are indicated merely by providing the time of the event followed by the string (TRANS). State events are indicated by two times: asserted and cleared.

Glish and segeste:

The Glish and segeste interpreters are the command-line and GUI interfaces to the managers and the Accessor.   Glish,  and it's GUI counterpart Glish/Tk, is the software behind not only the astronomer's interface developed by Rick Fisher, but Roger Norrod's Mockup test utilities and AIPS++ as well.  Glish is solely an NRAO product.

Segeste is an NRAO extension of the popular Tcl/Tk programming language.   CLEO, the engineers' and operators' interface to YGOR, is based on segeste and Tcl/Tk.

Engineering Programs:

TaskMaster:

The TaskMaster program controls all of the YGOR programs running on a specified computer. These programs run continuously and provide a number of services from monitoring receiver temperatures, to logging the weather station readings, or to running the Spectral Processor. One can use TaskMaster to query the status of all or a specific process by passing it the query event, e.g.,

sample% TaskMaster gemini query

for each process you will get output like the following:

Process:      messageMux
Arguments:
Path: bin
Notify: jbrandt
Description: Message system server process
MaxFileSize: 102400
ProcessNum: 1
ProcessID: 11534
LastStatus: 0
EventCount: 0
LastAction: Start
LastStartTime: 50188 15:18:13
LastEventTime: 0 0:00:00

Some fields of interest are:

Process
name of the program
Notify
Whenever a process is terminated by any means, this person is notified by mail. Also, this is the person responsible for this process if you have questions or comments.
ProcessID
The process identifier (PID) as gotten from a ps(1) command/.
EventCount
Each TaskMaster command (except query) for this process generates an event. This field counts the number of events.
LastAction
Each TaskMaster command (except query) for this process generates an event. This field displays the last event received.

TaskMaster processes may be initiated to restart automatically which is indicated by the "LastAction" field containing the word "Start", or to restart only on-command which is indicated by the "LastAction" field containing the word "RunOnce". Processes which have a state which must be redefined by the use in case of an unexpected exit, need to be restarted manually so the user can properly initiate the program. Processes which require no input from the user can restart automatically.

The commands or events that may be sent via TaskMaster  to a process are query, stop, start, suspend (like control-Z), continue (opposite of suspend).  For example, to suspend and continue the logging of weather data temporarily:

sample% TaskMaster gemini suspend WeatherRegistryMgr

or, to stop and restart the Spectral Processor control and data acquisition process:

sample% TaskMaster gemini stop spcda
sample% TaskMaster gemini start spcda

Debug Monitor and remoteWindow:

The Debug Monitor is another access method that may be employed to monitor the values in a device. The Debug Monitor is accessed from a remote computer using the remoteWindow command. 

 
The remoteWindow program is run with the following arguments:

  sample% remoteWindow WindowName ControllerName Rate

For example:

sample% remoteWindow Rcvr8_10PowerSupply receivers 10

Where:
remoteWindow
is the command name
Rcvr8_10PowerSupply
is the debug window to monitor
receivers
is the internet name of the controller single-board
10
is the sample rate, in units of 1 to 100, 1 == 100ms, 100 = 10sec, or 1 per second.

This will produce a display on the terminal window of the 3 power supply voltages of the X band receiver.


 


Copyright © 2000 Associated Universities, Inc. Washington D.C., USA
Modified: 18 September, 2002 by Ronald J. Maddalena