4.2. Server Configuration

4.2.1. Configuration File Format

For simplicity, the QOS Server configuration file uses Python's data structure declaration format. All of the entries are either elements in a Python dictionary or a Python list of dictionaries. Dictionaries are delimited by curly braces with name: value pairs. Lists are delimited by square brackets. In both cases, elements are separated by commas. As the configuration is an executable Python script, the config can be loaded into Python to test for validity.

Note that the entire configuration file is one large dictionary with five primary keys:

Each key is described in its own section below.

4.2.2. Core Server Configuration

The 'Qos' section describes basic server configuration items.

Basic Server Configuration

version

Sets the displayed version for the Server. Cosmetic.

defaultDomain

Sets the domain used to send out notifications. Notifications will be sent from qos@defaultDomain.

userpass

Sets the username & password combinations for the administration webpage and the report webpage. The value should be a dictionary of username: password combinations.

sendmailPath

Sets the path and filename of sendmail interface to the local email system. This is required to send email notifications.

4.2.3. Notification Configuration

The Notification Configuration defines a set of groups that the Server uses to send notifications to when problematic conditions exist. A group lists the chain of escalation that should be followed if an alert needs to be sent and has not been handled in a set amount of time. A special group name, default, sets the group to use if another group is not specified in the Entity configuration.

To set an alterate notification group to use for a particular Host, add a Host Entity type with a notify key with the desired group set as its value. To change an Entity, add a notify key to the Entity's dictionary, and likewise for SubEntities and for individual Triggers. notify keys are automatically propagated downward and can be overridden multiple times as needed.

An example of setting a default notify group for an entire Agent:

'omourov' :
[
{ 'type'    : 'Host' ,
  'notify'  : 'admins',
},
# rest of host config ...
],

An example of setting a notify group for a particular Entity:

'omourov' :
[
{ 'type'    : 'Cpu' ,
  'status'  : 'on' ,
  'notify'  : 'admins' ,
  'triggers':  [{ 'level' : 'info', 'trigger' : '10'}],
},
# rest of host config ...
],

An example of setting a notify group for a specific SubEntity:

{ 'type'      : 'Tcp' , 
  'subents'   : [ 
                 {
                  'status'   : 'on' ,
                  'name'     : 'mx1.example.com',
                  'tcpHost'  : '12.34.56.78',
                  'tcpPort'  : '25',
                  'goodPhrase'  : '220 example.com ESMTP',
                  'notify'   : 'admins',
                  'triggers' : [{ 'level' : 'warn', 'trigger' : '5' }, ],
              }
            ]
}

And finally, an example of setting multiple Triggers with different notification groups:

{ 'type'    : 'Disk' ,
  'status'  : 'on' ,
  'triggers': [{ 'level' : 'info',     'notify': 'noc', 'trigger' : '90'}, 
               { 'level' : 'critical', 'notify': 'admins', 'trigger' : '95'}, ],
},

An example Notification Group configuration might look like this:

 'Notify': {
    'default': { # Off-hours, when non-noc people aren't around
    'normal': ['noc@example.com'],
    'escalate': ['opspager@example.com', 'boss@example.com', 'bosspager@example.com'],
    },
    'weekday': { # when everyone's around
        'normal': ['noc@example.com'],
        'escalate': ['ops@example.com', 'nocmanager@example.com', 'boss@example.com', 
                        'exec@example.com'],
    },
    'weekend': { # when noc people aren't around
    'normal': ['ops@example.com'],
        'escalate': ['opspager@example.com', 'bosspager@example.com']
    },
 }
 

Notification Group Variables

escalate

The value is a list of addresses that should be sent notifications if an event remains in a paging state. The list is advanced one entry every escalationWait * 5 minutes the event remains in a trouble state. Escalation emails are "roll-ups" of all Entities in bad state, so only one mail is sent for each interval, listing the number of Entities in bad state and the message from the first Entity only.

escalationWait

Default group only. The value is an integer number of 5 minute periods that must pass before the next address in the escalate list is added to the recipient list for notifications.

normal

The value is a list of email address(es) to contact when an alert is generated for triggers set to info notify level or the first failure for triggers set to warn notify level. Ideally this should point to the admin's standard inbox.

page

The value is a list of email address(es) to contact when an alert is generated for triggers set to critical notify level or the second failure for triggers set to warn notify level. Ideally this should point to a pager or other notification device as the Entity in question requires attention. Addresses listed under this key get an email for each Entity reporting problems, so it can receive a high volume of emails. Not recommended for most paging services.

post

The value is a list with two items. The first item is a URL to send a HTTP POST request to. The second item, which must be double-quoted, is a dictionary of variable-value pairs to send as the body of the POST request. The values may reference the report dictionary, which contains the information from the Entity that is reporting problems.

4.2.4. Group Configuration

The Groups section contains lists of Entities to consider as a group in the Report module. The group name is the key and the group members are a Python list of strings of the full Entity name.

4.2.5. Schedule Configuration

Related to the Notify section, the Schedule section defines when various groups in the Notify configuration are active. Using this configuration section, you can automatically transfer pages from being sent locally at first to being sent to an oncall pager, or rotate oncall pagers throughout the week.

An example Schedule configuration might look like this:

'Schedules' :
 {
 'default' : [
    [ [0800,1700], 'weekday'], # Mon
    [ [0800,1700], 'weekday'], # Tue
    [ [0800,1700], 'weekday'], # Wed
    [ [0800,1700], 'weekday'], # Thu
    [ [0800,1700], 'weekday'], # Fri
    [ [], 'weekend'], # Sat
    [ [], 'weekend'], # Sun
    ],
 },

The configuration consists of a list of 7 items, each corresponding to a day of the week. Each day is a list of two items. The first item is a list of the starting and ending times, in military format, of the schedule. If the list is empty, then the schedule is set for the entire day. The second item is the name of the Notify group that is active during that time frame. Outside of the time frame, the default Notify group is used.

4.2.6. Host Configuration

The Host section defines the configration for the Agents. Each Agent that is to report data must have a key in this section. The key is the shortened hostname as reported by the Agent; therefore make sure the system hostname is set correctly or unpredictable results may occur. Each key's value is a dictionary of Entity keys, with the values the configuration for that Entity.

A sample Host configuration section looks like the following:

'omourov' :
[
{ 'type'    : 'Cpu' ,
  'status'  : 'on' ,
  'triggers':  [{ 'level' : 'info', 'trigger' : '10'}],
},
{ 'type'    : 'Disk' ,
  'status'  : 'on' ,
  'triggers': [{ 'level' : 'info',     'trigger' : '90'}, 
               { 'level' : 'critical', 'trigger' : '95'}, ],
},
{ 'type'    : 'Mem' ,
  'status'  : 'on' ,
  'triggers': [{ 'level' : 'info',     'trigger' : '80000'}, 
               { 'level' : 'critical', 'trigger' : '100000'}, ],
},
{ 'type'    : 'Proc' ,
  'status'  : 'on',
  'triggers': [{'level':'info', 'trigger':'automount'}],
},
## end of omourov
]

Each list entry in the Host value has three required keys:

type

The name of the Entity described in this entry.

status

Sets if Entity is enabled (on) or not (off).

triggers

Sets the Triggers and Notify config for this Entity. The value set is dependent on the Entity type.