9.7. Service of the Database objects
In an initial phase of the module work, when the Network Information Database
doesn't include any objects discovered by dnmmsd yet, the discovering procedure can be begun on two ways:
by polling a particular device specified as one of running options
(--node-2-discover option);
by discovering a local network, inside which the management station works.
The discovering procedure can be omitted for addresses specified in the
configuration of addresses excluded from the discovering procedure.
Devices are discovered by sending SNMP requests
to particular IP addresses. The IP addresses are another addresses belonging
to a network selected by an operator as managed one.
|
Besides a local network (finally) any other network is not automatically scanned
without a decision of the system administrator.
|
SNMP requests, both to newly discovered devices and to already included
in the Database, are sent with a community according to information included in the
configuration of SNMP communities. A similar process for each discovered device
as that one performed in the moment of its discovering is prepared every specified period of
time. This period is one of attributes of a state group to which a given object belongs.
Then all knowledge about the device configuration included in the Database is updated.
This process is performed every hour by default. It may occur earlier if dnmmsd discovers
any error in responses from a given device.
When a new discovered device is added, its all network interfaces and
BGP peers (if it has them) are added to the Database. The device is also scaned
for all active and permited IP address of managed item groups. IP address of one of
the device network interfaces is also added to ping objects (an IP address
corresponding to a device name described as sysName is prefered).
When information about a device is updated, its all network intefaces, BGP peers
and managed items discovered using managed item groups are chacked. Newly discoverd
interfaces, BGP peers and managed items are added, but these one which can't be
found are marked as removed.
Beside periodic update of each monitored device configuration,
checking of states of network interfaces, BGP peers, managed items and ping objects
is performed more often. That period is specified as one of fields of
a states group to which a given item belongs. Network interfaces and
BGP peers are asked for an operation and administrative state.
For managed items are chcked suitable variables indicated by a user in the
parent definition for a given item of managed item groups.
Ping objects are monitored by sending five, short ICMP-ECHO packets to a
given IP address with 1-second intervals. A state of a given item is
calculated depending on received responses. For network interfaces and
BGP peers, an UP (ESTABLISHED)
state is treated as a correct state while the administrative state is
UP too. If it's DOWN, the item state is
always positive. For managed items, an answer, which values are treated
as positive ones, depends on a definition of an origin managed item group
for that managed items.
Health of ping objects are measured as a percent of
received responds for sent ICMP-ECHO packets. A positive situation is
when this percent amounts 100%.
After the item state is calculated and it is different from
that one just before the procedure, the change is propagated to above
(parent) levels in case the item is a network interface or BGP peer.
Similar calculations are continued on higher levels of object parents
until states are still changed. The following states are permissible:
Not managed - an item is not managed;
Ok - an item state is correct;
Minor fault - an item state is not perfectly correct;
Major fault - an item fault is considerable;
Critical - an item is in a critical state;
Was minor fault - an item state was not perfectly correct but it is better now;
Was major fault - an item fault was considerable but it is in a better state now;
Was critical - an item was in a critical state but it is in a better state now;
Unknown - an item state is unknown;
Error - an error occurred for a particular item (this state should last long);
Deleted - an item was not found in a current device configuration.
A state of a given item is not modified and doesn't affect its higher level items
(its parent or owner) as the flag Is passive of the item is set.
A log message is written to a particular table as a response for each change
of network interface state, BGP peer, managed item state or ping object state.
For each item additional informing possibility about alarm situations exists.
For each network interface, BGP peer and managed item beeing in Critical
state a following script is run every 5 minutes:
netinterface-alarm.sh for network interfaces, bgppeer-alarm.sh
for BGP peers and mitem-alarm.sh for managed items.
For ping object, when a percent of answers for it is less then 100%, the program
ping-alarm.sh is run.
It's run periodically every specified period of time until
the state doesn't come to OK. The period is one of attributes of a ping object
group to which the ping object belongs. The issue how to configure ping object
alarms is described in Network Management Map (xdnmm).