Welcome to the official website of Guangzhou Yi'an Electronics Co., Ltd.!

Aug 12, 2024 Amount of reading:1555

The practice of monitoring construction-the final feeling is shallow on paper.

The practice of monitoring construction-the final feeling is shallow on paper.

In the construction of the monitoring system, we know the above problems, then, how do we solve these problems.


1. Lower the threshold for use-out of the box


For server devices, after the system is initialized, Agent is installed by default, and host monitoring data can be collected without any configuration. If process information is configured in CMDB, process status data will be collected.


2. Say goodbye to the alarm storm


When the host is connected by default, the default alarm policy can be automatically added. If the threshold is reached, an alarm can be generated, as shown below. There are some default alarm policies.


In order to prevent excessive alarms from causing interference to users, we have used four magic tricks to ensure that the alarms received by users are effective.


Tips one-alarm convergence. In the events shown below, it can be seen that the convergence rule effectively reduces the convergence of alarm events, and the ratio of events to notifications generated is 1:100, or even higher. The alarm convergence function, which is natively supported by the design, effectively prevents the generation of alarm storms.


Tips two-alarm suppression. Generally, different thresholds are configured for the same alarm content due to different actual needs of users. For example, if the disk space usage alarm is configured, there are 80% warning, 90% warning and 95% serious. How do these three policies work? If the current disk space usage has reached 96%, are three alarm notifications generated? In fact, if the monitoring system generates three alarms, the monitoring system will be considered as mentally retarded by users. Therefore, for policies with the same latitude, we will only send alarms with alarm level, that is, only 95% alarm notification will be generated.


Tips three-alarm summary. Even if our monitoring system already has the two functions of alarm convergence and alarm suppression, we still cannot solve the problem of sending a large number of alarms at the same time, for example, if multiple alarm rules are met at the same time, then the problem of alarm flood storm may still occur. Therefore, it is necessary for the alarm summary function. For a large number of incoming alarms at the same time, alarms of the same dimension are summarized, and alarms are merged for different policies. With the alarm summary function, we can safely receive alarms.


Tips four-alarm analysis. Through the mining and analysis of historical alarm data, we can also find abnormal alarms to better analyze the original data and alarm thresholds, thus providing better data support for alarm configuration and threshold-free alarms.


3. Easy function expansion-plug and play


4. User rights control-on-demand authorization.


There are two basic categories of viewing and management based on the functions provided by monitoring. Generally, there are the following user scenarios:


Receivers of alarm notifications: such as operation and maintenance, development, testing, products, etc. Applicable to application viewing class and shielding class operations.

Configurator of monitoring: such as operation and maintenance. Applicable to application view management operations.

Manager of the monitoring platform: a function applicable to the whole.


5. Automated Cornerstone Building-Efficiency First


Different from other monitoring systems, the definition of collection can be completed directly in the system, without the use of other third-party control systems or login server deployment. All the configuration can be completed on the interface, including plug-in writing, we only need to open the page to make plug-ins.


When the host in the module increases or decreases, the plug-in can be automatically distributed to the target machine, and the collection plug-in cannot be manually deployed.


Similarly, the target range in the alarm policy is also automatically matched, and the range takes effect automatically without editing the policy for the newly added host or module.


Summary of 5. Monitoring Construction-Experienced and Remembering the Past


In the construction process of the monitoring platform, through continuous practice and experience, the function of the monitoring platform construction will gradually mature. However, if the monitoring platform is only built as the core goal, there will be problems of non-monitoring itself, such as how to combine publishing and monitoring, and how to block alarms during publishing. How to make the monitoring linkage CMDB, how to get through the monitoring system and operation and maintenance automation system, how to get through the monitoring and assembly line release system, how to get through all aspects of monitoring and operation? This problem can not be solved by an independent monitoring system, but needs a scalable and customizable operation and maintenance platform to solve.

Previous
The challenge of monitoring construction-I will search up and down.
Next
No.