Auto-Balanced Collector Groups
Last updated on 04 January, 2021Overview
Traditionally, the assignment of devices to Collectors in LogicMonitor has always represented a unidirectional one-to-many relationship: a Collector could have multiple devices assigned to it, but devices could not be assigned to multiple Collectors. This presents some challenges not only in the assignment and management of devices among Collectors, but also in ensuring that Collector load is appropriately scaled and balanced.
While you can choose to maintain this one-to-many relationship if it’s well suited for your environment, LogicMonitor also offers the ability to create Auto-Balanced Collector Groups (ABCGs). By allowing devices to be assigned to more than one Collector, ABCGs address the aforementioned challenges by:
- Dynamically moving device(s) from one Collector to another within the ABCG in order to prevent any individual Collector from becoming over-subscribed.
- Improving scalability for device failover. When a Collector in an ABCG goes down, the devices that were being monitored by that Collector will be distributed among the other Collectors in the ABCG.
- Streamlining the creation process for devices and allowing for simplified capacity management within a group of Collectors.
How Do Auto-Balanced Collector Groups Work?
Monitoring thresholds for the Auto-Balanced Collector Group (ABCG) are based on raw instance counts and not on the current load or weighted instances. Every 30 minutes, LogicMonitor analyzes the total number of DataSource instances being monitored per Collector in an ABCG.
If a Collector’s monitoring threshold is exceeded:
- LogicMonitor will attempt to rebalance the load by moving the highest instance count devices off of that Collector to the Collector with the lowest load in the ABCG.
- If moving a device would result in the target Collector exceeding its threshold, the device is not moved and the next largest device on the Collector is attempted.
- This process repeats through all the devices monitored by that Collector until the instance count is below the threshold of the target Collector—or until all devices have been attempted.
It is possible that no moves will be made if rebalancing (moving devices from one Collector to another) would put other Collectors in the ABCG over their respective limits. Having too many instances on all your Collectors will prevent auto-balancing and lead to performance impacts, unless you adjust the Rebalancing Thresholds.
Note: If an ABCG is rebalanced, an entry is created in the audit logs under the user of “System:AutoBalancedCollectorGroupCheck”.
When a Collector in an ABCG fails, its devices are moved to the other active Collector(s) in the group. There is no one-to-one manual designation of a failover Collector; rather, a rebalance algorithm is triggered and devices are balanced across the remaining Collectors in the ABCG as efficiently as possible. When the failed Collector comes back online, the devices will remain on their new Collector(s), assuming they are not over their threshold limits.
After enabling auto-balancing for a Collector Group (detailed in the Configuring an Auto-Balanced Collector Group section of this support article), a Collector’s count of instances across all the devices it is currently monitoring is displayed in the Collector list. If necessary, instance count thresholds can be tuned, as discussed in the Rebalancing Thresholds section of this support article.
Note: Protocols that send data to Collectors (such as Syslog, SNMP traps, and NetFlow) are not part of auto-balancing. These types of solutions require configuration on the endpoint devices and would need a balancing solution at the transport layer rather than the application layer.
Collector Considerations for Inclusion in an Auto-Balanced Collector Group
Since devices will be dynamically moving among the Collectors in an Auto-Balanced Collector Group (ABCG), we recommend that the Collectors making up the group are as similar as possible. Collectors within an ABCG must specifically share the following characteristics:
- Network accessibility. Collectors must have the same network accessibility to all devices monitored by the ABCG.
- Operating system. Collectors must be on the same operating system. This is a hard requirement enforced by the LogicMonitor UI. Windows Collectors and Linux Collectors have different capabilities when it comes to monitoring devices. To ensure continuity of your metrics please ensure that Collectors in ABCGs are of the same OS.
- Collector version. Although it’s not required, we recommend that all Collectors in an ABCG are on the same version.
- Collector monitoring. Collectors that are part of an ABCG should maintain monitoring themselves. Avoid setting the ABCG as the preferred Collector. See Monitoring your Collectors.
Configuring an Auto-Balanced Collector Group
To configure an Auto-Balanced Collector Group (ABCG), perform the following steps:
- Navigate to Settings | Collectors and locate the Collector group for which you would like to enable auto balancing.
- If the Collector group doesn’t yet exist, click Add | Collector Group to begin the creation of a new group. For information on creating Collector groups and adding Collectors as group members, see Collector Groups.
- If the Collector group already exists, click the down arrow to the right of its name and select “Manage Group” from the dropdown menu that appears.
- Toggle the Auto Balanced Collector Group slider, located at the very top of the dialog, to the right to enable auto balancing for the Collector group.
- If Collectors are currently assigned to your new ABCG (i.e. you are converting an existing Collector group to an ABCG), the Do not auto balance monitored devices option appears. By default, this option is unchecked, allowing all devices to dynamically move among all Collectors.
However, if you prefer to leave devices assigned to their current Collectors and instead manually enable the devices on a case-by-case basis for participation in auto balancing (as discussed in the Assigning Devices to Auto-Balanced Collector Groups section of this support article), check this option (not recommended). Conversely, if you leave this option unchecked, but have a scenario in which a device must be monitored by a specific Collector (or in which it is not ideal for the device to move among Collectors), you can manually remove it from auto balancing—in the same way you would manually add it.
- Click Save.
Rebalancing Thresholds
The instance count threshold for a Collector in an Auto-Balanced Collector Group (ABCG) is auto-calculated using the ABCG’s assigned threshold value and the RAM on the Collector machine. By default, this threshold is set to 10,000 instances, which represents the instance count threshold for a medium-sized Collector that uses 2 GB of RAM. (See the calculation below.)
You can adjust this threshold value in the ABCG’s configuration using the following table as a reference point. Set the limit to the value in the Medium column that correlates to the approximate number of instances you would like on each Collector. For any additional guidance, contact Support.
The number of instances that a Collector can handle is calculated with:
Number of instances = (Target_Collector_mem/Medium_mem)^1/2 * Medium_Threshold
For example, if a user sets a Medium (2G) Collector’s threshold to 10,000, for a Large (4G) Collector, the threshold will be scaled to: (4/2)^1/2*10000 = 14140 instances
Assigning Devices to Auto-Balanced Collector Groups
Devices are assigned to Collectors from their configuration settings.
To assign a device to an Auto-Balanced Collector Group (ABCG), specify the ABCG in the Collector Group field. By default, the Preferred Collector field will dynamically update to “Auto Balanced.” This indicates that the device will participate in auto-balancing activities.
There may be situations in which a device must be monitored by a specific Collector (or in which it is not ideal for the device to move among Collectors). In these cases, you can designate a specific Collector in the Preferred Collector field that will be dedicated to that device, effectively removing the device from auto-balancing activities.