Legal claims defining the scope of protection. Each claim is shown in both the original legal language and a plain English translation.
1. A method for monitoring health of applications, the method comprising the steps of: determining that a local application is comprised in a service group, the local application being a single instance of an application; determining a service environment to which the local application belongs, the service environment being a group of service groups, each service group comprising one or more local applications; instantiating objects for representing the service environment and at least another service environment in a health map; adding objects to the health map that represent external resources; monitoring health indicators for the local applications, the service groups, the service environments and the external resources; attributing the health indicators to either health of the local application or health of one or more of the external resources; and raising a root cause alert to indicate a fault, the fault being attributable to either the local application or the one or more external resources.
A method for monitoring application health involves automatically discovering local applications running on servers, determining the service group (a collection of applications) and service environment (a collection of service groups) to which each application belongs. The method instantiates these applications, service groups, and service environments, along with their external resource dependencies (databases, web services) into a "health map". The system monitors health indicators like performance metrics, errors, and synthetic transactions for all components, attributing these indicators to the health of either the local application or external resources. When a fault is detected, a root cause alert is raised, pinpointing the faulty application or external resource.
2. The method of claim 1 , wherein the service environment to which the discovered local application belongs is a production service environment and the other service environment comprises a test service environment that duplicates the production service environment.
The method for monitoring application health, as described above, uses a production service environment, and also incorporates a test service environment that mirrors the production setup. The purpose of this is to allow testing and validation without impacting live services.
3. The method of claim 2 , further comprising: triggering one or more actions within the test service environment in response to the root cause alert indicating the fault.
The method for monitoring application health also triggers actions within the test service environment, in response to a root cause alert indicating a fault. This allows for investigating the fault in a safe, isolated environment before applying fixes to production.
4. The method of claim 2 , wherein triggering one or more actions within the test service environment further comprises: triggering one or more of tests, diagnostics, duplication of problems causing the fault and testing potential solutions and workarounds within the test service environment.
The method for monitoring application health triggers actions within the test service environment, and those actions include running tests, diagnostics, reproducing the fault condition, and evaluating potential solutions or workarounds, enabling comprehensive problem resolution without affecting the production environment.
5. The method of claim 1 , wherein adding the external resources to the health map further comprises: adding one or more of partitioned and non-partitioned external resources to the health map.
The method for monitoring application health adds external resources, either partitioned (e.g., sharded databases where the application is aware of the shards) or non-partitioned (e.g., load balancers where the application sees a single endpoint), to the health map to reflect the application's dependencies.
6. The method of claim 5 , wherein adding non-partitioned resources to the health map further comprises: adding services utilized by the local application that are outside of the local application such that each of the services appears as a single entity to the local application.
The method for monitoring application health adds non-partitioned external resources, that are used by the local application and are outside of the local application, to the health map. This means that each of these external services appears to the local application as a single entity.
7. The method of claim 5 , wherein adding partitioned resources to the health map further comprises: adding services utilized by the local application that are outside of the local application such that partitions of the services are visible to the local application.
The method for monitoring application health adds partitioned external resources, that are used by the local application and are outside of the local application, to the health map. This differs from non-partitioned resources because the individual partitions of the external service are visible to the local application.
8. The method of claim 7 , wherein adding partitioned resources to the health map further comprises: dynamically adding partitions to and removing partitions from the health map as partitions are dynamically added to or removed from the partitioned resources.
The method for monitoring application health dynamically adds or removes partitions from the health map as the underlying partitioned resources (like databases) are scaled up or down, reflecting the current state of the service environment.
9. The method of claim 1 , wherein monitoring health indicators further comprises: monitoring one or more of events, performance counters, synthetic transactions, and service events.
The method for monitoring application health monitors health indicators including events, performance counters (CPU usage, memory consumption), synthetic transactions (simulated user actions to test availability), and service events (SysLog, SNMP) to comprehensively assess application and resource health.
10. The method of claim 1 , wherein raising a root cause alert further comprises: indicating within the root cause alert, parts of the service environment that are affected by the fault.
The method for monitoring application health raising a root cause alert includes details about which parts of the service environment are affected by the fault, enabling faster incident response and impact assessment.
11. A system for monitoring health of an enterprise-scale service environment comprising: at least one processor; a non-transitory processor-readable storage medium storing instructions that cause the processor to: discover objects to be included in a health map, the objects representing at least the service environment that comprises at least a service group, the service group further comprising at least one local application; instantiate the objects into the health map; monitor health indicators for the local application represented in the health map; generate a root cause alert when a fault is detected within the service environment; and dynamically update the health map to delete objects representing one or more of the service environment, the service group and the local application that are removed from service; and dynamically update the health map to add new objects representing one or more of other service environment, other service group or another local application that are added to the service.
A system for monitoring the health of a large-scale service environment uses a processor and memory to: discover applications, service groups (collections of applications), and service environments (collections of service groups); represent them as objects in a "health map"; monitor their health using various metrics; and generate alerts when faults are detected. The system also dynamically updates the health map, adding or removing objects as applications or services are added or removed from the environment, thus providing an up-to-date view of the system's health.
12. The system of claim 11 , wherein the instructions for discovering the objects further comprise instructions that cause the processor to: make a determination on whether or not the local application is present on a server in the service environment.
The system, as described above, determines whether a local application is present on a server in the service environment as part of discovering the objects to be included in the health map.
13. The system of claim 12 , wherein the instructions to make the determination regarding the local application further comprise instructions that cause the processor to: query the service environment if a specific registry key is present and if a specific file exists on a local file system or if a specific environment variable is set.
To determine if a local application exists on a server, the system queries the service environment to check for the presence of specific registry keys, files on the file system, or environment variables. The presence of these indicators signals the existence of the application.
14. The system of claim 11 , wherein instructions to instantiate the objects into the health map further comprise instructions that cause the processor to: add the service group to the health map if the service group associated with the local application is not currently in the health map; and add the service environment to the health map if the service environment associated with the local application and the service group is not currently in the health map.
To instantiate the objects in the health map, the system adds the service group associated with a discovered local application if that service group is not already in the map, and also adds the service environment if it's not already present.
15. The system of claim 11 , wherein instructions to instantiate the objects into the health map further comprise instructions that cause the processor to: instantiating external resources used by the local application into the health map.
To instantiate objects, the system creates representations of external resources used by the local application in the health map, mapping the application's dependencies for comprehensive monitoring.
16. The system of claim 11 , wherein the service environment is a production service environment.
The service environment monitored by the system is a production service environment where live applications are running.
17. The system of claim 11 , wherein the service environment is a test service environment.
The service environment monitored by the system is a test service environment used for staging, testing, and development purposes.
18. A non-transitory, processor-readable, storage medium comprising processor-readable instructions for: determining that a local application is present in a service group, the local application being a single instance of an application; determining a service environment to which the local application belongs, the service environment being a group of service groups, each service group comprising one or more local applications serving a common consumer; instantiating the service environment and other service environments as objects in a health map; adding further objects to the health map for representing external resources; monitoring health indicators for the local applications, the service groups, the service environments and the external resources; attributing the health indicators to either health of the local application or health of one or more of the external resources; and raising a root cause alert to indicate a fault, the fault being attributable to either the local application or the one or more external resources.
A storage medium contains instructions to: identify local applications within service groups and service environments, representing them as objects in a health map along with external resources; monitor health indicators across all components; and trigger root cause alerts when faults are detected, pinpointing the source of the problem.
19. The storage medium of claim 18 , wherein the instructions for determining the service environment to which the local application belongs further comprise instructions to determine if the local application belongs in one of a production service environment or a test service environment that duplicates the production service environment.
The storage medium, as described above, determines the service environment of the local application. The system has instructions to determine if the local application belongs to a production or a test service environment that is a duplicate of the production environment.
20. The storage medium of claim 18 , wherein the instructions for raising a root cause alert to indicate a fault further comprise instructions for triggering one or more actions within the test service environment in response to the root cause alert indicating the fault within the production service environment.
The storage medium, as described above, triggers one or more actions within the test service environment in response to a root cause alert indicating a fault in the production service environment, facilitating investigation and resolution in a safe environment.
Unknown
September 26, 2017
Browse 5M+ US patents with plain-English claim translations and AI-generated analysis.