Observing the Health of Critical Paths Monitoring the performance of a distributed, serverless application at a given point in time can be challenging. The sheer number of parts operating independently across services, stacks, regions, and accounts can be overwhelming. Rather than trying to monitor everything, focus on the most critical parts of your application; the […]
Author: Margarette Freeman
Metrics, Alarms, and Alerts – Operating Serverless
Metrics, Alarms, and Alerts Metrics are the data that provides insights into the performance and health of your system. The metrics emitted by managed services give you a window into your utilization of the provided resources. You can also emit custom metrics from your Lambda functions (see Chapter 6 for information on using the open […]
Static analysis – Testing Serverless Applications
Static analysis If you are using a language and runtime that support static typing, you can leverage static analysis as a method to verify the requests sent to a third party and your handling of their responses. Static analysis is the process of verifying software without execut‐ ing it. Third parties will often provide official […]
Critical Health Dashboard – Operating Serverless
Critical Health Dashboard Just as you saw with testing in Chapter 7, operations can benefit from a focus on your critical paths. But even your critical paths will have aspects that are more important than others when it comes to assessing operational health and performance at scale. You can apply the RED method to ascertain […]
Capability Alerting – Operating Serverless
Capability Alerting Take a moment to imagine the following scenario. The day has finally arrived: you have spent weeks designing and building your beautiful serverless architecture, and now it is time to release it to your expectant users. But you recognize that a diligent serverless engineer never operates an application in production without alerts. How […]
Service level objectives – Operating Serverless
Service level objectives Service level objectives (SLOs) are targets for performance that provide an indication of how often your service can fail before the experience of your users is significantly degraded. SLOs are based on the realization that you cannot operate your product at 100% success all the time, and at some point, your users […]
Event-Driven Logging – Operating Serverless
Event-Driven Logging Application logs, combined with traces (see the next section), are essential for debug‐ ging and troubleshooting issues, either in real time or retrospectively. However, logs are easily overused and overly relied upon in serverless applications. This can result in an exponential increase in the time and knowledge it takes to debug an issue, […]
Using Distributed Tracing to Understand the Whole System – Operating Serverless
Using Distributed Tracing to Understand the Whole System The most common challenge with understanding the health of a serverless system is the inherent distribution of compute across microservices and managed services. This is also true when that health is diminished and the system is experiencing an issue that requires debugging and remediation. The traditional means […]