Promoting Serverless Observability The primary goal of observability is to provide engineers with maximum visibility of their software. Observability is all about optimizing your software engineering practice with a view toward building observable systems—that is, systems that can be inspected and analyzed to understand their behavior, performance, and health. For a deep dive into observability, […]
Category: Identifying the Critical Paths
Critical Health Dashboard – Operating Serverless
Critical Health Dashboard Just as you saw with testing in Chapter 7, operations can benefit from a focus on your critical paths. But even your critical paths will have aspects that are more important than others when it comes to assessing operational health and performance at scale. You can apply the RED method to ascertain […]
Capability Alerting – Operating Serverless
Capability Alerting Take a moment to imagine the following scenario. The day has finally arrived: you have spent weeks designing and building your beautiful serverless architecture, and now it is time to release it to your expectant users. But you recognize that a diligent serverless engineer never operates an application in production without alerts. How […]
Event-Driven Logging – Operating Serverless
Event-Driven Logging Application logs, combined with traces (see the next section), are essential for debug‐ ging and troubleshooting issues, either in real time or retrospectively. However, logs are easily overused and overly relied upon in serverless applications. This can result in an exponential increase in the time and knowledge it takes to debug an issue, […]
Using Distributed Tracing to Understand the Whole System – Operating Serverless
Using Distributed Tracing to Understand the Whole System The most common challenge with understanding the health of a serverless system is the inherent distribution of compute across microservices and managed services. This is also true when that health is diminished and the system is experiencing an issue that requires debugging and remediation. The traditional means […]