Promoting Serverless Observability – Operating Serverless

Promoting Serverless Observability

The primary goal of observability is to provide engineers with maximum visibility of their software. Observability is all about optimizing your software engineering practice with a view toward building observable systems—that is, systems that can be inspected and analyzed to understand their behavior, performance, and health.

For a deep dive into observability, we highly recommend the book Observability Engineering by Charity Majors, Liz Fong-Jones, and George Miranda (O’Reilly). One resource that can be helpful when getting started with observability is the Observability Maturity Model (OMM) introduced in Chapter 21 of that book. The OMM should give you an idea of where you are now and how you can improve your general observability practice.

Observability is more a sociotechnical concept than a matter of deploying a particular tool or using only open source standards. It’s about ensuring the data is there when you need to answer questions about your application’s behavior. Whereas monitoring can be seen as an active pursuit that involves engineers watching dashboards and trying to spot anomalies, observability is very much a passive process. After releasing a feature to users, your engineers should get back to building the next feature and improving the product, rather than worrying about supporting issues or analyzing performance. You should have confidence in your alarms to alert you to potential problems and in your metrics, logs, and traces to support efficient debugging when required.

With serverless, monitoring of infrastructure (such as network and hardware per‐ formance and failures) is delegated to the cloud vendor. Application monitoring becomes your sole focus. However, the application is now a highly distributed, ephemeral, event-driven mix of your business logic and the vendor’s managed serv‐ ices. This can make failure modes difficult to predict and comprehend. In turn, this means it is more important than ever to be able to view and understand your application’s behavior in production. The issue here is that traditional monitoring and alerting tools and strategies are inadequate for the task.

Adopting a culture and practice of observability in your team is crucial to the smooth operation of your serverless application. As you saw in Chapter 7, the distributed, asynchronous, decoupled, and event-driven nature of serverless applications raises special challenges with regard to testing. These same properties make serverless systems and microservices inherently difficult to monitor using traditional methods, such as “monitor all the things,” dashboards, and logs. You are no longer monitoring a single process running on a few machines, and operational status can no longer be understood via a few key metrics on a dashboard.

Observability cannot be a post-deployment afterthought or the responsibility of operations teams. Effective observability relies on the data and information about the system under observation being readily available. This means you cannot simply observe a system; you must first build a system that is capable of being observed. You must make observability a concern at every stage in your software delivery lifecycle, from design and development to operation and monitoring.

Leave a Reply Cancel reply