App Monitoring and Alerting — A Practical Prometheus + Spring Boot Tutorial

The tutorial takes you from scratch and shows what, how and why of application monitoring and alerting with Prometheus and Spring Boot

TL;DR

Download tutorial code here

Intro

So there are number of tutorials available for monitoring and alerting using Prometheus and this is another one. The reason I had to write this was the time it took me to setup and configure it locally. The resources are numerous but I couldn’t find a single step by step code tut that could share a practical code approach.

This tutorial will take you on an experiment using prometheus with Java Spring Boot framework.

This tutorial is suited for people having intermediate skills in below areas

THIS IS NOT A INFRA SETUP OR DEEP-DIVE TUT FOR PROMETHEUS. Tutorial is targeted for code integration only

What is Prometheus?

An app monitoring and alerting tool.

Why do you need it?

Building awesome software products is one part and maintaining them is another part. As you grow as a software professional you would realise the most important part of software is gathering metrics and gauging health of your services for an uninterrupted customer experience.

How you do it?

Step 1:

The tutorial assumes you have Java/ Spring framework background. First of all let’s setup a simple Spring Boot Application. Download the complete code here. You will see 3 APIs here simulating a simple 2XX response and another a simple 5XX response and last /alert-hook to receive a callback when an alert comes (more details in Step 4).

The metrics settings and server port 8082 can be configured via application.properties file

Gist of Spring Boot App integrated with Prometheus
Sample Responses when you start Spring Boot App and hit 2xx and 5xx api

The code uses a library called micrometer to create metrics that Prometheus can understand. These metrics are available over http://localhost:8082/actuator/prometheus

Step 2:

In a situation when your app starts misbehaving and starts throwing 5xx, you would want to proactively act on this situation w/o your customers letting you know about this situation. Thats where the tool like prometheus comes in.

So lets install Prometheus (on MAC for me)

Download Prometheus binary here, unzip and do the following bare minimum configs.

Open prometheus.yml in the downloaded folder and add your Spring boot application info so that Prometheus can pull in the metrics. See line 30 where we have added a new target telling Prometheus where to pull metrics

From your command prompt, cd into Prometheus dir and start the prometheus process. (It will automatically read above config file if the process is started in prometheus dir)

>>> ./prometheus

This should start prometheus server with some default setting like time interval after which prometheus will pull in the metrics.

After prometheus is started, visit http://localhost:9090/graph to view the Prometheus panel where you can query and view the metrics pushed by our Boot app.

Try querying the metrics like shown in below screenshot.

orders_5xx_total and orders_2xx_total are the metrics which we pushed from our app

Step 3:

Till now you have collected the data and now it’s time to set alerts on top of this data. A very important alert that you would need is the error rate breaching a threshold.

For this example, we will set an alert which will trigger if 5xx rate is more than 1%.

Let’s create a rules file that will have our condition represented in prometheus query language and some information that we want to push when this alert is triggered.

alert-rules.yml is our sample rule file which looks like below. (Gist available in the end)

Update prometheus.yml with the info of this file using below configs

Restart the prometheus server and you should see below Alerts info if you visit Alerts section in panel

Now to simulate the alert scenario, let’s hit a few 5xx urls and few 2xx urls from our spring boot app. Make sure 5xx hits are more than 2xx

After few seconds (of Rule evaluation/Scrape interval time), you will start seeing Alert Status go into Pending and then into Firing, which means Alert has been triggered.

Step 4

To get this Alert over various channels like Emails/Webhooks/Slack etc we need another tool called AlertManager.

Download Alertmanager binary here and connect prometheus to alertmanager by updating the alertmanager.yml configuration first.

In alertmanager.yml, update line 13 (see below screenshot) to point to api /alert-hook in our spring boot app. It means when ever an alert will come, AlertManager will notify us at this location.

Secondly update, prometheus.yml to connect to Alertmanager server. See below screenshot line 12 of prometheus.yml

Now run alertmanager from your CLI

>>./alertmanager

and restart the prometheus server

>>./prometheus

Now if you hit some 5xx more via spring boot API, you should see in your spring boot application console that after few seconds of evaluation, an alert was triggered and received a callback at /alert-hook API

You can also validate this via Alertmanager Panel, available

http://localhost:9093/#/alerts

And printing callback via your boot console logs

Further Reading

This tutorial has touched monitoring and alerting from a 100ft view. You should deep dive now to understand architecture in details, Prometheus Query language, aggregate functions and more. If you were able to connect the dots and share the vision, start going through official docs here.

Downloads

Code used in the tutorial is open-sourced and is available at Github. Download here

Complete prometheus.yml and alertmanager.yml and alert-rules.yml are available as Gists below

Director Of Engineering @Paytm | Entrepreneur | Strategic leader