Get up and running Zentral on Google Cloud Platform — Chapter 3

Intoduction

7 min readFeb 25, 2018

Welcome back to our third tutorial in a series on Zentral. In this chapter we peek into internal settings, run few linux systemd commands and go inspect server side processes and workers on command line as well as in the Prometheus 2.0 interface.

When I call Extra-Links->Kibana, I just get a HTTP 502. It looks like the service didn’t start correctly.

From previous chapter we do already know, in case we see a HTTP 502 status code instead of a Kibana 6 interface, a systemct command below will help restart the service for us:

systemctl restart elasticsearch 
systemctl restart kibana

Prometheus 2.0

But pretty sure we don’t want to go and restart such services just blindly. For general purpose of health check on internal workers status we’ve include Prometheus 2.0 into “Zentral-all-in-one”.

So let’s go and inspect Prometheus right now. You just need to go to the Extra links section in Zentral, then select Prometheus here.

Next in Prometheus navigate to Prometheus > Status > Targets section. Here you can see the quickly the health status of the build in running services. Please keep note of the Inventory worker dummy (1/1 up) entry here.We will work on that service next.

Let’s edit out the showroom dummies

The main configuration for Zentral is read from the base.json file, to be found inside /home/zentral/conf directory. By default in Zentral there’s entries for “dummy” devices and also a Inventory worker dummy process is running in Zentral. The Inventory worker dummy will generate recurring events, these are “inventory heartbeat” events written into ElasticSearch. For our productive use scenario we don’t need the dummy entries any more. So as our exercise here, we will edit out the dummies. This is a good lesson how to adjust and edit the Zentral base.json configuration.

For this task you either open up the GCP Terminal window or you log with ssh to the Zentral instance running (see Episode#2 for details).

Begin editing base.json

Just in case you’ll happen to make any errors on editing the base.json file, let’s be safe by default and create a backup first:

sudo cp /home/zentral/conf/base.json /home/zentral/conf/base.json.bk

Next we edit the base.json config file with vim text editor (basic edit skills in vim required). Our goal is to disable the Inventory worker dummy entry.

sudo vim /home/zentral/conf/base.json

Just in case you don’t like vim, feel free to use nano instead:

sudo nano /home/zentral/conf/base.json

The Inventory dummy worker is to be found in the base.json config file under zentral.contrib.inventory >clients > zentral.contrib.inventory.clients.dummy object. So once removed from the base.json the inventory.clients.dummy worker will also be removed from the runtime, as soon all Zentral workers are re-started.

It is critical, we don’t mess up the json formatting here.

Take your time, edit out the json zentral.contrib.inventory.clients.dummy object, you need include surrounding curly brackets. As result of our editing we see an empty "clients":[] array remaining in the base.json file. Now save (overwrite) the edited file

Next can validate the json formatting is not messed up by use of the python -m json.tool , run the following command:

python -m json.tool /home/zentral/conf/base.json

We should see a successful json format validation result, then we can restart some services. If validation throws an error, we must correct them (remember we can consult our original backup file) until we’ve fixed the edit to be correct.

Next ew restart the workers with just this command:

sudo systemctl restart zentral_workers

As a successful result of the base.json edit & restarting zentral_workers , we can discover the inventory dummies worker is no more running an also no more present in the Prometheus 2.0 targets section. This is expected behaviour.

Restarting services

A service restart usually just take few seconds. Including service restart commands, here is an additional reference on command options. For systemctl the common start, stop, restart commands will apply.

To restart installed software components:

sudo systemctl restart elasticsearch 
sudo systemctl restart kibana

To restart zentral workers:

sudo systemctl restart zentral_workers

To restart the Zentral web app (Nginx reverse proxy:

sudo systemctl restart zentral_web_app

Finally inside /home/zentral/app/utils we provide a tool to run general restart for all services:

sudo /home/zentral/app/utils/reload_restart.sh

Run a status check

We can validate the services status check with Ubuntu systemd via the systemctl command. We could make use of a wildcard as well, i.e. we see all services with “zentral” prefix:

systemctl status zentral*

Here we see our example output, with all services running well:

Inspect an error

Let’s see another Example below, this time we look into a minor error message in the zentral_web_app:

The error from systemctl status:

exception ERROR Invalid HTTP_HOST header

This indicate Gunicorn process started, but refers to a previous IP, that was set for the VM. An error like this could happen in case you’ve stopped an instance for a while. Unless you reserved the IP permanently, a later restart will assign a new external IP (an external IP can be reserved in GCP).

From the log we see this minor error happen in the zentral_web_app service. Keep in mind we don’t need to restart the full VM, we can just restart a single service:

systemctl restart zentral_web_app

Restarting services

There are several options to restart services in Zentral on the unix level. A service restart usually just take few seconds.

As a short reference — here is a list of service commands to use. The common start, stop, restart command variations will apply:

To restart installed software :

sudo systemctl restart elasticsearch 
sudo systemctl restart kibana

To restart workers:

sudo systemctl restart zentral_workers

To restart the Zentral web app:

sudo systemctl restart zentral_web_app

Finally in /home/zentral/app/utils we provide a tool to run combined restart of many services:

sudo /home/zentral/app/utils/reload_restart.sh

Basic Log Viewing with journalsctl

So next it’s nice to see the some system logs, for this we can reach out for journalctl tool on Ubuntu / systemd

Inspect zentral_web.service:

journalctl -u zentral_web.service

or Nginx (optionally use some extra flags)

journalctl -u nginxjournalctl -u nginx.service --since today

journalctl -u prometheus

We see the inventory worker for dummies is no longer present in Prometheus 2.0 status page

To archive devices in the Zentral

We still see the dummy device entries. You can go and archive them by simply clicking “Archive” button. See the animation below for the simple steps to “archive” a device.

Note: OK, you may ask why do the dummies devices exist at all? Yes, we do see this require an extra hops to get rid off . Reasoning is we build open source code for Zentral with tests. For this it’s just good to have initial device entry present, and the dummy heartbeats can help to test the workers code doesn’t get broken…and… ¯\_(ツ)_/¯

Now let’s back-check the heartbeats appearance in Kibana UI and now we can see the stream from “dummies” heartbeats has stopped.

Wrap up

This is the end of our third chapter. We have learned some backround of server side monitoring and basic systemd and journalctl tasks. Knowing of these in some detail help to inspect for server side errors. As next move we can read in Chapter 4 (link provided soon) how to enable a SingleSignOn configuration for Zentral.

Get up and running Zentral on Google Cloud Platform — Chapter 3

Intoduction

Prometheus 2.0

Let’s edit out the showroom dummies

Begin editing base.json

Restarting services

Run a status check

Inspect an error

Restarting services

Basic Log Viewing with journalsctl

To archive devices in the Zentral

Wrap up

Written by zentral