Home Simple Centralized Logging with Fluentd and S3
Post
Cancel

Simple Centralized Logging with Fluentd and S3

This post is about a simplified centralized logging system for everyone out there that don’t have strict logging performance, and would like a simple way to log multiple microservices onto a unified single point.

In this first post of this a three part post, we will create a stack with Fluentd and Minio serving as our local s3 server for testing purposes. We will also use httpd server for testing purposes and log producer.

Later we will add in the second post (a tricky) way to visualize our logs using grafana and simple-json plugin to retrieve our S3 json formatted logs. This requires some time as I’ll have to write a proxy in front of grafana to easy our life (WIP - Post II )

Requirements

  • Docker
  • docker-compose

Architecture overview

This a very simplistic centralized logging system, hence having a simplified architecture. We will have a fluentd daemon running in a host shipping our json formatted logs to S3. For testing purpose all of this happens in our local host, and we will use an S3 API compatible system named minio that provides us with a local development image.

tl;dr

  • Log Producers (microservices or else)
  • Fluentd
  • S3

Final folder structure

You might want to create this folder structure before starting writing any files.

1
2
3
4
5
6
7
📦fluentd-s3
 ┣ 📂fluentd
 ┃ ┣ 📜Dockerfile
 ┃ ┗ 📜fluentd.conf
 ┣ 📂logs
 ┣ 📜.gitignore
 ┗ 📜docker-compose.yml

Creating our stack

Fluentd

So since we already know that for this post our stack will be composed of two services providing the shipping (fluentd) and the storage(S3, minio here) we will start by adding our own image of fluentd so we can install our s3 data output plugin. For that we will use a Dockerfile.

So let’s make create our Dockerfile under fluentd folder.

With the following contents:

1
2
3
4
FROM fluentd:v1.9.1-1.0
USER root
RUN fluent-gem install fluent-plugin-s3
COPY fluentd.conf /fluentd/etc/fluent.conf

So we need to change to root user, to install our required plugin, and to copy the config file. You shouldn’t run as root user, but for the purpose of this post let’s leave it as it is.

So, but before we can build our Docker image, we must first add our configuration file referenced in the Dockerfile.

So let’s create our fluentd configuration file where we will set our accepted sources and where to output based on a given pattern, and in this case will be our S3 alike container.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<source>
  @type forward
  port 24224
  bind 0.0.0.0
</source>
<match **>
  @type s3

  aws_key_id "#{ENV['AWS_KEY_ID']}"
  aws_sec_key "#{ENV['AWS_SECRET_KEY']}"
  s3_bucket "#{ENV['S3_BUCKET']}"
  s3_endpoint "#{ENV['S3_ENDPOINT']}"
  path logs/
  buffer_path /var/log/fluent/s3

  time_slice_format %Y%m%d%H
  time_slice_wait "#{ENV['UPLOAD_INTERVAL']}"
  utc
  store_as json
  buffer_chunk_limit 256m
</match>

So our source block indicates that we will receive logs in the 24224 default fluentd port for tcp and udp, as well as accepting connections from everywhere (this is for simplicity).

And our match block indicates that all logs from all fluentd_tags are accepted and will be outputted to an s3_endpoint, bucket with our own credentials, and all of these will be set at daemon runtime, by using environment variables.

The s3_endpoint "#{ENV['S3_ENDPOINT']}" indicates that the value of s3_endpoint will be set from an environment variable, and its the systax for fluentd. You may name these variables as you wish as long as you keep "#{ENV['name_of_var']}" format.

The typo @type s3 makes use of our installed s3 data output plugin.

You may configure multiple sources and matches to output to different places.

So, we have now created our fluentd Dockerfile and we will later use our compose file to create the image for us directly.

Docker-compose

So now that we have a way to output our logs using fluentd, we need to have a way to store them. Since we will not creating our own minio image, let’s create our docker-compose file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
version: "3.3"
services:
  fluentd:
    image: fluentd-with-s3
    build:
      context: fluentd
    environment:
      AWS_KEY_ID: "admin"
      AWS_SECRET_KEY: "supersecret"
      #Unfortunately this is deprecated and asks for aws region - However there are multiple NON aws, s3 opinionated api's out there (i.e scaleway, digitalocean)
      S3_ENDPOINT: "http://s3:9000/"
      S3_BUCKET: "my_logging_bucket"
      UPLOAD_INTERVAL: 10s
      UPLOAD_TYPE: json
    ports:
      - "24224:24224"
      - "24224:24224/udp"
  
  #In our scenario s3 will be served from minio as an example as there are multiple S3 alternatives that also use the same api - Check minio image and docs at https://hub.docker.com/r/minio/minio/ 
  s3:
    image: minio/minio
    command: server /data
    environment:
      MINIO_ACCESS_KEY: "admin"
      MINIO_SECRET_KEY: "supersecret"
    volumes:
      - ./logs:/data
    entrypoint: sh
    command: -c 'mkdir -p /data/my_logging_bucket && minio server /data'
    ports:
      - "9000:9000"

So, now we have two services in our stack. The fluentd, that we will create our image named fluentd-with-s3 by using our fluentd folder context. And minio image, in our s3 named service.

So, since minio mimics s3 api behaviour instead of aws_access_key and and secret as vars, it receives minio_access_key and secret, and will have the same behaviour if you wish to use minio cloud or s3, or even scaleway object storage, as all use s3 opinionated api’s.

On our command for minio entry we are create a folder that will hold our “bucket” that we named my_logging_bucket, and that is what is referenced in our fluentd container, the minio server /data is where minio will serve its buckets from.

Ok so now we my start our stack with

docker-compose up -d

And minio and fluentd will launch. As you might have noticed we are also volume sharing with our host so we might check our files locally as well.

To make sure your stack is up and running, let’s check minio user interface that comes bundled with this image, and is served in port 9000.

Navigate to http://127.0.0.1:9000

You should be greeted with something similar to:

So, when we login, with our awesome credentials

  • username: admin
  • password: supersecret

We should be greeted with our minio dashboard with our buckets

So when you first start, you shouldn’t have a logs folder as the one that appears in the above image. This will have to be created by our fluentd daemon.

But for that to happen we must first generate logs, so we will use httpd sample image to create our own logs. For simplicy we might add it to our current stack in the same compose file.

1
2
3
4
5
6
7
8
9
10
11
12
  web:
    image: httpd
    ports:
      - "80:80"
    links:
      - fluentd
    logging:
      driver: "fluentd"
      options:
        fluentd-address: localhost:24224
        tag: httpd.access

So you might notice the logging definition. And also notice you are referencing localhost. Note: Logging is made by Docker daemon here and made with host connectivity, so that’s why it’s set as localhost and why fluentd is exposing its ports to host. If you try to add fluentd address as you would using docker internal dns provider, for this case fluentd:24224 that wouldn’t be a known host to your host, unless you add it to /etc/hosts defeating the purpose.

You may also notice the fluentd options tag, that will be appended to the generated logs to identify the service.

So let’s start our sample container.

1
docker-compose up -d web

So now that we have our running sample container, try to reach it via web, curl etc, and even on 404 pages to generate richer logs.

http://localhost http://localhost/doesnt-matter

So, now after waiting for fluentd to ship logs (it bundles them in a 10s frame) as we defined above using: UPLOAD_INTERVAL: 10s

And now we may browse our logged files in our minio UI dashboard.

You might see something similar to the image below:

You might also notice from the image above, that I have in .gz format log files. This is the fluentd s3 upload plugin default format. You may change using UPLOAD_TYPE: gz

So now you may play around with this stack adding multiple services and centralizing logs in one location.

But for now, the logs visualization is not searchable and requires you to download the file and search it with your favourite editor.

Thanks for reading.

If you found this post useful, please share. You may aswell follow me on twitter.

This post is licensed under CC BY 4.0 by the author.