Easy log forwarding on K8S using filebeat CRD’s

Roberto Javier Yudice Monico
4 min readOct 16, 2020
Photo by Markus Winkler on Unsplash

Log forwarding is an essential part of any Kubernetes cluster, due to the ephemeral nature of pods you want to persist all the logging data somewhere outside the pod so that it can be viewed beyond the pod’s lifetime, and also outside your worker node as they could also die.

Filebeat is a well established log shipper. In the Kubernetes ecosystem you have other options such as fluentbit and fluentd(the CNCF backed option), I found filebeat to be easier to set up than them, at least to get it running, and it is just a featured-rich.

To make the setup easier we are going to use Elastic’s ECK, which is a set of CRD’s for the ELK stack for Kubernetes, I won't get into too many details on CRD’s, but in short they allow you to extend Kubernetes beyond the typical resources such as Pods, Services, etc. so that you can define your own custom resources in a declarative way, this makes streamlining your cluster set up easier since you can have everything centralized in the same place.

Creating a test cluster

In case you haven’t heard about k3d, this would be a good time to use it. K3d can help you set up a lightweight Kubernetes cluster in your computer within seconds, and it supports multiple platforms, including windows.

Creating the CRD’s

The first step would be to create the ECK CRD’s in your kubernetes cluster. To do that you can run the following command, extracted from ECK’s documentation. For more information refer to ECK docs

kubectl apply -f https://download.elastic.co/downloads/eck/1.2.1/all-in-one.yaml

Creating the Role and Service Account

Next we need the cluster role and service account with the clusterrole binding that will be used by filebeat.

First let's create the namespace where we will create all of our filebeat resources:

kubectl create ns elastic-system

Now onto the YAML. Before applying the YAML ensure that you are in the “elastic-system” namespace that we created earlier, or pass on the namespace flag to kubectl using “-n elastic-system” flag.

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: elastic-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: elastic-system
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io

Creating the filebeat resource

Now that we have the CRD’s and the service account created we can define a Beat resource using the following YAML, again remember this should be done in the “elastic-system” namespace:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
name: filebeat
spec:
type: filebeat
version: 7.9.1
image: docker.elastic.co/beats/filebeat-oss:7.9.1
config:
output.elasticsearch:
hosts: ["<https://yourelasticsearch>"]
index: "k8s-%{[agent.version]}-%{+yyyy.MM.dd}"
setup.template.name: "k8s"
setup.template.pattern: "k8s-*"
setup.ilm.enabled: false
filebeat:
autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints:
enabled: true
default_config:
type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}.log
processors:
- add_cloud_metadata: {}
- add_host_metadata: {}
daemonSet:
podTemplate:
spec:
serviceAccountName: filebeat
automountServiceAccountToken: true
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirstWithHostNet
hostNetwork: true # Allows to provide richer host metadata
containers:
- name: filebeat
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
volumeMounts:
- name: varlogcontainers
mountPath: /var/log/containers
- name: varlogpods
mountPath: /var/log/pods
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
volumes:
- name: varlogcontainers
hostPath:
path: /var/log/containers
- name: varlogpods
hostPath:
path: /var/log/pods
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
---

Few things to highlight about the YAML

  1. I changed the image to use “filebeat-oss” image. This is the image of the opensource distribution of filebeat, the other image (which doesn’t contain oss at the end) will try to look for an elasticsearch license and fail if you don’t have one. For a list of images you can go to http://docker.elastic.co/.
  2. Replace “https://yourelasticsearch” with your ElasticSearch endpoint.
  3. All logs will go into a new ElasticSearch index pattern called “k8s-*”.

About Autodiscovery

In the filebeat configuration that is part of the filebeat resource we created in the YAML you’ll notice we are using autodiscovery, this simplifies things for us as it will automatically look for all pods and forward the logs without us having to do anything and it will also enrich the logging data with Kubernetes metadata, such as the namespace of the pod, the labels of the pods, and other data that will make the logs much easier to search and filter by.

Troubleshooting

If you aren’t seeing any logs in ElasticSearch you should check the logs of the DaemonSet that was created when you executed the filebeat resource YAML. It should be named “filebeat-beat-filebeat”, you can use the command below to view the logs, and look for errors there.

kubectl logs ds/filebeat-beat-filebeat -n elastic-system

Wrap up

So now you have a basic log forwarding setup that should serve as a start for more robust log forwarding. This setup is missing log parsing, often you will want to enrich your logging data as much as you can and this comes with some parsing to be done on the messages, to extract for example the logger that put the log message in the case of a web application. You can look at ElasticSearch Ingest Node for that.

--

--