Showing posts with label Kubernetes. Show all posts
Showing posts with label Kubernetes. Show all posts

Thursday, March 30, 2023

Sending Kibana (free/open source) Alerts via Webhook Using Fluent-Bit (free)



Background

This is a case where we helped a customer save quite a bit of money by using software they already owned rather than paying a large upcharge for additional licenses that they didn't need.

For any number of good reasons, your use case only calls for the free version of Elastic in your environment. In your environment, you also want to integrate alerts with your ticketing system. The challenge is that the free version of Kibana does not include a webhook connector for alerts. Only the Server log connector is available with the free license, whereas the Webhook connector (and others) are only available with the paid licenses.

I have a customer in the above situation. An application they purchased is bundled in an appliance running a packaged Kubernetes distribution. The application also includes Fluent Bit for log collection into Elasticsearch. The initial challenge was to send alerts to their on-prem Netcool environment when certain log messages were written. We helped them meet this challenge using the webhook output of Fluent Bit to send the appropriate messages to the Netcool message bus probe, which would then create an incident in their ticketing system for each of these alerts.

Their next requirement was to only create incidents based on some aggregation of log messages. Specifically, they obtained several Elasticsearch queries from the vendor that should be used to generate incidents. This is really straightforward when using one of the paid Elastic licenses because you can simply write a rule with the Elasticsearch query as a condition and the built-in webhook connector to define an action that sends a message. With the free license of Kibana, that connector isn't available. 

My Solution

The trick to the solution in this case is to just use the Server Log connector in Kibana to write a specifically-formatted message to the log when the Elasticsearch query condition is met. The message can be similar to:

CREATE_INCIDENT Vendor Query X has breached the prescribed threshold. Take action Y to correct.

This message is written to the log file for the Kibana pod, which is already being tail'ed by Fluent Bit. So we just needed to create a FILTER in Fluent Bit to match this log message and route that to the message bus probe. 

Wednesday, March 29, 2023

Modify kibana.yml after deploying Kibana with Helm

If you deploy Kibana using the Elastic helm chart with default values, what you'll find is that you don't have any obvious way to modify the kibana.yml file. For example, if you log into the Kibana pod with

kubectl exec --stdin --tty kibana_podname -- /bin/bash

you'll find that there's no editor available (like vi or even ed). You can cat config/kibana.yml, but the comments state that it is auto-generated. So what are you supposed to do to add an a setting to the file? For example, you might need to add a value for xpack.encryptedSavedObjects.encryptionKey so you can configure alerting.

The solution I came up with is a multi-step process:

1. Get the default values.yaml file for the chart and store that in a file with the command:

helm show values elastic/kibana > /tmp/kibana.yaml

2. Edit that file to add a section for kibana.yml under kibanaConfig. Originally, kibanaConfig is empty (set to {}). You need to change it to be something like:

kibanaConfig:
   kibana.yml: |
      xpack.encryptedSaveObject.encryptionKey xxxxxxxxxxxxxxxxxxxx


3. Now (unintuitively at least to me) uninstall the helm chart with:

helm uninstall kibana

3. Then install the helm chart again with:

helm install kibana elastic/kibana -f /tmp/kibana.yaml

And that's it. Your changes will be applied and you're good to go.

I'm pretty sure there's a way to create a configMap and reference it, which would then allow you to just delete the pod to have it re-read the configMap, but I haven't figured out those exact details. Maybe in another post.

Wednesday, February 8, 2023

Configuring Fluent Bit to send messages to the Netcool Mesage Bus probe

 Background

Fluent Bit is an open source and multi-platform log processor tool which aims to be a generic Swiss knife for logs processing and distribution.

It is included with several distributions of Kubernetes, and is used to pull log messages from multiple sources, modify them as needed, and send the records to one or more output destinations. It is amazingly customizable, so you can do just about any processing you want, with a couple of idiosyncracies, one of which I'll describe here.

The Challenge

What if you have a log message that you want to handle in two different ways:

1. Normalize the fields in the log message for storage in ElasticSearch (or Splunk, etc.).

2. Modify the log message so it has all of the appropriate fields needed for processing by your Netcool environment (fields that you don't necessarily want in your log storage system).

The Solution

Based on all of the unique restrictions in Fluent Bit, what you need to do is create a new copy of the log message, preserving the original so that the original can go through your "standard" processing, and the new message can be processed according to your needs in Netcool.

The specifics of this solution are to use a rewrite_tag FILTER to create a new, distinct copy of the message with a custom tag within the Fluent Bit pipeline, and then configure the appropriate additional FILTERs and OUTPUTs that only Match this new, custom tag. You also need to modify any existing OUTPUTs to exclude this new tag.

Here's a high-level graphic showing what we're going to do:



Our rewrite_tag FILTER is going to match all tags beginning with "kub". This will exclude our new tag, which will be "INC". So after the rewrite_tag filter, there will be two messages in the pipeline: the original plus our new one with our custom "INC" tag. We can the specify the appropriate Match statements in later FILTERs to only match the appropriate tag. So in the ES output above, the Match_Regex statement is:

Match_Regex  ^(?!INC).*

The official name of the above is a "lookahead exclude". Go ahead and try it out at regex101.com if you want. It will match any tag that does NOT begin with "INC", which is the custom tag for our new messages that we want to send tou our HTTP Message Bus probe.

The rewrite_tag FILTER will be custom for your environment, but the following may be close in many cases. For my case, I want to match any message that has a log field containing the string "ERROR writing to". You'll have to analyze your current messages to find the appropriate field and string that you're interested in. But here's my rewrite_tag FILTER stanza:

[FILTER]
    Name rewrite_tag
    Match_Regex ^(?!INC).*
    Rule    $log  ^.*Error\swriting\sto.* INC true

The "Rule" statement is the tricky part here. This statement consists of 4 parts, separated by whitespace:

Rule - the literal string "Rule"
$log - the name of the field you want to search to create a new message, preceded by "$". In this case, we want to search the field named log.
^.*Error\swriting\sto.* - the regular expression we want to match in the specified field. This regular expression CANNOT CONTAIN SPACES. That's why I'm using "\s".
INC - this is the name of the tag to set on the new message. This tag is ONLY used within the Fluent Bit pipeline, so it can literally be anything you want. I chose "INC" because these messages will be sent to the Message Bus proble to eventually create incidents in ServiceNow.
true - this specifies that we want the KEEP the original message. This allows it to continue to be processed as needed.

After you have the rewrite_tag FILTER in place, you will have at least one additional FILTER of type "modify" in your pipeline to allow you to add fields, rename fields, etc. You'll then have an OUTPUT stanza of type "http" to specify the location of the Message Bus probe. Something like the following:

[OUTPUT]
    Name http
    port 80
    Match INC
    host probehost
    uri /probe/webhook/fluentbit
    format json
    json_date_format epoch

The above specifies that the URL that these messages will be sent to is 

http://probehost:80/probe/webhook/fluentbit

In the json that's sent in the body of the POST request, there will be a field named date , and it will be in Unix "epoch" format, which is an integer representing the number of seconds since the beginning of the current epoch (a "normal" Unix/Linux timestamp).

That's it. That's all of the basic configuration needed on the Fluent Bit side.

Extra Credit/TLS Config

If your Message Bus probe is using TLS, you just need to add the following two lines to the above OUTPUT stanza:

    tls On
    tls.verify Off

The first line enables TLS encryption, and the second line is a shortcut that allows the connection to succeed without having to add the appropriate certificates to Fluent Bit - it will accept any certificate presented to it by the Message Bus probe, even a self-signed certificate.

Tuesday, February 5, 2019

A great video on deploying and operating Kubernetes at scale

Here's a video from Chick-Fil-A's IT team describing exactly how they use Kubernetes clusters at the edge (in each restaurant). The problems and their solutions are really intriguing.

https://www.youtube.com/watch?v=8edDcy3oeUo


Tuesday, December 4, 2018

If you run Kubernetes in the cloud, the first major vulnerability found isn't a huge issue

The first major Kubernetes (aka K8s) vulnerability was found yesterday:

https://www.zdnet.com/article/kubernetes-first-major-security-hole-discovered/

It's a pretty big deal and quite scary, but patches were immediately available upon disclosure. What's even better is that the managed Kubernetes services running onAWS, Azure and Google Cloud Platform have all been patched already. If you're managing your own K8s clusters, however, you need to patch it yourself, which just takes time and know-how.

In my eyes, this is another data point that shows how proper use of cloud resources can be extremely beneficial to a company. Specifically, the big cloud players, especially AWS, are very similar to a highly competent and agile outsourced IT department. They have offerings that are years ahead of services that you would want to have onsite, and they've got testing methodologies in place to ensure that they're available 99.9% of the time.

It's true that there can be some issues in moving to the cloud, but many of the problems of the past now have very robust solutions that are included in the offerings. And those offerings are available on a pay-as-you-go basis in many cases. So you can easily keep tabs on exactly how much you're spending even on a per-application basis.

To ensure a successful digital transformation, contact us to get the experienced help that will put you on the right path.

Monday, November 26, 2018

Every enterprise is already using serverless applications in some form or another

If you have an application that makes a call to an external application, then you're on the calling side of a serverless application. Here's a high level graphic to illustrate my point:
You essentially have no insight into how the Results are generated by the "cloud" you're accessing via IP address or hostname. So you're accessing a service, but the actual server part of that interaction is abstracted from you.

Here's a great article on the concept of "Servicefull Serverless" to go into more detail about this:

https://www.infoq.com/articles/serverless-sea-change

Now, the current definition of "serverless" leverages all kinds of possible technologies like AWS Lambda or Whisk or even Cloudflare Isolates, on top of containers and Kubernetes running in VMs (or bare iron in the case of Isolates). So it's extremely important for you to understand those components at some point, but from your view as a consumer, you're already using serverless technology.