Thursday, January 26, 2017

Learning / Using jsonpath with OpenShift and Kuberneties

I work as a support engineer on OpenShift and I spend a lot of time trying to get information out of the cluster and often times have to use that to information as an input of some other task I am doing.

Because OpenShift / Kuberneties (cli) have such a wide range of outputs:
-o, --output='': Output format. One of: json|yaml|wide|name|go-template=...|go-template-file=...|jsonpath=...|jsonpath-file=... See golang template [] and jsonpath template [].
I have a wide range of options to choose from when trying to complete my tasks. However one is far superior than all others (IMO).
  • Note: Many places on the web reference: as the source for where jsonpath started. 
Why is this better than the others?
  1. Well for starters it lets me get just the output I need. 
  2. It lets me format the data as I see fit. 
  3. It simple to understand and use (compared to go-template)
The best way to show you how easy it is to use is to show you some examples.

Lets start with something simple. Like getting the name of pod.
So if you have the following:
# [oc|kubectl] get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-11-57g56   1/1       Running   0          ---
registry-console-2-49irw   1/1       Running   0          ---
router-1-vinzu             1/1       Running   0          ---
One might just do something like:
# [oc|kubectl] get pods --no-headers | awk '{print $1}'
Simple enough, but if you had to pair this with another command you might start to see issues, with command nesting.

A simpler way (IMO) to get the same information is to do something like:
# [oc|kubectl] get pods -o jsonpath='{.items[*].spec.containers[*].name}{"\n"}'
  • Note: the '{"\n"}' in this command is so that a new line is added at the end of the output, and your terminal does not displaying on the same line as the output.
This allows me just to pull the exact data I want out of the system, and in this case (return it as I want - as a list of name).

If I wanted to do something more complicated, such as get a specific service name and IP from the system, I could easily format the output, and filter (a set of objects) to get me data on a specific object.
# [oc|kubectl] get services -o jsonpath='{.items[?("registry-console")]}: {.items[?("registry-console")].spec.clusterIP}{"\n"}'
Example output:
As a result of output like this, I can get the name of the service, as well as the IP that corresponds to the service. I also get it formatted, exactly how I want it, without having to do complex command stringing (with bash). 

One of the only downsides to using jsonpath is the docs, around the syntax.

If you read, over there is enough, to do damage and shoot yourself in the foot (only to find out your just doing it wrong).

So to help simplify that, I'll look at the same examples above, but with the json they provide.

Whenever starting out its best to just run:
# [oc|kubectl] get <object> -o json > file.json
Simply put, it's simpler to read the json, and then traverse through it to define your jsonpath syntax.

Example JSON  
this is a fedora paste bin an may not longer be around, sorry # [oc|kubectl] get pods -o json > example.json is how to create this)

If you look at my example json, you can see in both examples, that because the CLI gives me information on multiple objects, it creates and returns a List (which if you used to python terminology its a dictionary object, that defines a key, whose name is "items" that's value is an array):
     "kind": "List",
     "apiVersion": "v1",
     "metadata": {},
     "items": [
Because of this, you often have to start all jsonpath filters with:
As your going to be accessing the items array, from this output. However you might not need to do this. Say for example if you got the output of a specific pod.
# [oc|kubectl] get pod registry-console-2-49irw -o jsonpath='{.spec.containers[*].name}'
Because, this command returns a Pod. I don't have to access the member of list (or array).
    "kind": "Pod",
    "apiVersion": "v1",
    "metadata": {
So the first major tip I have with jsonpath is learning how [*] and ['index'] work. ['index'] is exactly what you would expect if you were working with a python dictionary, and when you try and access as key, to get its value. * is a special character, with [] that lets you traverse all the keys (indexes) in the array.

In 90% of what I do, [*] works, as shown above .items is one of the most important thing I have to traverse. So understanding that I am simply looking at the keys, for each member of the array, when specify the next '.' following [*] makes understanding the syntax much simpler.

The next most complicated concepts to learn with jsonpath are filters, and the 'current object' operator. As shown in the example above, ?() is used to create a filter, or a simple 'if TRUE' type of syntax. In the example I used, I used this with in an array [], to identify a pod by its name. The only way this works, is if I can access the 'current object' as I pass through the array. This is what the @ operator does.

With these concepts, your almost cretin to understand jsonpath, and you'll be able to get almost any data out of OpenShift or Kuberneties.

However, one item I did not show an example of, there is another way to traverse arrays. This method uses the range and end keywords.

If we go back to the services example, you can traverse the service and create output (that look like arrays) with something like:
# [oc|kubectl] get services -o jsonpath='{range .items[*]}[{},{.status.capacity}] {end}{"\n"}'
With this you might find it more natural, when traversing the array. However, due to the syntax length (I often don't use this method).

No comments:

Post a Comment

Its the little things (like opening a web browser)

Sometimes as your developing a solution/script for a problem your faced with interesting challenges where the dumbest workaround (opening a ...