Quick Notes on Kubernetes with Go

I this blog post, we talk about the various levels of abstraction available for working with Kubernetes (K8s) in Go. We start with fundamental building blocks for interacting with Kubernetes resources, starting with the REST API, and move through the Go API Library, API Machinery, the high-level client-go library,  controller-runtime, and kubebuilder. Each section provides insights into the utilities and patterns for resource creation, updates, and management, as well as details on working with custom resources and building operators.

Quick Notes Series

This post is part of the Quick Notes series, where I share concise sets of notes on topics that stood out to me, rather than polished, in-depth blog articles. These posts are most valuable if you're already familiar with the subject and need a brief review of the key points.

REST API

You can find the API description here. On the high-level, each resource type belongs to a specific Group and Version and can be either namespace or cluster scoped, and you can access it like this:
  • /apis/GROUP/VERSION/*
  • /apis/GROUP/VERSION/namespaces/NAMESPACE/*
The objects of each resource type are of certain Kind. For example, objects of resource type pod are of Kind Pod (with capital P).

With each resource, you can use various HTTP verbs, to get, watch changes, creates, update, or delete. When you access a resource, you transfer objects of the associated Kind. For example, when you are using /pod resource, you deal with objects of Kind: Pod.

Go API Library

This library does the serialization between JSON/YAML and Go structs. It is a collection of Go structs.

You can import it like this:
import "k8s.io/api/<group>/<version>"
Example:
pod := corev1.Pod{
   Spec: corev1.PodSpec{
      Containers: []corev1.Container{
        {
            Name:  "runtime",
            Image: "nginx",
         },
      },
   },
}
pod.SetName("my-pod")
pod.SetLabels(map[string]string{
     "component": "my-component",
})

Go API Machinery Library

This library provides some utilities for any object that follows Kubernetes API conventions, e.g., the objects must embed TypeMeta and ObjectMeta, and they must implement the runtime.Object interface with GetObjectKind() and DeepCopyObject() methods.

- TypeMeta:
  • APIVersion: Contains group and version
  • Kind
- ObjectMeta:
  • Name
  • Namespace
  • UID: The unit identifier of an object. It is unique during the lifetime of the cluster.
  • ResourceVersion: It is used for concurrency control. It increments after any change to the object.
  • Generation: It is updated when we update the Spec part of the object. When the status.ObservedGeneration matches Generation, we know that Spec has been realized by the controller.
  • Labels
  • Annotations
  • OwnerReferences:
    • Used for garbage collection when in reconciling an object, we create another object, and we want the child object to be deleted when we delete the parent object.
    • An object can be owned by multiple owners. In that case, the object is garbage collected when ALL of its owners are deleted.
    • Only one of the OwnerReferences can be the Controller.
    • BlockOwnerDeletion: when set true block the deletion of the  owner until the owned object is deleted first (Unless the deletion propagation policy is Orphan or Background)
    • OwnerReference is a struct having following fields: APIVersion, Kind, Name, UID, Controller (bool), BlockOwnerDeletion (bool).
    • When a child object changes, the reconciler of the controller owner object will be executed and the owner object that is being referenced by the child object as Controller will be passed to the reconciler function.  
Utilities that api machinery provides:
  • Scheme:
    • Has a mapping between GVK and the Go struct. Thus, you can use the scheme to instantiate a Go struct instance of your desired GVK.
    • Serialize/Deserialize from JSON/Protobuf to/from Go objects
    • Convert between versions.
  • RESTMapper:
    • To convert between Resource and Kind.
Scheme: You can add known types to their conversion to schema and use scheme to create object of that type or convert a given object to another type. We usually add types and conversions in an init function.

scheme := runtime.NewScheme()

To add a type:
func (s *Scheme) AddKnownType(gv schema.GroupVersion, types...Object)

Example:
func init () {
    Scheme.AddKnownTypes(
        schema.GroupVersion{
            Group: "MyGroup",
            Version: "v1",
        },
        &MyType{},
    )
}

Now, you can use it like this:
obj, err := Scheme.New(schema.GroupVersionKind{
    Group: "MyGroup"
    Version: "v1"
    Kind: "MyType"
})

To add a conversion function:
Scheme.AddConversionFunc(
    (*MyType1)(nil)
    (*MtType2)(nil)
    func(a, b interface{}, scope conversion.Scope) error {
        type1 = a.(*MyType1)
        type2 = b.(*MyType2)
        //convert
        return nil
})

Now, you can use it like this:
Suppose we have an instance of MyType1.

var type2 MyType2
scheme.Convert(&type1, &typ2, nil)

You can also use Scheme object to pass as both creater and typer to use with jsonserilaizer.NewSerializerWithOptions and protobuf.NewSerializer to encode or decode objects to/from JSON/YAML or protobuf.

- Resource vs. Kind:
  • GroupVersionResource: It is the rest path, e.g., apps/v1/deployments is the resource. 
  • GroupVerisonKind: It is the data format that is exchanged in that path, e.g., the Deployment Kind.
- API machinery provides RESTMapper To get kind for a resource and vice versa. The DefaultRESTMapper guesses the resource name based on convention. You can use AddSpecific to override this behavior and explicitly provide resource for a kind.

- API library has core K8s types like Deployment that follows API machinery conventions.

- Optional values are defined as pointer to value in Go when nil means no value. You can use pointer package to deal with these fields.

Example:
Replicas: pointer.Inte32(3)

or to read:
replicas := pointer.Int32Deref(spec.Repllicas, 1) //where 1 is the default value when the spec.Replicas is missing.

Note that to compare two pointers, use Inter32Equal and similar funcitons.

- Other types in API machinery:
  • Quantity: For memory, cpu, etc.
  • IntOrString: For fields that can be either string or int
  • Time

Go client-go Library

client-go is a high-level library to talk to Kubernetes API server:
  • Create a config specifying the necessary information to connect to the API server. 
    • The container running in a Pod usually has this information.Use InClusterConfig function.
    • Or you can use a kubeconfig that can be created from memory or a file on disk.
  • Get a Clientset for the config.
    • You can provide your customer HTTP client to talk to the API server (Using NewForConfigAndClient function) or use the default HTTP server.
  • Use the Clientset to perform operations (e.g., create, get, delete, update, patch, or watch) on different resources.
Example:
podList, err := clienset.CoreV1().Pod("myNamespace").List(ctx, metav1.ListOptions{})

You can filter Pods with label or fields using LabelSelector and FieldSelector for ListOptions.

as another example, let's watch Deployments:

watch, err := clientset.AppsV1().Deployments("myNamespace").Watch(ctx, metav1.ListOptions{})

for event :=. range watcher.ResultChan() {
    //read the events
}

Note that if there is no error, then the event will be of type Deployment in this example, i.e. event.Object.(type) is *appsv1.Deployment. Otherwise, it will be *metav1.Status. Note that this Status is HTTP status, so you can check its code to see what went wrong.

You can also get the underlying RESTClient and use it directly.
RESTClient := clientset.CoreV1().RESTClient()

Create, Update, Patch, and Apply

We have various operations for mutating objects:
  • Create: Create a brand new object. If the object of the same type with the same name exists in the namespace, you get an error.
  • Update: Update an existing object by providing a complete new manifest. We may get conflict errors.
  • Patch: We have different types. Note that for production, we usually should not use patch, as we want to keep track of the lineage of changes. Patch is ad-hoc and we lose track of what we did. Instead it is better to use Apply with complete data.
    • merge: JSON Patch, as defined in RFC6902. This specifies a set of operations to update the json. You can have conditions for the operations.
    • JSON: JSON Merge Patch, as defined in RFC7386. This is simply a partial JSON. If a key exists in the partial patch data, it updates it. Otherwise, it is unchanged.
    • strategic-merge: It is k8s-aware JSON merge. For example, it merge arrays based on name.
  • Apply: Create a brand new object if an object with the same name of the same resource does not exists in the namespace. Otherwise, update it on the server-side. Thus, we never get conflict errors.

Conflicts

Conflicts happens when you want to update an existing resource, but before your update is committed, someone else has change the object.

The conflicts are detected by the ResourceVersion. Specifically, when you send a complete manifest to API server having the ResourceVersion, if the ResourceVersion you specified in your yaml does not match what is stored in K8s, the API server rejects your request and say:

 error: Operation cannot be fulfilled on [...]: the object has been modified; please apply your changes to the latest version and try again.

Writing Controllers

K8s allows you to extend the API by defining your own Custom Resources (CRs) and their Operators

Custom Recourse Definitions

To define a new resource, you should create a new object of Kind CustomResourceDefinition. When you create a new resource, you create its corresponding Kind as well. Then, you can create new instance of your new Kind, and access them via the resource you created. See a CRD example.

For example, suppose I have define a new resource called myresources with MyKind. Now, I can create new object of type MyKind, and can access them via /myresources. Note that usually, we use the same name for the resource and Kind, so for example, if my custom Kind is Database, we use database (plural databases) as the resource name.

When you list the object of the Kind corresponding to your resource, by default you just see their name and age. You can customize what is printed using additionalPrinterColumns.

- To create Clientset for your custom resource:
  • First define the Go struct for its corresponding Kind. You define the Kind struct in types.go in the pkg/apis/MyGroup/MyVersion folder.
  • Add +k8s:deepcopy-gen annotations to your Kind definition and use deepcopy-gen to generate zz_generated.deepcopy.go next to the types.go.
  • Add +genclient annotations to your Kind definition.
  • Add register.go file next to your types.go that define AddToSchema function that given runtime.Scheme, it adds your Kind to the scheme.
  • Run client-gen to generate clientset in pkg/clientset/clientset.
This clienset is similar to what client-go provides, i.e., like client-go that uses API library and API machinery to talk to API server for Kubernetes resources, this clientset use API machinery to talk to API server for your new custom resource.

controller-runtime

The controller-runtime is a Go library to write controller for your custome resources. The main abstractions provided by the controller-runtime library:
  • Manager
  • Controller
Manager has the following:
  • Kubernetes clients for the reading/writing resources
  • Caching resources for reading
  • A schema
mgr, err :=  manager.New(config.GetConfigOrDie(), 
    manager.Options{
        Schema: scheme,
    },
)

We usually create scheme as follows:
scheme := runtime.NewScheme()
clientgoscheme.AddToScheme(scheme)
mygroupversion.AddToScheme(scheme)

Controller: For each controller
  • You specify the manager, and register a reconcile function when you create it.
  • Use the  >controller instance to watch events.
Now, when you start the manager, the reconcile function will be called when there is an event. 

A controller can watch events related to changed to a certain Kind, or watch a generic channel. By writing to this channel we can trigger a reconciliation.

Usually, there are two cases where a controller wants to watch a Kind:
  1. The Kind of the primary resource that this controller is supposed to control. For example, we have defined a resource called Database, and this is the controller (a.k.a. operator) that reconcile Database.
  2. The Kind of the objects that are created by primary resource. For example, if to reconcile my Database resource, I create a Pod, I want to watch Pods created for any Database by the controller of the Database (to update the status of the Database for example). 
In the first case, we watch like this:

controller.Watch(&source.Kind{Type: &mygroupversion.Database{},}, &handler.EnqueueRequestForObject{})

In the second case, we watch like this:
controller.Watch(&source.Kind{Type: &corev1.Pod{},}, &handler.EnqueueRequestForOwner{
    OwnerType: &mygroupversion.Database{},
    IsController: true,
})

This means, watch Pods, and when a Pod changes, follow its OwnerReference chain, until you find a object of type mygroupversion.Database that its OwnerReference link is of type controller (see above), then put that Database object to the work queue of the reconciler. 

Instead of creating controller with controller.New and call its Watch function, you can use builder.ControllerManagedBy that provides a fluent api to define watches. 

err = builder.
    ControllerManagedBy(mgr).
    For(&mygroupversion.Database{}).
    Owns(&corev1.Pod{}).
    Complete(&MyReconciler)

MyReconciler is a struct that implements the Reconciler interface with the Reconcile function.

The manager provides a client that is similar to client-go client. You can use it to created, get, delete, update, patch, etc. resources.

 The manager also provide an event recorder. You can use it to publish events that will be shown with kubectl describe. You can get this event recorder with GetEventRecorderFor, and use its Event, Eventf, or AnnotatedEventf functions.

The typical struct of a Reconcile function:
func (m *MyReconciler) Reconcile (ctx context.Context, req reconcile.Request) (reconcile.Result, error) {
  1. Get the object using req.NamedspacedName. If the object is not found log and return with no error
  2. Create lower level resources and set their OwnerReference to object specified by req
  3. Patch lower-level resource with server-side apply that takes care of: 1) creating if the resources if they don't exist, 2) resolve conflicts automatically.
  4. Compute the status for the req object by reading the status of the lower level resources. For this, we usually need to list them.
  5. Update status of req object using Status.Update.
}

Kubebuilder

Kubebuilder is a tool to generate code for implementing operator with controller-runtime. It is a single binary. Just download it and put it into your PATH.

- Start a new project:
kubebuilder init --domain my.domain --repo my.domain/myproject

- After this, you get:
  • Go code for the manager
  • Dockerfile
  • Manifest to implement the manager
  • Makefile
- You can simply add scaffolding for a new custom resource and its controller this way:
kubebuilder create api --group mygroup --version myversion --kind MyResource 

- Using make manifests, you can get the yaml definitions of your CRDs.

- Some of useful kubebuilder annotations:
  • //+kubebuilder:object:root=true: The struct is a Kind. This will cause kubebuilder to generate the DeepCopyObject() method for the struct.
  • //+kubebuilder:subresource:status: Enables status for the Kind. Note that some resources like ConfigMap don't have status.
  • //+kubebuilder:printcolumn:name="MyField", type=string, JSONPath = '.spec.MyFiled': Print this filed when doing kubectl describe
  • //+kubebuilder:rbac:groups=mygroup, resources=myresources, verbs=get;list...: kubebuilder creates rbac that gives necessary access for the reconciler.

Comments

Popular posts from this blog

In-memory vs. On-disk Databases

ByteGraph: A Graph Database for TikTok

DynamoDB, Ten Years Later

Eventual Consistency and Conflict Resolution - Part 2