Resize a TKGm cluster in CSE

When trying to resize a TKGm cluster with CSE, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

This post shows how you can use the vcd cse cli to workaround this problem.

When trying to resize a TKGm cluster with CSE in the VCD UI, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

Checking the logs in ~/.cse-logs there are no logs that show what the error is. It appears to be an issue with the Container UI Plugin for CSE 3.1.0.

If you review the console messages in Chrome’s developer tools you might see something like the following:

TypeError: Cannot read properties of null (reading 'length')
    at getFullSpec (https://vcd.vmwire.com/tenant/tenant1/uiPlugins/80134fc9-86e1-41db-9d02-b02d5e9e1e3c/ca5642fa-7186-4da2-b273-2dbd3451fd50/bundle.js:1:170675)
    at resizeCseCluster

This post shows how you can use the vcd cse cli to workaround this problem.

Using the vcd cse cli to resize a TKGm cluster

  1. First log into the CSE appliance or somewhere with vcd cse cli installed
  2. Then log into the VCD Org that has the cluster that you want to resize with a user with the role with the cse:nativecluster rights bundle.
    • vcd login vcd.vmwire.com tenant1 tenant1-admin -p Vmware1!
  3. Lets list the clusters using this command
    • vcd cse cluster list
  4. CSE should show you the clusters belonging to this organization
  5. Now lets obtain the details of the cluster that we want to resize
    • vcd cse cluster info hugo-tkg
    • copy the entire output of that command and paste it into Notepad++
  6. Delete everything from the status: line below so you only end up with the apiVersion, kind, metadata and spec sections. Like this:
apiVersion: cse.vmware.com/v2.0
kind: TKGm
metadata:
  name: hugo-tkg
  orgName: tenant1
  site: https://vcd.vmwire.com
  virtualDataCenterName: tenant1-vdc
spec:
  distribution:
    templateName: ubuntu-2004-kube-v1.20.5-vmware.2-tkg.1-6700972457122900687
    templateRevision: 1
  settings:
    network:
      cni: null
      expose: true
      pods:
        cidrBlocks:
        - 100.96.0.0/11
      services:
        cidrBlocks:
        - 100.64.0.0/13
    ovdcNetwork: default-organization-network
    rollbackOnFailure: true
    sshKey: ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ==
      rsa-key-20210508
  topology:
    controlPlane:
      count: 1
      cpu: null
      memory: null
      sizingClass: small
      storageProfile: iscsi
    nfs:
      count: 0
      sizingClass: null
      storageProfile: null
    workers:
      count: 3
      cpu: null
      memory: null
      sizingClass: medium
      storageProfile: iscsi

Prepare a cluster config file

  1. Change the workers: count to your new desired number of workers.
  2. Save this file as update_my_cluster.yaml
  3. Update the cluster with this command
    • vcd cse cluster apply update_my_cluster.yaml
  4. You’ll notice that CSE will deploy another worker node into the same vApp and after a few minutes your TKGm cluster will have another node added to it.
root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   14m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 10m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 12m   v1.20.5+vmware.2
root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   22m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 17m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 19m   v1.20.5+vmware.2
node-rbmz   Ready    <none>                 43s   v1.20.5+vmware.2

Viewing client logs

The vcd cse cli commands are client side, to enable logging for this do the following

  1. Run this command in the CSE appliance or on your workstation that has the vcd cse cli installed.
    • CSE_CLIENT_WIRE_LOGGING=True
  2. View the logs by using this command
    • tail -f cse-client-debug.log

A couple of notes

The vcd cse cluster resize command is not enabled if your CSE server is using legacy_mode: false. You can read up on this in this link.

Therefore, the only way to resize a cluster is to update it using the vcd cse cluster apply command. The apply command supports the following:

apply a configuration to a cluster resource by filename. The
resource will be created if it does not exist. (The command
can be used to create the cluster, scale-up/down worker count,
scale-up NFS nodes, upgrade the cluster to a new K8s version.

CSE 3.1.1 can only scale-up a TKGm cluster, it does not support scale-down yet.

Using a Statefulset to demo VCD cloud and storage providers.

This post uses a statefulset to deploy nginx with pvc and load balancer services into a Kubernetes cluster running in VMware Cloud Director enabled with Container Service Extension.

VCD has a cloud provider named vmware-cloud-director-ccm-0 and a CSI provider named csi-vcd-controllerplugin-0.

This post uses a statefulset to deploy nginx with pvc and load balancer services into a Kubernetes cluster running in VMware Cloud Director enabled with Container Service Extension.

VCD has a cloud provider named vmware-cloud-director-ccm-0 and a CSI provider named csi-vcd-controllerplugin-0.

If you sent the following command to a Kubernetes cluster

kubectl get po -n kube-system

You would see output like this

csi-vcd-controllerplugin-0                  3/3     Running   0          11h
csi-vcd-nodeplugin-lh2gs                    2/2     Running   0          13h
vmware-cloud-director-ccm-99fd59464-79z8r   1/1     Running   0          11h

Contents of web-statefulset.yaml, available on my GitHub here.

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: web-statefulset
spec:
  selector:
    app: nginx
  ports:
    - port: 80
      targetPort: 80
  type: LoadBalancer
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web-statefulset
  namespace: web-statefulset
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx-service"
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: nginx
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "vcd-disk-dev"
      resources:
        requests:
          storage: 1Gi

Lets deploy into a new namespace, for that we create a new namespace first.

kubectl create ns web-statefulset

Deploy the statefulset with the following command

kubectl apply -f web-statefulset.yaml

You’ll see named disks and ingress services create in VCD and Avi respectively.

If you tried to access the nginx webpage using the service IP address, you wouldn’t see any web page, although the connection is working. This is because the nginx app using the /usr/share/nginx/html mount point to an empty PVC. We need to copy a basic index.html into that directory to get a webpage.

We can do that by logging into the pod and downloading a sample index.html for nginx.

kubectl exec -it web-statefulset-0 -n web-statefulset -- bash

curl https://raw.githubusercontent.com/hugopow/cse/main/index.html -o /usr/share/nginx/html/index.html

Now when you connect to the external IP you would get a very simple webpage.

The index.html file is stored on /usr/share/nginx/html/index.html, which is mounted to /sdb1 backed by the PVC.