Resize a TKGm cluster in CSE

When trying to resize a TKGm cluster with CSE, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

This post shows how you can use the vcd cse cli to workaround this problem.

When trying to resize a TKGm cluster with CSE in the VCD UI, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

Checking the logs in ~/.cse-logs there are no logs that show what the error is. It appears to be an issue with the Container UI Plugin for CSE 3.1.0.

If you review the console messages in Chrome’s developer tools you might see something like the following:

TypeError: Cannot read properties of null (reading 'length')
    at getFullSpec (https://vcd.vmwire.com/tenant/tenant1/uiPlugins/80134fc9-86e1-41db-9d02-b02d5e9e1e3c/ca5642fa-7186-4da2-b273-2dbd3451fd50/bundle.js:1:170675)
    at resizeCseCluster

This post shows how you can use the vcd cse cli to workaround this problem.

Using the vcd cse cli to resize a TKGm cluster

  1. First log into the CSE appliance or somewhere with vcd cse cli installed
  2. Then log into the VCD Org that has the cluster that you want to resize with a user with the role with the cse:nativecluster rights bundle.
    • vcd login vcd.vmwire.com tenant1 tenant1-admin -p Vmware1!
  3. Lets list the clusters using this command
    • vcd cse cluster list
  4. CSE should show you the clusters belonging to this organization
  5. Now lets obtain the details of the cluster that we want to resize
    • vcd cse cluster info hugo-tkg
    • copy the entire output of that command and paste it into Notepad++
  6. Delete everything from the status: line below so you only end up with the apiVersion, kind, metadata and spec sections. Like this:
apiVersion: cse.vmware.com/v2.0
kind: TKGm
metadata:
  name: hugo-tkg
  orgName: tenant1
  site: https://vcd.vmwire.com
  virtualDataCenterName: tenant1-vdc
spec:
  distribution:
    templateName: ubuntu-2004-kube-v1.20.5-vmware.2-tkg.1-6700972457122900687
    templateRevision: 1
  settings:
    network:
      cni: null
      expose: true
      pods:
        cidrBlocks:
        - 100.96.0.0/11
      services:
        cidrBlocks:
        - 100.64.0.0/13
    ovdcNetwork: default-organization-network
    rollbackOnFailure: true
    sshKey: ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ==
      rsa-key-20210508
  topology:
    controlPlane:
      count: 1
      cpu: null
      memory: null
      sizingClass: small
      storageProfile: iscsi
    nfs:
      count: 0
      sizingClass: null
      storageProfile: null
    workers:
      count: 3
      cpu: null
      memory: null
      sizingClass: medium
      storageProfile: iscsi

Prepare a cluster config file

  1. Change the workers: count to your new desired number of workers.
  2. Save this file as update_my_cluster.yaml
  3. Update the cluster with this command
    • vcd cse cluster apply update_my_cluster.yaml
  4. You’ll notice that CSE will deploy another worker node into the same vApp and after a few minutes your TKGm cluster will have another node added to it.
root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   14m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 10m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 12m   v1.20.5+vmware.2
root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   22m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 17m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 19m   v1.20.5+vmware.2
node-rbmz   Ready    <none>                 43s   v1.20.5+vmware.2

Viewing client logs

The vcd cse cli commands are client side, to enable logging for this do the following

  1. Run this command in the CSE appliance or on your workstation that has the vcd cse cli installed.
    • CSE_CLIENT_WIRE_LOGGING=True
  2. View the logs by using this command
    • tail -f cse-client-debug.log

A couple of notes

The vcd cse cluster resize command is not enabled if your CSE server is using legacy_mode: false. You can read up on this in this link.

Therefore, the only way to resize a cluster is to update it using the vcd cse cluster apply command. The apply command supports the following:

apply a configuration to a cluster resource by filename. The
resource will be created if it does not exist. (The command
can be used to create the cluster, scale-up/down worker count,
scale-up NFS nodes, upgrade the cluster to a new K8s version.

CSE 3.1.1 can only scale-up a TKGm cluster, it does not support scale-down yet.

Author: Hugo Phan

@hugophan

One thought on “Resize a TKGm cluster in CSE”

Leave a comment