kubernetes – Page 2

Deploy Harbor Registry with Tanzu Packages and expose with Ingress

In the previous post, I described how to install Harbor using Helm to utilize ChartMuseum for running Harbor as a Helm chart repository.

The Harbor registry that comes shipped with TKG 1.5.1 uses Tanzu Packages to deploy Harbor into a TKG cluster. This version of Harbor does not support Helm Charts using ChartMuseum. VMware dropped support for ChartMuseum in TKG and are adopting OCI registries instead. This post describes how to deploy Harbor using the Tanzu Packages (KApp) and use Harbor as an OCI registry that fully supports Helm charts. This is the preferred way to use chart and image registries.

The latest versions as of TKG 1.5.1 packages, February 2022.

Package	Version
cert-manager	1.5.3+vmware.2-tkg.1
contour	1.18.2+vmware.1-tkg.1
harbor	2.3.3+vmware.1-tkg.1

Or run the following to see the latest available versions.

tanzu package available list harbor.tanzu.vmware.com -A

Pre-requisites

Before installing Harbor, you need to install Cert Manager and Contour. You can follow this other guide here to get started. This post uses Ingress, which requires NSX Advanced Load Balancer (Avi). The previous post will show you how to install these pre-requisites.

Deploy Harbor

Create a configuration file named harbor-data-values.yaml. This file configures the Harbor package. Follow the steps below to obtain a template file.

image_url=$(kubectl -n tanzu-package-repo-global get packages harbor.tanzu.vmware.com.2.3.3+vmware.1-tkg.1 -o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}')

imgpkg pull -b $image_url -o /tmp/harbor-package-2.3.3+vmware.1-tkg.1

cp /tmp/harbor-package-2.3.3+vmware.1-tkg.1/config/values.yaml harbor-data-values.yaml

Set the mandatory passwords and secrets in the harbor-data-values.yaml file by automatically generating random passwords and secrets:

bash /tmp/harbor-package-2.3.3+vmware.1-tkg.1/config/scripts/generate-passwords.sh harbor-data-values.yaml

Specify other settings in the harbor-data-values.yaml file.

Set the hostname setting to the hostname you want to use to access Harbor via ingress. For example, harbor.yourdomain.com.

To use your own certificates, update the tls.crt, tls.key, and ca.crt settings with the contents of your certificate, key, and CA certificate. The certificate can be signed by a trusted authority or be self-signed. If you leave these blank, Tanzu Kubernetes Grid automatically generates a self-signed certificate.

The format of the tls.crt and tls.key looks like this:

tlsCertificate:
  tls.crt: |
    -----BEGIN CERTIFICATE-----
    ---snipped---
    -----END CERTIFICATE-----
  tls.key: |
    -----BEGIN PRIVATE KEY-----
    ---snipped---
    -----END PRIVATE KEY-----

If you used the generate-passwords.sh script, optionally update the harborAdminPassword with something that is easier to remember.

Optionally update other persistence settings to specify how Harbor stores data.

If you need to store a large quantity of container images in Harbor, set persistence.persistentVolumeClaim.registry.size to a larger number.

If you do not update the storageClass under persistence settings, Harbor uses the cluster’s default storageClass.

Remove all comments in the harbor-data-values.yaml file:

yq -i eval '... comments=""' harbor-data-values.yaml

Install the Harbor package:

tanzu package install harbor \
--package-name harbor.tanzu.vmware.com \
--version 2.3.3+vmware.1-tkg.1 \
--values-file harbor-data-values.yaml \
--namespace my-packages

Obtain the address of the Envoy service load balancer.

kubectl get svc envoy -n tanzu-system-ingress -o jsonpath='{.status.loadBalancer.ingress[0]}'

Update your DNS record to point the hostname to the IP address above.

Update Harbor

Update the Harbor installation in any way, such as updating the TLS certificate, make your changes to the harbor-data-values.yaml file then run the following to update Harbor.

tanzu package installed update harbor --version 2.3.3+vmware.1-tkg.1 --values-file harbor-data-values.yaml --namespace my-packages

Using Harbor as an OCI Registry for Helm Charts

helm registry login -u admin harbor2.vmwire.com

Package a helm chart if you haven’t got one already packaged

helm package buildachart

Upload a chart to the registry

helm push buildachart-0.1.0.tgz oci://harbor2.vmwire.com/chartrepo

The chart can now be seen in the Harbor UI in the view as where normal Docker images are.

Notice that this is an OCI registry and not a Helm repository that is based on ChartMuseum, thats why you won’t see the ‘Helm Charts’ tab next to the ‘Repositories’ tab.

Deploy an application with Helm

Let’s deploy the buildachart application, this is a simple nginx application that can use TLS so we have a secure site with HTTPS.

Create a new namespace and the TLS secret for the application. Copy the tls.crt and tls.key files in pem format to $HOME/certs/

# Create a new namespace for cherry
k create ns cherry

# Create a TLS secret with the contents of tls.key and tls.crt in the cherry namespace
kubectl create secret tls cherry-tls --key $HOME/certs/tls.key --cert $HOME/certs/tls.crt -n cherry

Deploy the app using Harbor as the Helm chart repository

helm install buildachart oci://harbor2.vmwire.com/chartrepo/buildachart --version 0.1.0 -n cherry

If you need to install Helm

Follow this link here.

Useful links

https://helm.sh/docs/topics/registries/

https://opensource.com/article/20/5/helm-charts

https://itnext.io/helm-3-8-0-oci-registry-support-b050ff218911

Deploy Harbor Registry with Helm and expose with Ingress

The Harbor registry that comes shipped with TKG 1.5.1 uses Tanzu Packages to deploy Harbor into a TKG cluster. This version of Harbor does not support Helm Charts using ChartMuseum. VMware dropped support for ChartMuseum in TKG and are adopting OCI registries instead. This post describes how to deploy the upstream Harbor distribution that supports ChartMuseum for a helm repository. Follow this other post here to deploy Harbor with Tanzu Packages (Kapp) with support for OCI.

Intro

The Harbor registry that comes shipped with TKG 1.5.1 uses Tanzu Packages to deploy Harbor into a TKG cluster. This version of Harbor does not support Helm Charts using ChartMuseum. VMware dropped support for ChartMuseum in TKG and are adopting OCI registries instead. This post describes how to deploy the upstream Harbor distribution that supports ChartMuseum for a helm repository. Follow this other post here to deploy Harbor with Tanzu Packages (Kapp) with support for OCI.

The example below uses the following components:

TKG 1.5.1
AKO 1.6.1
Contour 1.18.2
Helm 3.8.0

Use the previous post to deploy the per-requisites.

Step 1 – Download the harbor helm chart

helm repo add harbor https://helm.goharbor.io
helm fetch harbor/harbor --untar

Step 2 – Edit the values.yaml file

You only need to change the following lines.

Line Number	Specification
5	loadBalancer or ingress (contour)
13	use TLS certificate
30 & 35	secret name (created in Step 3.)
38 & 39	FQDN of your harbor and notary DNS A records
215, 221 etc	A storage class if you don’t have a default storage class. Leave blank to use your default storage class.
355	admin password

expose:
  # Set the way how to expose the service. Set the type as "ingress",
  # "clusterIP", "nodePort" or "loadBalancer" and fill the information
  # in the corresponding section
  type: ingress
  tls:
    # Enable the tls or not.
    # Delete the "ssl-redirect" annotations in "expose.ingress.annotations" when TLS is disabled and "expose.type" is "ingress"
    # Note: if the "expose.type" is "ingress" and the tls
    # is disabled, the port must be included in the command when pull/push
    # images. Refer to https://github.com/goharbor/harbor/issues/5291
    # for the detail.
    enabled: true
    # The source of the tls certificate. Set it as "auto", "secret"
    # or "none" and fill the information in the corresponding section
    # 1) auto: generate the tls certificate automatically
    # 2) secret: read the tls certificate from the specified secret.
    # The tls certificate can be generated manually or by cert manager
    # 3) none: configure no tls certificate for the ingress. If the default
    # tls certificate is configured in the ingress controller, choose this option
    certSource: secret
    auto:
      # The common name used to generate the certificate, it's necessary
      # when the type isn't "ingress"
      commonName: ""
    secret:
      # The name of secret which contains keys named:
      # "tls.crt" - the certificate
      # "tls.key" - the private key
      secretName: "harbor-cert"
      # The name of secret which contains keys named:
      # "tls.crt" - the certificate
      # "tls.key" - the private key
      # Only needed when the "expose.type" is "ingress".
      notarySecretName: "harbor-cert"
  ingress:
    hosts:
      core: harbor.vmwire.com
      notary: notary.harbor.vmwire.com
   
---snipped---

Step 3 – Create a TLS secret for ingress

Copy the tls.crt and tls.key files in pem format to $HOME/certs/

# Create a new namespace for harbor
k create ns harbor

# Create a TLS secret with the contents of tls.key and tls.crt in the harbor namespace
kubectl create secret tls harbor-cert --key $HOME/certs/tls.key --cert $HOME/certs/tls.crt -n harbor

Step 4 – Install Harbor

Ensure you’re in the directory that you ran Step 2 in.

helm install harbor . -n harbor

Monitor deployment with

kubectl get po -n harbor

Log in

Use admin and the password you set on line 355 of the values.yaml file. The default password is Harbor12345.

Quick guide to install cert-manager, contour, prometheus and grafana into TKG using Tanzu Packages (Kapp)

Intro

For an overview of Kapp, please see this link here.

The latest versions as of TKG 1.5.1, February 2022.

Package	Version
cert-manager	1.5.3+vmware.2-tkg.1
contour	1.18.2+vmware.1-tkg.1
prometheus	2.27.0+vmware.2-tkg.1
grafana	7.5.7+vmware.2-tkg.1

Or run the following to see the latest available versions.

tanzu package available list cert-manager.tanzu.vmware.com -A
tanzu package available list contour.tanzu.vmware.com -A
tanzu package available list prometheus.tanzu.vmware.com -A
tanzu package available list grafana.tanzu.vmware.com -A

Install Cert Manager

tanzu package install cert-manager \
--package-name cert-manager.tanzu.vmware.com \
--namespace my-packages \
--version 1.5.3+vmware.2-tkg.1 \
--create-namespace

I’m using ingress with Contour which needs a load balancer to expose the ingress services. Install AKO and NSX Advanced Load Balancer (Avi) by following this previous post.

Install Contour

Create a file named contour-data-values.yaml, this example uses NSX Advanced Load Balancer (Avi)

---
infrastructure_provider: vsphere
namespace: tanzu-system-ingress
contour:
 configFileContents: {}
 useProxyProtocol: false
 replicas: 2
 pspNames: "vmware-system-restricted"
 logLevel: info
envoy:
 service:
   type: LoadBalancer
   annotations: {}
   nodePorts:
     http: null
     https: null
   externalTrafficPolicy: Cluster
   disableWait: false
 hostPorts:
   enable: true
   http: 80
   https: 443
 hostNetwork: false
 terminationGracePeriodSeconds: 300
 logLevel: info
 pspNames: null
certificates:
 duration: 8760h
 renewBefore: 360h

Remove comments in the contour-data-values.yaml file.

yq -i eval '... comments=""' contour-data-values.yaml

Deploy contour

tanzu package install contour \
--package-name contour.tanzu.vmware.com \
--version 1.18.2+vmware.1-tkg.1 \
--values-file contour-data-values.yaml \
--namespace my-packages

Install Prometheus

Download the prometheus-data-values.yaml file to use custom values to use ingress.

image_url=$(kubectl -n tanzu-package-repo-global get packages prometheus.tanzu.vmware.com.2.27.0+vmware.2-tkg.1 -o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}')

imgpkg pull -b $image_url -o /tmp/prometheus-package-2.27.0+vmware.2-tkg.1

cp /tmp/prometheus-package-2.27.0+vmware.2-tkg.1/config/values.yaml prometheus-data-values.yaml

Edit the file and change any settings you need such as adding the TLS certificate and private key for ingress. It’ll look something like this.

ingress:
  enabled: true
  virtual_host_fqdn: "prometheus-tkg-mgmt.vmwire.com"
  prometheus_prefix: "/"
  alertmanager_prefix: "/alertmanager/"
  prometheusServicePort: 80
  alertmanagerServicePort: 80
  tlsCertificate:
    tls.crt: |
      -----BEGIN CERTIFICATE-----
      --- snipped---
      -----END CERTIFICATE-----
    tls.key: |
      -----BEGIN PRIVATE KEY-----
      --- snipped---
      -----END PRIVATE KEY-----

Remove comments in the prometheus-data-values.yaml file.

yq -i eval '... comments=""' prometheus-data-values.yaml

Deploy prometheus

tanzu package install prometheus \
--package-name prometheus.tanzu.vmware.com \
--version 2.27.0+vmware.2-tkg.1 \
--values-file prometheus-data-values.yaml \
--namespace my-packages

Install Grafana

Download the grafana-data-values.yaml file.

image_url=$(kubectl -n tanzu-package-repo-global get packages grafana.tanzu.vmware.com.7.5.7+vmware.2-tkg.1 -o jsonpath='{.spec.template.spec.fetch[0].imgpkgBundle.image}')

imgpkg pull -b $image_url -o /tmp/grafana-package-7.5.7+vmware.2-tkg.1

cp /tmp/grafana-package-7.5.7+vmware.2-tkg.1/config/values.yaml grafana-data-values.yaml

Generate a Base64 password and edit the grafana-data-values.yaml file to update the default admin password.

echo -n 'Vmware1!' | base64

Also update the TLS configuration to use signed certificates for ingress. It will look something like this.

  secret:
    type: "Opaque"
    admin_user: "YWRtaW4="
    admin_password: "Vm13YXJlMSE="

ingress:
  enabled: true
  virtual_host_fqdn: "grafana-tkg-mgmt.vmwire.com"
  prefix: "/"
  servicePort: 80
  #! [Optional] The certificate for the ingress if you want to use your own TLS certificate.
  #! We will issue the certificate by cert-manager when it's empty.
  tlsCertificate:
    #! [Required] the certificate
    tls.crt: |
      -----BEGIN CERTIFICATE-----
      ---snipped---
      -----END CERTIFICATE-----
    #! [Required] the private key
    tls.key: |
      -----BEGIN PRIVATE KEY-----
      ---snipped---
      -----END PRIVATE KEY-----

Since I’m using ingress to expose the Grafana service, also change line 33, from LoadBalancer to ClusterIP. This prevents Kapp from creating an unnecessary service that will consume an IP address.

#! Grafana service configuration
   service:
     type: ClusterIP
     port: 80
     targetPort: 3000
     labels: {}
     annotations: {}

Remove comments in the grafana-data-values.yaml file.

yq -i eval '... comments=""' grafana-data-values.yaml

Deploy Grafana

tanzu package install grafana \
--package-name grafana.tanzu.vmware.com \
--version 7.5.7+vmware.2-tkg.1 \
--values-file grafana-data-values.yaml \
--namespace my-packages

Accessing Grafana

Since I’m using ingress and I set the ingress FQDN as grafana-tkg-mgmt.vmwire.com and I also used TLS. I can now access the Grafana UI using https://grafana-tkg-mgmt.vmwire.com and enjoy a secure connection.

Listing all installed packages

tanzu package installed list -A

Making changes to Contour, Prometheus or Grafana

If you need to make changes to any of the configuration files, you can then update the deployment with the tanzu package installed update command.

tanzu package installed update contour \
--version 1.18.2+vmware.1-tkg.1 \
--values-file contour-data-values.yaml \
--namespace my-packages

tanzu package installed update prometheus \
--version 2.27.0+vmware.2-tkg.1 \
--values-file prometheus-data-values.yaml \
--namespace my-packages

tanzu package installed update grafana \
--version 7.5.7+vmware.2-tkg.1 \
--values-file grafana-data-values.yaml \
--namespace my-packages

Removing Cert Manager, Contour, Prometheus or Grafana

tanzu package installed delete cert-manager -n my-packages

tanzu package installed delete contour -n my-packages

tanzu package installed delete prometheus -n my-packages

tanzu package installed delete grafana -n my-packages

Copypasta for doing this again on another cluster

Place all your completed data-values files into a directory and just run the entire code block below to set everything up in one go.

# Deploy cert-manager
tanzu package install cert-manager \
--package-name cert-manager.tanzu.vmware.com \
--namespace my-packages \
--version 1.5.3+vmware.2-tkg.1 \
--create-namespace

# Deploy contour
yq -i eval '... comments=""' contour-data-values.yaml
tanzu package install contour \
--package-name contour.tanzu.vmware.com \
--version 1.18.2+vmware.1-tkg.1 \
--values-file contour-data-values.yaml \
--namespace my-packages

# Deploy prometheus
yq -i eval '... comments=""' prometheus-data-values.yaml
tanzu package install prometheus \
--package-name prometheus.tanzu.vmware.com \
--version 2.27.0+vmware.2-tkg.1 \
--values-file prometheus-data-values.yaml \
--namespace my-packages

# Deploy grafana
yq -i eval '... comments=""' grafana-data-values.yaml
tanzu package install grafana \
--package-name grafana.tanzu.vmware.com \
--version 7.5.7+vmware.2-tkg.1 \
--values-file grafana-data-values.yaml \
--namespace my-packages

Run a replicated stateful application using local storage in Kubernetes

This post shows how to run a replicated stateful application on local storage using a StatefulSet controller. This application is a replicated MySQL database. The example topology has a single primary server and multiple replicas, using asynchronous row-based replication. The MySQL data is using a storage class backed by local SSD storage provided by the vSphere CSI driver performing the dynamic PersistentVolume provisioner.

This post continues from the previous post where I described how to setup multi-AZ topology aware volume provisioning with local storage.

I used this example here to setup a StatefulSet with MySQL to get an example application up and running.

However, I did not use the default storage class, but added one line to the mysql-statefulset.yaml file to use the storage class that is backed by local SSDs instead.

from:

    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

to:

    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: k8s-local-nvme
      resources:
        requests:
          storage: 10Gi

I also appended the StatefulSet to include the spec.template.spec.affinity and spec.template.spec.podAntiAffinity settings to make use of the three AZs for pod scheduling.

spec:
  selector:
    matchLabels:
      app: mysql
  serviceName: mysql
  replicas: 3
  template:
    metadata:
      labels:
        app: mysql
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.csi.vmware.com/k8s-zone
                operator: In
                values:
                - az-1
                - az-2
                - az-3
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - mysql
            topologyKey: topology.csi.vmware.com/k8s-zone

Everything else stayed the same. Please spend some time reading the example from kubernetes.io as I will be performing the same steps but using local storage instead to test the behavior of MySQL replication.

Architecture

I am using the same setup, with three replicas in the StatefulSet to match with the three AZs that I have setup in my lab.

My AZ layout is the following.

AZ	ESX host	TKG worker
az-1	esx1.vcd.lab	tkg-hugo-md-0-7d455b7488-g28bl
az-2	esx2.vcd.lab	tkg-hugo-md-1-7bbd55cdb8-996×2
az-3	esx3.vcd.lab	tkg-hugo-md-2-6c6c49dc67-xbpg7

We can see which pod runs on which worker using the following command:

k get po -o wide

NAME                READY   STATUS    RESTARTS   AGE     IP               NODE                             NOMINATED NODE   READINESS GATES
mysql-0             2/2     Running   0          3h24m   100.120.135.67   tkg-hugo-md-1-7bbd55cdb8-996x2   <none>           <none>
mysql-1             2/2     Running   0          3h22m   100.127.29.3     tkg-hugo-md-0-7d455b7488-g28bl   <none>           <none>
mysql-2             2/2     Running   0          113m    100.109.206.65   tkg-hugo-md-2-6c6c49dc67-xbpg7   <none>           <none>

To see which PVCs are using which AZs using the CSI driver’s node affinity we can use this command.

kubectl get pv -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.claimRef.name}{"\t"}{.spec.nodeAffinity}{"\n"}{end}'

pvc-06f9a40c-9fdf-48e3-9f49-b31ca2faf5a5        data-mysql-1    {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-1"]}]}]}}
pvc-1f586db7-12bb-474c-adb6-2f92d44789bb        data-mysql-2    {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-3"]}]}]}}
pvc-7108f915-e0e4-4028-8d45-6770b4d5be20        data-mysql-0    {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-2"]}]}]}}

We can see that each PV has been allocated to each AZ.

PV	Claim Name	AZ
pvc-06f9a40c-9fdf-48e3-9f49-b31ca2faf5a5	data-mysql-1	az-1
pvc-1f586db7-12bb-474c-adb6-2f92d44789bb	data-mysql-2	az-3
pvc-7108f915-e0e4-4028-8d45-6770b4d5be20	data-mysql-0	az-2

So we know which pod and which PV are on which worker node and on which ESX host.

Pod	PVC	Worker	ESX Host	AZ
mysql-0	data-mysql-0	tkg-hugo-md-1-7bbd55cdb8-996×2	esx2.vcd.lab	az-2
mysql-1	data-mysql-1	tkg-hugo-md-0-7d455b7488-g28bl	esx1.vcd.lab	az-1
mysql-2	data-mysql-2	tkg-hugo-md-2-6c6c49dc67-xbpg7	esx3.vcd.lab	az-3

For the rest of this exercise, I will perform the tests on mysql-2, tkg-hugo-md-2 and esx3.vcd.lab, which are all members of az-3.

Show data using mysql-client-loop pod

When all the pods are running we can run the following pod to constantly query the MySQL clusters.

kubectl run mysql-client-loop --image=mysql:5.7 -i -t --rm --restart=Never --\
  bash -ic "while sleep 1; do mysql -h mysql-read -e 'SELECT @@server_id,NOW()'; done"

Which would give us the following result:

+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         100 | 2022-02-06 14:12:57 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         101 | 2022-02-06 14:12:58 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         101 | 2022-02-06 14:12:59 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         100 | 2022-02-06 14:13:00 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         102 | 2022-02-06 14:13:01 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         101 | 2022-02-06 14:13:02 |
+-------------+---------------------+
+-------------+---------------------+
| @@server_id | NOW()               |
+-------------+---------------------+
|         102 | 2022-02-06 14:13:03 |
+-------------+---------------------+

The server_id’s are either 100, 101, or 102, referencing either mysql-0, mysql-1 or mysql-2 respectively. We can see that we can read data from all three of the pods which means our MySQL service is running well across all three AZs.

Simulating Pod and Node downtime

To demonstrate the increased availability of reading from the pool of replicas instead of a single server, keep the SELECT @@server_id loop from above running while you force a Pod out of the Ready state.

Delete Pods

The StatefulSet also recreates Pods if they’re deleted, similar to what a ReplicaSet does for stateless Pods.

kubectl delete pod mysql-2

The StatefulSet controller notices that no mysql-2 Pod exists anymore, and creates a new one with the same name and linked to the same PersistentVolumeClaim. You should see server ID 102 disappear from the loop output for a while and then return on its own.

Drain a Node

If your Kubernetes cluster has multiple Nodes, you can simulate Node downtime (such as when Nodes are upgraded) by issuing a drain.

We already know that mysql-2 is running on worker tkg-hugo-md-2. Then drain the Node by running the following command, which cordons it so no new Pods may schedule there, and then evicts any existing Pods.

kubectl drain tkg-hugo-md-2-6c6c49dc67-xbpg7 --force --delete-emptydir-data --ignore-daemonsets

What happens now is the pod mysql-2 will be evicted, it will also have its PVC unattached. Because we only have one worker per AZ, mysql-2 won’t be able to be scheduled on another node in another AZ.

The mysql-client-loop pod would show that 102 (mysql-2) is no longer serving MySQL requests. The pod mysql-2 will stay with a status as pending until a worker is available in AZ2 again.

Perform maintenance on ESX

After draining the worker node, we can now go ahead and perform maintenance operations on the ESX host by placing it into maintenance mode. Doing so will VMotion any VMs that are not using shared storage. You will find that because the worker node is still powered on and has locally attached VMDKs, this will prevent the ESX host from going into maintenance mode.

We know that the worker node is already drained and the MySQL application has two other replicas that are running in two other AZs, so we can safely power off this worker and enable the ESX host to complete going into maintenance mode. Yes, power off instead of gracefully shutting down. Kubernetes worker nodes are cattle and not pets and Kubernetes will destroy it anyway.

Operations with local storage

Consider the following when using local storage with Tanzu Kubernetes Grid.

TKG worker nodes that have been tagged with a k8s-zone and have attached PVs will not be able to VMotion.
TKG worker nodes that have been tagged with a k8s-zone and do not have attached PVs will also not be able to VMotion as they have the affinity rule set to “Must run on this host”.
Placing a ESX host into maintenance mode will not complete until the TKG worker node running on that host has been powered off.

However, do not be alarmed by any of this, as this is normal behavior. Kubernetes workers can be replaced very often and since we have a stateful application with more than one replica, we can do this with no consequences.

The following section shows why this is the case.

How do TKG clusters with local storage handle ESX maintenance?

To perform maintenance on an ESX host that requires a host reboot perform the following.

Drain the TKG worker node of the host that you want to place into maintenance mode

kubectl drain <node-name> --force --delete-emptydir-data --ignore-daemonsets

What this does is it evicts all pods but daemonsets, it will also evict the MySQL pod running on this node, including removing the volume mount. In our example here, we still have the other two MySQL pods running on two other worker nodes.

Now place the ESX host into maintenance mode.
Power off the TKG worker node on this ESX host to allow the host to go into maintenance mode.
You might notice that TKG will try to delete that worker node and clone a new worker node on this host, but it cannot due to the host being in maintenance mode. This is normal behavior as any Kubernetes clusters will try to replace a worker that is no longer accessible. This of course is the case as we have powered ours off.
You will notice that Kubernetes does not try to create a worker node on any other ESX host. This is because the powered-off worker is labelled with one of the AZs therefore Kubernetes tries to place a new worker in the same AZ.
Perform ESX maintenance as normal and when complete exit the host from maintenance mode.
When the host exits maintenance mode, you’ll notice that Kubernetes can now delete the powered-off worker and replace it with a new one.
When the new worker node powers on and becomes ready, you will notice that the previous PV that was attached to the now deleted worker node is now attached to the new worker node.
The MySQL pod will then claim the PV and the pod will start and come out of pending status into ready status.
All three MySQL pods are now up and running and we have a healthy MySQL cluster again. Any MySQL data that was changed during this maintenance window will be replicated to the MySQL pod.

Summary

Using local storage backed storage classes with TKG is a viable alternative to using shared storage when your applications can perform data protection and replication at a higher level. Applications such as databases like the MySQL example that I used can benefit from using cheaper locally attached fast solid state media such as SSD or NVMe without the need to create hyperconverged storage environments. Applications that can replicated data at the application level, can avoid using SAN and NAS completely and benefit from simpler infrastructures and lower costs as well as benefiting from faster storage and lower latencies.

Using local storage with Tanzu Kubernetes Grid Topology Aware Volume Provisioning

With the vSphere CSI driver, it is now possible to use local storage with TKG clusters. This is enabled by TKG’s Topology Aware Volume Provisioning capability.

With this model, it is possible to present individual SSDs or NVMe drives attached to an ESXi host and configure a local datastore for use with topology aware volume provisioning. Kubernetes can then create persistent volumes and schedule pods that are deployed onto the worker nodes that are on the same ESXi host as the volume. This enables Kubernetes pods to have direct local access to the underlying storage.

With the vSphere CSI driver version 2.4.1, it is now possible to use local storage with TKG clusters. This is enabled by TKG’s Topology Aware Volume Provisioning capability.

Using local storage has distinct advantages over shared storage, especially when it comes to supporting faster and cheaper storage media for applications that do not benefit from or require the added complexity of having their data replicated by the storage layer. Examples of applications that do not require storage protection (RAID or failures to tolerate) are applications that can achieve data protection at the application level.

To setup such an environment, it is necessary to go over some of the requirements first.

Deploy Tanzu Kubernetes Clusters to Multiple Availability Zones on vSphere – link
Spread Nodes Across Multiple Hosts in a Single Compute Cluster
Configure Tanzu Kubernetes Plans and Clusters with an overlay that is topology-aware – link
Deploy TKG clusters into a multi-AZ topology
Deploy the k8s-local-ssd storage class
Deploy Workloads with WaitForFirstConsumer Mode in Topology-Aware Environment – link

Before you start

Note that only the CSI driver for vSphere version 2.4.1 supports local storage topology in a multi-AZ topology. To check if you have the correct version in your TKG cluster, run the following.

tanzu package installed get vsphere-csi -n tkg-system

- Retrieving installation details for vsphere-csi... I0224 19:20:29.397702  317993 request.go:665] Waited for 1.03368201s due to client-side throttling, not priority and fairness, request: GET:https://172.16.3.94:6443/apis/secretgen.k14s.io/v1alpha1?timeout=32s
\ Retrieving installation details for vsphere-csi...
NAME:                    vsphere-csi
PACKAGE-NAME:            vsphere-csi.tanzu.vmware.com
PACKAGE-VERSION:         2.4.1+vmware.1-tkg.1
STATUS:                  Reconcile succeeded
CONDITIONS:              [{ReconcileSucceeded True  }]

Deploy Tanzu Kubernetes Clusters to Multiple Availibility Zones on vSphere

In my example, I am using the Spread Nodes Across Multiple Hosts in a Single Compute Cluster example, each ESXi host is an availability zone (AZ) and the vSphere cluster is the Region.

Figure 1. shows a TKG cluster with three worker nodes, each node is running on a separate ESXi host. Each ESXi host has a local SSD drive formatted with VMFS 6. The topology aware volume provisioner would always place pods and their replicas on separate worker nodes and also any persistent volume claims (PVC) on separate ESXi hosts.

Parameter	Specification	vSphere object	Datastore
Region	tagCategory: k8s-region	cluster*
Zone az-1 az-2 az-3	tagCategory: k8s-zone host-group-1 host-group-2 host-group-3	esx1.vcd.lab esx2.vcd.lab esx3.vcd.lab	esx1-ssd-1 esx2-ssd-1 esx3-ssd-1
Storage Policy	k8s-local-ssd		esx1-ssd-1 esx2-ssd-1 esx3-ssd-1
Tags	tagCategory: k8s-storage tag: k8s-local-ssd		esx1-ssd-1 esx2-ssd-1 esx3-ssd-1

*Note that “cluster” is the name of my vSphere cluster.

Ensure that you’ve set up the correct rules that enforce worker nodes to their respective ESXi hosts. Always use “Must run on hosts in group“, this is very important for local storage topology to work. This is because the worker nodes will be labelled for topology awareness, and if a worker node is vMotion’d accidentally then the CSI driver will not be able to bind the PVC to the worker node.

Below is my vsphere-zones.yaml file.

Note that autoConfigure is set to true. Which means that you do not have to tag the cluster or the ESX hosts yourself, you would only need to setup up the affinity rules under Cluster, Configure, VM/Host Groups and VM/Host Rules. The setting autoConfigure: true, would then make CAPV automatically configure the tags and tag categories for you.

---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
 name: az-1
spec:
 region:
   name: cluster
   type: ComputeCluster
   tagCategory: k8s-region
   autoConfigure: true
 zone:
   name: az-1
   type: HostGroup
   tagCategory: k8s-zone
   autoConfigure: true
 topology:
   datacenter: home.local
   computeCluster: cluster
   hosts:
     vmGroupName: workers-group-1
     hostGroupName: host-group-1
   datastore: lun01
   networks:
   - tkg-workload
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
 name: az-2
spec:
 region:
   name: cluster
   type: ComputeCluster
   tagCategory: k8s-region
   autoConfigure: true
 zone:
   name: az-2
   type: HostGroup
   tagCategory: k8s-zone
   autoConfigure: true
 topology:
   datacenter: home.local
   computeCluster: cluster
   hosts:
     vmGroupName: workers-group-2
     hostGroupName: host-group-2
   datastore: lun01
   networks:
   - tkg-workload
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereFailureDomain
metadata:
 name: az-3
spec:
 region:
   name: cluster
   type: ComputeCluster
   tagCategory: k8s-region
   autoConfigure: true
 zone:
   name: az-3
   type: HostGroup
   tagCategory: k8s-zone
   autoConfigure: true
 topology:
   datacenter: home.local
   computeCluster: cluster
   hosts:
     vmGroupName: workers-group-3
     hostGroupName: host-group-3
   datastore: lun01
   networks:
   - tkg-workload
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
 name: az-1
spec:
 server: vcenter.vmwire.com
 failureDomain: az-1
 placementConstraint:
   resourcePool: tkg-vsphere-workload
   folder: tkg-vsphere-workload
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
 name: az-2
spec:
 server: vcenter.vmwire.com
 failureDomain: az-2
 placementConstraint:
   resourcePool: tkg-vsphere-workload
   folder: tkg-vsphere-workload
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereDeploymentZone
metadata:
 name: az-3
spec:
 server: vcenter.vmwire.com
 failureDomain: az-3
 placementConstraint:
   resourcePool: tkg-vsphere-workload
   folder: tkg-vsphere-workload

Note that Kubernetes does not like using parameter names that are not standard, I suggest for your vmGroupName and hostGroupName parameters, use lowercase and dashes instead of periods. For example host-group-3, instead of Host.Group.3. The latter will be rejected.

Configure Tanzu Kubernetes Plans and Clusters with an overlay that is topology-aware

To ensure that this topology can be built by TKG, we first need to create a TKG cluster plan overlay that tells Tanzu how what to do when creating worker nodes in a multi-availability zone topology.

Lets take a look at my az-overlay.yaml file.

Since I have three AZs, I need to create an overlay file that includes the cluster plan for all three AZs.

Parameter	Specification
Zone az-1 az-2 az-3	VSphereMachineTemplate -worker-0 -worker-1 -worker-2	KubeadmConfigTemplate -md-0 -md-1 -md-2

#! Please add any overlays specific to vSphere provider under this file.

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")

#@ load("lib/helpers.star", "get_bom_data_for_tkr_name", "get_default_tkg_bom_data", "kubeadm_image_repo", "get_image_repo_for_component", "get_vsphere_thumbprint")

#@ load("lib/validate.star", "validate_configuration")
#@ load("@ytt:yaml", "yaml")
#@ validate_configuration("vsphere")

#@ bomDataForK8sVersion = get_bom_data_for_tkr_name()

#@ if data.values.CLUSTER_PLAN == "dev" and not data.values.IS_WINDOWS_WORKLOAD_CLUSTER:
#@overlay/match by=overlay.subset({"kind":"VSphereCluster"})
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereCluster
metadata:
  name: #@ data.values.CLUSTER_NAME
spec:
  thumbprint: #@ get_vsphere_thumbprint()
  server: #@ data.values.VSPHERE_SERVER
  identityRef:
    kind: Secret
    name: #@ data.values.CLUSTER_NAME

#@overlay/match by=overlay.subset({"kind":"MachineDeployment", "metadata":{"name": "{}-md-0".format(data.values.CLUSTER_NAME)}})
---
spec:
  template:
    spec:
      #@overlay/match missing_ok=True
      #@ if data.values.VSPHERE_AZ_0:
      failureDomain: #@ data.values.VSPHERE_AZ_0
      #@ end
      infrastructureRef:
        name: #@ "{}-worker-0".format(data.values.CLUSTER_NAME)

#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate", "metadata":{"name": "{}-worker".format(data.values.CLUSTER_NAME)}})
---
metadata:
  name: #@ "{}-worker-0".format(data.values.CLUSTER_NAME)
spec:
  template:
    spec:
      #@overlay/match missing_ok=True
      #@ if data.values.VSPHERE_AZ_0:
      failureDomain: #@ data.values.VSPHERE_AZ_0
      #@ end
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
  name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
  #@overlay/match missing_ok=True
  annotations:
    vmTemplateMoid: #@ data.values.VSPHERE_TEMPLATE_MOID
spec:
  template:
    spec:
      cloneMode:  #@ data.values.VSPHERE_CLONE_MODE
      datacenter: #@ data.values.VSPHERE_DATACENTER
      datastore: #@ data.values.VSPHERE_DATASTORE
      storagePolicyName: #@ data.values.VSPHERE_STORAGE_POLICY_ID
      diskGiB: #@ data.values.VSPHERE_WORKER_DISK_GIB
      folder: #@ data.values.VSPHERE_FOLDER
      memoryMiB: #@ data.values.VSPHERE_WORKER_MEM_MIB
      network:
        devices:
          #@overlay/match by=overlay.index(0)
          #@overlay/replace
          - networkName: #@ data.values.VSPHERE_NETWORK
            #@ if data.values.WORKER_NODE_NAMESERVERS:
            nameservers: #@ data.values.WORKER_NODE_NAMESERVERS.replace(" ", "").split(",")
            #@ end
            #@ if data.values.TKG_IP_FAMILY == "ipv6":
            dhcp6: true
            #@ elif data.values.TKG_IP_FAMILY in ["ipv4,ipv6", "ipv6,ipv4"]:
            dhcp4: true
            dhcp6: true
            #@ else:
            dhcp4: true
            #@ end
      numCPUs: #@ data.values.VSPHERE_WORKER_NUM_CPUS
      resourcePool: #@ data.values.VSPHERE_RESOURCE_POOL
      server: #@ data.values.VSPHERE_SERVER
      template: #@ data.values.VSPHERE_TEMPLATE
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: VSphereMachineTemplate
metadata:
  name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
  #@overlay/match missing_ok=True
  annotations:
    vmTemplateMoid: #@ data.values.VSPHERE_TEMPLATE_MOID
spec:
  template:
    spec:
      cloneMode:  #@ data.values.VSPHERE_CLONE_MODE
      datacenter: #@ data.values.VSPHERE_DATACENTER
      datastore: #@ data.values.VSPHERE_DATASTORE
      storagePolicyName: #@ data.values.VSPHERE_STORAGE_POLICY_ID
      diskGiB: #@ data.values.VSPHERE_WORKER_DISK_GIB
      folder: #@ data.values.VSPHERE_FOLDER
      memoryMiB: #@ data.values.VSPHERE_WORKER_MEM_MIB
      network:
        devices:
          #@overlay/match by=overlay.index(0)
          #@overlay/replace
          - networkName: #@ data.values.VSPHERE_NETWORK
            #@ if data.values.WORKER_NODE_NAMESERVERS:
            nameservers: #@ data.values.WORKER_NODE_NAMESERVERS.replace(" ", "").split(",")
            #@ end
            #@ if data.values.TKG_IP_FAMILY == "ipv6":
            dhcp6: true
            #@ elif data.values.TKG_IP_FAMILY in ["ipv4,ipv6", "ipv6,ipv4"]:
            dhcp4: true
            dhcp6: true
            #@ else:
            dhcp4: true
            #@ end
      numCPUs: #@ data.values.VSPHERE_WORKER_NUM_CPUS
      resourcePool: #@ data.values.VSPHERE_RESOURCE_POOL
      server: #@ data.values.VSPHERE_SERVER
      template: #@ data.values.VSPHERE_TEMPLATE
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
  name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
spec:
  clusterName: #@ data.values.CLUSTER_NAME
  replicas: #@ data.values.WORKER_MACHINE_COUNT_1
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
        node-pool: #@ "{}-worker-pool".format(data.values.CLUSTER_NAME)
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
      clusterName: #@ data.values.CLUSTER_NAME
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: VSphereMachineTemplate
        name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
      version: #@ data.values.KUBERNETES_VERSION
      #@ if data.values.VSPHERE_AZ_1:
      failureDomain: #@ data.values.VSPHERE_AZ_1
      #@ end
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  labels:
    cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
  name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
spec:
  clusterName: #@ data.values.CLUSTER_NAME
  replicas: #@ data.values.WORKER_MACHINE_COUNT_2
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME
        node-pool: #@ "{}-worker-pool".format(data.values.CLUSTER_NAME)
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
      clusterName: #@ data.values.CLUSTER_NAME
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: VSphereMachineTemplate
        name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
      version: #@ data.values.KUBERNETES_VERSION
      #@ if data.values.VSPHERE_AZ_2:
      failureDomain: #@ data.values.VSPHERE_AZ_2
      #@ end
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: #@ "{}-md-1".format(data.values.CLUSTER_NAME)
  namespace: '${ NAMESPACE }'
spec:
  template:
    spec:
      useExperimentalRetryJoin: true
      joinConfiguration:
        nodeRegistration:
          criSocket: /var/run/containerd/containerd.sock
          kubeletExtraArgs:
            cloud-provider: external
            tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
          name: '{{ ds.meta_data.hostname }}'
      preKubeadmCommands:
        - hostname "{{ ds.meta_data.hostname }}"
        - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
        - echo "127.0.0.1   localhost" >>/etc/hosts
        - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
        - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
      files: []
      users:
        - name: capv
          sshAuthorizedKeys:
            - #@ data.values.VSPHERE_SSH_AUTHORIZED_KEY
          sudo: ALL=(ALL) NOPASSWD:ALL
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  name: #@ "{}-md-2".format(data.values.CLUSTER_NAME)
  namespace: '${ NAMESPACE }'
spec:
  template:
    spec:
      useExperimentalRetryJoin: true
      joinConfiguration:
        nodeRegistration:
          criSocket: /var/run/containerd/containerd.sock
          kubeletExtraArgs:
            cloud-provider: external
            tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
          name: '{{ ds.meta_data.hostname }}'
      preKubeadmCommands:
        - hostname "{{ ds.meta_data.hostname }}"
        - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
        - echo "127.0.0.1   localhost" >>/etc/hosts
        - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
        - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
      files: []
      users:
        - name: capv
          sshAuthorizedKeys:
            - #@ data.values.VSPHERE_SSH_AUTHORIZED_KEY
          sudo: ALL=(ALL) NOPASSWD:ALL
#@ end

Deploy a TKG cluster into a multi-AZ topology

To deploy a TKG cluster that spreads its worker nodes over multiple AZs, we need to add some key value pairs into the cluster config file.

Below is an example for my cluster config file – tkg-hugo.yaml.

The new key value pairs are described in the table below.

Parameter	Specification	Details
VSPHERE_REGION	k8s-region	Must be the same as the configuration in the vsphere-zones.yaml file
VSPHERE_ZONE	k8s-zone	Must be the same as the configuration in the vsphere-zones.yaml file
VSPHERE_AZ_0 VSPHERE_AZ_1 VSPHERE_AZ_2	az-1 az-2 az-3	Must be the same as the configuration in the vsphere-zones.yaml file
WORKER_MACHINE_COUNT	3	This is the number of worker nodes for the cluster. The total number of workers are distributed in a round-robin fashion across the number of AZs specified.
A note on WORKER_MACHINE_COUNT when using CLUSTER_PLAN: dev instead of prod. If you change the az-overlay.yaml @ if data.values.CLUSTER_PLAN == “prod” to @ if data.values.CLUSTER_PLAN == “dev”		Then the WORKER_MACHINE_COUNT reverts to the number of workers for each AZ. So if you set this number to 3, in a three AZ topology, you would end up with a TKG cluster with nine workers!

CLUSTER_CIDR: 100.96.0.0/11
CLUSTER_NAME: tkg-hugo
CLUSTER_PLAN: prod
ENABLE_CEIP_PARTICIPATION: 'false'
ENABLE_MHC: 'true'
IDENTITY_MANAGEMENT_TYPE: none
INFRASTRUCTURE_PROVIDER: vsphere
SERVICE_CIDR: 100.64.0.0/13
TKG_HTTP_PROXY_ENABLED: false
DEPLOY_TKG_ON_VSPHERE7: 'true'
VSPHERE_DATACENTER: /home.local
VSPHERE_DATASTORE: lun02
VSPHERE_FOLDER: /home.local/vm/tkg-vsphere-workload
VSPHERE_NETWORK: /home.local/network/tkg-workload
VSPHERE_PASSWORD: <encoded:snipped>
VSPHERE_RESOURCE_POOL: /home.local/host/cluster/Resources/tkg-vsphere-workload
VSPHERE_SERVER: vcenter.vmwire.com
VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa <snipped> administrator@vsphere.local
VSPHERE_USERNAME: administrator@vsphere.local
CONTROLPLANE_SIZE: small
WORKER_MACHINE_COUNT: 3
WORKER_SIZE: small
VSPHERE_INSECURE: 'true'
ENABLE_AUDIT_LOGGING: 'true'
ENABLE_DEFAULT_STORAGE_CLASS: 'false'
ENABLE_AUTOSCALER: 'false'
AVI_CONTROL_PLANE_HA_PROVIDER: 'true'
VSPHERE_REGION: k8s-region
VSPHERE_ZONE: k8s-zone
VSPHERE_AZ_0: az-1
VSPHERE_AZ_1: az-2
VSPHERE_AZ_2: az-3

Deploy the k8s-local-ssd Storage Class

Below is my storageclass-k8s-local-ssd.yaml.

Note that parameters.storagePolicyName: k8s-local-ssd, which is the same as the name of the storage policy for the local storage. All three of the local VMFS datastores that are backed by the local SSD drives are members of this storage policy.

Note that the volumeBindingMode is set to WaitForFirstConsumer.

Instead of creating a volume immediately, the WaitForFirstConsumer setting instructs the volume provisioner to wait until a pod using the associated PVC runs through scheduling. In contrast with the Immediate volume binding mode, when the WaitForFirstConsumer setting is used, the Kubernetes scheduler drives the decision of which failure domain to use for volume provisioning using the pod policies.

This guarantees the pod at its volume is always on the same AZ (ESXi host).

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: k8s-local-ssd
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: csi.vsphere.vmware.com
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
  storagePolicyName: k8s-local-ssd

Deploy a workload that uses Topology Aware Volume Provisioning

Below is a statefulset that deploys three pods running nginx. It configures two persistent volumes, one for www and another for log. Both of these volumes are going to be provisioned onto the same ESXi host where the pod is running. The statefulset also runs an initContainer that will download a simple html file from my repo and copy it to the www mount point (/user/share/nginx/html).

You can see under spec.affinity.nodeAffinity how the statefulset uses the topology.

The statefulset then exposes the nginx app using the nginx-service which uses the Gateway API, that I wrote about in a previous blog post.

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: default
  labels:
    ako.vmware.com/gateway-name: gateway-tkg-workload-vip
    ako.vmware.com/gateway-namespace: default
spec:
  selector:
    app: nginx
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
  type: ClusterIP
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  serviceName: nginx-service
  template:
    metadata:
      labels:
        app: nginx
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: topology.csi.vmware.com/k8s-zone
                operator: In
                values:
                - az-1
                - az-2
                - az-3
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - nginx
            topologyKey: topology.csi.vmware.com/k8s-zone
      terminationGracePeriodSeconds: 10
      initContainers:
      - name: install
        image: busybox
        command:
        - wget
        - "-O"
        - "/www/index.html"
        - https://raw.githubusercontent.com/hugopow/cse/main/index.html
        volumeMounts:
        - name: www
          mountPath: "/www"
      containers:
        - name: nginx
          image: k8s.gcr.io/nginx-slim:0.8
          ports:
            - containerPort: 80
              name: web
          volumeMounts:
            - name: www
              mountPath: /usr/share/nginx/html
            - name: logs
              mountPath: /logs
  volumeClaimTemplates:
    - metadata:
        name: www
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: k8s-local-ssd
        resources:
          requests:
            storage: 2Gi
    - metadata:
        name: logs
      spec:
        accessModes: [ "ReadWriteOnce" ]
        storageClassName: k8s-local-ssd
        resources:
          requests:
            storage: 1Gi

What if you wanted to use more than three availability zones?

Some notes here on what I experienced during my testing.

The TKG cluster config has the following three lines to specify the names of the AZs that you want to use which will be passed onto the Tanzu CLI to use to deploy your TKG cluster using the ytt overlay file. However, the Tanzu CLI only supports a total of three AZs.

VSPHERE_AZ_0: az-1
VSPHERE_AZ_1: az-2
VSPHERE_AZ_2: az-3

If you wanted to use more than three AZs, then you would have to remove these three lines from the TKG cluster config and change the ytt overlay to not use the VSPHERE_AZ_# variables but to hard code the AZs into the ytt overlay file instead.

To do this replace the following:

      #@ if data.values.VSPHERE_AZ_2:
      failureDomain: #@ data.values.VSPHERE_AZ_0
      #@ end

with the following:

      failureDomain: az-2

and create an additional block of MachineDeployment and KubeadmConfigTemplate for each additional AZ that you need.

Summary

Below are screenshots and the resulting deployed objects after running kubectl apply -f to the above.

kubectl get nodes

NAME                             STATUS   ROLES                  AGE     VERSION
tkg-hugo-md-0-7d455b7488-d6jrl   Ready    <none>                 3h23m   v1.22.5+vmware.1
tkg-hugo-md-1-bc76659f7-cntn4    Ready    <none>                 3h23m   v1.22.5+vmware.1
tkg-hugo-md-2-6bb75968c4-mnrk5   Ready    <none>                 3h23m   v1.22.5+vmware.1

You can see that the worker nodes are distributed across the ESXi hosts as per our vsphere-zones.yaml and also our az-overlay.yaml files.

kubectl get po -o wide

NAME    READY   STATUS    RESTARTS   AGE     IP                NODE                             NOMINATED NODE   READINESS GATES
web-0   1/1     Running   0          3h14m   100.124.232.195   tkg-hugo-md-2-6bb75968c4-mnrk5   <none>           <none>
web-1   1/1     Running   0          3h13m   100.122.148.67    tkg-hugo-md-1-bc76659f7-cntn4    <none>           <none>
web-2   1/1     Running   0          3h12m   100.108.145.68    tkg-hugo-md-0-7d455b7488-d6jrl   <none>           <none>

You can see that each pod is placed on a separate worker node.

kubectl get csinodes -o jsonpath='{range .items[*]}{.metadata.name} {.spec}{"\n"}{end}'

tkg-hugo-md-0-7d455b7488-d6jrl {"drivers":[{"allocatable":{"count":59},"name":"csi.vsphere.vmware.com","nodeID":"tkg-hugo-md-0-7d455b7488-d6jrl","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
tkg-hugo-md-1-bc76659f7-cntn4 {"drivers":[{"allocatable":{"count":59},"name":"csi.vsphere.vmware.com","nodeID":"tkg-hugo-md-1-bc76659f7-cntn4","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}
tkg-hugo-md-2-6bb75968c4-mnrk5 {"drivers":[{"allocatable":{"count":59},"name":"csi.vsphere.vmware.com","nodeID":"tkg-hugo-md-2-6bb75968c4-mnrk5","topologyKeys":["topology.csi.vmware.com/k8s-region","topology.csi.vmware.com/k8s-zone"]}]}

We can see that the CSI driver has correctly configured the worker nodes with the topologyKeys that enables the topology aware volume provisioning.

kubectl get pvc -o wide

NAME         STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS    AGE     VOLUMEMODE
logs-web-0   Bound    pvc-13cf4150-db60-4c13-9ee2-cbc092dba782   1Gi        RWO            k8s-local-ssd   3h18m   Filesystem
logs-web-1   Bound    pvc-e99cfe33-9fa4-46d8-95f8-8a71f4535b15   1Gi        RWO            k8s-local-ssd   3h17m   Filesystem
logs-web-2   Bound    pvc-6bd51eed-e0aa-4489-ac0a-f546dadcee16   1Gi        RWO            k8s-local-ssd   3h17m   Filesystem
www-web-0    Bound    pvc-8f46420a-41c4-4ad3-97d4-5becb9c45c94   2Gi        RWO            k8s-local-ssd   3h18m   Filesystem
www-web-1    Bound    pvc-c3c9f551-1837-41aa-b24f-f9dc6fdb9063   2Gi        RWO            k8s-local-ssd   3h17m   Filesystem
www-web-2    Bound    pvc-632a9f81-3e9d-492b-847a-9316043a2d47   2Gi        RWO            k8s-local-ssd   3h17m   Filesystem

kubectl get pv -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.claimRef.name}{"\t"}{.spec.nodeAffinity}{"\n"}{end}'

pvc-13cf4150-db60-4c13-9ee2-cbc092dba782        logs-web-0      {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-3"]}]}]}}
pvc-632a9f81-3e9d-492b-847a-9316043a2d47        www-web-2       {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-1"]}]}]}}
pvc-6bd51eed-e0aa-4489-ac0a-f546dadcee16        logs-web-2      {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-1"]}]}]}}
pvc-8f46420a-41c4-4ad3-97d4-5becb9c45c94        www-web-0       {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-3"]}]}]}}
pvc-c3c9f551-1837-41aa-b24f-f9dc6fdb9063        www-web-1       {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]},{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-2"]}]}]}}
pvc-e99cfe33-9fa4-46d8-95f8-8a71f4535b15        logs-web-1      {"required":{"nodeSelectorTerms":[{"matchExpressions":[{"key":"topology.csi.vmware.com/k8s-zone","operator":"In","values":["az-2"]},{"key":"topology.csi.vmware.com/k8s-region","operator":"In","values":["cluster"]}]}]}}

Here we see the placement for the persistent volumes within the AZs and they also align to the right worker node.

k get no tkg-hugo-md-0-7d455b7488-d6jrl -o yaml | grep topology.kubernetes.io/zone:

topology.kubernetes.io/zone: az-1

k get no tkg-hugo-md-1-bc76659f7-cntn4 -o yaml | grep topology.kubernetes.io/zone:

topology.kubernetes.io/zone: az-2

k get no tkg-hugo-md-2-6bb75968c4-mnrk5 -o yaml | grep topology.kubernetes.io/zone:

topology.kubernetes.io/zone: az-3

k get volumeattachments.storage.k8s.io

NAME                                                                   ATTACHER                 PV                                         NODE                             ATTACHED   AGE
csi-476b244713205d0d4d4e13da1a6bd2beec49ac90fbd4b64c090ffba8468f6479   csi.vsphere.vmware.com   pvc-c3c9f551-1837-41aa-b24f-f9dc6fdb9063   tkg-hugo-md-1-bc76659f7-cntn4    true       9h
csi-5a759811557125917e3b627993061912386f4d2e8fb709e85fc407117138b178   csi.vsphere.vmware.com   pvc-8f46420a-41c4-4ad3-97d4-5becb9c45c94   tkg-hugo-md-2-6bb75968c4-mnrk5   true       9h
csi-6016904b0ac4ac936184e95c8ff0b3b8bebabb861a99b822e6473c5ee1caf388   csi.vsphere.vmware.com   pvc-6bd51eed-e0aa-4489-ac0a-f546dadcee16   tkg-hugo-md-0-7d455b7488-d6jrl   true       9h
csi-c5b9abcc05d7db5348493952107405b557d7eaa0341aa4e952457cf36f90a26d   csi.vsphere.vmware.com   pvc-13cf4150-db60-4c13-9ee2-cbc092dba782   tkg-hugo-md-2-6bb75968c4-mnrk5   true       9h
csi-df68754411ab34a5af1c4014db9e9ba41ee216d0f4ec191a0d191f07f99e3039   csi.vsphere.vmware.com   pvc-e99cfe33-9fa4-46d8-95f8-8a71f4535b15   tkg-hugo-md-1-bc76659f7-cntn4    true       9h
csi-f48a7db32aafb2c76cc22b1b533d15d331cd14c2896b20cfb4d659621fd60fbc   csi.vsphere.vmware.com   pvc-632a9f81-3e9d-492b-847a-9316043a2d47   tkg-hugo-md-0-7d455b7488-d6jrl   true       9h

And finally, some other screenshots to show the PVCs in vSphere.

ESX1

ESX2

ESX3

Running Kubernetes Dashboard with signed certificates

In a previous post I went through how to deploy the Kubernetes Dashboard into a Kubernetes cluster with default settings, running with a self-signed certificate. This post covers how to update the configuration to use a signed certificate. I’m a fan of Let’s Encrypt so will be using a signed wildcard certificate from Let’s Encrypt for this post.

You can prepare Let’s Encrypt by referring to a previous post here.

Step 1. Create a new namespace

Create a new namespace for Kubernetes Dashboard

kubectl create ns kubernetes-dashboard

Step 2. Upload certificates

Upload your certificate and private key to $HOME/certs in pem format. Let’s Encrypt just happens to issue certificates in pem format with the following names:

cert.pem and privkey.pem

All we need to do is to rename these to:

tls.crt and tls.key

And then upload them to $HOME/certs where our kubectl tool is installed.

Step 3. Create secret

Create the secret for the custom certificate by running this command.

kubectl create secret generic kubernetes-dashboard-certs --from-file=$HOME/certs -n kubernetes-dashboard

You can check that that secret is ready by issuing the following command:

kubectl describe secret -n kubernetes-dashboard kubernetes-dashboard-certs

Name:         kubernetes-dashboard-certs
Namespace:    kubernetes-dashboard
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
tls.key:  1705 bytes
tls.crt:  1835 bytes

Step 4. Edit the deployment

We need to download the deployment yaml and then edit it to ensure that it uses the Let’s Encrypt signed certificates.

Run the following command to download the Kubernetes Dashboard deployment yaml file

wget https://raw.githubusercontent.com/kubernetes/dashboard/v2.5.0/aio/deploy/recommended.yaml

Now edit it by using your favorite editor and add in the following two lines

        - --tls-cert-file=/tls.crt
        - --tls-key-file=/tls.key

under the following

Deployment – kubernetes-dashboard – spec.template.spec.containers.args

kind: Deployment
apiVersion: apps/v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: kubernetes-dashboard
          image: kubernetesui/dashboard:v2.5.0
          imagePullPolicy: Always
          ports:
            - containerPort: 8443
              protocol: TCP
          args:
            - --tls-cert-file=/tls.crt
            - --tls-key-file=/tls.key
            - --auto-generate-certificates

Step 5. Expose Kubernetes Dashboard using a load balancer

Let’s expose the app using a load balancer, I’m using NSX ALB (Avi) but the code below can be used with any load balancer.

Continue editing the recommended.yaml file with the following contents from line 32:

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  type: LoadBalancer

Save changes to the file. Now we’re ready to deploy.

If you want to use the Avi Services API (K8s Gateway API). Then add labels to the service, like this. This will ensure that the service uses the Avi gateway.

kind: Service
apiVersion: v1
metadata:
  labels:
    k8s-app: kubernetes-dashboard
    ako.vmware.com/gateway-name: gateway-tkg-workload-vip
    ako.vmware.com/gateway-namespace: default
  name: kubernetes-dashboard
  namespace: kubernetes-dashboard
spec:
  ports:
    - port: 443
      targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  type: LoadBalancer

Step 6. Deploy Kubernetes Dashboard

Deploy the app with the following command:

kubectl apply -f recommended.yaml

Step 7. Get full access to the cluster for Kubernetes Dashboard

To get full cluster access to the kubernetes-dashboard account run the following

kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kubernetes-dashboard:kubernetes-dashboard

To login we’ll need to obtain a token with the following code:

kubectl describe -n kubernetes-dashboard secret kubernetes-dashboard-token

Copy just the token and paste it into the browser to login. Enjoy a secure connection to Kubernetes Dashboard. Enjoy!

Deploying Harbor onto Photon OS for Air-gapped Environments

This post describes how to setup Harbor to run on a standalone VM. There are times when you want to do this, such as occasions where your environment does not have internet access or you want to have a local repository running close to your environment.

I found that I was running a lot of TKG deployments against TKG staging builds in my lab and wanted to speed up cluster creation times, so building a local Harbor repository would make things a bit quicker and more reliable.

This post describes how you can setup a Harbor repository on a Photon VM.

Step 1: Setup a static IP

See the documentation https://vmware.github.io/photon/assets/files/html/3.0/photon_admin/setting-a-static-ip-address.html, and https://vmware.github.io/photon/assets/files/html/3.0/photon_admin/adding-a-dns-server.html

vi /etc/systemd/network/10-static-en.network

chmod 644 /etc/systemd/network/10-static-en.network
systemctl restart systemd-networkd

vi /etc/hostname

reboot

Step 2: Enable pings to the VM

iptables -A INPUT -p ICMP -j ACCEPT
iptables -A OUTPUT -p ICMP -j ACCEPT

Step 3: Update Photon repositories and perform updates

cd /etc/yum.repos.d/
sed  -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo
tdnf --assumeyes update
tdnf updateinfo
tdnf -y distro-sync
tdnf install -y bindutils tar parted
reboot

Step 4: Install docker-compose

curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
docker-compose --version
systemctl start docker
systemctl enable docker
docker version

Step 5: Add a data disk for Harbor

Add another vmdk file to the VM then run the below

fdisk -l
parted /dev/sdb mklabel gpt mkpart ext4 0% 100%
mkfs -t ext4 /dev/sdb1
mkdir /data
vim /etc/fstab

Append the following line to the end of the file

/dev/sdb1 /data ext4 defaults 0 0

mount /data
df -h

Step 6: Setup Harbor

mkdir -p /harbor /etc/docker/certs.d/harbor.vmwire.com
cd /harbor
curl -sLO https://github.com/goharbor/harbor/releases/download/v2.4.1/harbor-offline-installer-v2.4.1.tgz
tar xvf harbor-offline-installer-v2.4.1.tgz --strip-components=1

Step 7: Prepare SSL certificates

I use Let’s Encrypt and have the following three files renamed from the original Let’s Encrypt filenames:

harbor.cert

harbor_key.key and

ca.crt

harbor.cert is the wildcard certificate issued for my domain by Let’s Encrypt, it is originally named cert.pem.

harbor_key.key is orginally named privkey.pem.

ca.crt is chain.pem.

Copy all three certificate files to /etc/docker/certs.d/harbor.vmwire.com

cp harbor.cert harbor_key.key ca.crt /etc/docker/certs.d/harbor.vmwire.com/

Step 8: Edit the harbor.yml file

# Configuration file of Harbor

# The IP address or hostname to access admin UI and registry service.
# DO NOT use localhost or 127.0.0.1, because Harbor needs to be accessed by external clients.
hostname: harbor.vmwire.com

# http related config
http:
  # port for http, default is 80. If https enabled, this port will redirect to https port
  port: 80

# https related config
https:
  # https port for harbor, default is 443
  port: 443
  # The path of cert and key files for nginx
  certificate: /etc/docker/certs.d/harbor.vmwire.com/harbor.cert
  private_key: /etc/docker/certs.d/harbor.vmwire.com/harbor_key.key

[snipped]

Update line 5 with your harbor instance’s FQDN.

Update lines 17 and 18 with the certificate and private key.

You can leave all the other lines on default.

Install Harbor with the following command:

./install.sh

Check to see if services are running

docker-compose ps

Step 9: Add harbor FQDN to your DNS servers and connect to Harbor.

To upgrade, download the new offline installer and run

install.sh

Using Avi’s Support for Gateway API

Avi (NSX Advanced Load Balancer) supports Kubernetes Gateway API. This post shows how to install and use the Gateway API to expose applications using this custom resource definition (CRD).

Introduction

Avi (NSX Advanced Load Balancer) supports Kubernetes Gateway API. This post shows how to install and use the Gateway API to expose applications using this custom resource definition (CRD).

Gateway API is an open source project managed by the SIG-NETWORK community. It is a collection of resources that model service networking in Kubernetes. These resources – GatewayClass,Gateway, HTTPRoute, TCPRoute, Service, etc – aim to evolve Kubernetes service networking through expressive, extensible, and role-oriented interfaces that are implemented by many vendors and have broad industry support.
https://gateway-api.sigs.k8s.io/

For a quick introduction to the Kubernetes Gateway API, read this link and this link from the Avi documentation.

Why use Gateway API?

You would want to use the Gateway API if you had the following requirements:

Network segmentation – exposing applications from the same Kubernetes cluster to different network segments
Shared IP – exposing multiple services that use both TCP and UDP ports on the same IP address

NSX Advanced Load Balancer supports both of these requirements through the use of the Gateway API. The following section describes how this is implemented.

The Gateway API introduces a few new resource types:

GatewayClasses are cluster-scoped resources that act as templates to explicitly define behavior for Gateways derived from them. This is similar in concept to StorageClasses, but for networking data-planes.

Gateways are the deployed instances of GatewayClasses. They are the logical representation of the data-plane which performs routing, which may be in-cluster proxies, hardware LBs, or cloud LBs.

AVI Infra Setting

Aviinfrasetting provides a way to segregate Layer-4/Layer-7 virtual services to have properties based on different underlying infrastructure components, like Service Engine Group, intended VIP Network etc.

A sample Avi Infra Setting is as shown below:

apiVersion: ako.vmware.com/v1alpha1
kind: AviInfraSetting
metadata:
  name: aviinfrasetting-tkg-workload-vip
spec:
  seGroup:
    name: tkgvsphere-tkgworkload-group10
  network:
    vipNetworks:
      - networkName: tkg-workload-vip
        cidr: 172.16.4.64/27
    enableRhi: false

Avi Infra Setting is a cluster scoped CRD and can be attached to the intended Services. Avi Infra setting resources can be attached to Services using Gateway APIs.

GatewayClass

Gateway APIs provide interfaces to structure Kubernetes service networking.

AKO supports Gateway APIs via the servicesAPI flag in the values.yaml.

The Avi Infra Setting resource can be attached to a Gateway Class object, via the .spec.parametersRef as shown below:

apiVersion: networking.x-k8s.io/v1alpha1
kind: GatewayClass
metadata:
  name: gatewayclass-tkg-workload-vip
spec:
  controller: ako.vmware.com/avi-lb
  parametersRef:
    group: ako.vmware.com
    kind: AviInfraSetting
    name: aviinfrasetting-tkg-workload-vip

Gateway

The Gateway object provides a way to configure multiple Services as backends to the Gateway using label matching. The labels are specified as constant key-value pairs, the keys being ako.vmware.com/gateway-namespace and ako.vmware.com/gateway-name. The values corresponding to these keys must match the Gateway namespace and name respectively, for AKO to consider the Gateway valid. In case any one of the label keys are not provided as part of matchLabels OR the namespace/name provided in the label values do no match the actual Gateway namespace/name, AKO will consider the Gateway invalid.

Please see https://avinetworks.com/docs/ako/1.5/gateway/.

kind: Gateway
apiVersion: networking.x-k8s.io/v1alpha1
metadata:
  name: gateway-tkg-workload-vip
  namespace: default
spec:
  gatewayClassName: gatewayclass-tkg-workload-vip
  listeners:
  - protocol: TCP
    port: 80
    routes:
      selector:
        matchLabels:
          ako.vmware.com/gateway-name: gateway-tkg-workload-vip
          ako.vmware.com/gateway-namespace: default
      group: v1
      kind: Service
- protocol: TCP
    port: 443
    routes:
      selector:
        matchLabels:
          ako.vmware.com/gateway-name: gateway-tkg-workload-vip
          ako.vmware.com/gateway-namespace: default
      group: v1
      kind: Service

How to use GatewayAPI

Tying all of these CRDs together.

A Gateway uses a GatewayClass, which in turn uses an AviInfraSetting. Therefore when a Gateway is used by a Service using the relevant labels, that particular service will be exposed on a network that is referenced by the AviInfraSetting via the .spec.network.vipNetworks

https://github.com/vmware/load-balancer-and-ingress-services-for-kubernetes/blob/master/docs/crds/avinfrasetting.md#aviinfrasetting-with-servicesingressroutes

In your helm charts, for any service that needs a LoadBalancer service. You would now want to use ClusterIP instead of LoadBalancer and use Labels such as the following:

apiVersion: v1
kind: Service
metadata:
  name: web-statefulset-service-1
  namespace: default
  labels:
    ako.vmware.com/gateway-name: gateway-tkg-workload-vip
    ako.vmware.com/gateway-namespace: default
spec:
  selector:
    app: nginx
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
  type: ClusterIP

The Labels

ako.vmware.com/gateway-name: gateway-tkg-workload-vip
ako.vmware.com/gateway-namespace: default

and the ClusterIP type tells the AKO operator to use the gateways, each gateway is on a separate network segment for traffic separation via the spec.gatewayClassName and conversely the gatewayclass via the spec.parametersRef.name for the AviInfraSetting.

Scaling TKGm control plane nodes vertically

This post describes how to change TKGm control plane nodes resources, such as vCPU and RAM. In the previous post, I described how to increase resources for a worker node. This process was quite simple and straightforward and initially I had a tough time finding the right resource to edit as the control plane nodes use a different resource to provision the virtual machines.

Step 1. Change to the TKG management cluster context

kubectl config use-context tkg-mgmt

Step 2. List VSphereMachineTemplate

kubectl get VSphereMachineTemplate

Step 4. Make a copy of the current control plane VsphereMachineTemplate to a new file

kubectl get vspheremachinetemplates tkg-ssc-control-plane -o yaml > tkg-ssc-control-plane-new.yaml

Step 5. Edit the new file and make the changes

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","metadata":{"annotations":{},"name":"tkg-ssc-control-plane-new","namespace":"default"},"spec":{"template":{"spec":{"cloneMode":"fullClone","datacenter":"/home.local","datastore":"lun01","diskGiB":40,"folder":"/home.local/vm/tkg-vsphere-shared-services","memoryMiB":4096,"network":{"devices":[{"dhcp4":true,"networkName":"/home.local/network/tkg-mgmt"}]},"numCPUs":2,"resourcePool":"/home.local/host/cluster/Resources/tkg-vsphere-shared-services","server":"vcenter.vmwire.com","storagePolicyName":"","template":"/home.local/vm/Templates/ubuntu-2004-kube-v1.21.2+vmware.1"}}}}
  creationTimestamp: "2021-11-11T07:33:37Z"
  generation: 1
  name: tkg-ssc-control-plane-new
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    kind: Cluster
    name: tkg-ssc
    uid: 9bd41852-38df-4d12-bb81-7b2bb35fdfa5
  resourceVersion: "198053"
  uid: 09b62ee7-6532-4bf6-8939-ef70a28bc65f
spec:
  template:
    spec:
      cloneMode: fullClone
      datacenter: /home.local
      datastore: lun01
      diskGiB: 40
      folder: /home.local/vm/tkg-vsphere-shared-services
      memoryMiB: 4096
      network:
        devices:
        - dhcp4: true
          networkName: /home.local/network/tkg-mgmt
      numCPUs: 2
      resourcePool: /home.local/host/cluster/Resources/tkg-vsphere-shared-services
      server: vcenter.vmwire.com
      storagePolicyName: ""
      template: /home.local/vm/Templates/ubuntu-2004-kube-v1.21.2+vmware.1

I made changes to lines 6, 9, 26 and 31. I want to reduce the vCPU and RAM of the control plane nodes as these were over-provisioned by mistake.

Step 6. Apply the new VsphereMachineTemplate

kubectl apply -f tkg-ssc-control-plane-new.yaml

Step 7. List all KubeadmControlPlane

kubectl get KubeadmControlPlane

NAME                    INITIALIZED   API SERVER AVAILABLE   VERSION            REPLICAS   READY   UPDATED   UNAVAILABLE
tkg-ssc-control-plane   true          true                   v1.21.2+vmware.1   1          1       1

Step 8. Edit the KubeadmControlPlane for the cluster

kubectl edit KubeadmControlPlane tkg-ssc-control-plane

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: controlplane.cluster.x-k8s.io/v1alpha3
kind: KubeadmControlPlane
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"controlplane.cluster.x-k8s.io/v1alpha3","kind":"KubeadmControlPlane","metadata":{"annotations":{},"name":"tkg-ssc-control-plane","namespace":"default"},"spec":{"infrastructureTemplate":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","name":"tkg-ssc-control-plane"},"kubeadmConfigSpec":{"clusterConfiguration":{"apiServer":{"extraArgs":{"audit-log-maxage":"30","audit-log-maxbackup":"10","audit-log-maxsize":"100","audit-log-path":"/var/log/kubernetes/audit.log","audit-policy-file":"/etc/kubernetes/audit-policy.yaml","cloud-provider":"external","tls-cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"},"extraVolumes":[{"hostPath":"/etc/kubernetes/audit-policy.yaml","mountPath":"/etc/kubernetes/audit-policy.yaml","name":"audit-policy"},{"hostPath":"/var/log/kubernetes","mountPath":"/var/log/kubernetes","name":"audit-logs"}],"timeoutForControlPlane":"8m0s"},"controllerManager":{"extraArgs":{"cloud-provider":"external","tls-cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"}},"dns":{"imageRepository":"projects.registry.vmware.com/tkg","imageTag":"v1.8.0_vmware.5","type":"CoreDNS"},"etcd":{"local":{"dataDir":"/var/lib/etcd","extraArgs":{"cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"},"imageRepository":"projects.registry.vmware.com/tkg","imageTag":"v3.4.13_vmware.15"}},"imageRepository":"projects.registry.vmware.com/tkg","scheduler":{"extraArgs":{"tls-cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"}}},"files":[{"content":"---snip---","encoding":"base64","owner":"root:root","path":"/etc/kubernetes/audit-policy.yaml","permissions":"0600"}],"initConfiguration":{"nodeRegistration":{"criSocket":"/var/run/containerd/containerd.sock","kubeletExtraArgs":{"cloud-provider":"external","tls-cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"},"name":"{{ ds.meta_data.hostname }}"}},"joinConfiguration":{"nodeRegistration":{"criSocket":"/var/run/containerd/containerd.sock","kubeletExtraArgs":{"cloud-provider":"external","tls-cipher-suites":"TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384"},"name":"{{ ds.meta_data.hostname }}"}},"preKubeadmCommands":["hostname \"{{ ds.meta_data.hostname }}\"","echo \"::1         ipv6-localhost ipv6-loopback\" \u003e/etc/hosts","echo \"127.0.0.1   localhost\" \u003e\u003e/etc/hosts","echo \"127.0.0.1   {{ ds.meta_data.hostname }}\" \u003e\u003e/etc/hosts","echo \"{{ ds.meta_data.hostname }}\" \u003e/etc/hostname"],"useExperimentalRetryJoin":true,"users":[{"name":"capv","sshAuthorizedKeys":["---snip---"],"sudo":"ALL=(ALL) NOPASSWD:ALL"}]},"replicas":1,"version":"v1.21.2+vmware.1"}}
  creationTimestamp: "2021-11-11T07:33:37Z"
  finalizers:
  - kubeadm.controlplane.cluster.x-k8s.io
  generation: 2
  labels:
    cluster.x-k8s.io/cluster-name: tkg-ssc
  name: tkg-ssc-control-plane
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    blockOwnerDeletion: true
    controller: true
    kind: Cluster
    name: tkg-ssc
    uid: 9bd41852-38df-4d12-bb81-7b2bb35fdfa5
  resourceVersion: "5787872"
  uid: 1532ce9b-2d7e-45f7-b8ab-2d5bd4fe6b7f
spec:
  infrastructureTemplate:
    apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
    kind: VSphereMachineTemplate
    name: tkg-ssc-control-plane-new
    namespace: default
  kubeadmConfigSpec:
    clusterConfiguration:
      apiServer:
        extraArgs:
          audit-log-maxage: "30"
          audit-log-maxbackup: "10"
          audit-log-maxsize: "100"
          audit-log-path: /var/log/kubernetes/audit.log
          audit-policy-file: /etc/kubernetes/audit-policy.yaml
          cloud-provider: external
tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
      dns:
        imageRepository: projects.registry.vmware.com/tkg
        imageTag: v1.8.0_vmware.5
        type: CoreDNS
      etcd:
        local:
          dataDir: /var/lib/etcd
          extraArgs:
            cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
          imageRepository: projects.registry.vmware.com/tkg
          imageTag: v3.4.13_vmware.15
      imageRepository: projects.registry.vmware.com/tkg
      networking: {}
      scheduler:
        extraArgs:
          tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
    files:
    - content: ---snip---
      encoding: base64
      owner: root:root
      path: /etc/kubernetes/audit-policy.yaml
      permissions: "0600"
    initConfiguration:
      localAPIEndpoint:
        advertiseAddress: ""
        bindPort: 0
      nodeRegistration:
        criSocket: /var/run/containerd/containerd.sock
        kubeletExtraArgs:
          cloud-provider: external
          tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
        name: '{{ ds.meta_data.hostname }}'
    joinConfiguration:
      discovery: {}
      nodeRegistration:
        criSocket: /var/run/containerd/containerd.sock
        kubeletExtraArgs:
          cloud-provider: external
          tls-cipher-suites: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
        name: '{{ ds.meta_data.hostname }}'
    preKubeadmCommands:
    - hostname "{{ ds.meta_data.hostname }}"
    - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
    - echo "127.0.0.1   localhost" >>/etc/hosts
    - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
    - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
    useExperimentalRetryJoin: true
    users:
    - name: capv
      sshAuthorizedKeys:
      - ssh-rsa ---snip---
      sudo: ALL=(ALL) NOPASSWD:ALL
  replicas: 1
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1
    type: RollingUpdate
  version: v1.21.2+vmware.1
status:
  conditions:
  - lastTransitionTime: "2021-11-22T14:40:53Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2021-11-11T07:35:55Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2021-11-11T07:33:39Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2021-11-22T14:39:32Z"
    status: "True"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2021-11-22T14:40:53Z"
    status: "True"
    type: EtcdClusterHealthyCondition
  - lastTransitionTime: "2021-11-22T14:40:53Z"
    status: "True"
    type: MachinesReady
  - lastTransitionTime: "2021-11-22T14:39:57Z"
    status: "True"
    type: MachinesSpecUpToDate
  - lastTransitionTime: "2021-11-22T14:40:53Z"
    status: "True"
    type: Resized
  initialized: true
  observedGeneration: 2
  ready: true
  readyReplicas: 1
  replicas: 1
  selector: cluster.x-k8s.io/cluster-name=tkg-ssc,cluster.x-k8s.io/control-plane
  updatedReplicas: 1

Change line 32, to use the new VsphereMachineTemplate called tkg-ssc-control-plane-new. Once you save and quit with :wq! the control plane nodes will be re-deployed.

Installing Contour and Envoy with Container Service Extension and VMware Cloud Director

This post summarizes how to get Contour and Envoy up and running with Kubernetes clusters running in VMware Cloud Director.

Pre-requisites

A Kubernetes cluster deployed by Container Service Extension in VCD
NSX Advanced Load Balancer setup for the Kubernetes cluster
A VMware Cloud Director deployment

Step 1. Upload an SSL certificate for Contour to VCD

Obtain the cluster ID from Kubernetes Container Clusters, copy the entire Cluster ID.

Navigate to Certificate Management, Certificates Library, and click on Import

I used a Let’s Encrypt signed certificate. I wrote about using Let’s Encrypt in a previous post.

For the friendly name, paste in the Cluster ID and append to the end “-cert”

Continue the wizard by uploading the certificate and the private key for the certificate.

Step 2. Install Helm

helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm fetch bitnami/contour

tar xvf contour-<version>.tgz

Step 3. Running Envoy as a non-root user

Envoy is configured to run as a non-root user by default. This is much more secure but we won’t be able to use any ports that are lower than 1024. Therefore we must change the values.yaml file for contour.

Edit the values.yaml file located in the directory that you untar the tkz file into and search for

envoy.containerPorts.http

Change the http port to 8080 and the https port to 8443.

It should end up looking like this:

  containerPorts:
    http: 8080
    https: 8443

Step 4. Installing Contour (and Envoy)

Install Contour by running the following command

helm install ingress <path-to-contour-directory>

You should get one daemonset named ingress-contour-envoy and deployment named ingress-contour-contour. These spin up two pods.

You will also see two services starting, one called ingress-contour with a service type of ClusterIP and another called ingress-contour-envoy with a service type LoadBalancer. Wait for NSX ALB to assign an external IP for the envoy service from your Organization network IP pool.

This IP is now your Kubernetes cluster IP for ingress services. Make a note of this IP address. My example uses 10.149.1.116 as the external IP.

Step 5. Setup DNS

The next step to do is to setup DNS, I’m using Windows DNS in my lab so what I’ve done is setup a sub domain called apps.vmwire.com and also setup an A record pointing to *.apps.vmwire.com.

*.apps.vmwire.com 10.149.1.116

DNS is now setup to point *.apps.vmwire.com to the external IP assigned to Envoy. From this point forward, any DNS request that hits *.apps.vmwire.com will be redirected to Contour.

Testing ingress with some apps

Download the following files from my Github.

https://raw.githubusercontent.com/hugopow/cse/main/shapes.yaml

https://raw.githubusercontent.com/hugopow/cse/main/shapes-ingress.yaml

They are two yaml files that deploys a sample web application and then exposes the applications using Contour and Envoy.

You don’t have to edit the shapes.yaml file, but you will need to edit the shapes-ingress.yaml file and change lines 9 and 16 to your desired DNS settings.

In this example, Contour will use circles.apps.vmwire.com to expose the circles application and triangles.apps.vmwire.com to expose the triangles application. Note that we are not adding circles. or triangles. A records into the DNS server.

Lets deploy the circles and triangles apps.

kubectl apply -f shapes.yaml

And then expose the applications with Contour

kubectl apply -f shapes-ingress.yaml

Now open up a web browser and navigate to http://circles.<your-domain> or http://triangles.<your-domain> and see the apps being exposed by Contour. If you don’t get a connection, its probably because you haven’t enabled port 80 through your Edge Gateway.

Scaling up a TKGm cluster vertically

It is very simple to scale-out a TKGm cluster. The command

tanzu cluster scale cluster_name --controlplane-machine-count 5 --worker-machine-count 10

will easily do this for you, this is known as horizontal scale-out. But have you thought of how to scale-up control plane or worker nodes with more CPU or memory?

This post discusses how you can scale up a TKGm worker node, tl;dr how to increase or decrease worker node CPU, RAM, disk.

Getting started

It is not a simple process to scale-up as it is to scale-out. Follow the steps below to scale-up your TKGm cluster.

Step 1.

Run the following command to obtain the list of vSphere machine templates that TKGm uses to deploy control plane and worker nodes.

kubectl get vspheremachinetemplate

NAME                             AGE
tkg-ssc-control-plane            3d1h
tkg-ssc-worker                   3d1h
tkg-workload-01-control-plane    3d
tkg-workload-01-worker           3d

You can see that there are four machine templates.

Lets say we want to increase the size of the worker nodes in the tkg-workload-01 cluster.

Lets describe the tkg-workload-01-worker machine template.

kubectl describe vspheremachinetemplate tkg-workload-01-worker

Name:         tkg-workload-01-worker
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  infrastructure.cluster.x-k8s.io/v1alpha3
Kind:         VSphereMachineTemplate
Metadata:
  Creation Timestamp:  2021-10-29T14:11:25Z
  Generation:          1
  Managed Fields:
    API Version:  infrastructure.cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:template:
          .:
          f:spec:
            .:
            f:cloneMode:
            f:datacenter:
            f:datastore:
            f:diskGiB:
            f:folder:
            f:memoryMiB:
            f:network:
              .:
              f:devices:
            f:numCPUs:
            f:resourcePool:
            f:server:
            f:storagePolicyName:
            f:template:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2021-10-29T14:11:25Z
    API Version:  infrastructure.cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:ownerReferences:
          .:
          k:{"uid":"be507594-0c05-4d30-8ed6-56811733df23"}:
            .:
            f:apiVersion:
            f:kind:
            f:name:
            f:uid:
    Manager:    manager
    Operation:  Update
    Time:       2021-10-29T14:11:25Z
  Owner References:
    API Version:     cluster.x-k8s.io/v1alpha3
    Kind:            Cluster
    Name:            tkg-workload-01
    UID:             be507594-0c05-4d30-8ed6-56811733df23
  Resource Version:  45814
  UID:               fc1f3d9f-078f-4282-b93f-e46593a760a5
Spec:
  Template:
    Spec:
      Clone Mode:   fullClone
      Datacenter:   /TanzuPOC
      Datastore:    tanzu_ssd_02
      Disk Gi B:    40
      Folder:       /TanzuPOC/vm/tkg-vsphere-workload
      Memory Mi B:  16384
      Network:
        Devices:
          dhcp4:            true
          Network Name:     /TanzuPOC/network/TKG-wkld
      Num CP Us:            4
      Resource Pool:        /TanzuPOC/host/Workload Cluster 1/Resources/tkg-vsphere-workload
      Server:               vcenter.vmwire.com
      Storage Policy Name:
      Template:             /TanzuPOC/vm/ubuntu-2004-kube-v1.21.2+vmware.1
Events:                     <none>

You can see that this machine template has 16GB of RAM and 4 vCPUs. Lets say we want to increase workers to 120GB of RAM and 24 vCPUs, how would we do this?

Step 2.

We need to clone the currently in use machine template into a new one and then apply it.

kubectl get vspheremachinetemplate tkg-workload-01-worker -o yaml > new-machine-template.yaml

Step 3.

Now that we have exported the current machine template into a new yaml file, we can edit it to suit our needs. Edi the file and make the changes to the file.

apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: VSphereMachineTemplate
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","metadata":{"annotations":{},"creationTimestamp":"2021-10-29T14:11:25Z","generation":1,"name":"tkg-workload-01-worker-scale","namespace":"default","ownerReferences":[{"apiVersion":"cluster.x-k8s.io/v1alpha3","kind":"Cluster","name":"tkg-workload-01","uid":"be507594-0c05-4d30-8ed6-56811733df23"}],"resourceVersion":"45814","uid":"fc1f3d9f-078f-4282-b93f-e46593a760a5"},"spec":{"template":{"spec":{"cloneMode":"fullClone","datacenter":"/TanzuPOC","datastore":"tanzu_ssd_02","diskGiB":40,"folder":"/TanzuPOC/vm/tkg-vsphere-workload","memoryMiB":122880,"network":{"devices":[{"dhcp4":true,"networkName":"/TanzuPOC/network/TKG-wkld"}]},"numCPUs":24,"resourcePool":"/TanzuPOC/host/Workload Cluster 1/Resources/tkg-vsphere-workload","server":"tanzuvcenter01.ete.ka.sw.ericsson.se","storagePolicyName":"","template":"/TanzuPOC/vm/ubuntu-2004-kube-v1.21.2+vmware.1"}}}}
  creationTimestamp: "2021-11-01T11:18:08Z"
  generation: 1
  name: tkg-workload-01-worker-scale
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    kind: Cluster
    name: tkg-workload-01
    uid: be507594-0c05-4d30-8ed6-56811733df23
  resourceVersion: "1590589"
  uid: 8697ec4c-7118-4ff0-b4cd-a456cb090f58
spec:
  template:
    spec:
      cloneMode: fullClone
      datacenter: /TanzuPOC
      datastore: tanzu_ssd_02
      diskGiB: 40
      folder: /TanzuPOC/vm/tkg-vsphere-workload
      memoryMiB: 122880
      network:
        devices:
        - dhcp4: true
          networkName: /TanzuPOC/network/TKG-wkld
      numCPUs: 24
      resourcePool: /TanzuPOC/host/Workload Cluster 1/Resources/tkg-vsphere-workload
      server: vcenter.vmwire.com
      storagePolicyName: ""
      template: /TanzuPOC/vm/ubuntu-2004-kube-v1.21.2+vmware.1

Change lines 6 and 9 by appending a new name to the machine template, you’ll notice that the original name was tkg-workload-01-worker, I appended “scale” to it so the new name of this new machine template is tkg-workload-01-worker-scale.

Step 4.

We can now apply the new machine template with this command

kubectl apply –f new-machine-template.yaml

We can check that the new machine template exists by running this command

kubectl get vspheremachinetemplate

NAME                             AGE
tkg-ssc-control-plane            3d1h
tkg-ssc-worker                   3d1h
tkg-workload-01-control-plane    3d
tkg-workload-01-worker           3d
tkg-workload-01-worker-scale     10s

Step 5.

Now we can apply the new machine template to our cluster.

Before doing that, we need to obtain the machine deployment details for the tkg-workload-01 cluster, we can get this information by running these commands

kubectl get MachineDeployment

NAME                   PHASE     REPLICAS   READY   UPDATED   UNAVAILABLE
tkg-ssc-md-0           Running   3          3       3
tkg-workload-01-md-0   Running   4          4       4

We are interested in the tkg-workload-01-md-0 machine deployment so lets describe it.

kubectl describe MachineDeployment tkg-workload-01-md-0

Name:         tkg-workload-01-md-0
Namespace:    default
Labels:       cluster.x-k8s.io/cluster-name=tkg-workload-01
Annotations:  machinedeployment.clusters.x-k8s.io/revision: 3
API Version:  cluster.x-k8s.io/v1alpha3
Kind:         MachineDeployment
Metadata:
  Creation Timestamp:  2021-10-29T14:11:25Z
  Generation:          7
  Managed Fields:
    API Version:  cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
        f:labels:
          .:
          f:cluster.x-k8s.io/cluster-name:
      f:spec:
        .:
        f:clusterName:
        f:selector:
          .:
          f:matchLabels:
            .:
            f:cluster.x-k8s.io/cluster-name:
        f:template:
          .:
          f:metadata:
            .:
            f:labels:
              .:
              f:cluster.x-k8s.io/cluster-name:
              f:node-pool:
          f:spec:
            .:
            f:bootstrap:
              .:
              f:configRef:
                .:
                f:apiVersion:
                f:kind:
                f:name:
            f:clusterName:
            f:infrastructureRef:
              .:
              f:apiVersion:
              f:kind:
            f:version:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2021-10-29T14:11:25Z
    API Version:  cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:template:
          f:spec:
            f:infrastructureRef:
              f:name:
    Manager:      kubectl-edit
    Operation:    Update
    Time:         2021-11-01T11:25:51Z
    API Version:  cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:spec:
        f:replicas:
    Manager:      tanzu-plugin-cluster
    Operation:    Update
    Time:         2021-11-01T12:33:35Z
    API Version:  cluster.x-k8s.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:machinedeployment.clusters.x-k8s.io/revision:
        f:ownerReferences:
          .:
          k:{"uid":"be507594-0c05-4d30-8ed6-56811733df23"}:
            .:
            f:apiVersion:
            f:kind:
            f:name:
            f:uid:
      f:status:
        .:
        f:availableReplicas:
        f:observedGeneration:
        f:phase:
        f:readyReplicas:
        f:replicas:
        f:selector:
        f:updatedReplicas:
    Manager:    manager
    Operation:  Update
    Time:       2021-11-01T14:30:38Z
  Owner References:
    API Version:     cluster.x-k8s.io/v1alpha3
    Kind:            Cluster
    Name:            tkg-workload-01
    UID:             be507594-0c05-4d30-8ed6-56811733df23
  Resource Version:  1665423
  UID:               5148e564-cf66-4581-8941-c3024c58967e
Spec:
  Cluster Name:               tkg-workload-01
  Min Ready Seconds:          0
  Progress Deadline Seconds:  600
  Replicas:                   4
  Revision History Limit:     1
  Selector:
    Match Labels:
      cluster.x-k8s.io/cluster-name:  tkg-workload-01
  Strategy:
    Rolling Update:
      Max Surge:        1
      Max Unavailable:  0
    Type:               RollingUpdate
  Template:
    Metadata:
      Labels:
        cluster.x-k8s.io/cluster-name:  tkg-workload-01
        Node - Pool:                    tkg-workload-01-worker-pool
    Spec:
      Bootstrap:
        Config Ref:
          API Version:  bootstrap.cluster.x-k8s.io/v1alpha3
          Kind:         KubeadmConfigTemplate
          Name:         tkg-workload-01-md-0
      Cluster Name:     tkg-workload-01
      Infrastructure Ref:
        API Version:  infrastructure.cluster.x-k8s.io/v1alpha3
        Kind:         VSphereMachineTemplate
        Name:         tkg-workload-01-worker
      Version:        v1.21.2+vmware.1
Status:
  Available Replicas:   4
  Observed Generation:  7
  Phase:                Running
  Ready Replicas:       4
  Replicas:             4
  Selector:             cluster.x-k8s.io/cluster-name=tkg-workload-01
  Updated Replicas:     4
Events:
  Type    Reason           Age                 From                          Message
  ----    ------           ----                ----                          -------
  Normal  SuccessfulScale  90s (x2 over 114m)  machinedeployment-controller  Scaled down MachineSet "tkg-workload-01-md-0-647645ddcd" to 4

The line that we are interested in is line 38. This is the current machine template that this cluster is using, you’ll notice that it is of course using the original spec, what we need to do is change it to the new spec that we created earlier. If you remember, we named that one tkg-workload-01-worker-scale.

Step 6.

kubectl edit MachineDeployment tkg-workload-01-md-0

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: cluster.x-k8s.io/v1alpha3
kind: MachineDeployment
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"cluster.x-k8s.io/v1alpha3","kind":"MachineDeployment","metadata":{"annotations":{},"labels":{"cluster.x-k8s.io/cluster-name":"tkg-workload-01"},"name":"tkg-workload-01-md-0","namespace":"default"},"spec":{"clusterName":"tkg-workload-01","replicas":4,"selector":{"matchLabels":{"cluster.x-k8s.io/cluster-name":"tkg-workload-01"}},"template":{"metadata":{"labels":{"cluster.x-k8s.io/cluster-name":"tkg-workload-01","node-pool":"tkg-workload-01-worker-pool"}},"spec":{"bootstrap":{"configRef":{"apiVersion":"bootstrap.cluster.x-k8s.io/v1alpha3","kind":"KubeadmConfigTemplate","name":"tkg-workload-01-md-0"}},"clusterName":"tkg-workload-01","infrastructureRef":{"apiVersion":"infrastructure.cluster.x-k8s.io/v1alpha3","kind":"VSphereMachineTemplate","name":"tkg-workload-01-worker"},"version":"v1.21.2+vmware.1"}}}}
    machinedeployment.clusters.x-k8s.io/revision: "3"
  creationTimestamp: "2021-10-29T14:11:25Z"
  generation: 7
  labels:
    cluster.x-k8s.io/cluster-name: tkg-workload-01
  name: tkg-workload-01-md-0
  namespace: default
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1alpha3
    kind: Cluster
    name: tkg-workload-01
    uid: be507594-0c05-4d30-8ed6-56811733df23
  resourceVersion: "1665423"
  uid: 5148e564-cf66-4581-8941-c3024c58967e
spec:
  clusterName: tkg-workload-01
  minReadySeconds: 0
  progressDeadlineSeconds: 600
  replicas: 4
  revisionHistoryLimit: 1
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: tkg-workload-01
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        cluster.x-k8s.io/cluster-name: tkg-workload-01
        node-pool: tkg-workload-01-worker-pool
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
          kind: KubeadmConfigTemplate
          name: tkg-workload-01-md-0
      clusterName: tkg-workload-01
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
        kind: VSphereMachineTemplate
        name: tkg-workload-01-worker-scale
      version: v1.21.2+vmware.1
status:
  availableReplicas: 4
  observedGeneration: 7
  phase: Running
  readyReplicas: 4
  replicas: 4
  selector: cluster.x-k8s.io/cluster-name=tkg-workload-01
  updatedReplicas: 4

The line that we are interested in is line 54. We need to change the machine template from that old one to our new one.

Lets make that change by going down to line 54 and adding “-scale” to the end of that line. Once you save and quit using “:wq!”. Kubernetes will make do a rolling update of your TKGm cluster for you.

Finishing off

Once the rolling update is done, you can check vSphere Web Client for new VMs being cloned and old ones being deleted. You can also run the command below to see the status of the rolling updates.

kubectl get MachineDeployment

You’ll then see that your new worker nodes have been resized without interrupting any of the running pods in the cluster.

Resize a TKGm cluster in CSE

When trying to resize a TKGm cluster with CSE, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

This post shows how you can use the vcd cse cli to workaround this problem.

When trying to resize a TKGm cluster with CSE in the VCD UI, you might encounter this error below:

Cluster resize request failed. Please contact your provider if this problem persists. (Error: Unknown error)

Checking the logs in ~/.cse-logs there are no logs that show what the error is. It appears to be an issue with the Container UI Plugin for CSE 3.1.0.

If you review the console messages in Chrome’s developer tools you might see something like the following:

TypeError: Cannot read properties of null (reading 'length')
    at getFullSpec (https://vcd.vmwire.com/tenant/tenant1/uiPlugins/80134fc9-86e1-41db-9d02-b02d5e9e1e3c/ca5642fa-7186-4da2-b273-2dbd3451fd50/bundle.js:1:170675)
    at resizeCseCluster

This post shows how you can use the vcd cse cli to workaround this problem.

Using the vcd cse cli to resize a TKGm cluster

First log into the CSE appliance or somewhere with vcd cse cli installed
Then log into the VCD Org that has the cluster that you want to resize with a user with the role with the cse:nativecluster rights bundle.
- vcd login vcd.vmwire.com tenant1 tenant1-admin -p Vmware1!
Lets list the clusters using this command
- vcd cse cluster list
CSE should show you the clusters belonging to this organization
Now lets obtain the details of the cluster that we want to resize
- vcd cse cluster info hugo-tkg
- copy the entire output of that command and paste it into Notepad++
Delete everything from the status: line below so you only end up with the apiVersion, kind, metadata and spec sections. Like this:

apiVersion: cse.vmware.com/v2.0
kind: TKGm
metadata:
  name: hugo-tkg
  orgName: tenant1
  site: https://vcd.vmwire.com
  virtualDataCenterName: tenant1-vdc
spec:
  distribution:
    templateName: ubuntu-2004-kube-v1.20.5-vmware.2-tkg.1-6700972457122900687
    templateRevision: 1
  settings:
    network:
      cni: null
      expose: true
      pods:
        cidrBlocks:
        - 100.96.0.0/11
      services:
        cidrBlocks:
        - 100.64.0.0/13
    ovdcNetwork: default-organization-network
    rollbackOnFailure: true
    sshKey: ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ==
      rsa-key-20210508
  topology:
    controlPlane:
      count: 1
      cpu: null
      memory: null
      sizingClass: small
      storageProfile: iscsi
    nfs:
      count: 0
      sizingClass: null
      storageProfile: null
    workers:
      count: 3
      cpu: null
      memory: null
      sizingClass: medium
      storageProfile: iscsi

Prepare a cluster config file

Change the workers: count to your new desired number of workers.
Save this file as update_my_cluster.yaml
Update the cluster with this command
- vcd cse cluster apply update_my_cluster.yaml
You’ll notice that CSE will deploy another worker node into the same vApp and after a few minutes your TKGm cluster will have another node added to it.

root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   14m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 10m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 12m   v1.20.5+vmware.2
root@photon-manager [ ~/.kube ]# kubectl get nodes
NAME        STATUS   ROLES                  AGE   VERSION
mstr-zcn7   Ready    control-plane,master   22m   v1.20.5+vmware.2
node-7swy   Ready    <none>                 17m   v1.20.5+vmware.2
node-90sb   Ready    <none>                 19m   v1.20.5+vmware.2
node-rbmz   Ready    <none>                 43s   v1.20.5+vmware.2

Viewing client logs

The vcd cse cli commands are client side, to enable logging for this do the following

Run this command in the CSE appliance or on your workstation that has the vcd cse cli installed.
- CSE_CLIENT_WIRE_LOGGING=True
View the logs by using this command
- tail -f cse-client-debug.log

A couple of notes

The vcd cse cluster resize command is not enabled if your CSE server is using legacy_mode: false. You can read up on this in this link.

Therefore, the only way to resize a cluster is to update it using the vcd cse cluster apply command. The apply command supports the following:

apply a configuration to a cluster resource by filename. The
resource will be created if it does not exist. (The command
can be used to create the cluster, scale-up/down worker count,
scale-up NFS nodes, upgrade the cluster to a new K8s version.

CSE 3.1.1 can only scale-up a TKGm cluster, it does not support scale-down yet.

Using a Statefulset to demo VCD cloud and storage providers.

This post uses a statefulset to deploy nginx with pvc and load balancer services into a Kubernetes cluster running in VMware Cloud Director enabled with Container Service Extension.

VCD has a cloud provider named vmware-cloud-director-ccm-0 and a CSI provider named csi-vcd-controllerplugin-0.

This post uses a statefulset to deploy nginx with pvc and load balancer services into a Kubernetes cluster running in VMware Cloud Director enabled with Container Service Extension.

VCD has a cloud provider named vmware-cloud-director-ccm-0 and a CSI provider named csi-vcd-controllerplugin-0.

If you sent the following command to a Kubernetes cluster

kubectl get po -n kube-system

You would see output like this

csi-vcd-controllerplugin-0                  3/3     Running   0          11h
csi-vcd-nodeplugin-lh2gs                    2/2     Running   0          13h
vmware-cloud-director-ccm-99fd59464-79z8r   1/1     Running   0          11h

Contents of web-statefulset.yaml, available on my GitHub here.

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
  namespace: web-statefulset
spec:
  selector:
    app: nginx
  ports:
    - port: 80
      targetPort: 80
  type: LoadBalancer
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web-statefulset
  namespace: web-statefulset
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx-service"
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
          name: nginx
        volumeMounts:
        - name: www
          mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
  - metadata:
      name: www
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "vcd-disk-dev"
      resources:
        requests:
          storage: 1Gi

Lets deploy into a new namespace, for that we create a new namespace first.

kubectl create ns web-statefulset

Deploy the statefulset with the following command

kubectl apply -f web-statefulset.yaml

You’ll see named disks and ingress services create in VCD and Avi respectively.

If you tried to access the nginx webpage using the service IP address, you wouldn’t see any web page, although the connection is working. This is because the nginx app using the /usr/share/nginx/html mount point to an empty PVC. We need to copy a basic index.html into that directory to get a webpage.

We can do that by logging into the pod and downloading a sample index.html for nginx.

kubectl exec -it web-statefulset-0 -n web-statefulset -- bash

curl https://raw.githubusercontent.com/hugopow/cse/main/index.html -o /usr/share/nginx/html/index.html

Now when you connect to the external IP you would get a very simple webpage.

The index.html file is stored on /usr/share/nginx/html/index.html, which is mounted to /sdb1 backed by the PVC.

Enable kubectl bash completion and fast context switching on Photon OS

Bash auto completion is very useful, it’ll save you time and avoids unnecessary typos. This quick guide shows you how to setup bash auto completion for Photon OS so that you can use kubectl commands and complete them using the [TAB] key on your keyboard. It also works to auto complete kubernetes resources too. For example you could type kubectl describe ns <first-couple-of letters-of-namespace> and press [TAB], bash auto completion will then complete the rest for you.

Additionally, we will install kubectx to enable fast context switching between contexts. To use kubectl, just type kubectl and press enter, you can then use your cursor to move between contexts.

The Linux package bash-completion should already be installed in Photon.

Enabling kubectl auto completion

echo 'source /usr/share/bash-completion/bash_completion' >>~/.bashrc
echo 'source <(kubectl completion bash)' >>~/.bashrc

Photon does not have the /etc/bash_completion.d directory, so we will need to create it.

mkdir /etc/bash_completion.d

Then we add the completion script to the following file.

kubectl completion bash >/etc/bash_completion.d/kubectl

Enabling alias’ to save you typing kubectl and kubectx

echo 'alias k=kubectl' >>~/.bashrc
echo 'complete -F __start_kubectl k' >>~/.bashrc
echo 'alias kx=kubectx' >>~/.bashrc
echo 'complete -F __start_kubectx kx' >>~/.bashrc
. ~/.bashrc

Enabling kubectx for fast context switching

sudo git clone https://github.com/ahmetb/kubectx /opt/kubectx
sudo ln -s /opt/kubectx/kubectx /usr/local/bin/kubectx
sudo ln -s /opt/kubectx/kubens /usr/local/bin/kubens

git clone --depth 1 https://github.com/junegunn/fzf.git ~/.fzf
~/.fzf/install

. ~/.bashrc

Thats it! kubectl and kubectx will be ready to use.

Complete code block for copypasta

echo 'source /usr/share/bash-completion/bash_completion' >>~/.bashrc
echo 'source <(kubectl completion bash)' >>~/.bashrc
mkdir /etc/bash_completion.d
kubectl completion bash >/etc/bash_completion.d/kubectl
echo 'alias k=kubectl' >>~/.bashrc
echo 'complete -F __start_kubectl k' >>~/.bashrc
sudo git clone https://github.com/ahmetb/kubectx /opt/kubectx
sudo ln -s /opt/kubectx/kubectx /usr/local/bin/kubectx
sudo ln -s /opt/kubectx/kubens /usr/local/bin/kubens
git clone --depth 1 https://github.com/junegunn/fzf.git ~/.fzf
~/.fzf/install
echo 'alias kx=kubectx' >>~/.bashrc
echo 'complete -F __start_kubectx kx' >>~/.bashrc
. ~/.bashrc

Setting this up for non-root users

echo 'source /usr/share/bash-completion/bash_completion' >>~/.bashrc
echo 'source <(kubectl completion bash)' >>~/.bashrc
sudo chmod 755 /etc/bash_completion.d
kubectl completion bash >/etc/bash_completion.d/kubectl
echo 'alias k=kubectl' >>~/.bashrc
echo 'complete -F __start_kubectl k' >>~/.bashrc
sudo chmod 755 /opt/kubectx
sudo chmod 755 /usr/local/bin/kubectx
sudo chmod 755 /usr/local/bin/kubens
echo 'alias kx=kubectx' >>~/.bashrc
echo 'complete -F __start_kubectx kx' >>~/.bashrc
. ~/.bashrc

Install Container Service Extension 3.1.1 with VCD 10.3.1

Prepare the Photon OS 3 VM

Deploy the OVA using this link.

Photon OS 3 does not support Linux guest customization unfortunately, so we will use the links below to manually setup the OS with a hostname and static IP address.

Boot the VM, the default credentials are root with password changeme. Change the default password.

Set host name by changing the /etc/hostname file.

Configure a static IP using this guide.

Add DNS server using this guide.

Reboot.

Photon 3 has the older repositories, so we will need to update to newer repositories as detailed in this KB article. I’ve included this in the instructions below.

Copypasta or use create a bash script.

# Update Photon repositories
cd /etc/yum.repos.d/
sed  -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo

# If you get errors with the above command, then copy the command from the KB article.

# Update Photon
tdnf --assumeyes update

# Install dependencies
tdnf --assumeyes install build-essential python3-devel python3-pip git

# Update python3, cse supports python3 version 3.7.3 or greater, it does not support python 3.8 or above.
tdnf --assumeyes update python3

# Prepare cse user and application directories
mkdir -p /opt/vmware/cse
chmod 775 -R /opt
chmod 777 /
groupadd cse
useradd cse -g cse -m -p Vmware1! -d /opt/vmware/cse
chown cse:cse -R /opt

# Run as cse user, add your public ssh key to CSE server
su - cse
mkdir -p ~/.ssh
cat >> ~/.ssh/authorized_keys << EOF
ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ== rsa-key-20210508
EOF

cat >> ~/.bash_profile << EOF
# For Container Service Extension
export CSE_CONFIG=/opt/vmware/cse/config/config.yaml
export CSE_CONFIG_PASSWORD=Vmware1!
source /opt/vmware/cse/python/bin/activate
EOF

# Install CSE in virtual environment
python3 -m venv /opt/vmware/cse/python
source /opt/vmware/cse/python/bin/activate
pip3 install container-service-extension==3.1.1

cse version

source ~/.bash_profile

# Prepare vcd-cli
mkdir -p ~/.vcd-cli
cat >  ~/.vcd-cli/profiles.yaml << EOF
extensions:
- container_service_extension.client.cse
EOF

vcd cse version

# Add my Let's Encrypt intermediate and root certs. Use your certificates issued by your CA to enable verify=true with CSE.
cat >> /opt/vmware/cse/python/lib/python3.7/site-packages/certifi/cacert.pem << EOF
-----BEGIN CERTIFICATE-----
MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw
TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw
WhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg
RW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP
R5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx
sxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm
NHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg
Z3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG
/kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC
AYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB
Af8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA
FHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw
AoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw
Oi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB
gt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W
PTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl
ikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz
CkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm
lJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4
avAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2
yJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O
yK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids
hCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+
HlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv
MldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX
nLRbwHOoq7hHwg==
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIFazCCA1OgAwIBAgIRAIIQz7DSQONZRGPgu2OCiwAwDQYJKoZIhvcNAQELBQAw
TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMTUwNjA0MTEwNDM4
WhcNMzUwNjA0MTEwNDM4WjBPMQswCQYDVQQGEwJVUzEpMCcGA1UEChMgSW50ZXJu
ZXQgU2VjdXJpdHkgUmVzZWFyY2ggR3JvdXAxFTATBgNVBAMTDElTUkcgUm9vdCBY
MTCCAiIwDQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAK3oJHP0FDfzm54rVygc
h77ct984kIxuPOZXoHj3dcKi/vVqbvYATyjb3miGbESTtrFj/RQSa78f0uoxmyF+
0TM8ukj13Xnfs7j/EvEhmkvBioZxaUpmZmyPfjxwv60pIgbz5MDmgK7iS4+3mX6U
A5/TR5d8mUgjU+g4rk8Kb4Mu0UlXjIB0ttov0DiNewNwIRt18jA8+o+u3dpjq+sW
T8KOEUt+zwvo/7V3LvSye0rgTBIlDHCNAymg4VMk7BPZ7hm/ELNKjD+Jo2FR3qyH
B5T0Y3HsLuJvW5iB4YlcNHlsdu87kGJ55tukmi8mxdAQ4Q7e2RCOFvu396j3x+UC
B5iPNgiV5+I3lg02dZ77DnKxHZu8A/lJBdiB3QW0KtZB6awBdpUKD9jf1b0SHzUv
KBds0pjBqAlkd25HN7rOrFleaJ1/ctaJxQZBKT5ZPt0m9STJEadao0xAH0ahmbWn
OlFuhjuefXKnEgV4We0+UXgVCwOPjdAvBbI+e0ocS3MFEvzG6uBQE3xDk3SzynTn
jh8BCNAw1FtxNrQHusEwMFxIt4I7mKZ9YIqioymCzLq9gwQbooMDQaHWBfEbwrbw
qHyGO0aoSCqI3Haadr8faqU9GY/rOPNk3sgrDQoo//fb4hVC1CLQJ13hef4Y53CI
rU7m2Ys6xt0nUW7/vGT1M0NPAgMBAAGjQjBAMA4GA1UdDwEB/wQEAwIBBjAPBgNV
HRMBAf8EBTADAQH/MB0GA1UdDgQWBBR5tFnme7bl5AFzgAiIyBpY9umbbjANBgkq
hkiG9w0BAQsFAAOCAgEAVR9YqbyyqFDQDLHYGmkgJykIrGF1XIpu+ILlaS/V9lZL
ubhzEFnTIZd+50xx+7LSYK05qAvqFyFWhfFQDlnrzuBZ6brJFe+GnY+EgPbk6ZGQ
3BebYhtF8GaV0nxvwuo77x/Py9auJ/GpsMiu/X1+mvoiBOv/2X/qkSsisRcOj/KK
NFtY2PwByVS5uCbMiogziUwthDyC3+6WVwW6LLv3xLfHTjuCvjHIInNzktHCgKQ5
ORAzI4JMPJ+GslWYHb4phowim57iaztXOoJwTdwJx4nLCgdNbOhdjsnvzqvHu7Ur
TkXWStAmzOVyyghqpZXjFaH3pO3JLF+l+/+sKAIuvtd7u+Nxe5AW0wdeRlN8NwdC
jNPElpzVmbUq4JUagEiuTDkHzsxHpFKVK7q4+63SM1N95R1NbdWhscdCb+ZAJzVc
oyi3B43njTOQ5yOf+1CceWxG1bQVs5ZufpsMljq4Ui0/1lvh+wjChP4kqKOJ2qxq
4RgqsahDYVvTH9w7jXbyLeiNdd8XM2w9U/t7y0Ff/9yi0GE44Za4rF2LN9d11TPA
mRGunUHBcnWEvgJBQl9nJEiU0Zsnvgc/ubhPgXRR4Xq37Z0j4r7g1SgEEzwxA57d
emyPxgcYxn/eR44/KJ4EBs+lVDR3veyJm+kXQ99b21/+jh5Xos1AnX5iItreGCc=
-----END CERTIFICATE-----
EOF

# Create service account
vcd login vcd.vmwire.com system administrator -p Vmware1!
cse create-service-role vcd.vmwire.com
# Enter system administrator username and password

# Create VCD service account for CSE
vcd user create --enabled svc-cse Vmware1! "CSE Service Role"

# Create config file
mkdir -p /opt/vmware/cse/config

cat > /opt/vmware/cse/config/config-not-encrypted.conf << EOF
mqtt:
  verify_ssl: false

vcd:
  host: vcd.vmwire.com
  log: true
  password: Vmware1!
  port: 443
  username: administrator
  verify: true

vcs:
- name: vcenter.vmwire.com
  password: Vmware1!
  username: administrator@vsphere.local
  verify: true

service:
  enforce_authorization: false
  legacy_mode: false
  log_wire: false
  no_vc_communication_mode: false
  processors: 15
  telemetry:
    enable: true

broker:
  catalog: cse-catalog
  ip_allocation_mode: pool
  network: default-organization-network
  org: cse
  remote_template_cookbook_url: https://raw.githubusercontent.com/vmware/container-service-extension-templates/master/template_v2.yaml
  storage_profile: 'iscsi'
  vdc: cse-vdc
EOF

cse encrypt /opt/vmware/cse/config/config-not-encrypted.conf --output /opt/vmware/cse/config/config.yaml
chmod 600 /opt/vmware/cse/config/config.yaml
cse check /opt/vmware/cse/config/config.yaml

cse template list

# Import TKGm ova with this command
# Copy the ova to /tmp/ first, the ova can be obtained from my.vmware.com, ensure that it has chmod 644 permissions.
cse template import -F /tmp/ubuntu-2004-kube-v1.20.5-vmware.2-tkg.1-6700972457122900687.ova

# You may need to enable 644 permissions on the file if cse complains that the file is not readable.

# Install CSE
cse install -k ~/.ssh/authorized_keys

# Or use this if you've already installed and want to skip template creation again
cse upgrade --skip-template-creation -k ~/.ssh/authorized_keys

# Register the cse extension with vcd if it did not already register
vcd system extension create cse cse cse vcdext '/api/cse, /api/cse/.*, /api/cse/.*/.*'

# Setup cse.sh
cat > /opt/vmware/cse/cse.sh << EOF
#!/usr/bin/env bash
source /opt/vmware/cse/python/bin/activate
export CSE_CONFIG=/opt/vmware/cse/config/config.yaml
export CSE_CONFIG_PASSWORD=Vmware1!
cse run
EOF

# Make cse.sh executable
chmod +x /opt/vmware/cse/cse.sh

# Deactivate the python virtual environment and go back to root
deactivate
exit

# Setup cse.service, use MQTT and not RabbitMQ
cat > /etc/systemd/system/cse.service << EOF
[Unit]
Description=Container Service Extension for VMware Cloud Director

[Service]
ExecStart=/opt/vmware/cse/cse.sh
User=cse
WorkingDirectory=/opt/vmware/cse
Type=simple
Restart=always

[Install]
WantedBy=default.target
EOF

systemctl enable cse.service
systemctl start cse.service

systemctl status cse.service

Enable the CSE UI Plugin for VCD

The new CSE UI extension is bundled with VCD 10.3.1.

Enable it for the tenants that you want or for all tenants.

Enable the rights bundles

Follow the instructions in this other post.

For 3.1.1 you will also need to edit the cse:nativeCluster Entitlement Rights Bundle and add the two following rights:

ACCESS CONTROL, User, Manage user’s own API token

COMPUTE, Organization VDC, Create a Shared Disk

Then publish the Rights Bundle to all tenants.

Enable Global Roles to use CSE or Configure Rights Bundles

The quickest way to get CSE working is to add the relevant rights to the Organization Administrator role. You can create a custom rights bundle and create a custom role for the k8s admin tenant persona if you like. I won’t cover that in this post.

Edit the Organization Administrator role and scroll all the way down to the bottom and click both the View 8/8 and Manage 12/12, then Save.

Setting up VCD CSI and CPI Operators

You may notice that when the cluster is up you might not be able to deploy any pods, this is because the cluster is not ready and is in a tainted state due to the CSI and CPI Operators not having the credentials.

kubectl get pods -A
NAMESPACE     NAME                                         READY   STATUS    RESTARTS   AGE
kube-system   antrea-agent-lhsxv                           2/2     Running   0          10h
kube-system   antrea-agent-pjwtp                           2/2     Running   0          10h
kube-system   antrea-controller-5cd95c574d-4qb7p           0/1     Pending   0          10h
kube-system   coredns-6598d898cd-9vbzv                     0/1     Pending   0          10h
kube-system   coredns-6598d898cd-wwpk9                     0/1     Pending   0          10h
kube-system   csi-vcd-controllerplugin-0                   0/3     Pending   0          37s
kube-system   etcd-mstr-h8mg                               1/1     Running   0          10h
kube-system   kube-apiserver-mstr-h8mg                     1/1     Running   0          10h
kube-system   kube-controller-manager-mstr-h8mg            1/1     Running   0          10h
kube-system   kube-proxy-2dzwh                             1/1     Running   0          10h
kube-system   kube-proxy-wd7tf                             1/1     Running   0          10h
kube-system   kube-scheduler-mstr-h8mg                     1/1     Running   0          10h
kube-system   vmware-cloud-director-ccm-5489b6788c-kgtsn   1/1     Running   0          13s

To bring up the pods to a ready state, you will need to follow this previous post.

Useful links

https://github.com/vmware/container-service-extension/commit/5d2a60b5eeb164547aef39602f9871c06726863e

https://vmware.github.io/container-service-extension/cse3_1/RELEASE_NOTES.html

Kubernetes Load Balancer Service for CSE on Cloud Director

This article describes how to setup vCenter, VCD, NSX-T and NSX Advanced Load Balancer to support exposing Kubernetes applications in Kubernetes clusters provisioned into VCD.

At the end of this post, you would be able to run this command:

kubectl expose deployment webserver –port=80 –type=LoadBalancer

… and have NSX ALB together with VCD and NSX-T automate the provisioning and setup of everything that allows you to expose that application to the outside world using a Kubernetes service of type LoadBalancer.

This article describes how to setup vCenter, VCD, NSX-T and NSX Advanced Load Balancer to support exposing Kubernetes applications in Kubernetes clusters provisioned into VCD.

At the end of this post, you would be able to run this command:

kubectl expose deployment webserver --port=80 --type=LoadBalancer

Create a Content Library for NSX ALB

In vCenter (Resource vCenter managing VCD PVDCs), create a Content Library for NSX Advanced Load Balancer to use to upload the service engine ova.

Create T1 for Avi Service Engine management network

Create T1 for Avi Service Engine management network. You can either attach this T1 to the default T0 or create a new T0.

enable DHCP server for the T1
enable All Static Routes and All Connected Segments & Service Ports under Route Advertisement

Create a network segment for Service Engine management network

Create a network segment for Avi Service Engine management network. Attach the segment to the T1 the was created in the previous step.

Ensure you enable DHCP, this will assign IP addresses to the service engines automatically and you won’t need to setup IPAM profiles in Avi Vantage.

NSX Advanced Load Balancer Settings

A couple of things to setup here.

You do not need to create any tenants in NSX ALB, just use the default admin context.
No IPAM/DNS Profiles are required as we will use DHCP from NSX-T for all networks.
Use FQDNs instead of IP addresses
Use the same FQDN in all systems for consistency and to ensure that registration between the systems work
- NSX ALB
- VCD
- NSX-T

Navigate to Administration, User Credentials and setup user credentials for NSX-T controller and vCenter server
Navigate to Administration, Settings, Tenant Settings and ensure that the settings are as follows

Setup an NSX-T Cloud

Navigate to Infrastructure, Clouds. Setup your cloud similar to mine, I have valled my NSX-T cloud nsx.vmwire.com (which is the FQDN of my NSX-T Controller).

Lets go through these settings from the top.

use the FQDN of your NSX-T manager for the name
click the DHCP option, we will be using NSX-T’s DHCP server so we can ignore IPAM/DNS later
enter something for the Object Name Prefix, this will give the SE VM name a prefix so they can be identified in vCenter. I used avi here, so it will look like this in vCenter

type the FQDN of the NSX-T manager into the NSX-T Manager Address
choose the NSX-T Manager Credentials that you configured earlier
select the Transport Zone that you are using in VCD for your tenants
- under Management Network Segment, select the T1 that you created earlier for SE management networking
- under Segment ID, select the network segment that you created earlier for the SE management network
click ADD under the Data Network Segment(s)
- select the T1 that is used by the tenant in VCD
- select the tenant organization routed network that is attached to the t1 in the previous task
the two previous settings tell NSX ALB where to place the data/vip network for front-end load balancing use. NSX-ALB will create a new segment for this in NSX-T automatically, and VCD will automatically create DNAT rules when a virtual service is requested in NSX ALB
the last step is to add the vCenter server, this would be the vCenter server that is managing the PVDCs used in VCD.

Now wait for a while until the status icon turns green and shows Complete.

Setup a Service Engine Group

Decide whether you want to use a shared service engine group for all VCD tenants or dedicated a service engine group for each Tenant.

I use the dedicated model.

navigate to Infrastructure, Service Engine Group
change the cloud to the NSX-T cloud that you setup earlier
create a new service engine group with your preferred settings, you can read about the options here.

Setup Avi in VCD

Log into VCD as a Provider and navigate to Resources, Infrastructure Resources, NSX-ALB, Controllers and click on the ADD link.

Wait for a while for Avi to sync with VCD. Then continue to add the NSX-T Cloud.

Navigate to Resources, Infrastructure Resources, NSX-ALB, NSX-T Clouds and click on the ADD link.

Proceed when you can see the status is healthy.

Navigate to Resources, Infrastructure Resources, NSX-ALB, Service Engine Groups and click on the ADD link.

Staying logged in as a Provider, navigate to the tenant that you wish to enable NSX ALB load balancing services and navigate to Networking, Edge Gateways, Load Balancer, Service Engine Groups. Then add the service engine group to this tenant.

This will enable this tenant to use NSX ALB load balancing services.

Deploy a new Kubernetes cluster in VCD with Container Service Extension

Deploy a new Kubernetes cluster using Container Service Extension in VCD as normal.

Once the cluster is ready, download the kube config file and log into the cluster.

Check that all the nodes and pods are up as normal.

kubectl get nodes -A

kubectl get pods -A
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE
kube-system   antrea-agent-7nlqs                          2/2     Running   0          21m
kube-system   antrea-agent-q5qc8                          2/2     Running   0          24m
kube-system   antrea-controller-5cd95c574d-r4q2z          0/1     Pending   0          8m38s
kube-system   coredns-6598d898cd-qswn8                    0/1     Pending   0          24m
kube-system   coredns-6598d898cd-s4p5m                    0/1     Pending   0          24m
kube-system   csi-vcd-controllerplugin-0                  0/3     Pending   0          4m29s
kube-system   etcd-mstr-zj9p                              1/1     Running   0          24m
kube-system   kube-apiserver-mstr-zj9p                    1/1     Running   0          24m
kube-system   kube-controller-manager-mstr-zj9p           1/1     Running   0          24m
kube-system   kube-proxy-76m4h                            1/1     Running   0          24m
kube-system   kube-proxy-9229x                            1/1     Running   0          21m
kube-system   kube-scheduler-mstr-zj9p                    1/1     Running   0          24m
kube-system   vmware-cloud-director-ccm-99fd59464-qjj7n   1/1     Running   0          24m

You might see that the following pods in the kube-system namespace are in a pending state. If everything is already working then move onto the next section.

kube-system   coredns-6598d898cd-qswn8     0/1     Pending
kube-system   coredns-6598d898cd-s4p5m     0/1     Pending
kube-system   csi-vcd-controllerplugin-0   0/3     Pending

This is due to the cluster waiting for the csi-vcd-controllerplugin-0 to start.

To get this working, we just need to configure the csi-vcd-controllerplugin-0 with the instructions in this previous post.

Once done, you’ll see that the pods are all now healthy.

kubectl get pods -A
NAMESPACE     NAME                                        READY   STATUS    RESTARTS   AGE
kube-system   antrea-agent-7nlqs                          2/2     Running   0          23m
kube-system   antrea-agent-q5qc8                          2/2     Running   0          26m
kube-system   antrea-controller-5cd95c574d-r4q2z          1/1     Running   0          10m
kube-system   coredns-6598d898cd-qswn8                    1/1     Running   0          26m
kube-system   coredns-6598d898cd-s4p5m                    1/1     Running   0          26m
kube-system   csi-vcd-controllerplugin-0                  3/3     Running   0          60s
kube-system   csi-vcd-nodeplugin-twr4w                    2/2     Running   0          49s
kube-system   etcd-mstr-zj9p                              1/1     Running   0          26m
kube-system   kube-apiserver-mstr-zj9p                    1/1     Running   0          26m
kube-system   kube-controller-manager-mstr-zj9p           1/1     Running   0          26m
kube-system   kube-proxy-76m4h                            1/1     Running   0          26m
kube-system   kube-proxy-9229x                            1/1     Running   0          23m
kube-system   kube-scheduler-mstr-zj9p                    1/1     Running   0          26m
kube-system   vmware-cloud-director-ccm-99fd59464-qjj7n   1/1     Running   0          26m

Testing the Load Balancer service

Lets deploy a nginx webserver and expose it using all of the infrastructure that we setup above.

kubectl create deployment webserver --image nginx

Wait for the deployment to start and the pod to go into a running state. You can use this command to check

kubectl get deploy webserver
NAME        READY   UP-TO-DATE   AVAILABLE   AGE
webserver   1/1     1            1           7h47m

Now we can’t access the nginx default web page yet until we expose it using the load balancer service.

kubectl expose deployment webserver --port=80 --type=LoadBalancer

Wait for the load balancer service to start and the pod to go into a running state. During this time, you’ll see the service engines being provisioned automatically by NSX ALB. It’ll take 10 minutes or so to get everything up and running.

You can use this command to check when the load balancer service has completed and check the EXTERNAL-IP.

kubectl get service webserver
NAME        TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
webserver   LoadBalancer   100.71.45.194   10.149.1.114   80:32495/TCP   7h48m

You can see that NSX ALB, VCD and NSX-T all worked together to expose the nginx applicationto the outside world.

The external IP of 10.149.1.114 in my environment is an uplink segment on a T0 that I have configured for VCD tenants to use as egress and ingress into their organization VDC. It is the external network for their VDCs.

Paste the external IP into a web browser and you should see the nginx web page.

In the next post, I’ll go over the end to end network flow to show how this all connects NSX ALB, VCD, NSX-T and Kubernetes together.

VMware Cloud Director CSI Driver for Kubernetes

Container Service Extension (CSE) 3.1.1 now supports persistent volumes that are backed by VCD’s Named Disk feature.

Setting up the VCD CSI driver on your Kubernetes cluster

Container Service Extension (CSE) 3.1.1 now supports persistent volumes that are backed by VCD’s Named Disk feature. These now appear under Storage – Named disks in VCD. To use this functionality today (28 September 2021), you’ll need to deploy CSE 3.1.1 beta with VCD 10.3. See this previous post for details.

Ideally, you want to deploy the CSI driver using the same user that also deployed the Kubernetes cluster into VCD. In my environment, I used a user named tenant1-admin, this user has the Organization Administrator role with the added right:

Compute – Organization VDC – Create a Shared Disk.

Create the vcloud-basic-auth.yaml

Before you can create persistent volumes you have to setup the Kubernetes cluster with the VCD CSI driver.

Ensure you can log into the cluster by downloading the kube config and logging into it using the correct context.

kubectl config get-contexts
CURRENT   NAME                          CLUSTER      AUTHINFO           NAMESPACE
*         kubernetes-admin@kubernetes   kubernetes   kubernetes-admin

Create the vcloud-basic-auth.yaml file which is used to setup the VCD CSI driver for this Kubernetes cluster.

VCDUSER=$(echo -n 'tenant1-admin' | base64)
PASSWORD=$(echo -n 'Vmware1!' | base64)

cat > vcloud-basic-auth.yaml << END
---
apiVersion: v1
kind: Secret
metadata:
 name: vcloud-basic-auth
 namespace: kube-system
data:
 username: "$VCDUSER"
 password: "$PASSWORD"
END

Install the CSI driver into the Kubernetes cluster.

kubectl apply  -f vcloud-basic-auth.yaml

You should see three new pods starting in the kube-system namespace.

kube-system   csi-vcd-controllerplugin-0                  3/3     Running   0          43m     100.96.1.10     node-xgsw   <none>           <none>
kube-system   csi-vcd-nodeplugin-bckqx                    2/2     Running   0          43m     192.168.0.101   node-xgsw   <none>           <none>
kube-system   vmware-cloud-director-ccm-99fd59464-swh29   1/1     Running   0          43m     192.168.0.100   mstr-31jt   <none>           <none>

Setup a Storage Class

Here’s my storage-class.yaml file, which is used to setup the storage class for my Kubernetes cluster.

apiVersion: v1
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  annotations:
    storageclass.kubernetes.io/is-default-class: "false"
  name: vcd-disk-dev
provisioner: named-disk.csi.cloud-director.vmware.com
reclaimPolicy: Delete
parameters:
  storageProfile: "truenas-iscsi-luns"
  filesystem: "ext4"

Notice that the storageProfile needs to be set to either “*” for any storage policy or the name of a storage policy that you has access to in your Organization VDC.

Create the storage class by applying that file.

kubectl apply -f storage-class.yaml

You can see if that was successful by getting all storage classes.

kubectl get storageclass
NAME           PROVISIONER                                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
vcd-disk-dev   named-disk.csi.cloud-director.vmware.com   Delete          Immediate           false                  43h

Make the storage class the default

kubectl patch storageclass vcd-disk-dev -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Using the VCD CSI driver

Now that we’ve got a storage class and the driver installed, we can now deploy a persistent volume claim and attach it to a pod. Lets create a persistent volume claim first.

Creating a persistent volume claim

We will need to prepare another file, I’ve called my my-pvc.yaml, and it looks like this.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: "vcd-disk-dev"

Lets deploy it

kubectl apply -f my-pvc.yaml

We can check that it deployed with this command

kubectl get pvc
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-pvc   Bound    pvc-2ddeccd0-e092-4aca-a090-dff9694e2f04   1Gi        RWO            vcd-disk-dev   36m

Attaching the persistent volume to a pod

Lets deploy an nginx pod that will attach the PV and use it for nginx.

My pod.yaml looks like this.

apiVersion: v1
kind: Pod
metadata:
  name: pod
  labels:
    app : nginx
spec:
  volumes:
    - name: my-pod-storage
      persistentVolumeClaim:
        claimName: my-pvc
  containers:
    - name: my-pod-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/usr/share/nginx/html"
          name: my-pod-storage

You can see that the persistentVolumeClaim, claimName: my-pvc, this aligns to the name of the PVC. I’ve also mounted it to /usr/share/nginx/html within the nginx pod.

Lets attach the PV.

kubectl apply -f pod.yaml

You’ll see a few things happen in the Recent Tasks pane when you run this. You can see that Kubernetes has attached the PV to the nginx pod using the CSI driver, the driver informs VCD to attach the disk to the worker node.

If you open up vSphere Web Client, you can see that the disk is now attached to the worker node.

You can also see the CSI driver doing its thing if you take a look at the logs with this command.

kubectl logs csi-vcd-controllerplugin-0 -n kube-system -c csi-attacher

Checking the mount in the pod

You can log into the nginx pod using this command.

kubectl exec -it pod -- bash

Then type mount and df to see the mount is present and the size of the mount point.

df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/sdb          999320    1288    929220   1% /usr/share/nginx/html

mount
/dev/sdb on /usr/share/nginx/html type ext4 (rw,relatime)

The size is correct, being 1GB and the disk is mounted.

Describing the pod gives us more information.

kubectl describe po pod
Name:         pod
Namespace:    default
Priority:     0
Node:         node-xgsw/192.168.0.101
Start Time:   Sun, 26 Sep 2021 12:43:15 +0300
Labels:       app=nginx
Annotations:  <none>
Status:       Running
IP:           100.96.1.12
IPs:
  IP:  100.96.1.12
Containers:
  my-pod-container:
    Container ID:   containerd://6a194ac30dab7dc5a5127180af139e531e650bedbb140e4dc378c21869bd570f
    Image:          nginx
    Image ID:       docker.io/library/nginx@sha256:853b221d3341add7aaadf5f81dd088ea943ab9c918766e295321294b035f3f3e
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sun, 26 Sep 2021 12:43:34 +0300
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /usr/share/nginx/html from my-pod-storage (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xm4gd (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  my-pod-storage:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-pvc
    ReadOnly:   false
  default-token-xm4gd:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xm4gd
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>

Useful commands

Show storage classes

kubectl get storageclass

Show persistent volumes and persistent volume claims

kubectl get pv,pvc

Show all pods running in the cluster

kubectl get po -A -o wide

Describe the nginx pod

kubectl describe po pod

Show logs for the CSI driver

kubectl logs csi-vcd-controllerplugin-0 -n kube-system -c csi-attacher

kubectl logs csi-vcd-controllerplugin-0 -n kube-system -c csi-provisioner

kubectl logs csi-vcd-controllerplugin-0 -n kube-system -c vcd-csi-plugin

kubectl logs vmware-cloud-director-ccm-99fd59464-swh29 -n kube-system

Useful links

https://github.com/vmware/cloud-director-named-disk-csi-driver/blob/0.1.0-beta/README.md

Install Container Service Extension 3.1.1 beta with VCD 10.3

Prepare the Photon OS 3 VM

Deploy the OVA using this link.

Photon OS 3 does not support Linux guest customization unfortunately, so we will use the links below to manually setup the OS with a hostname and static IP address.

Boot the VM, the default credentials are root with password changeme. Change the default password.

Set host name by changing the /etc/hostname file.

Configure a static IP using this guide.

Add DNS server using this guide.

Reboot.

Photon 3 has the older repositories, so we will need to update to newer repositories as detailed in this KB article. I’ve included this in the instructions below.

Copypasta or use create a bash script.

# Update Photon repositories
cd /etc/yum.repos.d/
sed  -i 's/dl.bintray.com\/vmware/packages.vmware.com\/photon\/$releasever/g' photon.repo photon-updates.repo photon-extras.repo photon-debuginfo.repo

# Update Photon
tdnf --assumeyes update

# Install dependencies
tdnf --assumeyes install build-essential python3-devel python3-pip git

# Prepare cse user and application directories
mkdir -p /opt/vmware/cse
chmod 775 -R /opt
chmod 777 /
groupadd cse
useradd cse -g cse -m -p Vmware1! -d /opt/vmware/cse
chown cse:cse -R /opt

# Run as cse user
su - cse
mkdir -p ~/.ssh
cat >> ~/.ssh/authorized_keys << EOF
ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ== rsa-key-20210508
EOF

cat >> ~/.bash_profile << EOF
# For Container Service Extension
export CSE_CONFIG=/opt/vmware/cse/config/config.yaml
export CSE_CONFIG_PASSWORD=Vmware1!
source /opt/vmware/cse/python/bin/activate
EOF

# Install CSE in virtual environment
python3 -m venv /opt/vmware/cse/python
source /opt/vmware/cse/python/bin/activate
pip3 install git+https://github.com/vmware/container-service-extension.git@3.1.1.0b2

cse version

source ~/.bash_profile

# Prepare vcd-cli
mkdir -p ~/.vcd-cli
cat >  ~/.vcd-cli/profiles.yaml << EOF
extensions:
- container_service_extension.client.cse
EOF

vcd cse version

# Add my Let's Encrypt intermediate and root certs. Use your certificates issued by your CA to enable verify=true with CSE.
cat >> /opt/vmware/cse/python/lib/python3.7/site-packages/certifi/cacert.pem << EOF #ok
-----BEGIN CERTIFICATE-----
MIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw
TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw
WhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg
RW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK
AoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP
R5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx
sxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm
NHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg
Z3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG
/kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC
AYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB
Af8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA
FHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw
AoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw
Oi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB
gt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W
PTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl
ikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz
CkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm
lJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4
avAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2
yJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O
yK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids
hCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+
HlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv
MldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX
nLRbwHOoq7hHwg==
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIFYDCCBEigAwIBAgIQQAF3ITfU6UK47naqPGQKtzANBgkqhkiG9w0BAQsFADA/
MSQwIgYDVQQKExtEaWdpdGFsIFNpZ25hdHVyZSBUcnVzdCBDby4xFzAVBgNVBAMT
DkRTVCBSb290IENBIFgzMB4XDTIxMDEyMDE5MTQwM1oXDTI0MDkzMDE4MTQwM1ow
TzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh
cmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwggIiMA0GCSqGSIb3DQEB
AQUAA4ICDwAwggIKAoICAQCt6CRz9BQ385ueK1coHIe+3LffOJCMbjzmV6B493XC
ov71am72AE8o295ohmxEk7axY/0UEmu/H9LqMZshftEzPLpI9d1537O4/xLxIZpL
wYqGcWlKZmZsj348cL+tKSIG8+TA5oCu4kuPt5l+lAOf00eXfJlII1PoOK5PCm+D
LtFJV4yAdLbaL9A4jXsDcCEbdfIwPPqPrt3aY6vrFk/CjhFLfs8L6P+1dy70sntK
4EwSJQxwjQMpoOFTJOwT2e4ZvxCzSow/iaNhUd6shweU9GNx7C7ib1uYgeGJXDR5
bHbvO5BieebbpJovJsXQEOEO3tkQjhb7t/eo98flAgeYjzYIlefiN5YNNnWe+w5y
sR2bvAP5SQXYgd0FtCrWQemsAXaVCg/Y39W9Eh81LygXbNKYwagJZHduRze6zqxZ
Xmidf3LWicUGQSk+WT7dJvUkyRGnWqNMQB9GoZm1pzpRboY7nn1ypxIFeFntPlF4
FQsDj43QLwWyPntKHEtzBRL8xurgUBN8Q5N0s8p0544fAQjQMNRbcTa0B7rBMDBc
SLeCO5imfWCKoqMpgsy6vYMEG6KDA0Gh1gXxG8K28Kh8hjtGqEgqiNx2mna/H2ql
PRmP6zjzZN7IKw0KKP/32+IVQtQi0Cdd4Xn+GOdwiK1O5tmLOsbdJ1Fu/7xk9TND
TwIDAQABo4IBRjCCAUIwDwYDVR0TAQH/BAUwAwEB/zAOBgNVHQ8BAf8EBAMCAQYw
SwYIKwYBBQUHAQEEPzA9MDsGCCsGAQUFBzAChi9odHRwOi8vYXBwcy5pZGVudHJ1
c3QuY29tL3Jvb3RzL2RzdHJvb3RjYXgzLnA3YzAfBgNVHSMEGDAWgBTEp7Gkeyxx
+tvhS5B1/8QVYIWJEDBUBgNVHSAETTBLMAgGBmeBDAECATA/BgsrBgEEAYLfEwEB
ATAwMC4GCCsGAQUFBwIBFiJodHRwOi8vY3BzLnJvb3QteDEubGV0c2VuY3J5cHQu
b3JnMDwGA1UdHwQ1MDMwMaAvoC2GK2h0dHA6Ly9jcmwuaWRlbnRydXN0LmNvbS9E
U1RST09UQ0FYM0NSTC5jcmwwHQYDVR0OBBYEFHm0WeZ7tuXkAXOACIjIGlj26Ztu
MA0GCSqGSIb3DQEBCwUAA4IBAQAKcwBslm7/DlLQrt2M51oGrS+o44+/yQoDFVDC
5WxCu2+b9LRPwkSICHXM6webFGJueN7sJ7o5XPWioW5WlHAQU7G75K/QosMrAdSW
9MUgNTP52GE24HGNtLi1qoJFlcDyqSMo59ahy2cI2qBDLKobkx/J3vWraV0T9VuG
WCLKTVXkcGdtwlfFRjlBz4pYg1htmf5X6DYO8A4jqv2Il9DjXA6USbW1FzXSLr9O
he8Y4IWS6wY7bCkjCWDcRQJMEhg76fsO3txE+FiYruq9RUWhiF1myv4Q6W+CyBFC
Dfvp7OOGAN6dEOM4+qR9sdjoSYKEBpsr6GtPAQw4dy753ec5
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIDSjCCAjKgAwIBAgIQRK+wgNajJ7qJMDmGLvhAazANBgkqhkiG9w0BAQUFADA/
MSQwIgYDVQQKExtEaWdpdGFsIFNpZ25hdHVyZSBUcnVzdCBDby4xFzAVBgNVBAMT
DkRTVCBSb290IENBIFgzMB4XDTAwMDkzMDIxMTIxOVoXDTIxMDkzMDE0MDExNVow
PzEkMCIGA1UEChMbRGlnaXRhbCBTaWduYXR1cmUgVHJ1c3QgQ28uMRcwFQYDVQQD
Ew5EU1QgUm9vdCBDQSBYMzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB
AN+v6ZdQCINXtMxiZfaQguzH0yxrMMpb7NnDfcdAwRgUi+DoM3ZJKuM/IUmTrE4O
rz5Iy2Xu/NMhD2XSKtkyj4zl93ewEnu1lcCJo6m67XMuegwGMoOifooUMM0RoOEq
OLl5CjH9UL2AZd+3UWODyOKIYepLYYHsUmu5ouJLGiifSKOeDNoJjj4XLh7dIN9b
xiqKqy69cK3FCxolkHRyxXtqqzTWMIn/5WgTe1QLyNau7Fqckh49ZLOMxt+/yUFw
7BZy1SbsOFU5Q9D8/RhcQPGX69Wam40dutolucbY38EVAjqr2m7xPi71XAicPNaD
aeQQmxkqtilX4+U9m5/wAl0CAwEAAaNCMEAwDwYDVR0TAQH/BAUwAwEB/zAOBgNV
HQ8BAf8EBAMCAQYwHQYDVR0OBBYEFMSnsaR7LHH62+FLkHX/xBVghYkQMA0GCSqG
SIb3DQEBBQUAA4IBAQCjGiybFwBcqR7uKGY3Or+Dxz9LwwmglSBd49lZRNI+DT69
ikugdB/OEIKcdBodfpga3csTS7MgROSR6cz8faXbauX+5v3gTt23ADq1cEmv8uXr
AvHRAosZy5Q6XkjEGB5YGV8eAlrwDPGxrancWYaLbumR9YbK+rlmM6pZW87ipxZz
R8srzJmwN0jP41ZL9c8PDHIyh8bwRLtTcm1D9SZImlJnt1ir/md2cXjbDaJWFBM5
JDGFoqgCWjBH4d1QB7wCCZAA62RjYJsWvIjJEubSfZGL+T0yjWW06XyxV3bqxbYo
Ob8VZRzI9neWagqNdwvYkQsEjgfbKbYK7p2CNTUQ
-----END CERTIFICATE-----
EOF

# Create service account
vcd login vcd.vmwire.com system administrator -p Vmware1!
cse create-service-role vcd.vmwire.com
# Enter system administrator username and password

# Create VCD service account for CSE
vcd user create --enabled svc-cse Vmware1! "CSE Service Role"

# Create config file
mkdir -p /opt/vmware/cse/config

cat > /opt/vmware/cse/config/config-not-encrypted.conf << EOF
mqtt:
  verify_ssl: false

vcd:
  host: vcd.vmwire.com
  log: true
  password: Vmware1!
  port: 443
  username: administrator
  verify: true

vcs:
- name: vcenter.vmwire.com
  password: Vmware1!
  username: administrator@vsphere.local
  verify: true

service:
  enforce_authorization: false
  legacy_mode: false
  log_wire: false
  processors: 15
  telemetry:
    enable: true

broker:
  catalog: cse-catalog
  ip_allocation_mode: pool
  network: default-organization-network
  org: cse
  remote_template_cookbook_url: https://raw.githubusercontent.com/vmware/container-service-extension-templates/master/template_v2.yaml
  storage_profile: 'truenas-iscsi-luns'
  vdc: cse-vdc
EOF

cse encrypt /opt/vmware/cse/config/config-not-encrypted.conf --output /opt/vmware/cse/config/config.yaml
chmod 600 /opt/vmware/cse/config/config.yaml
cse check /opt/vmware/cse/config/config.yaml

cse template list

mkdir -p ~/.ssh

# Add your public key(s) here
cat >> ~/.ssh/authorized_keys << EOF
ssh-rsa AAAAB3NzaC1yc2EAAAABJQAAAQEAhcw67bz3xRjyhPLysMhUHJPhmatJkmPUdMUEZre+MeiDhC602jkRUNVu43Nk8iD/I07kLxdAdVPZNoZuWE7WBjmn13xf0Ki2hSH/47z3ObXrd8Vleq0CXa+qRnCeYM3FiKb4D5IfL4XkHW83qwp8PuX8FHJrXY8RacVaOWXrESCnl3cSC0tA3eVxWoJ1kwHxhSTfJ9xBtKyCqkoulqyqFYU2A1oMazaK9TYWKmtcYRn27CC1Jrwawt2zfbNsQbHx1jlDoIO6FLz8Dfkm0DToanw0GoHs2Q+uXJ8ve/oBs0VJZFYPquBmcyfny4WIh4L0lwzsiAVWJ6PvzF5HMuNcwQ== rsa-key-20210508
EOF

# Import TKGm ova with this command
# Copy the ova to /home/ first, the ova can be obtained from my.vmware.com, ensure that it has chmod 644 permissions.
cse template import -F /home/ubuntu-2004-kube-v1.20.5-vmware.2-tkg.1-6700972457122900687.ova

# Install CSE
cse install -k ~/.ssh/authorized_keys

# Or use this if you've already installed and want to skip template creation again
cse upgrade --skip-template-creation -k ~/.ssh/authorized_keys

# Setup cse.sh
cat > /opt/vmware/cse/cse.sh << EOF
#!/usr/bin/env bash
source /opt/vmware/cse/python/bin/activate
export CSE_CONFIG=/opt/vmware/cse/config/config.yaml
export CSE_CONFIG_PASSWORD=Vmware1!
cse run
EOF

# Make cse.sh executable
chmod +x /opt/vmware/cse/cse.sh

# Deactivate the python virtual environment and go back to root
deactivate
exit

# Setup cse.service, use MQTT and not RabbitMQ
cat > /etc/systemd/system/cse.service << EOF
[Unit]
Description=Container Service Extension for VMware Cloud Director

[Service]
ExecStart=/opt/vmware/cse/cse.sh
User=cse
WorkingDirectory=/opt/vmware/cse
Type=simple
Restart=always

[Install]
WantedBy=default.target
EOF

systemctl enable cse.service
systemctl start cse.service

systemctl status cse.service

Install and enable the CSE UI Plugin for VCD

Download the latest version from https://github.com/vmware/container-service-extension/raw/master/cse_ui/3.0.4/container-ui-plugin.zip.

Enable it for the tenants that you want or for all tenants.

Enable the rights bundles

Follow the instructions in this other post.

Enable Global Roles to use CSE or Configure Rights Bundles

Edit the Organization Administrator role and scroll all the way down to the bottom and click both the View 8/8 and Manage 12/12, then Save.

Useful links

https://github.com/vmware/container-service-extension/commit/5d2a60b5eeb164547aef39602f9871c06726863e

https://vmware.github.io/container-service-extension/cse3_1/RELEASE_NOTES.html

Rights Bundles for Container Service Extension

A quick note on the Rights Bundles for Container Service Extension when enabling native, TKGm or TKGs clusters.

The rights bundle named vmware:tkgcluster Entitlement are for TKGs clusters and NOT for TKGm.

The rights bundle named cse:nativeCluster Entitlement are for native clusters AND also for TKGm clusters.

Yes, this is very confusing and will be fixed in an upcoming release.

You can see a brief note about this on the release notes here.

Users deploying VMware Tanzu Kubernetes Grid clusters should have the rights required to deploy exposed native clusters and additionally the right Full Control: CSE:NATIVECLUSTER. This right is crucial for VCD CPI to work properly.

So in summary, for a user to be able to deploy TKGm clusters they will need to have the cse:nativeCluster Entitlement rights.

To publish these rights, go to the Provider portal and navigate to Administration, Rights Bundles.

Click on the radio button next to cse:nativeCluster Entitlement and click on Publish, then publish to the desired tenant or to all tenants.

Container Service Extension Operational Tips

A short post on some operational tips for CSE 3.0.4. This post covers recommendations for sizing the CSE server, how to protect it from failure, finding the important log files and other tips and tricks.

Important files

Backup the following files. Its a good idea to perform image level backups of the VM too.

All file locations below assume you’re using the automated method to deploy CSE.

File	Why?
/opt/vmware/cse/config/config.yaml, unecrypted.conf	Contains the configuration for CSE server. Ensure you keep a safe backup of both the unecrypted file, so you can make changes and keep the encrypted file in case you lose the CSE server for whatever reason.
/opt/vmware/cse/.cse_scripts/*	Here you’ll find a bunch of directories that hold the Kubernetes templates runtimes for all of the supported Kubernetes versions. The supported templates are the TKGm ones and the native ones. Take a backup of this entire directory. You will need this if you want to save time when you redeploy CSE into a new VM but you’ve already prepared the templates and the templates are ready in the VCD catalog. Saving these directories and copying them to the new CSE VM will enable you to run the command: `sudo -u cse -i cse upgrade --skip-template-creation -k /opt/vmware/cse/.ssh/authorized_keys` This will skip the long process of template creation again but allow you to setup CSE on the new VM.

If you didn’t take a backup of the .cse_scripts directory and redeployed CSE with the –skip-template-creation flag and already have the templates in catalog – when you go to deploy a Kubernetes cluster with VCD you’ll see an error such as:

FileNotFoundError: [Errno 2] No such file or directory: '/opt/vmware/cse/.cse_scripts/ubuntu-16.04_k8-1.18_weave-2.6.5_rev2/mstr.sh'

How to install both native and TKGm templates

There are two cookbooks that can be used to install CSE and enable template creation into VCD. The two are

native and

TKGm

When you install CSE you can only configure one entry into the broker section of the config.yaml file.

broker:
  catalog: cse-catalog
  default_template_name: ubuntu-16.04_k8-1.21_weave-2.8.1
  default_template_revision: 1
  ip_allocation_mode: pool
  network: default-organization-network
  org: cse
  remote_template_cookbook_url: https://raw.githubusercontent.com/vmware/container-service-extension-templates/master/template.yaml
  storage_profile: 'truenas-iscsi-luns'
  vdc: cse-vdc

The lines 3, 4 and 8 are what we care about in the above code snippet. This code tells CSE to use the native template cookbook.

When you perform a completely fresh install of CSE you will need to run the installation without the –skip-template-creation flag.

sudo -u cse -i cse install -k /opt/vmware/cse/.ssh/authorized_keys

You’ll then get this option in VCD

How do you also enable TKGm templates in addition to native templates?

Well you would either update the config.yaml file or create a new one and use this code in the broker section instead.

broker:
  catalog: cse-catalog
  default_template_name: ubuntu-20.04_tkgm-1.20_antrea-0.11
  default_template_revision: 1
  ip_allocation_mode: pool
  network: default-organization-network
  org: cse
  remote_template_cookbook_url: https://raw.githubusercontent.com/vmware/container-service-extension-templates/tkgm/template.yaml
  storage_profile: 'truenas-iscsi-luns'
  vdc: cse-vd

However, this time you would not use cse install command, but rather cse upgrade instead.

sudo -u cse -i cse upgrade -k /opt/vmware/cse/.ssh/authorized_keys

You’ll then see two options in VCD

For a really easy end to end automated deployment of both native and TKGm templates, use the bash script I developed in my GitHub repository.

Use vSphere HA for the CSE server

The CSE server can not support its own high availability through multiple VMs and sharing state. In fact, CSE is designed not to hold any state and communicates entirely with VCD through the message bus either with MQTT or RabbitMQ.

Use vSphere HA with high priority to ensure that the CSE server is started quickly in the event of a loss of an ESXi host.

The following is unsupported – I’ve tested running two CSE servers using the same config.yaml file on two separate VMs and this does in fact work without any obvious errors. Since CSE is stateless and uses a message bus to function and to provide the extension capability for container service with VCD. However this is totally unsupported by VMware GSS, so don’t do this.

Sizing CSE server

Consider the following sizing for the CSE server

Configuration	Specification
vCPU	2 vCPUs
Memory	2 GB
Disk	18 GB * from Photon 3 OVA

This configuration will support up to 50 concurrent operations. Doubling the resource will not double the number of concurrent operations as there are many variables to consider. The bottleneck would be the ability for VCD to place messages on MQTT or RabbitMQ and also VCD’s operations concurrency.

Log files

Log file location	Why?
/opt/vmware/cse/.cse-logs/cse-server-debug.log	More detailed debug logs, use this one if something fails.
/opt/vmware/cse/.cse-logs/cse-server-info.log	CSE server logs and message bus messages

File Permissions for a healthy CSE server installation

I spent some time scratching my head with this when I wrote the bash script. The script ran as root but used sudo -u cse -i to run a Python virtual environment to install CSE as the cse user, this cause some issues initially but were resolved with the following chown and chmod settings.

File	Specification
entire /opt/vmware/cse directory	chown cse:cse -R chmod 775 -R
/opt/vmware/cse/config/config.yaml	chmod 600 chown cse:cse
/opt/vmware/cse/cse.sh	cse user execute permissions

CSE server service operations


systemctl start cse.service	Start the CSE service
systemctl stop cse.service	Stop the CSE service
systemctl status cse.service	Show current status `systemctl status cse.service ● cse.service - Container Service Extension for VMware Cloud Director Loaded: loaded (/etc/systemd/system/cse.service; enabled; vendor preset: enabled) Active: active (running) since Tue 2021-08-24 12:47:43 UTC; 7h ago Main PID: 4154 (bash) Tasks: 19 (limit: 2368) Memory: 73.6M CGroup: /system.slice/cse.service ├─4154 bash /opt/vmware/cse/cse.sh └─4155 /opt/vmware/cse/python/bin/python3 /opt/vmware/cse/python/bin/cse run`

Use CA signed certificates

Use CA signed certificates for VCD, vCenter. In your production environments you should! Even in your test environments or home labs it is very easy to obtain CA signed certs to use from a provider such as Let’s Encrypt. I’ve in fact written about this in some of my previous posts. Here for vCD and here for the rest.

Using CA signed certs allows you to set the key verify to true in the config.yaml file.

verify=true

Doing so makes you CSE server much more secure. This also allows you to use the vcd and cse CLIs without using the -i -w flags which is logging in without verifying certs and to disable warnings respectively. This is of course unsafe.

In order to ensure end to end security between CSE server, VCD and vCenter, import the certificate chain consisting of the INTERMEDIATE and ROOT certs from the certificate authority into the certs store on the CSE server.

sudo -u cse -i cat >> /opt/vmware/cse/python/lib/python3.7/site-packages/certifi/cacert.pem << EOF
-----BEGIN CERTIFICATE-----
[snipped]
-----END CERTIFICATE-----
EOF

Please see my example here starting on line 71.

Monitoring with Octant

Yes, Kubernetes clusters deployed by CSE into VCD can be monitored with Octant. I wrote about it previously here.

All you need to do is update your local kubeconfig file with the kubconfig that you downloaded from CSE in VCD.

As long as the workstation where Octant is running can route to the Control Plane endpoint for the Kubernetes cluster, Octant can then see and provided you with its great dashboards. You can use the CSE expose feature for this if your workstation is not inside the VCD cloud.

Removing clusters that failed to deploy

Obtain the cluster UID,

On CSE run this command to obtain the UID vcd cse cluster info, look for the uid parameter, it is all the way at the bottom, copy it to your clipboard.
Open up Postman or something with curl installed.
GET https://{{vcd_public_address}}/cloudapi/1.0.0/entities/urn:vcloud:entity:cse:nativeCluster:577b8c6c-bee4-49fb-8c03-2a22390f2783
POST https://{{vcd_public_address}}/cloudapi/1.0.0/entities/urn:vcloud:entity:cse:nativeCluster:577b8c6c-bee4-49fb-8c03-2a22390f2783/resolve
DEL https://{{vcd_public_address}}/cloudapi/1.0.0/entities/urn:vcloud:entity:cse:nativeCluster:577b8c6c-bee4-49fb-8c03-2a22390f2783
If that did not work use this DEL https://{{vcd_public_address}}/cloudapi/1.0.0/entities/urn:vcloud:entity:cse:nativeCluster:577b8c6c-bee4-49fb-8c03-2a22390f2783?invokeHooks=false

Known issues

Cannot deploy TKGm runtimes with expose set to true.

If you tried to use the expose feature when deploying a TKGm runtime it would fail. This is a known issue with CSE 3.0.4 and is being fixed, I’ll update this post when a fix is released.

Pre-requisites

Deploy Harbor

Update Harbor

Using Harbor as an OCI Registry for Helm Charts

Deploy an application with Helm

If you need to install Helm

Useful links

Intro

Step 1 – Download the harbor helm chart

Step 2 – Edit the values.yaml file

Step 3 – Create a TLS secret for ingress

Step 4 – Install Harbor

Log in

Intro

Install Cert Manager

Install Contour

Install Prometheus

Install Grafana

Accessing Grafana

Listing all installed packages

Making changes to Contour, Prometheus or Grafana

Removing Cert Manager, Contour, Prometheus or Grafana

Copypasta for doing this again on another cluster

Architecture

Show data using mysql-client-loop pod

Simulating Pod and Node downtime

Delete Pods

Drain a Node

Perform maintenance on ESX

Operations with local storage

How do TKG clusters with local storage handle ESX maintenance?

Summary

Before you start

Deploy Tanzu Kubernetes Clusters to Multiple Availibility Zones on vSphere

Configure Tanzu Kubernetes Plans and Clusters with an overlay that is topology-aware

Deploy a TKG cluster into a multi-AZ topology

Deploy the k8s-local-ssd Storage Class

Deploy a workload that uses Topology Aware Volume Provisioning

What if you wanted to use more than three availability zones?

Summary

Step 1. Create a new namespace

Step 2. Upload certificates

Step 3. Create secret

Step 4. Edit the deployment

Step 5. Expose Kubernetes Dashboard using a load balancer

Step 6. Deploy Kubernetes Dashboard

Step 7. Get full access to the cluster for Kubernetes Dashboard

Step 8. Obtain the login token

Step 1: Setup a static IP

Step 2: Enable pings to the VM

Step 3: Update Photon repositories and perform updates

Step 4: Install docker-compose

Step 5: Add a data disk for Harbor

Step 6: Setup Harbor

Step 7: Prepare SSL certificates

Step 8: Edit the harbor.yml file

Introduction

Why use Gateway API?

AVI Infra Setting

GatewayClass

Gateway

How to use GatewayAPI

Pre-requisites

Step 1. Upload an SSL certificate for Contour to VCD

Step 2. Install Helm

Step 3. Running Envoy as a non-root user

Step 4. Installing Contour (and Envoy)

Step 5. Setup DNS

Testing ingress with some apps

Getting started

Step 1.

Step 2.

Step 3.

Step 4.

Step 5.

Step 6.

Finishing off

Using the vcd cse cli to resize a TKGm cluster

Prepare a cluster config file

Viewing client logs