Best practices for installing CSE 4.0

Container Service Extension 4 was released recently. This post aims to help ease the setup of CSE 4.0 as it has a different deployment model using the Solutions framework instead of deploying the CSE appliance into the traditional Management cluster concept used by service providers to run VMware management components such as vCenter, NSX-T Managers, Avi Controllers and other management systems.

Step 1 – Create a CSE Service Account

Perform these steps using the administrator@system account or an equivalent system administrator role.

Setup a Service Account in the Provider (system) organization with the role CSE Admin Role.

In my environment I created a user to use as a service account named svc-cse. You’ll notice that this user has been assigned the CSE Admin Role.

The CSE Admin Role is created automatically by CSE when you use the CSE Management UI as a Provider administrator, just do these steps using the administrator@system account.

Step 2 – Create a token for the Service Account

Log out of VCD and log back into the Provider organization as the service account you created in Step 1 above. Once logged in, it should look like the following screenshot, notice that the svc-cse user is logged into the Provider organization.

Click on the downward arrow at the top right of the screen, next to the user svc-cse and select User Preferences.

Under Access Tokens, create a new token and copy the token to a safe place. This is what you use to deploy the CSE appliance later.

Log out of VCD and log back in as adminstrator@system to the Provider organization.

Step 3 – Deploy CSE appliance

Create a new tenant Organization where you will run CSE. This new organization is dedicated to VCD extensions such as CSE and is managed by the service provider.

For example you can name this new organization something like “solutions-org“. Create an Org VDC within this organization and also the necessary network infrastructure such as a T1 router and an organization network with internet access.

Still logged into the Provider organization, open another tab by clicking on the Open in Tenant Portal link to your “solutions-org” organization. You must deploy the CSE vApp as a Provider.

Now you can deploy the CSE vApp.

Use the Add vApp From Catalog workflow.

Accept the EULA and continue with the workflow.

When you get the Step 8 of the Create vApp from Template, ensure that you setup the OVF properties like my screenshot below:

The important thing to note is to ensure that you are using the correct service account username and use the token from Step 2 above.

Also since you must have the service account in the Provider organization, leave the default system organization for CSE service account’s org.

The last value is very important, it must by set to the tenant organization that will run the CSE appliance, in our case it is the “solutions-org” org.

Once the OVA is deployed you can boot it up or if you want to customize the root password then do so before you start the vApp. If not, the default credentials are root and vmware.

Rights required for deploying TKG clusters

Ensure that the user that is logged into a tenant organization has the correct rights to deploy a TKG cluster. This user must have at a minimum the rights in the Kubernetes Cluster Author Global Role.

App LaunchPad

You’ll also need to upgrade App Launchpad to the latest version alp-2.1.2-20764259 to support CSE 4.0 deployed clusters.

Also ensure that the App-Launchpad-Service role has the rights to manage CAPVCD clusters.

Otherwise you may encounter the following issue:

VCD API Protected by Web Application Firewalls

If you are using a web application firewall (WAF) in front of your VCD cells and you are blocking access to the provider side APIs. You will need to add the SNAT IP address of the T1 from the solutions-org into the WAF whitelist.

The CSE appliance will need access to the VCD provider side APIs.

I wrote about using a WAF in front of VCD in the past to protect provider side APIs. You can read those posts here and here.

Advertisement

Cleaning up CSE 4.0 beta

For those partners that have been testing the beta, you’ll need to remove all traces of it before you can install the GA version. VMware does not support upgrading or migrating from beta builds to GA builds.

This is a post to help you clean up your VMware Cloud Director environment in preparation for the GA build of CSE 4.0.

For those partners that have been testing the beta, you’ll need to remove all traces of it before you can install the GA version. VMware does not support upgrading or migrating from beta builds to GA builds.

If you don’t clean up, when you try to configure CSE again with the CSE Management wizard, you’ll see the message below:

“Server configuration entity already exists.”

Delete CSE Roles

First delete all the CSE Roles that the beta has setup, the GA version of CSE will recreate these for you when you use the CSE management wizard. Don’t forget to assign the new role to your CSE service account when you deploy the CSE GA OVA.

Use the Postman Collection to clean up

I’ve included a Postman collection on my Github account, available here.

Hopefully, it is self-explanatory. Authenticate against the VCD API, then run each API request in order, make sure you obtain the entity and entityType IDs before you delete.

If you’re unable to delete the entity or entityTypes, you may need to delete all of the CSE clusters before, that means cleaning up all PVCs, PVs, deployments and then the clusters themselves.

Deploy CSE GA Normally

You’ll now be able to use the Configure Management wizard and deploy CSE 4.0 GA as normal.

Known Issues

If you’re unable to delete any of these entities then run a POST using /resolve.

For example, https://vcd.vmwire.com/api-explorer/provider#/definedEntity/resolveDefinedEntity

Once, it is resolved, you can go ahead and delete the entity.

Using Velero with Restic for Kubernetes Data Protection

Velero (formerly Heptio Ark) gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes. You can run Velero with a cloud provider or on-premises.

This works with any Kubernetes cluster, including Tanzu Kubernetes Grid and Kubernetes clusters deployed with Container Service Extension with VMware Cloud Director.

This solution can be used for air-gapped environments where the Kuberenetes clusters do not have Internet access and cannot use public services such as Amazon S3, or Tanzu Mission Control Data Protection. These services are SaaS services which are pretty much out of bounds in air-gapped environments.

Overview

Velero (formerly Heptio Ark) gives you tools to back up and restore your Kubernetes cluster resources and persistent volumes. You can run Velero with a cloud provider or on-premises. Velero lets you:

  • Take backups of your cluster and restore in case of loss.
  • Migrate cluster resources to other clusters.
  • Replicate your production cluster to development and testing clusters.

Velero consists of:

  • A server that runs on your Kubernetes cluster
  • A command-line client that runs locally

Velero works with any Kubernetes cluster, including Tanzu Kubernetes Grid and Kubernetes clusters deployed using Container Service Extension with VMware Cloud Director.

This solution can be used for air-gapped environments where the Kubernetes clusters do not have Internet access and cannot use public services such as Amazon S3, or Tanzu Mission Control Data Protection. These services are SaaS services which are pretty much out of bounds in air-gapped environments.

Install Velero onto your workstation

Download the latest Velero release for your preferred operating system, this is usually where you have your kubectl tools.

https://github.com/vmware-tanzu/velero/releases

Extract the contents.

tar zxvf velero-v1.8.1-linux-amd64.tar.gz

You’ll see a folder structure like the following.

ls -l
total 70252
-rw-r----- 1 phanh users    10255 Mar 10 09:45 LICENSE
drwxr-x--- 4 phanh users     4096 Apr 11 08:40 examples
-rw-r----- 1 phanh users    15557 Apr 11 08:52 values.yaml
-rwxr-x--- 1 phanh users 71899684 Mar 15 02:07 velero

Copy the velero binary to the /usr/local/bin location so it is usable from anywhere.

sudo cp velero /usr/local/bin/velero

sudo chmod +x /usr/local/bin/velero

sudo chmod 755 /usr/local/bin/velero

If you want to enable bash auto completion, please follow this guide.

Setup an S3 service and bucket

I’m using TrueNAS’ S3 compatible storage in my lab. TrueNAS is an S3 compliant object storage system and is incredibly easy to setup. You can use other S3 compatible object stores such as Amazon S3. A full list of supported providers can be found here.

Follow these instructions to setup S3 on TrueNAS.

  1. Add certificate, go to System, Certificates
  2. Add, Import Certificate, copy and paste cert.pem and cert.key
  3. Storage, Pools, click on the three dots next to the Pools that will hold the S3 root bucket.
  4. Add a Dataset, give it a name such as s3-storage
  5. Services, S3, click on pencil icon.
  6. Setup like the example below.

Setup the access key and secret key for this configuration.

access key: AKIAIOSFODNN7EXAMPLE
secret key: wJalrXUtnFEMIK7MDENGbPxRfiCYEXAMPLEKEY

Update DNS to point to s3.vmwire.com to 10.92.124.20 (IP of TrueNAS). Note that this FQDN and IP address needs to be accessible from the Kubernetes worker nodes. For example, if you are installing Velero onto Kubernetes clusters in VCD, the worker nodes on the Organization network need to be able to route to your S3 service. If you are a service provider, you can place your S3 service on the services network that is accessible by all tenants in VCD.

Test access

Download and install the S3 browser tool https://s3-browser.en.uptodown.com/windows

Setup the connection to your S3 service using the access key and secret key.

Create a new bucket to store some backups. If you are using Container Service Extension with VCD, create a new bucket for each Tenant organization. This ensures multi-tenancy is maintained. I’ve create a new bucket named tenant1 which corresponds to one of my tenant organizations in my VCD environment.

Install Velero into the Kubernetes cluster

You can use the velero-plugin-for-aws and the AWS provider with any S3 API compatible system, this includes TrueNAS, Cloudian Hyperstore etc.

Setup a file with your access key and secret key details, the file is named credentials-velero.

vi credentials-velero
[default]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMIK7MDENGbPxRfiCYEXAMPLEKEY

Change your Kubernetes context to the cluster that you want to enable for Velero backups. The Velero CLI will connect to your Kubernetes cluster and deploy all the resources for Velero.

velero install \
    --use-restic \
    --default-volumes-to-restic \
    --use-volume-snapshots=false \
    --provider aws \
    --plugins velero/velero-plugin-for-aws:v1.4.0 \
    --bucket tenant1 \
    --backup-location-config region=default,s3ForcePathStyle="true",s3Url=https://s3.vmwire.com:9000 \
    --secret-file ./credentials-velero

To install Restic, use the --use-restic flag in the velero install command. See the install overview for more details on other flags for the install command.

velero install --use-restic

When using Restic on a storage provider that doesn’t have Velero support for snapshots, the --use-volume-snapshots=false flag prevents an unused VolumeSnapshotLocation from being created on installation. The VCD CSI provider does not provide native snapshot capability, that’s why using Restic is a good option here.

I’ve enabled the default behavior to include all persistent volumes to be included in pod backups enabled on all Velero backups running the velero install command with the --default-volumes-to-restic flag. Refer install overview for details.

Specify the bucket with the --bucket flag, I’m using tenant1 here to correspond to a VCD tenant that will have its own bucket for storing backups in the Kubernetes cluster.

For the --backup-location-config flag, configure you settings like mine, and use the s3Url flag to point to your S3 object store, if you don’t use this Velero will use AWS’ S3 public URIs.

A working deployment looks like this

time="2022-04-11T19:24:22Z" level=info msg="Starting Controller" logSource="/go/pkg/mod/github.com/bombsimon/logrusr@v1.1.0/logrusr.go:111" logger=controller.downloadrequest reconciler group=velero.io reconciler kind=DownloadRequest
time="2022-04-11T19:24:22Z" level=info msg="Starting controller" controller=restore logSource="pkg/controller/generic_controller.go:76"
time="2022-04-11T19:24:22Z" level=info msg="Starting controller" controller=backup logSource="pkg/controller/generic_controller.go:76"
time="2022-04-11T19:24:22Z" level=info msg="Starting controller" controller=restic-repo logSource="pkg/controller/generic_controller.go:76"
time="2022-04-11T19:24:22Z" level=info msg="Starting controller" controller=backup-sync logSource="pkg/controller/generic_controller.go:76"
time="2022-04-11T19:24:22Z" level=info msg="Starting workers" logSource="/go/pkg/mod/github.com/bombsimon/logrusr@v1.1.0/logrusr.go:111" logger=controller.backupstoragelocation reconciler group=velero.io reconciler kind=BackupStorageLocation worker count=1
time="2022-04-11T19:24:22Z" level=info msg="Starting workers" logSource="/go/pkg/mod/github.com/bombsimon/logrusr@v1.1.0/logrusr.go:111" logger=controller.downloadrequest reconciler group=velero.io reconciler kind=DownloadRequest worker count=1
time="2022-04-11T19:24:22Z" level=info msg="Starting workers" logSource="/go/pkg/mod/github.com/bombsimon/logrusr@v1.1.0/logrusr.go:111" logger=controller.serverstatusrequest reconciler group=velero.io reconciler kind=ServerStatusRequest worker count=10
time="2022-04-11T19:24:22Z" level=info msg="Validating backup storage location" backup-storage-location=default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114"
time="2022-04-11T19:24:22Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121"
time="2022-04-11T19:25:22Z" level=info msg="Validating backup storage location" backup-storage-location=default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:114"
time="2022-04-11T19:25:22Z" level=info msg="Backup storage location valid, marking as available" backup-storage-location=default controller=backup-storage-location logSource="pkg/controller/backup_storage_location_controller.go:121"

To see all resources deployed, use this command.

k get all -n velero
NAME                          READY   STATUS    RESTARTS   AGE
pod/restic-x6r69              1/1     Running   0          49m
pod/velero-7bc4b5cd46-k46hj   1/1     Running   0          49m

NAME                    DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/restic   1         1         1       1            1           <none>          49m

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/velero   1/1     1            1           49m

NAME                                DESIRED   CURRENT   READY   AGE
replicaset.apps/velero-7bc4b5cd46   1         1         1       49m

Example to test Velero and Restic integration

Please use this link here: https://velero.io/docs/v1.5/examples/#snapshot-example-with-persistentvolumes

You may need to edit the with-pv.yaml manifest if you don’t have a default storage class.

Useful commands

velero get backup-locations
NAME      PROVIDER   BUCKET/PREFIX   PHASE       LAST VALIDATED                  ACCESS MODE   DEFAULT
default   aws        tenant1          Available   2022-04-11 19:26:22 +0000 UTC   ReadWrite     true

Create a backup example

velero backup create nginx-backup --selector app=nginx

Show backup logs

velero backup logs nginx-backup

Delete a backup

velero delete backup nginx-backup

Show all backups

velero backup get

Backup the VCD PostgreSQL database, see this previous blog post.

velero backup create postgresql --ordered-resources 'statefulsets=vmware-cloud-director/postgresql-primary' --include-namespaces=vmware-cloud-director

Show logs for this backup

velero backup logs postgresql

Describe the postgresql backup

velero backup describe postgresql

Describe volume backups

kubectl -n velero get podvolumebackups -l velero.io/backup-name=nginx-backup -o yaml

apiVersion: v1
items:
- apiVersion: velero.io/v1
  kind: PodVolumeBackup
  metadata:
    annotations:
      velero.io/pvc-name: nginx-logs
    creationTimestamp: "2022-04-13T17:55:04Z"
    generateName: nginx-backup-
    generation: 4
    labels:
      velero.io/backup-name: nginx-backup
      velero.io/backup-uid: c92d306a-bc76-47ba-ac81-5b4dae92c677
      velero.io/pvc-uid: cf3bdb2f-714b-47ee-876c-5ed1bbea8263
    name: nginx-backup-vgqjf
    namespace: velero
    ownerReferences:
    - apiVersion: velero.io/v1
      controller: true
      kind: Backup
      name: nginx-backup
      uid: c92d306a-bc76-47ba-ac81-5b4dae92c677
    resourceVersion: "8425774"
    uid: 1fcdfec5-9854-4e43-8bc2-97a8733ee38f
  spec:
    backupStorageLocation: default
    node: node-7n43
    pod:
      kind: Pod
      name: nginx-deployment-66689547d-kwbzn
      namespace: nginx-example
      uid: 05afa981-a6ac-4caf-963b-95750c7a31af
    repoIdentifier: s3:https://s3.vmwire.com:9000/tenant1/restic/nginx-example
    tags:
      backup: nginx-backup
      backup-uid: c92d306a-bc76-47ba-ac81-5b4dae92c677
      ns: nginx-example
      pod: nginx-deployment-66689547d-kwbzn
      pod-uid: 05afa981-a6ac-4caf-963b-95750c7a31af
      pvc-uid: cf3bdb2f-714b-47ee-876c-5ed1bbea8263
      volume: nginx-logs
    volume: nginx-logs
  status:
    completionTimestamp: "2022-04-13T17:55:06Z"
    path: /host_pods/05afa981-a6ac-4caf-963b-95750c7a31af/volumes/kubernetes.io~csi/pvc-cf3bdb2f-714b-47ee-876c-5ed1bbea8263/mount
    phase: Completed
    progress:
      bytesDone: 618
      totalBytes: 618
    snapshotID: 8aa5e473
    startTimestamp: "2022-04-13T17:55:04Z"
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Rights Bundles for Container Service Extension

A quick note on the Rights Bundles for Container Service Extension when enabling native, TKGm or TKGs clusters.

The rights bundle named vmware:tkgcluster Entitlement are for TKGs clusters and NOT for TKGm.

The rights bundle named cse:nativeCluster Entitlement are for native clusters AND also for TKGm clusters.

Yes, this is very confusing and will be fixed in an upcoming release.

You can see a brief note about this on the release notes here.

Users deploying VMware Tanzu Kubernetes Grid clusters should have the rights required to deploy exposed native clusters and additionally the right Full Control: CSE:NATIVECLUSTER. This right is crucial for VCD CPI to work properly.

So in summary, for a user to be able to deploy TKGm clusters they will need to have the cse:nativeCluster Entitlement rights.

To publish these rights, go to the Provider portal and navigate to Administration, Rights Bundles.

Click on the radio button next to cse:nativeCluster Entitlement and click on Publish, then publish to the desired tenant or to all tenants.

Load Balancing and Protecting Cloud Director with Avi Networks

This article covers protecting and load balancing the Cloud Director application with Avi Networks. It covers SSL termination. health monitoring and layer 7 HTTP filtering. It can also be used as a reference for other load balancer products such as F5 LTM or NGINX.

Overview

The Avi Vantage platform is built on software-defined principles, enabling a next generation architecture to deliver the flexibility and simplicity expected by IT and lines of business. The Avi Vantage architecture separates the data and control planes to deliver application services beyond load balancing, such as application analytics, predictive autoscaling, micro-segmentation, and self-service for app owners in both on-premises or cloud environments. The platform provides a centrally managed, dynamic pool of load balancing resources on commodity x86 servers, VMs or containers, to deliver granular services close to individual applications. This allows network services to scale near infinitely without the added complexity of managing hundreds of disparate appliances.

https://avinetworks.com/docs/18.2/architectural-overview/

Avi components

Controllers – these are the management appliances that are responsible for state data, Service Engines are deployed by the controllers. The controllers run in a management network.

Service Engines – the load balancing services run in here. These generally run in a DMZ network. Service Engines can have one or more network adaptors connected to multiple networks. At least one network with routing to the controllers, and the remaining networks as data networks.

Deployment modes

Avi can be installed in a variety of deployment types. For VMware Cloud on AWS, it is not currently possible to deploy using ‘write access’ as vCenter is locked-down in VMC and it also has a different API from vSphere 6.7 vCenter Server. You’ll also find that other tools may not work with vCenter in a VMware Cloud on AWS SDDC, such as govc.

Instead Avi needs to be deployed using ‘No Access’ mode.

You can refer to this link for instructions to deploy Avi Controllers in ‘No Access’ mode.

Since it is only possible to use ‘No Access’ mode with VMC based SDDCs, its also a requirement to deploy the service engines manually. To do this follow the guide in this link, and start at the section titled Downloading Avi Service Engine on OVA.

If you’re using Avi with on-premises deployments of vCenter, then ‘Write Mode’ can be used to automate the provisioning of service engines. Refer to this link for more information on the different modes.

Deploying Avi Controller with govc

You can deploy the Avi Controller onto non VMware Cloud on AWS vCenter servers using the govc tool. Refer to this other post on how to do so. I’ve copied the JSON for the controller.ova for your convenience below.

{
    "DiskProvisioning": "flat",
    "IPAllocationPolicy": "dhcpPolicy",
    "IPProtocol": "IPv4",
    "PropertyMapping": [
        {
            "Key": "avi.mgmt-ip.CONTROLLER",
            "Value": ""
        },
        {
            "Key": "avi.mgmt-mask.CONTROLLER",
            "Value": ""
        },
        {
            "Key": "avi.default-gw.CONTROLLER",
            "Value": ""
        },
        {
            "Key": "avi.sysadmin-public-key.CONTROLLER",
            "Value": ""
        }
    ],
    "NetworkMapping": [
        {
            "Name": "Management",
            "Network": ""
        }
    ],
    "MarkAsTemplate": false,
    "PowerOn": false,
    "InjectOvfEnv": false,
    "WaitForIP": false,
    "Name": null
}

Architecture

For a high-level architecture overview, this link provides a great starting point.

Figure 1. Avi architecture

Service Engine Typical Deployment Architecture

Generally in legacy deployments, where BGP is not used. The service engines would tend to have three network interfaces. These are typically used for frontend, backend and management networks. This is typical of traditional deployments with F5 LTM for example.

For our example here, I will use three networks for the SEs as laid out below.

Network nameGateway CIDRPurpose
sddc-cgw-vcd-dmz1 10.104.125.1/24Management
sddc-cgw-vcd-dmz210.104.126.1/24Backend
sddc-cgw-vcd-dmz310.104.127.1/24Frontend

The service engines are configured with the following details. It is important to make a note of the MAC addresses in ‘No access’ mode as you will need this information later.

Service Engineavi-se1avi-se2
ManagementIP Address 10.104.125.11
Mac Address 00:50:56:8d:c0:2e
IP Address 10.104.125.12
Mac Address 00:50:56:8d:38:33
BackendIP Address 10.104.126.11
Mac Address 00:50:56:8d:8e:41
IP Address 10.104.126.12
Mac Address 00:50:56:8d:53:f6
FrontendIP Address 10.104.127.11
Mac Address 00:50:56:8d:89:b4
IP Address 10.104.127.12
Mac Address 00:50:56:8d:80:41

The Management network is used for communications between the SEs and the Avi controllers. For the port requirements, please refer to this link.

The Backend network is used for communications between the SEs and the application that is being load balanced and protected by Avi.

The Frontend network is used for upstream communications to the clients, in this case the northbound router or firewall towards the Internet.

Sample Application

Lets use VMware Cloud Director as the sample application for configuring Avi. vCD as it is more commonly named (to be renamed VMware Cloud Director), is a cloud platform which is deployed with an Internet facing portal. Due to this, it is always best to protect the portal from malicious attacks by employing a number of methods.

Some of these include, SSL termination and web application filtering. The following two documents explain this in more detail.

vCloud Director Security and VMware vCloud Director Security Hardening Guide.

The vCD application is configured as below:

vCD Appliance 1vCD Appliance 2
namevcd-fe1vcd-fe2
eth0 ip address10.104.123.2110.104.123.22
static route10.104.123.1 10.104.126.0/2410.104.123.1 10.104.126.0/24
eth1 ip address10.104.124.2110.104.124.22

You’ll notice that the eth0 and eth1 interfaces are connected to two different management networks 10.104.123.0/24 and 10.104.124.0/24 respectively. For vCD, it is generally good practice to separate the two interfaces into separate networks.

Network nameGateway CIDRPurpose
sddc-cgw-vcd-mgmt-110.104.123.1/24vCD Frontend
UI/API/VM Remote Console
sddc-cgw-vcd-mgmt-210.104.124.1/24vCD Backend
PostgreSQL, SSH etc.

For simplicity, I also deployed my Avi controllers onto the sddc-cgw-vcd-mgmt-2 network.

The diagram below summarises the above architecture for the HTTP interface for vCD. For this guide, I’ve used VMware Cloud on AWS together with Avi Networks to protect vCD running as an appliance inside the SDDC. This is not a typical deployment model as Cloud Director Service will be able to use VMware Cloud on AWS SDDC resource soon, but I wanted to showcase the possibilities and constraints when using Avi with VMC based SDDCs.

Figure 2 . vCD HTTP Diagram

Configuring Avi for Cloud Director

After you have deployed the Avi Controllers and the Service Engines, there are few more steps needed before vCD is fully up and operational. The proceeding steps can be summarised as follows:

  1. Setup networking for the service engines by assigning the right IP address to the correct MAC addresses for the data networks
  2. Configure the network subnets for the service engines
  3. Configure static routes for the service engines to reach vCD
  4. Setup Legacy HA mode for the service engine group
  5. Setup the SSL certificate for the HTTP service
  6. Setup the Virtual Services for HTTP and Remote Console (VMRC)
  7. Setup the server pools
  8. Setup health monitors
  9. Setup HTTP security policies

Map Service Engine interfaces

Using the Avi Vantage Controller, navigate to Infrastructure > Service Engine, select one of the Service Engines then click on the little pencil icon. Then map the MAC addresses to the correct IP addresses.

Configure the network subnets for the service engines

Navigate to Infrastructure > Networks and create the subnets.

Configure static routes

Navigate to Infrastructure > Routing and setup any static routes. You’ll notice from figure 2 that since the service engine has three network interfaces on different networks, we need to create a static route on the interface that does not have the default gateway. This is so the service engines knows which gateway to use to route traffic for particular traffic types. In this case, the gateway for the service engine to route the HTTP and Remote Console traffic southbound to the vCD cells.

Setup Legacy HA mode for the service engine group

Navigate to Infrastructure > Service Engine Group.

Setup the HA mode to Legacy HA. This is the simplest configuration, you can use Elastic HA if you wish.

Configure the HTTP and Remote Console Virtual Services

Navigate to Applications > Virtual Services.

Creating a Virtual Service, has a few sub tasks which include the creation of the downstream server pools and SSL certificates.

Create a new Virtual Service for the HTTP service, this is for the Cloud Director UI and API. Please use this example to create another Virtual Service for the Remote Console.

For the Remote Console service, you will need to accept TCP 443 on the load balancer but connect southbound to the Cloud Director appliances on port TCP 8443. TCP 8443 is the port that VMRC uses as it shares the same IP addresses as the HTTP service.

You may notice that the screenshot is for an already configured Virtual Service for the vCD HTTP service. The server pool and SSL certificate is already configured. Below are the screenshots for those.

Certificate Management

You may already have a signed HTTP certificate that you wish to use with the load balancer for SSL termination. To do so, you will need to use the JAVA keytool to manipulate the HTTP certificate, obtaining the private key and convert from JCEKS to PCKS12. JAVA keytool is available in the vCD appliance at /opt/vmware/vcloud-director/jre/bin/.

Figure 3. SSL termination on load balancer

For detailed instructions on creating a signed certificate for vCD, please follow this guide.

Convert the keystore file certificates.ks file from JCEKS to PKCS12

keytool -importkeystore -srcstoretype JCEKS -srckeystore certificates.ks -destkeystore certificates_pkcs12.ks -deststoretype PKCS12

Export private key for the HTTP certificate from the certificates_pkcs12.ks file

keytool -importkeystore -srckeystore certificates_pkcs12.ks -srcalias http -destalias http -destkeystore httpcert.p12 -deststoretype PKCS12

Now that you have the private key for the HTTP certificate, you can go ahead and configure the HTTP certificate on the load balancer.

For the certificate file, you can either paste the text or upload the certificate file (.cer, .crt) from the certificate authority for the HTTP certificate.

For the Key (PEM) or PKCS12 file, you can use the httpcert.p12 file that you extracted from the certificates_pkcs12.ks file above.

The Key Passphrase is the password that you used to secure the httpcert.p12 file earlier.

Note that the vCD Remote Console (VMRC) must use pass-through for SSL termination, e.g., termination of the VMRC session must happen on the Cloud Director cell. Therefore, the above certificate management activities on Avi are not required for the VMRC.

Health Monitors

Navigate to Applications > Pools.

Edit the HTTP pool using the pencil icon and click on the Add Active Monitor green button.

Health monitoring of the HTTP service uses

GET /cloud/server_status HTTP/1.0

With an expected server response of

Service is up.

And a response code of 200.

The vCD Remote Console Health monitor is a lot simpler as you can see below.

Layer 7 HTTP Security

Layer 7 HTTP Security is very important and is highly recommended for any application exposed to the Internet. Layer 3 fire-walling and SSL certificates is always never enough in protecting and securing applications.

Navigate to Applications > Virtual Services.

Click on the pencil icon for the HTTP virtual service and then click on the Policies tab. Then click on the HTTP Security policy. Add a new policy with the following settings. You can read more about Layer 7 HTTP policies here.

Allowed StringsRequired by
/tenantTenant use
/loginLogin
/networkAccess to networking
/tenant-networkingAccess to networking
/cloudFor SAML/SSO logins
/transferUploads/Downloads of ISO and templates
/apiGeneral API access
/cloudapiGeneral API access
/docsSwagger API browser
Blocked Strings
/cloudapi/1.0.0/sessions/providerSpecifically block admin APIs from the Internet

This will drop all provider side services when accessed from the Internet. To access provider side services, such as /provider or admin APIs, use an internal connection to the Cloud Director cells.

Change Cloud Director public addresses

If not already done so, you should also change the public address settings in Cloud Director.

Testing the Cloud Director portal

Try to access https://vcloud.vmwire.com/provider

You won’t be able to access it as /provider is not on the list of allowed URI strings that we configured in the L7 HTTPS Security settings.

However, if you try to access https://vcloud.vmwire.com/tenant/vmwire, you will be able to reach the tenant portal for the organisation named VMwire.

Many thanks to Mikael Steding, our Avi Network Systems Engineer for helping me with setting this up.

Please reach out to me if you have any questions.