Learning to Troubleshoot Crossplane

2023_01_Blog_OG_Troubleshoot-XP_1200x630
date icon

January 9, 2023

author icon

Craig D Wilhite

read time icon

Reading time: 6 min read

Share:

LinkedIn icon
Twitter icon
Facebook icon

For many Crossplane users just beginning their journey, users must eventually venture off the beaten path of reference docs and Crossplane tutorials and begin crafting APIs of their own. Users may find themselves in situations where they try to submit a claim for a new, custom API they just defined to their control plane and find, “...wait, is the control plane doing anything?” Knowing how to debug issues with Crossplane is a crucial tool to have on your Crossplane journey. Here are three tips for Crossplane users who want to understand how to begin debugging their control planes.

Anatomy of Crossplane resources

It’s helpful to familiarize yourself with the moving parts that are involved in making your APIs work. If we assume users are creating claim objects as desired end state for consuming your control plane’s API, there are several objects that get created in response to the claim:

  • Claims generate a Composite Resource (XR)
  • The Composite Resource generates at a minimum either Managed Resources (one or more), nested Composite Resources, or a combination of both (both Managed Resources and nested Composite Resources
  • The Managed Resources are ultimately backed by Controllers provided by the Crossplane Provider they originate from, which must be correctly configured with a ProviderConfig object to talk to whichever external service’s API that you’re building on top of

It’s worth highlighting that Composite Resources can themselves be composed of nested Composite Resources, which recurses the diagram until there are no more nested Composite Resources to unpack. Here is a diagram to illustrate:

Walking the tree of the diagram, we see that a Crossplane Claim generates a Composite Resource (XR). That Composite Resource will generate Managed Resources (minimum of one to many) OR it will generate nested Composite Resources OR it could do both. The faded boxes are meant to illustrate the optionality here–it ultimately depends on how a given Composite Resource is defined. When you finally unwind a Composite Resource to its foundations–that is, unwind it plus its nested XRs (if any) and you will eventually end up with only Managed Resources–each of the resulting Managed Resources are defined and implemented by a Crossplane Provider.

Having a grasp of this anatomy will pay dividends when it comes to debugging complex Crossplane resources (which Composite Resources are). For now, let’s stick to surface level troubleshooting and look at some tips for debugging simpler Crossplane resources–ProviderConfigs and Managed Resources.

Tip #1: Verify you have a valid ProviderConfig

The first problem area users might encounter when submitting a claim and finding that they do not deploy: make sure the ProviderConfig is correctly configured. If the ProviderConfig is misconfigured, the control plane will never be able to successfully call the external cloud service. Here are some ways ProviderConfigs can end up in a faulty state:

  • If you are using a Service Account, it may have insufficient permissions.
  • The secret associated with the ProviderConfig has the wrong name or is in the wrong namespace
  • The secret may be misconfigured our formatted incorrectly


In this blog post, I demonstrate debugging a ProviderConfig on a control plane that has platform-ref-gcp installed on it. The first thing to note: Configurations and Providers can report a “healthy” and “installed” status of “true” even if the ProviderConfig is absent or misconfigured.


For now, the best approach to knowing whether or not you have your Provider configured correctly is to try creating a simple Managed Resource (MR) from that Provider and see if it succeeds. We’ll use Tips #2 and #3 to help us accomplish Tip #1.

Tip #2: Use Kubernetes events to debug

To demonstrate the first troubleshooting tip, I purposefully misconfigured my ProviderConfig and tried creating a GCP Bucket below. Notice in the screenshot below, my bucket does not show a “ready” status. Now I can demonstrate the second tip: when in doubt, grab the Kubernetes events for the Crossplane object you need to debug! I can use the following kubectl command to grab events associated with that bucket:

kubectl get events --field-selector involvedObject.name=upbound-bucket-9269b6d08


This command returns a helpful hint: it is reporting that my secret (which is referenced by my ProviderConfig) is not found. I am using UXP, Upbound’s distro of Crossplane, so I look for which secrets exist in the upbound-system namespace because Upbound’s documentation references this namespace in Get Started docs:

If you are using upstream Crossplane, you would probably want to check the “crossplane-system” namespace instead

We have confirmed the secret exists.

Tip #3: Use kubectl describe to see rendered configs

The next thing to do is describe my ProviderConfig and see if that reveals anything. Here is another powerful troubleshooting tip: you can use “kubectl describe” on Crossplane objects to get events and a rendering of the submitted .yaml config. In the screenshot below, we use this command (and trim out the middle of the output for brevity):

kubectl describe ProviderConfig default

💡Bonus tip: ‘kubectl describe’ automatically breaks up field names; it renders camelCase as "Camel Case". In the screenshot above, “SecretRef” is rendered as “Secret Ref”

The culprit has been found: the ProviderConfig thinks the Secret exists in crossplane-system, but we know this is not the case (UXP is installed into upbound-system). We can fix this by updating our ProviderConfig. After doing that and getting the events of the Bucket now, we see this:


I purposefully misconfigured the private key for the Secret associated with my ProviderConfig. This error complains about an issue with the private key, which matches our expectation. If I fix the Secret associated with the ProviderConfig and check the status of the bucket afterwards, I will now see this:


Success! The resource is now showing a ready status and the event log shows the resource creation was requested successfully. If we wait a few seconds, the Bucket will be provisioned and will show up in the GCP console. This test using a simple MR revealed two separate errors, we successfully troubleshot them, and have now confirmed the Provider is correctly configured.

Closing

These three tips will set you down the path to troubleshooting Crossplane and squashing those pesky bugs. In a future post, we’ll explore how to use these same tips to debug more complex objects like Composite Resources. For  now, check out the Crossplane troubleshooting resources on the Crossplane docs. Stay tuned!

Cheers,

Craig D Wilhite

Subscribe to the Upbound Newsletter