Last week, we looked at setting up a Kubernetes cluster on three Jetson Nanos, to prepare them for application deployment. That’s what we tackled this week, and in today’s blog post we will look at the obstacles we’ve encountered.
For these examples, we have been trying to deploy a test application using the instructions from the Rancher docs. This uses a pre-built container provided by Rancher. To recreate my deployment do the following.
- Create a file called testdeploy.yaml and paste the following inside:
apiVersion: apps/v1 kind: Deployment metadata: name: mysite labels: app: mysite spec: replicas: 1 selector: matchLabels: app: mysite template: metadata: labels: app: mysite spec: containers: - name: mysite image: kellygriffin/hello:v1 ports: - containerPort: 80
- Once you’ve saved the file, use the following command to run it.
- View your deployment using:
The CrashLoopBackOff error:
If you are running the above steps on a Jetson Nano cluster as well, you may see the common CrashLoopBackOff error.
This error is well documented for Kubernetes in general, and is often caused by insufficient resources, trying to access a locked file, or a locked database. However, in the case of the Jetson Nano cluster, I believe the reason for the CrashLoopBackOff error is the ARM_64 architecture it runs on and shares with the new M1 Macbook Pro. A lot of pre-made containers may be designed specifically to be run on x64 architectures or even x86 making them incompatible with the ARM_64 architecture.
To test this theory, I tried to deploy the Nginx service on the cluster. Nginx should work as it is made to work with ARM_64 architectures. I ran the following, to create a deployment of Nginx using the Nginx image.
After running kubectl get pods, we can see that the Nginx service is running fine on the cluster, while the testdeploy continues to restart over and over after crashing.
Many premade containers available to use for testing may be designed for different architectures from the ARM_64, which caused the frustrating CrashLoopBackOff error. As long as your Nginx deployment works, your cluster should be working, it will just require learning and much trial and error to deploy properly.
The ImagePullBackOff error:
The ImagePullBackOff error is a very finicky error. I encountered it the first few times trying to deploy the Nginx pod using the steps outlined above.
At first I had no idea how to get around it, and I assumed it was a problem with the place it was pulling the image from. Maybe the image was locked, or corrupted I thought. But after retrying multiple times, I let it run for 8 or so minutes. It retried pulling multiple times, and then one time it just worked. If you’re encountering this error, I recommend letting it run for at least 10 or so minutes before trying something else.
Those are a few of the more problematic errors I encountered when trying to get to the first deployment. The next step will be to deploy a machine learning model on the cluster. This will require a lot more in terms of containerization, and we will explore that containerization and the creation of the model itself next week.