Skip to content

Commit 72ecb24

Browse files
h4x3rotabclaude
andcommitted
replace test.sh with deploy.sh — full one-command setup
deploy.sh automates the entire flow: - Deploy CVM via phala CLI - Wait for SSH, extract kubeconfig - Wait for k3s node Ready - Wait for wildcard TLS certificate - Deploy nginx test workload with IngressRoute - Run a quick smoke test README restructured: Quick Start now points to deploy.sh, manual steps moved into a collapsible details section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent ea27712 commit 72ecb24

3 files changed

Lines changed: 303 additions & 186 deletions

File tree

k3s/README.md

Lines changed: 114 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,97 @@ Run a single-node Kubernetes cluster inside an Intel TDX Confidential VM with wi
1717
npm install -g phala
1818
phala auth login
1919
```
20-
- `kubectl` installed ([install guide](https://kubernetes.io/docs/tasks/tools/))
20+
- `kubectl` and `jq` installed ([kubectl install guide](https://kubernetes.io/docs/tasks/tools/))
2121
- A domain you control (for the wildcard certificate)
2222
- Cloudflare API token with **Zone:Read** and **DNS:Edit** permissions (see [DNS_PROVIDERS.md](../custom-domain/dstack-ingress/DNS_PROVIDERS.md) for other DNS providers)
2323

2424
## Quick Start
2525

26+
The deploy script handles everything — CVM provisioning, kubeconfig extraction, certificate waiting, and a test workload:
27+
28+
```bash
29+
export CLOUDFLARE_API_TOKEN=your-cloudflare-token
30+
export CERTBOT_EMAIL=you@example.com
31+
32+
./deploy.sh k3s.example.com
33+
```
34+
35+
Replace `k3s.example.com` with your actual domain. The script takes ~10 minutes (mostly waiting for the wildcard certificate). When done, it prints:
36+
37+
```
38+
============================================
39+
k3s on dstack is ready!
40+
============================================
41+
42+
Kubeconfig: export KUBECONFIG=/path/to/k3s.yaml
43+
kubectl: kubectl get nodes
44+
Test URL: https://nginx.k3s.example.com/
45+
Evidence: https://nginx.k3s.example.com/evidences/quote
46+
```
47+
48+
You can then deploy your own services:
49+
50+
```bash
51+
export KUBECONFIG=k3s.yaml
52+
kubectl run my-app --image=my-app:latest --port=8080
53+
kubectl expose pod my-app --port=8080
54+
kubectl apply -f - <<EOF
55+
apiVersion: traefik.io/v1alpha1
56+
kind: IngressRoute
57+
metadata:
58+
name: my-app
59+
spec:
60+
entryPoints: [web]
61+
routes:
62+
- match: Host(\`my-app.k3s.example.com\`)
63+
kind: Rule
64+
services:
65+
- name: my-app
66+
port: 8080
67+
EOF
68+
```
69+
70+
### Clean Up
71+
72+
Remove the test workload:
73+
74+
```bash
75+
kubectl delete ingressroute.traefik.io nginx
76+
kubectl delete svc nginx
77+
kubectl delete pod nginx
78+
```
79+
80+
Delete the CVM entirely:
81+
82+
```bash
83+
echo y | phala cvms delete my-k3s
84+
rm k3s.yaml
85+
```
86+
87+
### Configuration
88+
89+
You can customize the deployment with environment variables:
90+
91+
| Variable | Default | Description |
92+
|----------|---------|-------------|
93+
| `CVM_NAME` | `my-k3s` | CVM name |
94+
| `INSTANCE_TYPE` | `tdx.medium` | Instance type |
95+
| `DISK_SIZE` | `50G` | Disk size |
96+
| `KUBECONFIG_FILE` | `k3s.yaml` | Output kubeconfig path |
97+
98+
Example:
99+
100+
```bash
101+
CVM_NAME=prod-k3s INSTANCE_TYPE=tdx.4xlarge DISK_SIZE=100G ./deploy.sh k3s.example.com
102+
```
103+
104+
## Step-by-Step Guide
105+
106+
If you prefer to run each step manually instead of using the deploy script:
107+
108+
<details>
109+
<summary>Click to expand manual steps</summary>
110+
26111
### 1. Deploy the CVM
27112

28113
```bash
@@ -38,8 +123,6 @@ phala deploy \
38123
--wait
39124
```
40125

41-
Replace `k3s.example.com` with your actual domain. All subdomains under it (e.g., `nginx.k3s.example.com`) will get TLS automatically.
42-
43126
The `--dev-os` flag enables SSH access (needed to extract the kubeconfig). The `--disk-size 50G` gives enough room for k3s images and workloads.
44127

45128
The deploy command outputs an **App ID** and gateway info. Save the App ID (a 40-character hex string):
@@ -107,54 +190,37 @@ Retry until you see an HTTP response (a 404 is fine — it means TLS works but n
107190
HTTP/1.1 404 Not Found
108191
```
109192

110-
### 5. Deploy and Test a Workload
111-
112-
Run the included test script to deploy an nginx pod, verify HTTPS, check evidence endpoints, and confirm kubectl access — all in one command:
113-
114-
```bash
115-
./test.sh k3s.example.com
116-
```
117-
118-
Expected output:
119-
120-
```
121-
==> Deploying test workload...
122-
==> Waiting for pod to be ready...
123-
==> Running smoke tests...
124-
125-
PASS: https://nginx.k3s.example.com/ returned 200
126-
PASS: TLS cert CN matches *.k3s.example.com
127-
PASS: /evidences/quote returned 200
128-
PASS: /evidences/cc_eventlog returned 200
129-
PASS: /evidences/raw_quote returned 200
130-
PASS: k3s API /version returned 200
131-
PASS: kubectl reports node Ready
132-
133-
==> Results: 7/7 passed
134-
```
135-
136-
The test workload stays running so you can try it yourself:
137-
138-
```bash
139-
curl "https://nginx.k3s.example.com/"
140-
```
141-
142-
### 6. Clean Up
143-
144-
Remove the test workload:
193+
### 5. Deploy a Test Workload
145194

146195
```bash
147-
kubectl delete ingressroute.traefik.io nginx
148-
kubectl delete svc nginx
149-
kubectl delete pod nginx
196+
CLUSTER_DOMAIN=k3s.example.com
197+
198+
kubectl run nginx --image=nginx:alpine --port=80
199+
kubectl expose pod nginx --port=80 --target-port=80 --name=nginx
200+
kubectl wait --for=condition=Ready pod/nginx --timeout=120s
201+
202+
kubectl apply -f - <<EOF
203+
apiVersion: traefik.io/v1alpha1
204+
kind: IngressRoute
205+
metadata:
206+
name: nginx
207+
spec:
208+
entryPoints: [web]
209+
routes:
210+
- match: Host(\`nginx.${CLUSTER_DOMAIN}\`)
211+
kind: Rule
212+
services:
213+
- name: nginx
214+
port: 80
215+
EOF
216+
217+
sleep 10
218+
curl -s "https://nginx.${CLUSTER_DOMAIN}/"
150219
```
151220

152-
Delete the CVM entirely:
221+
You should see the nginx welcome page served over HTTPS with a valid Let's Encrypt certificate.
153222

154-
```bash
155-
echo y | phala cvms delete my-k3s
156-
rm k3s.yaml
157-
```
223+
</details>
158224

159225
## How It Works
160226

@@ -203,9 +269,7 @@ External HTTPS traffic hits dstack-ingress, which terminates TLS using the wildc
203269
4. The decrypted HTTP traffic is forwarded to Traefik on port 80
204270
5. Traefik matches the `Host` header against IngressRoute rules and routes to the right pod
205271

206-
## Configuration
207-
208-
### Environment Variables
272+
## Environment Variables
209273

210274
| Variable | Required | Description |
211275
|----------|----------|-------------|
@@ -310,7 +374,7 @@ phala ssh <app-id> -- "docker logs dstack-k3s-1 2>&1 | tail -30"
310374
```
311375
k3s/
312376
├── docker-compose.yaml # k3s + kmod-installer + dstack-ingress
313-
├── test.sh # One-command smoke test
377+
├── deploy.sh # One-command deploy + setup
314378
├── README.md
315379
└── manifests/
316380
├── rbac.yaml # Optional: scoped service account

k3s/deploy.sh

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
#!/usr/bin/env bash
2+
# Deploy k3s on dstack and set up kubectl access.
3+
#
4+
# Usage:
5+
# export CLOUDFLARE_API_TOKEN=xxx
6+
# export CERTBOT_EMAIL=you@example.com
7+
# ./deploy.sh k3s.example.com
8+
#
9+
# Prerequisites:
10+
# - phala CLI installed and authenticated (phala auth login)
11+
# - kubectl and jq installed
12+
13+
set -euo pipefail
14+
15+
CLUSTER_DOMAIN="${1:-${CLUSTER_DOMAIN:-}}"
16+
CVM_NAME="${CVM_NAME:-my-k3s}"
17+
INSTANCE_TYPE="${INSTANCE_TYPE:-tdx.medium}"
18+
DISK_SIZE="${DISK_SIZE:-50G}"
19+
KUBECONFIG_FILE="${KUBECONFIG_FILE:-k3s.yaml}"
20+
21+
if [[ -z "$CLUSTER_DOMAIN" ]]; then
22+
echo "Usage: $0 <cluster-domain>"
23+
echo " e.g. $0 k3s.example.com"
24+
echo ""
25+
echo "Required env vars:"
26+
echo " CLOUDFLARE_API_TOKEN Cloudflare API token (Zone:Read + DNS:Edit)"
27+
echo " CERTBOT_EMAIL Email for Let's Encrypt registration"
28+
echo ""
29+
echo "Optional env vars:"
30+
echo " CVM_NAME CVM name (default: my-k3s)"
31+
echo " INSTANCE_TYPE Instance type (default: tdx.medium)"
32+
echo " DISK_SIZE Disk size (default: 50G)"
33+
echo " KUBECONFIG_FILE Output kubeconfig path (default: k3s.yaml)"
34+
exit 1
35+
fi
36+
37+
: "${CLOUDFLARE_API_TOKEN:?CLOUDFLARE_API_TOKEN is required}"
38+
: "${CERTBOT_EMAIL:?CERTBOT_EMAIL is required}"
39+
40+
for cmd in phala kubectl jq; do
41+
command -v "$cmd" >/dev/null 2>&1 || { echo "Error: $cmd is required but not found"; exit 1; }
42+
done
43+
44+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
45+
46+
# ---------- Step 1: Deploy the CVM ----------
47+
48+
echo "==> Deploying CVM '${CVM_NAME}'..."
49+
phala deploy \
50+
-n "$CVM_NAME" \
51+
-c "${SCRIPT_DIR}/docker-compose.yaml" \
52+
-t "$INSTANCE_TYPE" \
53+
--disk-size "$DISK_SIZE" \
54+
--dev-os \
55+
-e "CLOUDFLARE_API_TOKEN=${CLOUDFLARE_API_TOKEN}" \
56+
-e "CERTBOT_EMAIL=${CERTBOT_EMAIL}" \
57+
-e "CLUSTER_DOMAIN=${CLUSTER_DOMAIN}" \
58+
--wait
59+
60+
# ---------- Step 2: Extract APP_ID and GATEWAY_DOMAIN ----------
61+
62+
echo "==> Fetching CVM info..."
63+
CVM_JSON=$(phala cvms get "$CVM_NAME" --json 2>/dev/null)
64+
APP_ID=$(echo "$CVM_JSON" | jq -r '.app_id')
65+
GATEWAY_DOMAIN=$(echo "$CVM_JSON" | jq -r '.gateway.base_domain')
66+
67+
echo " App ID: ${APP_ID}"
68+
echo " Gateway domain: ${GATEWAY_DOMAIN}"
69+
70+
# ---------- Step 3: Wait for SSH and extract kubeconfig ----------
71+
72+
echo "==> Waiting for CVM to boot (this takes 2-3 minutes)..."
73+
for i in $(seq 1 30); do
74+
if phala ssh "$APP_ID" -- "echo ok" >/dev/null 2>&1; then
75+
break
76+
fi
77+
if [[ $i -eq 30 ]]; then
78+
echo "Error: SSH not available after 5 minutes"
79+
exit 1
80+
fi
81+
sleep 10
82+
done
83+
84+
echo "==> Extracting kubeconfig..."
85+
for i in $(seq 1 12); do
86+
if phala ssh "$APP_ID" -- \
87+
"docker exec dstack-k3s-1 cat /etc/rancher/k3s/k3s.yaml" \
88+
2>/dev/null > "$KUBECONFIG_FILE" && [[ -s "$KUBECONFIG_FILE" ]]; then
89+
break
90+
fi
91+
if [[ $i -eq 12 ]]; then
92+
echo "Error: could not extract kubeconfig after 2 minutes"
93+
exit 1
94+
fi
95+
sleep 10
96+
done
97+
98+
# Rewrite API server URL to use the gateway TLS passthrough endpoint
99+
sed -i "s|https://127.0.0.1:6443|https://${APP_ID}-6443s.${GATEWAY_DOMAIN}|" "$KUBECONFIG_FILE"
100+
101+
export KUBECONFIG="${KUBECONFIG_FILE}"
102+
103+
# ---------- Step 4: Wait for node Ready ----------
104+
105+
echo "==> Waiting for k3s node to become Ready..."
106+
for i in $(seq 1 30); do
107+
STATUS=$(kubectl get nodes -o jsonpath='{.items[0].status.conditions[?(@.type=="Ready")].status}' 2>/dev/null || echo "")
108+
if [[ "$STATUS" == "True" ]]; then
109+
break
110+
fi
111+
if [[ $i -eq 30 ]]; then
112+
echo "Error: node not Ready after 5 minutes"
113+
exit 1
114+
fi
115+
sleep 10
116+
done
117+
118+
echo "==> Node is Ready:"
119+
kubectl get nodes
120+
121+
# ---------- Step 5: Wait for wildcard certificate ----------
122+
123+
echo "==> Waiting for wildcard TLS certificate (this takes 5-8 minutes)..."
124+
CERT_TEST_URL="https://test.${CLUSTER_DOMAIN}/"
125+
for i in $(seq 1 60); do
126+
HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 "$CERT_TEST_URL" 2>/dev/null || echo "000")
127+
if [[ "$HTTP_CODE" != "000" ]]; then
128+
echo " Certificate is ready (got HTTP ${HTTP_CODE})"
129+
break
130+
fi
131+
if [[ $i -eq 60 ]]; then
132+
echo "Warning: certificate not ready after 10 minutes, continuing anyway"
133+
break
134+
fi
135+
sleep 10
136+
done
137+
138+
# ---------- Step 6: Deploy test workload ----------
139+
140+
echo "==> Deploying nginx test workload..."
141+
NGINX_HOST="nginx.${CLUSTER_DOMAIN}"
142+
143+
kubectl run nginx --image=nginx:alpine --port=80 --restart=Never 2>/dev/null || true
144+
kubectl expose pod nginx --port=80 --target-port=80 --name=nginx 2>/dev/null || true
145+
146+
kubectl apply -f - <<EOF
147+
apiVersion: traefik.io/v1alpha1
148+
kind: IngressRoute
149+
metadata:
150+
name: nginx
151+
spec:
152+
entryPoints: [web]
153+
routes:
154+
- match: Host(\`${NGINX_HOST}\`)
155+
kind: Rule
156+
services:
157+
- name: nginx
158+
port: 80
159+
EOF
160+
161+
kubectl wait --for=condition=Ready pod/nginx --timeout=120s
162+
sleep 10
163+
164+
# ---------- Smoke test ----------
165+
166+
echo ""
167+
echo "==> Smoke test..."
168+
HTTP_CODE=$(curl -s -o /dev/null -w '%{http_code}' --max-time 10 "https://${NGINX_HOST}/" 2>/dev/null || echo "000")
169+
if [[ "$HTTP_CODE" == "200" ]]; then
170+
echo " PASS: https://${NGINX_HOST}/ returned 200"
171+
else
172+
echo " WARN: https://${NGINX_HOST}/ returned ${HTTP_CODE} (cert may still be propagating)"
173+
fi
174+
175+
# ---------- Done ----------
176+
177+
echo ""
178+
echo "============================================"
179+
echo " k3s on dstack is ready!"
180+
echo "============================================"
181+
echo ""
182+
echo " Kubeconfig: export KUBECONFIG=$(pwd)/${KUBECONFIG_FILE}"
183+
echo " kubectl: kubectl get nodes"
184+
echo " Test URL: https://${NGINX_HOST}/"
185+
echo " Evidence: https://${NGINX_HOST}/evidences/quote"
186+
echo ""
187+
echo " To clean up:"
188+
echo " kubectl delete ingressroute.traefik.io nginx && kubectl delete svc nginx && kubectl delete pod nginx"
189+
echo " echo y | phala cvms delete ${CVM_NAME} && rm ${KUBECONFIG_FILE}"

0 commit comments

Comments
 (0)