docs(self-edit): document synchronous /api/v1/self/deploy

Update API docs, recipes, design doc, deploy-pipeline architecture,
and deploy-logs ops doc to match the new synchronous behaviour
(commit 8505981). The endpoint now returns 200/500 with status,
durationMs, exitCode, errorSummary, and an inline logTail (last
~8KB) — no polling, no companion GET endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Eliot M 2026-04-29 14:56:48 +00:00
parent 8505981889
commit e223ba45ec
5 changed files with 104 additions and 43 deletions

View file

@ -87,4 +87,4 @@ The deploy service broadcasts these to the requesting user's connected tabs:
- `deploy.done` — { deployId, ok, exitCode, durationMs, errorSummary }
- `deploy.cancelled` — { deployId }
Self-edit calls use `requestedByUserId = '__self__'`, so WS broadcasts go nowhere — the response payload + log polling is enough.
Self-edit calls use `requestedByUserId = '__self__'`, so WS broadcasts go nowhere. Instead, `POST /api/v1/self/deploy` is **synchronous**: the route handler subscribes to `DeployService` events (via `service.waitForDone(deployId)`), awaits the terminal `done`/`cancelled` event, then responds once with the final row + a tail of the log file. The agent gets a single blocking answer — no polling, no WS subscription needed.

View file

@ -133,24 +133,46 @@ Both endpoints live on **core-back** at `/api/v1/self/*` and are reachable from
`scope=auto``all` if `CIAL_UNRESTRICTED=1`, else `platform`. Explicit `scope=all` when `CIAL_UNRESTRICTED=0` returns 403 `unrestricted_required`.
**Success (202 Accepted)**:
**Synchronous** — handler awaits `DeployService.waitForDone(deployId)`
and responds once the build (and post-build restart) reach a terminal
state. No 202 / poll loop.
**Success (200 OK)**:
```json
{
"ok": true,
"deployId": "uuid",
"status": "queued" | "building",
"status": "ok",
"scope": "platform" | "all",
"logPath": "/cial/data/deploy-logs/<uuid>.log",
"buildFilter": ["@cial/platform-front", "@cial/platform-back"]
"durationMs": 42103,
"exitCode": 0,
"errorSummary": null,
"logTail": "<last ~8KB of /cial/data/deploy-logs/<deployId>.log>"
}
```
**Failure (500)**:
```json
{
"ok": false,
"deployId": "uuid",
"status": "error" | "cancelled" | "unknown",
"scope": "platform" | "all",
"durationMs": 12011,
"exitCode": 1,
"errorSummary": "error TS2345: …",
"logTail": "<last ~8KB of log>"
}
```
`status: "unknown"` only happens on the 10-minute waitForDone timeout
— the build itself keeps running in the background.
**Errors**:
- `403 not_localhost` — request not from loopback.
- `403 unrestricted_required``scope=all` but flag off.
- `409 build_in_progress` — only when caller passes `?wait=false`; otherwise the runner's coalesce/queue applies.
Streaming logs continue to flow over the existing WS bridge (`DeployWsType.DeployLog`). The skill polls for completion (see SKILL.md outline).
Streaming logs still flow over the existing WS bridge (`DeployWsType.DeployLog`) for human-driven deploys, but self-edit calls bypass that — they get the `logTail` inline in the response.
### `POST /api/v1/self/restart`
@ -327,10 +349,8 @@ Trigger a build via the self-edit endpoint and wait for completion.
- Before invoking `cial:restart` on freshly-edited code.
## Steps
1. POST http://127.0.0.1:8080/api/v1/self/deploy with `{ "scope": "auto", "sessionId": "$CIAL_SESSION_ID" }`.
2. Capture the returned `deployId` and `logPath`.
3. Poll GET /api/v1/deploy/{deployId} every 2s until `status` ∈ {`ok`, `error`, `cancelled`}.
4. On error, read the last ~50 lines of `logPath` and surface the failure summary.
1. POST http://127.0.0.1:4000/api/v1/self/deploy with `{ "scope": "auto", "sessionId": "$CIAL_SESSION_ID" }` (use `--max-time 600`; the call blocks until the build is done).
2. Read the response. 200 → success. 500 → failure; the body's `errorSummary` + `logTail` carry the diagnostic; fix and re-run.
## Auth
None — endpoint is localhost-gated. Just curl it.

View file

@ -32,6 +32,12 @@ tail -f /cial/data/deploy-logs/$DEPLOY_ID.log | grep '^ERR '
## Programmatic access
The synchronous `POST /api/v1/self/deploy` already returns the final
row + a `logTail` (last ~8KB) inline — for builds you trigger from
the agent shell, no extra call is needed.
For historical inspection (auth-gated, instance_admin):
```sh
# Snapshot of the deploy row (status, exitCode, errorSummary, logPath)
curl -sf http://127.0.0.1:4000/deploy/$DEPLOY_ID | jq .deploy
@ -46,7 +52,7 @@ Every failed deploy stores the last 20 lines of stderr in the `errorSummary` col
## Streaming via WebSocket
The deploy controller broadcasts `deploy.log` events to the requesting user's connected tabs. For the self-edit endpoints, `requestedByUserId = '__self__'`, so WS broadcasts go nowhere — use the file tail or REST polling instead.
The deploy controller broadcasts `deploy.log` events to the requesting user's connected tabs. For the self-edit endpoints, `requestedByUserId = '__self__'`, so WS broadcasts go nowhere — the synchronous response carries `logTail` (last ~8KB) inline, and the full log lives at `/cial/data/deploy-logs/<deployId>.log`.
## Cleanup

View file

@ -11,7 +11,10 @@ No auth header. The localhost-only middleware (`core/back/src/modules/self/local
## `POST /api/v1/self/deploy`
Build and (on success) restart the matching scope.
Build and (on success) restart the matching scope. **Synchronous** — the
request blocks until the build (and post-build restart) reach a terminal
state, then returns one response with the verdict + log tail. No polling
required, no companion GET endpoint.
**Body** (all optional):
```json
@ -27,17 +30,45 @@ Build and (on success) restart the matching scope.
- `mode` controls the deploy mode persisted in the deploy table; defaults to whatever was last set via `PATCH /deploy/mode`.
- `sessionId` is recorded for log correlation only — never used for auth.
**Success — 202 Accepted**:
Use `--max-time 600` on curl: a full `scope=all` build can run several
minutes.
**Success — 200 OK**:
```json
{
"ok": true,
"deployId": "8b1a3c4e-…",
"status": "queued" | "building",
"scope": "platform" | "all"
"status": "ok",
"scope": "platform" | "all",
"durationMs": 42103,
"exitCode": 0,
"errorSummary": null,
"logTail": "OUT [@cial/platform-back] ✓ done in 1.2s\n…"
}
```
The build runs asynchronously. Poll completion at `GET /deploy/<deployId>` or subscribe to the WS deploy events. When the build completes, the matching restart fires automatically (no second call needed).
**Failure — 500**:
```json
{
"ok": false,
"deployId": "8b1a3c4e-…",
"status": "error" | "cancelled" | "unknown",
"scope": "platform" | "all",
"durationMs": 12011,
"exitCode": 1,
"errorSummary": "error TS2345: …",
"logTail": "<last ~8KB of /cial/data/deploy-logs/<deployId>.log>"
}
```
`logTail` is the last ~8KB of the deploy log file, prefixed by stream
(`OUT `/`ERR `). Use it to diagnose the failure inline; for the full
log, read `/cial/data/deploy-logs/<deployId>.log`.
If the build doesn't reach a terminal state within 10 minutes the
endpoint returns 500 with `status: "unknown"` and `errorSummary:
"deploy did not reach terminal state"` — the build itself keeps
running in the background.
**Errors**:
- `403 not_localhost` — request not from loopback.
@ -78,14 +109,20 @@ When `scope=all` (and unrestricted), `edgeRestart` is `true` — the response is
- `403 unrestricted_required`.
- `503 supervisor_unreachable` — IPC socket missing or timed out.
## Polling for build completion
## Inspecting an old deploy
The synchronous self-edit response includes everything you need for
the deploy you just triggered. To inspect a previous deploy:
```sh
curl -sf http://127.0.0.1:4000/deploy/$DEPLOY_ID | jq .deploy.status
# → "queued" | "building" | "restarting" | "ok" | "error" | "cancelled"
curl -sf http://127.0.0.1:4000/deploy/$DEPLOY_ID | jq .deploy
# → { id, status, mode, exitCode, errorSummary, durationMs, … }
```
`ok` and `error` are terminal. The `errorSummary` field on the deploy row contains the last ~20 lines of stderr when `status=error`.
This endpoint is on the auth-gated admin router (`instance_admin`
Better-Auth session required) — not reachable via plain localhost
curl from the agent shell. Use `tail` on the log file instead, or
re-trigger the synchronous deploy.
## Reading build logs

View file

@ -8,21 +8,17 @@ Concrete copy-pasteable flows for the things you'll do often.
# 1. Edit the file
$EDITOR /cial/platform/front/src/components/Foo.tsx
# 2. Build + auto-restart
curl -sf -X POST http://127.0.0.1:4000/api/v1/self/deploy \
# 2. Build + auto-restart (synchronous: blocks until done)
curl -sS --max-time 600 -X POST http://127.0.0.1:4000/api/v1/self/deploy \
-H 'content-type: application/json' \
-d '{}'
# → { "ok": true, "deployId": "...", "status": "building", "scope": "platform" }
# Success → 200
# { "ok": true, "status": "ok", "scope": "platform", "durationMs": 42103,
# "exitCode": 0, "errorSummary": null, "logTail": "..." }
# Failure → 500
# { "ok": false, "status": "error", "errorSummary": "...", "logTail": "<8KB>" }
# 3. Poll completion
while true; do
status=$(curl -sf http://127.0.0.1:4000/deploy/$DEPLOY_ID | jq -r .deploy.status)
echo "$status"
case "$status" in ok|error|cancelled) break;; esac
sleep 2
done
# 4. Open the page — change is live
# 3. Open the page — change is live
```
Note: in dev mode, `next dev --turbopack` and `tsx watch` already auto-reload. The build is still useful for type-checking and for protocol/sdk consumers, but the restart is mostly a no-op for hot-reloaded code.
@ -38,12 +34,12 @@ Note: in dev mode, `next dev --turbopack` and `tsx watch` already auto-reload. T
$EDITOR /cial/core/ui/src/components/Sidebar/Sidebar.tsx
# 2. Build + restart everything (auto resolves to scope=all)
curl -sf -X POST http://127.0.0.1:4000/api/v1/self/deploy \
curl -sS --max-time 600 -X POST http://127.0.0.1:4000/api/v1/self/deploy \
-H 'content-type: application/json' \
-d '{}'
```
When the build completes, the deploy service automatically restarts platform + core + edge. The edge restart bounces the container — the curl polling will lose its connection. Wait ~10s and re-check `/healthz`.
When the build completes, the deploy service automatically restarts platform + core + edge. The edge restart bounces the container — the synchronous curl will lose its connection right after the response is sent. Wait ~10s and re-check `/healthz`.
## Restart only (no build)
@ -61,31 +57,33 @@ curl -sf -X POST http://127.0.0.1:4000/api/v1/self/restart \
$EDITOR /cial/core/protocol/src/sessions.ts # add field
$EDITOR /cial/core/back/src/... # use it
$EDITOR /cial/platform/front/src/... # use it
curl -sf -X POST http://127.0.0.1:4000/api/v1/self/deploy -d '{}'
curl -sS --max-time 600 -X POST http://127.0.0.1:4000/api/v1/self/deploy -d '{}'
```
In `--unrestricted`, the build filter graph picks up `@cial/protocol` first, then anything that depends on it (`@cial/back`, `@cial/front`, `@cial/sdk`, `@cial/platform-*`). One call rebuilds the chain.
## Cancel an in-flight build
The synchronous `/api/v1/self/deploy` blocks the caller, so the agent
won't normally need to cancel — Ctrl-C the curl and the build keeps
running to completion in the background. To force-cancel:
```sh
curl -sf -X POST http://127.0.0.1:4000/deploy/$DEPLOY_ID/cancel
```
(This goes through the existing `/deploy/*` admin endpoints, which require an `instance_admin` Better-Auth session — the dev autologin gives you one as `dev@local.test`.)
(This goes through the existing `/deploy/*` admin endpoints, which require an `instance_admin` Better-Auth session — the dev autologin gives you one as `dev@local.test`. Not reachable from the localhost-only agent shell.)
## Inspect the last build's stderr
The synchronous `/api/v1/self/deploy` response already includes
`logTail` (last ~8KB) and `errorSummary`. For the full log of any
deploy:
```sh
tail -n 50 /cial/data/deploy-logs/$DEPLOY_ID.log | grep '^ERR '
```
Or the structured tail:
```sh
curl -sf http://127.0.0.1:4000/deploy/$DEPLOY_ID | jq .deploy.errorSummary
```
## Health check after edge restart
```sh