Commit Graph

12 Commits

Author SHA1 Message Date
7b2f2fae30 Accept Tessera (Keycloak-compatible) OIDC tokens as API bearer
Adds an additive bearer-verification path: verify RS256 access tokens against
Tessera's JWKS (iss/aud/exp), map sub/preferred_username/email + roles
(realm_access.roles, resource_access.<audience>.roles) to the app's identity.
Existing auth (API keys / app JWTs / sessions) is unchanged. Issuer + audience
are env-configurable. Validated end-to-end against the local sim.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 15:11:30 +01:00
b02b1706b6 fix(oidc): clamp post_login_redirect to CORS allow-list (open-redirect)
Closes the open-redirect risk surfaced by the v0.3.x security audit:
the OIDC callback handler was using oidc_config.post_login_redirect
verbatim as the redirect base on both error and success paths. An
attacker with admin-key compromise (or a misconfigured operator) could
set that field to an external domain and turn /api/auth/oidc/callback
into a phishing redirector ('dialectic login failed → re-enter
password at evil.com/login').

Fix:
  - New AuthHandler.safeRedirectBase(raw) validator:
      * empty → '/'
      * relative path starting with '/' (but not '//') → keep as-is
      * absolute URL whose host is in the allow-list → keep as-is
      * everything else → '/'
  - allow-list sourced from cfg.CORSAllowOrigins (the same set we
    already trust for browser CORS), threaded through NewAuthHandler.
  - Applied on BOTH the error branch and success branch of Callback.

Also: drop redundant newline on cmd/dialectic-cli usage Fprintln so
go test ./... passes.

No behavior change for happy path: prod's PostLoginRedirect is
'https://dialectic.hangman-lab.top/oidc/callback', which matches the
CORS allow-list ('https://dialectic.hangman-lab.top'), so the
validator returns it unmodified.
2026-05-24 10:03:53 +01:00
2463129dbd feat(oidc): backend-mediated OIDC login + session cookies + cli config
Adds the OpenID Connect login flow Dialectic.Frontend will drive. Pattern
mirrors Fabric.Backend.Center: SPA → /api/auth/oidc/start → IdP →
/api/auth/oidc/callback → 302 to SPA with one-time ticket in URL fragment
→ SPA POST /api/auth/oidc/exchange → HttpOnly session cookie set.

What's added:

  - internal/oidc/service.go — runtime OIDC service:
    * BuildAuthorizeURL (PKCE S256 + random state, 10min ttl)
    * HandleCallback (token exchange + ID token verify + ticket mint, 60s ttl)
    * ExchangeTicket (ticket → session JWT, HS256 24h)
    * VerifySession (cookie validation)
    * GetConfig/SetConfig with sync.Map-backed state/ticket stores
    * SweepExpired (call from background goroutine; clears stale entries)
  - internal/db/migrations/004_oidc_config.sql — single-row oidc_config
    table (issuer/client_id/client_secret/redirect_uri/post_login_redirect/
    scopes/enabled). Runtime-mutable via dialectic-cli.
  - internal/httpapi/handlers/auth.go — 5 endpoints:
    GET  /api/auth/oidc/status   — { enabled }
    GET  /api/auth/oidc/start    — 302 to IdP
    GET  /api/auth/oidc/callback — IdP returns; we 302 to SPA with ticket
    POST /api/auth/oidc/exchange — ticket → cookie + user
    GET  /api/auth/me            — current session user (401 if anon)
    POST /api/auth/logout        — clears cookie
  - internal/auth: replaces the OIDCBrowser Phase-2C stub with one that
    reads the session cookie via SessionVerifier; keeps dev-bypass
    behind cfg.OIDCOnly gate (set OIDC_ONLY=true in prod to disable
    dev-bypass entirely)
  - cmd/dialectic-cli/main.go — new binary; subcommand
    'config oidc [--issuer ... --client-id ... --client-secret ...
                  --callback-url ... --enabled true|false]'
    Runs against same DB the backend uses; reachable via
    'docker exec dialectic-backend dialectic-cli config oidc ...'
  - Dockerfile: build both binaries; put on PATH for docker exec

Config:

  - SESSION_SIGNING_KEY env: required in prod, ephemeral random in dev.
    HS256 secret for session JWTs. Stable across restarts (rotation
    invalidates every session — kill switch).
  - OIDC_ONLY env: 'true' disables the dev-bypass path entirely; use
    in prod once OIDC is configured.
  - OIDC_ISSUER + OIDC_CLIENT_ID env are no longer required at boot —
    they're advisory bootstrap values for the oidc_config DB row.

Deps:
  - github.com/coreos/go-oidc/v3 (discovery + JWKS verify)
  - golang.org/x/oauth2 (token exchange + PKCE)
  - github.com/golang-jwt/jwt/v5 (session JWT)
  - Bumped go.mod toolchain to 1.25.

Pairs with Dialectic.Frontend (next commit) which removes the
/agents/:id admin page and adds the login button + /oidc/callback
SPA route + AuthProvider that talks to these new endpoints.
2026-05-24 01:40:36 +01:00
0b16b52ee7 feat(admin): GET /api/admin/agents/{id} activity summary
New endpoint for operator diagnostics (used by the Dialectic.Frontend
AgentActivity page). Same x-dialectic-admin-key gate as
ProvisionAgentKey. Returns:

  - key_provisioned (bool) + last_used_at if available
  - signups_count / arguments_count / verdicts_count
  - recent_topics[]: up to 20 topics the agent touched in any role
    (volunteer → camp-allocated → pro/con poster → judge),
    deduped by (topic_id, role), most recent action_at first

Implementation: 3 small COUNT queries + one UNION-ALL across signups +
camps + arguments + verdicts joined to topics. Caps at 20 rows; bounded
by per-table indexes on agent_id / posted_at / created_at. <50ms at
current sim row counts.

No new tables / migrations. Roll out: re-deploy backend; frontend
prompts for the admin key on first visit and stores in localStorage.
2026-05-24 00:14:18 +01:00
5cf4302d50 refactor(backend): drop backend-driven Fabric broadcast — agent-driven model
The backend no longer broadcasts topic lifecycle events to Fabric. The
new model (per design discussion 2026-05-23 evening):

  - Proposing agent posts a single recruitment fabric-send-message
    immediately after creating a topic (carries topic_id + signup
    window + debate window + title).
  - Downstream agents that decide to participate book a HF on_call
    slot covering the debate window via `hf calendar schedule on_call
    <time> <duration> --job DEBATE-<topic_id>`.
  - HF wakes the agent naturally at slot start; the wake payload
    carries event_data with the DEBATE-<topic_id> code so the agent
    knows why it was woken.
  - The backend stays a pure data + state-machine service and doesn't
    know about Fabric.

Code removed:

  - internal/fabric/announce.go (entire file + empty dir)
  - ticker.go: broadcastLifecycle + broadcastAnnouncement + topicTarget
    helpers; announcer field on Ticker; announce field/arg on NewTicker
  - models/topic.go: AnnounceGuildBaseURL + AnnounceChannelID fields
  - store/topic_store.go: same fields on CreateTopicInput + INSERT
  - handlers/topics.go: same fields on createTopicBody + validation +
    parameter passing to store
  - handlers/verdict.go: announcer field + lifecycle broadcast on
    verdict submit
  - config/config.go: FabricSystemAPIKey field + DIALECTIC_FABRIC_SYSTEM_API_KEY
    env read
  - main.go + routes.go: announcer wiring

Database:

  - migrations/003_drop_topic_announce_target.sql drops the two columns
    added by migration 002. Counterpart commit on the deployment side
    needs DIALECTIC_FABRIC_SYSTEM_API_KEY env removed from
    docker-compose.yml; harmless if left as the backend no longer
    reads it.

Pairs with:
  - Dialectic.OpenclawPlugin: rip announce_* params from
    dialectic_propose_topic (next commit)
  - Fabric.Backend.Center: rip serviceEndpoint field + cli
  - Fabric.Backend.Guild: rip system-key bypass on ApiKeyGuard and
    announce-only-system limit on messaging.controller
  - ClawSkills: rewrite participate-debate + analyze-intel step 4 +
    delete rotate-fabric-system-key workflow
2026-05-23 23:45:22 +01:00
22d9fb7ed5 feat(topics): GET /api/topics/{id} returns camps array
Adds the camps allocation array to topic_detail responses so an agent
can locate which camp they're in (pro/con/judge) in a single round-trip
instead of needing a separate endpoint call. Camps are 0 rows
pre-signup_close, exactly 3 rows after — small enough to inline always.

Backward-compatible: the existing Topic fields remain top-level on the
response; `camps` is a sibling array. Callers reading e.g. response.title
or response.status continue to work unchanged.

Arguments are deliberately NOT inlined here — they can grow to many KB
per topic, and most callers (list view, status check, signup intent
resolution) don't need them. Use the new `dialectic_list_arguments`
plugin tool against GET /api/topics/{id}/arguments when you actually
need the transcript.

E2e verified on sim: judge agent successfully called topic_detail to
get camps + list_arguments to get transcript + submit_verdict citing
the actual pro/con argument content (no more 'tie because I saw no
arguments' false readings).
2026-05-23 22:03:49 +01:00
a43ff2de62 feat: per-topic announce target (move guild+channel from env to topic row)
Operator decision: backend env hard-coding a single guild/channel was
wrong because (a) one Center can host many guilds and (b) one guild
can have many announce channels for different purposes. The
proposing agent now chooses where this topic's lifecycle events go,
passed as create-topic params and stored on the topic row.

Schema migration 002:
- ALTER topics ADD announce_guild_base_url VARCHAR(255) NULL,
                  announce_channel_id     VARCHAR(64)  NULL.
- Both nullable; one-of-two is rejected at POST time; both null =
  topic creator opted out of broadcasts (announcer skips with log).

handlers/topics.go: createTopicBody adds announce_guild_base_url +
announce_channel_id; validates both-or-neither.

fabric/announce.go: rewritten signature. NewAnnouncer takes only
the system api key. PostTopicAnnouncement + PostLifecycleEvent take
a Target {GuildBaseURL, ChannelID} per call. Zero-value Target -> skip.

orchestrator/ticker.go: new helper topicTarget(topic) extracts the
target from the topic row; all broadcasts route through it.

verdict.go: same per-topic target extraction at completion.

config: removed FabricGuildBaseURL, FabricAnnounceChannelID,
FabricBotBearerToken from the Config struct + env reads.
FabricSystemAPIKey env renamed to DIALECTIC_FABRIC_SYSTEM_API_KEY
to disambiguate from the Fabric backend's own
FABRIC_BACKEND_GUILD_SYSTEM_API_KEY (operator: paste the same value
into both - one says "I am the system caller", the other says "I
accept this caller as system").

FABRIC_BOT_BEARER_TOKEN is gone entirely. The upgraded Guild
ApiKeyGuard accepts x-fabric-system-key alone for announce posts;
no per-user Bearer needed. Pairs with the matching change on
nav/Fabric.Backend.Guild commit 985b06a.
2026-05-23 17:53:30 +01:00
b2a0cac460 feat: lifecycle broadcasts on signup_closed / cancelled / debating / completed
Phase 3 push-wakeup mechanism without adding a new push channel.
Topic state transitions now post short messages to the same Fabric
announce channel used for the initial signup announcement. Agents
subscribed to announce + not currently busy get woken via the
existing Phase 1 inbound path; busy-discard already filters
appropriately. No SSE, no per-agent DM fanout, no plugin changes —
reuses existing infra end-to-end.

Changes:
- ticker.go: after signup_close transition, broadcasts signup_closed
  (with pro/con/judge agent IDs + debate-start time) OR cancelled
  (with reason). After debate_start transition, broadcasts debating
  with debate-end time.
- announce.go: new PostLifecycleEvent helper - same headers/auth as
  PostTopicAnnouncement, different format.
- verdict.go: after successful judge submission, broadcasts completed
  with the judge id. Best-effort + async so a slow Fabric does not
  slow the judge response.
- routes.go: instantiates the announcer once + passes to VerdictHandler.

Workflow participate-debate step 5 should be updated to expect
wakeups instead of polling - separate follow-up edit on lyn/ClawSkills.
2026-05-23 15:02:58 +01:00
15bb942d9b feat: POST /api/admin/agent-keys — system-keyed raw key minting
New admin endpoint for provisioning per-agent dialectic API keys
during recruitment. Auth via separate x-dialectic-admin-key header
matching env DIALECTIC_ADMIN_API_KEY (not bearer — admin lifecycle
is independent of agent identity).

Behavior:
- Body {agent_id, force?}; generates 32-byte hex raw key; stores
  sha256-peppered hash in agent_keys; returns raw key (ONLY time
  exposed — caller stores in agent secret-mgr)
- 409 on existing agent_id unless force:true (rotates the hash,
  clears last_used_at + revoked_at)
- Closed-by-default: if DIALECTIC_ADMIN_API_KEY env is empty, every
  request 401s

Caller pattern: skills/dialectic-hangman-lab/scripts/dialectic-ctrl
(to be added) reads admin key from
/root/.openclaw/system-secrets/dialectic-admin-key on the openclaw
host, POSTs to admin endpoint, stores returned raw key in the proxy-
for agent secret-mgr (inherits the proxy-pcexec context from
recruitment/onboard).

Unblocks Phase 3.5 plan to provision all prod agents and integrate
into recruitment skill.
2026-05-23 14:53:39 +01:00
03b89a547c fix(db,topics): time.Time params for TIMESTAMP + comment-aware SQL split
Two fixes surfaced by sim e2e test (which otherwise passed full
lifecycle: created → signup_open → 3 signups → allocator → debating
→ arguments → verdict gate (409 early, 201 after debate_end_at) →
completed).

1) MySQL TIMESTAMP rejects RFC3339-with-Z strings — passing those as
   sqlx parameters fails with "Incorrect datetime value". Changed
   CreateTopicInput lifecycle fields from string to time.Time; the
   handler parses+UTCs in validateLifecycleTimes (which now returns
   the parsed array along with the validation result) and passes
   time.Time to the store. The mysql driver formats correctly.

2) splitSQL was naive `strings.Split(s, ";")` which split inside
   comments — the 001 migration had a few `--` lines containing `;`
   ("signup_close_at; immutable", etc) which broke. Migration text
   tidied to not use `;` inside comments, AND splitSQL upgraded to
   skip both `-- ...` and `/* ... */` comment regions before splitting.

Sim verified — clean apply on fresh MySQL.
2026-05-23 12:24:13 +01:00
57a1fa1b33 feat: Phase 2D — orchestrator, arguments/verdict endpoints, fabric announce
State machine driver + camp allocator + judge-submitted verdicts +
broadcast hook to Fabric announce channel.

internal/orchestrator/
- allocator.go: pure function implementing the 3-camp rule from the
  2026-05-23 design session — for each camp (pro/con/judge), random
  pick from volunteers; backfill unfilled camps from remaining
  unallocated signups if pool is large enough; <3 final → cancel
  with diagnostic reason. rng injected for test determinism.
- allocator_test.go: 7 tests covering empty/insufficient/single-volunteer
  /multi-volunteer-no-dup/backfill/insufficient-backfill/large-pool
  distinctness invariants. All pass.
- ticker.go: scans every 15s (configurable via ORCHESTRATOR_TICK_INTERVAL),
  drives 3 state transitions atomically:
    created → signup_open (post fabric announcement async)
    signup_open → signup_closed | cancelled (run allocator, write camps)
    signup_closed → debating (open round 0)
  debating → completed is driven by the verdict POST handler (the
  implicit "judging" sub-state is captured by the gate
  status==debating AND now>=debate_end_at). Per-topic transitions
  use SELECT FOR UPDATE so concurrent ticker instances are safe.

internal/fabric/announce.go: HTTP client posting to a Guild announce
channel using x-fabric-system-key header (the Phase 1 gate). Wraps
the formatted topic announcement (title/summary/timing/schema). All
4 config fields required to enable; any missing → no-op with log
(orchestrator runs fine without Fabric coupling for dev).

internal/store/{round,camp,argument,verdict}_store.go: CRUD layer
for the remaining v2 entities. CampStore.WriteAllocation accepts a
tx so the orchestrator can wrap allocator+camps+status into one
atomic transition.

internal/httpapi/handlers/arguments.go:
- POST /api/topics/{id}/arguments — agent posts during debate. Gates:
  agent must be in a camp on this topic; status==debating; content
  nonempty and <=32KB; attached to latest open round.
- GET /api/topics/{id}/arguments — full transcript, visibility-gated.

internal/httpapi/handlers/verdict.go:
- POST /api/topics/{id}/verdict — judge submits. Gates: caller==judge
  camp; status==debating AND now>=debate_end_at; verdict valid JSON;
  rationale required. On success: writes verdicts row (unique on
  topic_id → 409 on dup) and flips topic.status to completed.
- GET /api/topics/{id}/verdict — visibility-gated.

config: 5 new env vars — FABRIC_GUILD_BASE_URL,
FABRIC_ANNOUNCE_CHANNEL_ID, FABRIC_SYSTEM_API_KEY,
FABRIC_BOT_BEARER_TOKEN, ORCHESTRATOR_TICK_INTERVAL.

routes.go: wired new handlers — POST signups/arguments/verdict gated
on agent bearer; GET arguments/verdict on optional-auth chain
(public topics readable anonymously).

main.go: instantiates announcer + ticker; ticker.Run in a goroutine
sharing the lifetime ctx.

go vet + gofmt clean; 7/7 allocator tests pass; 12M static binary.

Next: Phase 2E (deploy to t3 with nginx + CF origin cert) or
Phase 2D.5 (SSE stream for live transcript subscribers).
2026-05-23 12:02:27 +01:00
e706f3d6ef feat: greenfield Go rewrite (Phase 2A + 2B + 2C core)
Replaces the Python v1 (preserved on archive/python-v1 branch).

Stack: Go 1.23 + chi router + sqlx + MySQL 8. Distroless static
container. 12-factor config from env. Embedded SQL migrations.

Schema (internal/db/migrations/001_init.sql):
- topics: 议题 with 4-timestamp lifecycle (signup_open/close +
  debate_start/end), visibility (default private), status state machine,
  verdict_schema FK
- signups: agent self-enrollment with willing_camps (JSON array of
  pro|con|judge), pre_validated audit flag, (topic,agent) unique
- camps: post-allocation lock (one row per topic+camp) — written by
  Phase 2D allocator
- rounds + arguments: chronological debate transcript
- verdicts: judge structured output, one per topic, with token-cost
  trail for future budgeting
- agent_keys + system_keys: peppered sha256 hashes, never raw
- verdict_schemas: seeded with binary, claim-resolution (for
  analyze-intel), policy-recommendation, free-form

Auth (internal/auth):
- AgentAPIKey: real bearer-token middleware against agent_keys;
  best-effort last_used_at touch on success
- OIDCBrowser: Phase 2 stub. Dev mode accepts x-dev-bypass header
  (constant-time compare); prod 401s with a Phase-4-pending hint.
  Real Keycloak JWKS verification lands with the frontend rewrite.

HTTP API (internal/httpapi):
- /api/healthz — db ping + version + uptime
- GET /api/topics — list with status/visibility/limit/offset filters;
  anonymous callers see public only
- GET /api/topics/{id} — visibility-gated (private → 404 hide)
- POST /api/topics — create with RFC3339 lifecycle validation
  (signup_open < signup_close <= debate_start < debate_end)
- PUT /api/topics/{id}/visibility — dialectic-admin role gate
- POST /api/topics/{id}/signups — agent self-enroll; rejects when
  topic.status != signup_open OR outside signup window; idempotent
  upsert per (topic, agent)
- GET /api/topics/{id}/signups — list (any authed caller)

Auth chains:
- optionalAuth: try bearer → try oidc → fall through anonymous
  (handlers branch on Caller.Kind == ""). Uses captureWriter to demote
  inner 401s to "try next" without leaking response bytes.
- requireAnyAuth: chain that 401s if neither succeeds.
- requireAgent: strict bearer-only (signup POST).

Run: `docker compose -f docker-compose.dev.yml up --build`. Migrations
auto-apply on first connect; idempotent on reboot. README documents
env vars, dev bypass usage, agent-key provisioning SQL, and the
Phase 2D/E/3/4/5 roadmap.

go vet clean, gofmt clean, single 11M static binary.
2026-05-23 11:51:55 +01:00