Pattern — WebSocket¶
Type: Pattern Long-lived connection lifecycle — WebSocket / SSE / WebRTC peer — modelled as a state machine that owns the socket. Convention, not Spec.
Code samples are in ClojureScript (the CLJS reference). The pattern itself is host-agnostic.
:rf.ws/*(WebSocket connections) is a managed external effect — per Managed-Effects, the surface MUST satisfy the eight properties (effect-as-data, framework-owned socket-actor lifecycle, structured failure taxonomy under:rf.ws/*, trace-bus observability,:sensitive?/:large?composition, built-in retry / abort / teardown via the connection state machine, in-flight socket-actor registry, per-frame interceptor scoping).
Role¶
A named pattern, not a Spec. WebSockets do not fit Pattern-AsyncEffect: they are state-machine-shaped — a long-lived connection with retry, exponential backoff, server-pushed events, heartbeat, subscription management, message correlation, queued sends when disconnected, and re-auth on reconnect. The natural canonical answer is a state machine that owns the connection lifecycle.
This doc names that machine's standard shape so per-app instances cite a single canonical description rather than re-deriving the lifecycle each time.
Why WebSocket is not Pattern-AsyncEffect¶
Pattern-AsyncEffect is "post work, await reply, dispatch result, commit." It is a one-shot interaction. WebSocket has none of those bounds:
- The connection itself has phases —
:disconnected,:connecting,:authenticating,:connected,:reconnecting,:failed. Each is a distinct state with distinct allowed transitions. - The connection lasts longer than any single message; messages flow in both directions while in
:connected. - Retry-with-backoff requires a timer mechanism and a counter; the canonical answer is
:after+:always+ machine-scoped guards (per 005). - Subscription state — which topics the app is subscribed to — must survive reconnects. The machine carries it in
:data. - Server-pushed events arrive without a corresponding request; they are dispatched events landing in the running-app machinery.
Treat individual messages over an open WebSocket as Pattern-AsyncEffect interactions when they are request-reply (correlation-id keyed). Treat the connection as a state machine.
This pattern applies equally to Server-Sent Events (EventSource) and WebRTC peer connections — they share the long-lived-connection-with-lifecycle shape. Differences are mostly in the wire format and server-pushed-vs-bidirectional message semantics; the state machine shape is identical.
The connection state machine¶
The canonical states form a hierarchical machine. :connecting, :authenticating, and :connected sit under a single compound parent :active because they share one critical invariant: the live socket actor must outlive all three. Anchoring the :invoke on the parent — not on :connecting — keeps the actor alive across the success-path transitions (:connecting → :authenticating → :connected) without re-spawning a fresh socket each time the leaf changes.
| State | Meaning |
|---|---|
:disconnected |
No socket; not yet attempted, or destroyed cleanly. |
:active |
Compound parent; owns the :websocket/socket :invoke. Leaves: :connecting, :authenticating, :connected. |
:reconnecting |
Connection lost; waiting on :after backoff before re-attempt. Socket actor has been destroyed. |
:failed |
Max retries exceeded; manual recovery only. Terminal until external [:ws/connect ...] dispatched. |
Standard transitions¶
:disconnected --:ws/connect--> :active / :connecting
:active / :connecting --:ws/opened--> :active / :authenticating
:active / :authenticating --:ws/auth-ok--> :active / :connected
:active / :authenticating --:ws/auth-failed--> :failed
:active / * --:ws/closed--> :reconnecting
:active / * --:ws/fatal--> :failed
:reconnecting --:after backoff--> :active / :connecting
:reconnecting --:always max-retries--> :failed
:failed --:ws/connect--> :active / :connecting
Per 005 §Transition resolution — deepest-wins with parent fallthrough, :ws/closed and :ws/fatal are declared on :active once and inherited by every leaf — every :connecting-error, :authenticating-error, and :connected-error path routes through the same parent-level transition.
The connection machine composes the locked substrate:
- Hierarchical states (005 §Hierarchical compound states) —
:activeis the parent of three connection-leaves; the parent owns the socket actor. :after(005 §Delayed:aftertransitions) — exponential backoff timer in:reconnecting, expressed as a fn-form delay(fn [snap] ms)that reads the current:retriesand:base-msfrom:data. The:after-epoch invariant (005 §Epoch-based stale detection) guarantees stale timers from prior:reconnectingvisits are silently dropped on transitions away.:always(005 §Eventless:alwaystransitions) — max-retries guard fires immediately on entry to:reconnectingif:retriesexceeds the limit, transitioning straight to:failed. Also used to flush queued messages on entry to:connected.- Machine-scoped
:guards/:actions(005 §Registration — the machine IS the event handler) — for:max-retries-exceeded?,:has-queued-messages?,:bump-retry-count,:flush-queue,:current-socket?, etc. :invoke(005 §Declarative:invoke) —:activeinvokes a:websocket/socketactor that owns the actualWebSocketobject; the actor's lifetime is bound to the:activeparent. Any transition that exits:active(to:reconnecting, to:failed, or to:disconnected) destroys the actor; re-entering:activeafter:afterbackoff spawns a fresh socket.- Pattern-StaleDetection — the connection epoch is the socket-actor's own gensym'd id. Every event the socket actor dispatches into the parent carries its
:socket-id; the parent's actions check that the carried id matches the live(:socket-id data)before committing. Replies from a previous connection epoch —:ws/receivedfrom a socket that has since been replaced — fail the check and are dropped via a:rf.ws/stale-sockettrace. The same idiom that:afteralready uses internally, applied to socket-actor identity.
Worked example — connection machine¶
(rf/reg-event-fx :ws/connection
{:doc "WebSocket connection lifecycle: disconnected → active{:connecting →
:authenticating → :connected} → reconnecting (with backoff) → failed."}
(rf/create-machine-handler
{:initial :disconnected
:data {:url nil ;; supplied by :ws/connect; refreshed by :ws/refresh-token
:auth-token nil ;; supplied by :ws/connect (or refreshed at runtime)
:retries 0
:max-retries 8
:base-ms 1000 ;; initial backoff
:max-backoff-ms 30000
:socket-id nil ;; address of the currently-live socket actor
:subscriptions #{} ;; topics to (re-)subscribe on :connected entry
:queue [] ;; messages buffered while disconnected
:in-flight {} ;; {request-id → {:reply-event ... :timeout-ms ...}}
:error nil}
:guards
{:max-retries-exceeded?
(fn [data _]
(>= (:retries data) (:max-retries data)))
:has-queued-messages?
(fn [data _]
(seq (:queue data)))
:current-socket?
;; True iff the incoming socket-stamped event came from the actor
;; this machine currently owns. Connection-epoch staleness check
;; (Pattern-StaleDetection): replies from a prior socket-id are
;; suppressed without disturbing the live snapshot.
(fn [data [_ {:keys [source-socket-id]}]]
(= source-socket-id (:socket-id data)))}
:actions
{:record-connection-opts
;; Caller passes URL + token on :ws/connect; opts land in :data and
;; every subsequent reconnect re-reads them via :invoke's :data fn.
(fn [data [_ {:keys [url auth-token]}]]
{:data (assoc data :url url :auth-token auth-token)})
:refresh-token
;; The auth machine calls this after an out-of-band refresh; the
;; next :active entry's :invoke :data fn picks up the fresh token.
(fn [data [_ token]]
{:data (assoc data :auth-token token)})
:bump-retry (fn [data _] {:data (update data :retries inc)})
:clear-socket-id
(fn [data _] {:data (assoc data :socket-id nil)})
:send-auth
;; Route an :auth message into the live socket actor.
(fn [data _]
{:fx [[:dispatch [(:socket-id data) [:send {:type :auth
:token (:auth-token data)}]]]]})
:on-connected
;; Compound entry action for :connected — :reset-retry + :resubscribe
;; in one fn. (Per [005 §State nodes] :entry takes one fn or one
;; registered id, never a vector.)
(fn [data _]
{:data (assoc data :retries 0)
:fx (mapv (fn [topic]
[:dispatch [(:socket-id data)
[:send {:type :subscribe :topic topic}]]])
(:subscriptions data))})
:flush-queue
;; Send everything buffered while disconnected; clear the queue.
(fn [data _]
{:data (assoc data :queue [])
:fx (mapv (fn [msg] [:dispatch [(:socket-id data) [:send msg]]])
(:queue data))})
:enqueue-message
;; Buffer a send while the connection is not yet :connected.
(fn [data [_ msg]]
{:data (update data :queue conj msg)})
:register-request
;; Caller: [:ws/request {:request-id ..., :body ..., :reply ...}].
;; Record the in-flight entry, forward to the socket, schedule a timeout.
(fn [data [_ {:keys [request-id body reply timeout-ms]
:or {timeout-ms 30000}}]]
{:data (assoc-in data [:in-flight request-id]
{:reply-event reply :timeout-ms timeout-ms})
:fx [[:dispatch [(:socket-id data)
[:send (assoc body :request-id request-id)]]]
[:dispatch-later
{:ms timeout-ms
:dispatch [:ws/connection
[:ws/request-timeout
{:request-id request-id
:source-socket-id (:socket-id data)}]]}]]})
:clear-request
(fn [data [_ {:keys [request-id]}]]
{:data (update data :in-flight dissoc request-id)})
:record-and-reset
;; Compound action — record fresh opts AND reset the retry counter.
;; Used on manual :ws/connect from :reconnecting / :failed (the
;; running app has refreshed the token; reconnect immediately).
(fn [data [_ {:keys [url auth-token]}]]
{:data (-> data
(assoc :url url :auth-token auth-token)
(assoc :retries 0))})
:record-error
(fn [data [_ err]] {:data (assoc data :error err)})}
:states
{:disconnected
{:on {:ws/connect {:target [:active]
:action :record-connection-opts}
:ws/send {:action :enqueue-message}}}
:active
{;; The socket actor is invoked at the parent level — its lifetime
;; spans :connecting, :authenticating, and :connected. Any transition
;; that exits :active (to :reconnecting, :failed, or :disconnected)
;; destroys it; re-entering :active spawns a fresh one.
:invoke {:machine-id :websocket/socket
;; Mechanism 2 from Pattern-AsyncEffect §Parameter passing
;; across the boundary — the child reads URL + auth-token
;; from the parent's :data at spawn time. Every re-entry to
;; :active picks up whatever the parent's :data currently
;; holds, so a :refresh-token between reconnects flows in
;; without any extra wiring.
:data (fn [snap _] {:url (-> snap :data :url)
:auth-token (-> snap :data :auth-token)})
;; Record the spawned actor id so subsequent dispatches
;; and :current-socket? checks have a value to compare.
:on-spawn (fn [data id] (assoc data :socket-id id))}
;; Exit cascade — on any transition that leaves :active, clear the
;; stale socket-id from :data. The runtime destroys the actor
;; automatically (per :invoke's desugared exit, [005 §Desugaring rules]);
;; this just keeps the parent's :data tidy so :current-socket?'s
;; comparison against `nil` correctly rejects late events.
:exit :clear-socket-id
;; Parent-level transitions inherited by every leaf
;; (per [005 §Transition resolution]). Any transport-level error
;; or close during :connecting / :authenticating / :connected
;; routes through one of these.
:on {:ws/closed {:target :reconnecting
:action :bump-retry}
:ws/error {:target :reconnecting
:action :bump-retry}
:ws/fatal {:target :failed
:action :record-error}
:ws/send {:action :enqueue-message}
:ws/refresh-token {:action :refresh-token}
;; A request issued before the connection is :connected is
;; queued like any other send (the request-id is preserved
;; in the queued body); the active leaf overrides for the
;; :connected case below.
:ws/request {:action :enqueue-message}}
:initial :connecting
:states
{:connecting
{:on {:ws/opened {:target :authenticating}}}
:authenticating
{:entry :send-auth
:on {:ws/auth-ok {:target :connected}
:ws/auth-failed {:target [:failed]
:action :record-error}}}
:connected
{:entry :on-connected
:always [{:guard :has-queued-messages? :action :flush-queue}]
:on {;; Pushed server event with no correlation id — forward.
;; The :current-socket? guard suppresses messages dispatched
;; in-flight from a prior socket whose destroy hadn't
;; flushed by the time the dispatch landed.
:ws/received {:guard :current-socket?
:action (fn [data [_ {:keys [body] :as ev}]]
(if-let [rid (:request-id body)]
;; Correlated reply — look up
;; the in-flight entry and
;; dispatch the registered
;; reply event; clear the slot.
(let [{:keys [reply-event]}
(get-in data [:in-flight rid])]
{:data (update data :in-flight dissoc rid)
:fx (when reply-event
[[:dispatch (conj reply-event body)]])})
;; Server push — translate to a
;; named running-app event.
{:fx [[:dispatch [:ws/handle-message body]]]}))}
;; Override the parent's :ws/send: while :connected the
;; message goes straight to the wire instead of queueing.
:ws/send {:action (fn [data [_ msg]]
{:fx [[:dispatch [(:socket-id data)
[:send msg]]]]})}
;; Override the parent's :ws/request: while :connected
;; the request is registered + sent immediately.
:ws/request {:action :register-request}
:ws/request-timeout
{:guard :current-socket?
:action :clear-request}}}}}
:reconnecting
{:always [{:guard :max-retries-exceeded? :target :failed}]
;; Exponential backoff, computed at state entry from the current
;; retry count. Per [005 §Value shape] the fn-form delay is called
;; once at entry against the entering snapshot; the :after epoch
;; carries through the synthetic timer event so a transition out
;; of :reconnecting (e.g., a manual :ws/connect) makes the in-flight
;; backoff timer stale. Add a jitter term in production.
:after {(fn [snap]
(let [{:keys [retries base-ms max-backoff-ms]} (:data snap)]
(min (* base-ms (Math/pow 2 retries))
max-backoff-ms)))
{:target [:active]}}
:on {;; Manual reconnect (e.g., after the auth machine refreshes
;; the token) — short-circuit the backoff, record fresh opts,
;; zero the retry counter, re-enter :active.
:ws/connect {:target [:active]
:action :record-and-reset}
:ws/send {:action :enqueue-message}
:ws/request {:action :enqueue-message}
:ws/refresh-token {:action :refresh-token}}}
:failed
{:on {:ws/connect {:target [:active]
:action :record-and-reset}
:ws/refresh-token {:action :refresh-token}}}}}))
The :websocket/socket invoked actor is itself a small machine (or fx-backed event handler) that owns the JS WebSocket instance and translates :open, :message, :error, :close events into dispatches back to the parent connection machine. Every outgoing dispatch carries :source-socket-id (the actor's :rf/self-id, per 005 §Runtime stamps on the spawned actor's :data) so the parent's :current-socket? guard can suppress messages from a prior socket if one happens to dispatch in flight as the cascade tears it down. The actor's lifetime is bound to :active — leaving :active (whether to :reconnecting on error or :failed fatally) destroys it; re-entering :active creates a fresh socket.
Parameters¶
The connection's :url and :auth-token arrive on the :ws/connect event:
(rf/dispatch [:ws/connection [:ws/connect {:url "wss://api.example.com/ws"
:auth-token (some-token)}]])
:record-connection-opts persists them into :data; the :active state's :invoke :data fn reads them out at spawn time and threads them into the child :websocket/socket actor. Every reconnect re-reads :data at the new :active entry, so a refreshed token (via [:ws/connection [:ws/refresh-token new-token]]) automatically flows into the next socket without re-dispatching :ws/connect. A full re-target (different URL) is a fresh :ws/connect that records the new opts and forces an :active re-entry.
For the canonical menu of mechanisms — event payload (used here for caller-supplied URL/token), spawn-spec :data fn (used between this machine and the child socket actor), and boot-time host config (when the URL is fixed by build-time config and threaded in by the boot machine) — see Pattern-AsyncEffect §Parameter passing across the boundary.
Subscription protocol¶
The connection machine tracks subscribed topics in :data :subscriptions (a set). On entry to :connected, the :resubscribe action re-issues subscribe messages for every topic — guaranteeing subscriptions survive reconnects.
To subscribe / unsubscribe at runtime, the running app dispatches sub/unsub events the connection machine handles by updating :subscriptions and forwarding the wire-message:
;; Subscribe to a topic — pure :data update + send.
:ws/subscribe
{:action (fn [data [_ topic]]
{:data (update data :subscriptions conj topic)
:fx [[:dispatch [(:socket-id data) [:send {:type :subscribe :topic topic}]]]]})}
(Wire the slot into :connected's :on map alongside :ws/received and :ws/send.) The exact subscribe-message wire format is application-specific; the pattern is "track in :data, re-issue on :connected entry."
Message correlation for request-reply¶
Request-reply protocols carry a correlation id on every request and matching reply. The pattern, fully implemented in the worked example above:
- Caller dispatches
[:ws/connection [:ws/request {:request-id ..., :body ..., :reply [::handler ...], :timeout-ms 10000}]]. :register-requestaction records the in-flight entry —(:in-flight data)gains{request-id {:reply-event ... :timeout-ms ...}}— forwards the body (with:request-idstamped) to the socket actor, and schedules a:dispatch-laterfor the timeout.:ws/receivedarrives with{:body {:request-id ... :result ...}}. The:connectedstate's handler checks the connection-epoch guard (:current-socket?) and then branches: a body carrying:request-idlooks up the in-flight entry and dispatches the registered reply event; a body without:request-idis a server push and routes to:ws/handle-message. Either branch clears the in-flight slot when correlated.:ws/request-timeoutfires if no reply arrives within the timeout window. The:clear-requestaction removes the in-flight entry; the caller's reply event never fires. (Apps that want to surface "request timed out" to the caller can do so by dispatching a per-feature error event from:clear-requestinstead.)
The correlation id can be any =-comparable value — a (random-uuid) is the canonical default, but per-feature [:feature/load slug] vectors compose with Spec 014 §:request-id (internal)'s precedent. Each request-reply over the open socket is a Pattern-AsyncEffect interaction; the connection machine is the long-lived host that performs the correlation step Pattern-AsyncEffect leaves to the caller.
Heartbeat / keepalive¶
Use :after on :connected to schedule a periodic ping: :after {30000 {:target :connected :action :send-ping}} self-loops externally, re-arming the timer. If the pong does not arrive within a window, transition to :reconnecting. A child heartbeat machine invoked from :connected is cleaner for non-trivial cases.
Server-pushed events¶
Server pushes (:ws/received events with no :request-id) are translated into named dispatched events the running-app handlers consume. The connection machine's role is mechanical — receive, validate the socket-id, translate, dispatch. The semantic interpretation lives in the per-feature event handlers:
(rf/reg-event-fx :ws/handle-message
(fn [_ [_ {:keys [type] :as msg}]]
(case type
:note/created {:fx [[:dispatch [:notes/append msg]]]}
:user/typing {:fx [[:dispatch [:chat/typing msg]]]}
...)))
Re-authentication on reconnect¶
Token expiry across reconnects has two recovery paths, both supported by the worked machine. Proactive: the auth machine refreshes the token and dispatches [:ws/connection [:ws/refresh-token new-token]]; the :refresh-token action updates :data :auth-token; the next :active entry's :invoke :data fn picks up the fresh value. Reactive: a reconnect into :authenticating fails with :ws/auth-failed and the machine transitions to :failed; the auth machine observes via sub-machine (per 005 §Subscribing to machines via sub-machine), runs its refresh, and dispatches [:ws/connection [:ws/connect {:url ... :auth-token new-token}]] to re-target. Either way, refreshed credentials land in :data and the next :active entry threads them through :invoke :data.
SSR¶
The connection machine no-ops in SSR mode — :invoke's spawn fx is :platforms #{:client} (the WebSocket API doesn't exist server-side); :after timers do not schedule under SSR (per 011 §:after is no-op under SSR). The server renders the machine's current state (typically :disconnected) statically; the client hydrates and starts the connection on its own.
This mirrors the rule for any client-only fx: the :platforms metadata gates execution; the server's fx resolver silently no-ops it.
Anti-patterns¶
- Implementing reconnect logic in
setTimeoutfrom inside the fx-handler. Bypasses the machine; bypasses tracing; bypasses stale-detection. Use:afterfor the backoff timer. - Mutating
app-dbfrom theonmessagecallback directly. The fx-handler must dispatch a named event; the event handler does the write. Same rule as Pattern-AsyncEffect. - Per-message machine-spawn-and-destroy. The connection machine is long-lived. Spawning a new machine per outgoing message is structural overkill — use a single connection machine with
:in-flightcorrelation tracking instead. - Treating WebSocket as Pattern-AsyncEffect. A connection that retries, reconnects, and survives across message boundaries is state-machine-shaped. Use this pattern.
- Storing the
WebSocketobject inapp-db. The JSWebSocketis not a value; it cannot serialise; it cannot survive Tool-Pair epoch replay. The:websocket/socketactor owns it via a host-side reference; only its id appears in:data. - Anchoring the
:invokeon:connectinginstead of the:activeparent. A socket actor scoped to:connectingis destroyed the moment the leaf transitions to:authenticating— every dispatch from:authenticatingand:connectedthen addresses a dead actor. The actor's lifetime must outlive every leaf that dispatches through it; the hierarchical parent is the natural anchor. - Forgetting to re-thread connection opts on reconnect. Recording
:urland:auth-tokenonly in:disconnected's:ws/connecthandler — and never refreshing them on the:reconnecting→:activepath — means a token expiry mid-session can never recover. Either store opts in:data(where the:invoke:datafn re-reads them on every:activeentry — the worked example's approach) or provide an explicit:ws/refresh-tokenslot at the parent level. - Skipping the connection-epoch check on
:ws/received. Without:current-socket?(or equivalent), a slow:messageevent from a torn-down socket can land in:connectedafter a reconnect and be processed against the new connection's:in-flightmap — at best a wrong-reply dispatch; at worst an in-flight slot cleared by a stale correlation id. The check is one line; skipping it is the websocket equivalent of 012 §Navigation tokens's nav-token bug. - Hardcoding the wire format in the pattern. EDN, JSON, MessagePack, Protobuf — the connection machine doesn't care. The
:websocket/socketactor serialises on send and deserialises on receive; the machine sees plain Clojure values.
Composition with related patterns¶
- Pattern-AsyncEffect — distinct but adjacent. Individual request-reply messages over the open socket fit Pattern-AsyncEffect (the open connection acts as the fx); the connection lifecycle itself does not. The request-reply correlation step the connection machine performs is what Pattern-AsyncEffect leaves to the caller — Pattern-WebSocket's
:in-flightmap is the worked example of that step. - Pattern-StaleDetection — composes twice. First for the
:afterbackoff timer (the runtime's built-in epoch handles it). Second for the connection-epoch: the live socket-id IS the epoch;:current-socket?is the guard;:rf.ws/stale-socketis the trace. - Pattern-Boot — "establish real-time connection" is often a late boot phase; the boot machine's
:routingor a dedicated:connecting-realtimestate dispatches[:ws/connection [:ws/connect ...]]to kick the connection machine into:active. :after/:always/:invoke/ hierarchical states (005) — the locked machine substrate. This pattern is the canonical worked example exercising all four together.- No Suspense (Principles.md) — connection state is explicit (
:disconnected,:active / :connecting,:active / :connected,:reconnecting,:failed), not implicit "loading"; views render against the snapshot's:state.
Cross-references¶
- 005-StateMachines.md — the substrate; this pattern is a worked example exercising hierarchical states,
:after,:always, and:invoketogether. - Pattern-AsyncEffect.md — sibling pattern for one-shot async work.
- Pattern-StaleDetection.md — epoch idiom; this pattern reuses it twice (backoff timer + connection epoch).
- Pattern-Boot.md — boot may include connection establishment as a phase.
- 011-SSR §
:afteris no-op under SSR — the server-side rule for the connection machine's timers.