remote-http-launcher

remote-http-launcher is a command-line tool and Python library for launching, supervising, and reconnecting to long-running HTTP services — either locally or on remote hosts via SSH. It is driven by a single YAML configuration file that describes the service to launch, and it manages all process lifecycle state through JSON files on the client and server side.

Although it originated in the Seamless scientific-workflow framework, remote-http-launcher is fully generic: it can launch any HTTP service that follows its status-file protocol.

How it works

Given a YAML config, the launcher follows a deterministic sequence:

Check for an existing local connection file (~/.remote-http-launcher/client/<key>.json). If one exists and the service responds to the configured handshake, the launcher exits immediately — the service is already running.
Check the remote (or local) server directory (~/.remote-http-launcher/server/<key>.json):
If the remote JSON shows "status": "running", the launcher verifies the port and handshake, then writes the local connection file.
If it shows "status": "starting", the launcher monitors until the service comes up (or times out).
If the JSON file is absent or stale, the launcher starts the service.
Launch the service as a daemonic subprocess (locally or over SSH), writing a server-side JSON file with the PID, workdir, and "status": "starting". The path to this JSON file is passed to the service via the {status_file} template variable in the command field.
Optionally create an SSH tunnel forwarding the remote port to a local port, with a background monitor that tears down the tunnel when the remote process exits.
Write the local connection file containing the hostname and port (local or tunneled) for consumption by client code.

Status-file protocol

remote-http-launcher imposes a specific contract on any service it launches. The service must:

Read the JSON status file whose path was passed to it (via the {status_file} template variable in the command).
Acquire a free port and start listening on it.
Update that same JSON file: set "status" to "running" and add a "port" field with the chosen port number.

If the service fails to start, it should set "status" to "failed".

The launcher monitors the status file after starting the process. If the file is not updated within roughly one minute (and the process has exited), the launcher treats the launch as failed.

This is the only requirement remote-http-launcher places on the launched service — beyond this protocol, the service can be anything that speaks HTTP.

Configuration

The YAML configuration file supports the following fields:

Field	Required	Description
`workdir`	yes	Working directory on the (remote) host
`key`	yes	Template string evaluated to a filename used for the JSON state files
`command`	yes	Template string for the bash command to launch the service
`hostname`	no	Target HTTP hostname or IP; omit to run locally
`ssh_hostname`	no	SSH host to connect to (defaults to `hostname`)
`network_interface`	no	Interface the service binds to (default: `localhost`)
`tunnel`	no	Create an SSH tunnel for the service port (default: `false`)
`handshake`	no	HTTP path (and optional query parameters) for a health-check GET request
`conda`	no	Conda environment to activate before launching the command
`file_parameters`	no	Arbitrary parameters written into the server-side JSON file
`meta`	no	Opaque caller-provided JSON metadata copied into client/server state

The key and command fields are Python f-string templates evaluated against the full config namespace.

Example

workdir: /home/user/my-service-data
hostname: my-server.example.com
key: 'myservice-{workdir.strip("/").replace("/", "--")}'
command: >-
  myservice --port-range {port_start} {port_end}
  --status-file {status_file}
  --host {config['network_interface']}
  --timeout {timeout}
  {workdir}
network_interface: "0.0.0.0"
handshake: healthcheck
conda: myservice-env
timeout: 600
port_start: 10000
port_end: 19999

A JSON schema is included at config.schema.yaml for validation and editor support.

Installation

pip install remote-http-launcher

The only runtime dependency is PyYAML.

Usage

# Launch (or reconnect to) the service described in config.yaml
remote-http-launcher config.yaml

# Override the client connection directory
remote-http-launcher config.yaml --connection-dir /tmp/my-connections

# Print the evaluated command without launching
remote-http-launcher config.yaml --dry-run

# Log detailed launcher activity, including helper commands
remote-http-launcher config.yaml --log-level debug --debug-log-file rhl-debug.log

Logging

remote-http-launcher exposes all launcher-side probes and helper dispatches through the observer/logging interface. Use --log-level debug or --debug-log-file PATH to log the commands sent to the local or SSH target, including rhl-* helper calls such as rhl-cache-conda, rhl-conda-info, rhl-inspect, and rhl-launch-service.

Debug logs intentionally show the target executor (LocalExecutor for local services, SSHExecutor[host] for SSH targets). This matters for conda: a local target runs its helper calls locally, so they should not be read as remote SSH probes.

--dry-run prints the evaluated launch command and does not perform conda setup or cache refreshes.

Python API

from remote_http_launcher import run

result = run({
    "workdir": "/home/user/data",
    "key": "my-service",
    "command": "myservice --port-range 10000 19999 --status-file {status_file} {workdir}",
    "handshake": "healthcheck",
})
print(result["hostname"], result["port"])

SSH Guard

remote-http-launcher ships an SSH guard (rhl-guard) that restricts what commands can be run on the remote server under the service user account. When installed, only named rhl-* helper programs are permitted — naked shell commands such as pkill, rm -rf, bash -lc ..., or arbitrary python3 -c are rejected.

How it works

The SSH command= option in authorized_keys forces every incoming SSH session through rhl-guard. The guard reads SSH_ORIGINAL_COMMAND, validates it against a whitelist, and either execs the command or exits with an error. Interactive sessions (no SSH_ORIGINAL_COMMAND) are always rejected.

The guard whitelist covers only top-level rhl-* helper commands installed by this package (see below). The helpers perform state inspection, process checks, HTTP handshakes, port verification, conda cache reads, and service launch using structured argv/data instead of remote shell or Python source.

The guard silently strips any leading VAR=value shell variable assignments before the whitelist check. This means the standard client PATH prepend (PATH=$HOME/miniforge3/bin:$HOME/miniconda3/bin:$PATH rhl-ps) passes through rhl-guard unchanged — the assignments are dropped and only the rhl-* command and its arguments are exec'd.

Direct process-management commands such as kill -1 <pid> are rejected by the guard. Use helpers such as rhl-stop <key> instead.

Path policy

Three of the helpers — rhl-clear, rhl-ps-persistent, and rhl-launch-service (the --workdir argument) — accept filesystem paths chosen by the SSH client. Without bounding which paths the client may supply, a stolen SSH key with command="rhl-guard" would still reach helpers that destroy or probe arbitrary directories.

Path policy is enforced entirely by rhl-guard. The rhl-* helpers are not modified. They behave identically to previous versions when invoked directly — locally on the server, via tests, or in any client-side context (rhl-ps, rhl-rm --client, etc.). The guard is the single point that parses SSH_ORIGINAL_COMMAND, identifies any path arguments destined for the three helpers above, validates those paths against the configured policy, and only then execs the helper. Direct invocations bypass rhl-guard entirely and are unaffected by the policy.

The policy is declared via one of the following mutually-exclusive flags on rhl-guard in the authorized_keys command= directive:

Flag	Effect	When to use
`--data-roots /path/to/roots-file`	Strict allowlist. Each path the guard extracts from the SSH command must, after `~`-expansion and symlink resolution, equal or live under one of the absolute paths listed in the file. One path per line; blank lines and `#` comments are ignored. The same list governs `rhl-clear`, `rhl-ps-persistent`, and `rhl-launch-service --workdir`.	Recommended for production. Fully explicit; no heuristics.
`--clear-policy seamless`	Marker preset for Seamless. The guard accepts a target directory for `rhl-clear` and `rhl-ps-persistent` only if it contains either a `seamless.db` file or a `.HASHSERVER_PREFIX` file. `rhl-launch-service --workdir` is checked against the always-on heuristics only (since a workdir that has not yet been created cannot contain a marker). No allowlist file required.	Seamless deployments.
`--clear-policy marker:NAME`	Generic marker. Same shape as the `seamless` preset, but the marker filename is configurable. Drop `touch <dir>/<NAME>` into each directory you wish to authorise.	Non-Seamless deployments that prefer per-directory markers over a central allowlist.
`--permissive-paths`	Disables the policy. The guard accepts any path that passes the always-on heuristics below.	Compatibility for deployments that cannot configure any of the modes above. Not recommended. Mutually exclusive with `--data-roots` and `--clear-policy`; combining them is a fatal `rhl-guard` configuration error.

If none of these flags is present, rhl-guard refuses to dispatch rhl-clear, rhl-ps-persistent, or rhl-launch-service (any invocation of those three, not only those that include a path argument) and exits with an error pointing the operator at this section. The other rhl-* helpers do not accept client-chosen paths and continue to be dispatched normally.

Always-on heuristics

In every mode (including --permissive-paths), rhl-guard rejects the dispatch if any path it extracts from the SSH command is:

not absolute after ~ expansion;
equal to $HOME, or an ancestor of $HOME (e.g. /, /home);
containing any segment that begins with . (e.g. ~/.ssh, /tmp/.cache/foo); or
resolved to one of the system-root directories (/, /etc, /usr, /bin, /sbin, /lib, /lib64, /boot, /sys, /proc, /run, /var, /opt).

These are guard-side checks applied before the helper is exec'd. They are independent of, and additional to, the helper's own existing checks; nothing in the helper changes.

Installation

On the remote server, add to ~/.ssh/authorized_keys a command= directive that includes one of the path-policy flags. The recommended form pins an explicit allowlist file:

command="rhl-guard --data-roots /home/svc/.config/rhl/data-roots" ssh-rsa AAAA... your-key-comment

with ~/.config/rhl/data-roots containing, e.g.:

# Persistent workdirs mirroring clusters.yaml
~/seamless-buffers
~/hashserver-bufferdir
~/seamless-databasedir

Alternative forms for deployments that prefer markers over a central allowlist:

# Seamless preset — no allowlist file needed
command="rhl-guard --clear-policy seamless" ssh-rsa AAAA... your-key-comment

# Generic marker — drop ".rhl-clearable" into each authorised workdir
command="rhl-guard --clear-policy marker:.rhl-clearable" ssh-rsa AAAA... your-key-comment

Compatibility form for deployments that cannot configure either of the above (not recommended):

command="rhl-guard --permissive-paths" ssh-rsa AAAA... your-key-comment

Running rhl-guard directly prints an installation-oriented error explaining that it must be invoked by SSH. To test one guarded command locally on the server:

SSH_ORIGINAL_COMMAND="rhl-ps" rhl-guard

Conda cache

The launcher reads conda configuration via rhl-conda-info automatically. You can prime the cache once on the remote server before using conda environments:

ssh <remote_host> rhl-cache-conda

The cache is also refreshed automatically in the specific case where a cached conda setup exists but does not list the requested environment. In that case the launcher runs rhl-cache-conda on the same target, reloads ~/.remote-http-launcher/conda-setup.json, and checks the environment list again before failing. If the environment is still absent, the error names the target and says that rhl-cache-conda has already been run.

For guarded SSH servers this automatic refresh is an allowed rhl-* helper command, not an inline shell probe. It is therefore visible in debug logs just like manual helper calls. For local targets the same refresh runs through LocalExecutor.

Re-run rhl-cache-conda manually if you want to update the cache ahead of time after changing the conda installation or environments. On hosts where no rhl-* helpers are installed, the launcher falls back to inline heredoc probes for its own conda discovery automatically.

Reaching the `rhl-*` helpers over SSH

The launcher's fallback covers only its own conda discovery — anything that depends directly on the helpers (the seamless-service-* layer in seamless-config, agents shelling out to rhl-ps / rhl-stop / rhl-logs, etc.) requires remote-http-launcher to be installed on the remote server. Two supported install paths:

System-wide install (recommended when you have root): pip install remote-http-launcher into the system Python. The helpers land under /usr/local/bin.
Conda base env install (no root): pip install remote-http-launcher into the conda base environment on the remote host. The helpers land in $HOME/miniforge3/bin or $HOME/miniconda3/bin.

Clients (seamless-service-* and agents) must prepend $HOME/miniforge3/bin:$HOME/miniconda3/bin to PATH on every SSH invocation of an rhl-* helper, which covers conda-base installs without any shell startup changes:

ssh <host> 'PATH=$HOME/miniforge3/bin:$HOME/miniconda3/bin:$PATH' rhl-ps --json

Use the command above to verify reachability. If it still returns command not found, remote-http-launcher is not installed on the server.

Lifecycle States

Launcher state normally moves through these states:

State	Meaning	Next action
`absent`	No launcher state or persistent directory exists	—
`starting`	Process launched, waiting for it to report a port	Wait, or check logs if it hangs
`running`	Service healthy and port known	—
`failed`	Startup failed or timed out	Read log with `rhl-logs`, fix, restart
`stale`	Server JSON exists but process is dead	Read log with `rhl-logs`, then `rhl-rm` to clear
`persistent`	Workdir contains data even when no process is running	`rhl-ps-persistent` to inspect; `rhl-clear` to wipe

Persistent state causes false-pass test results. A service launched against a populated buffer directory or seamless.db returns cached results without exercising the underlying computation. When a test passes suspiciously after changes that should affect results, use rhl-ps-persistent (or seamless-service-ps --persistent from seamless-config) to inspect cached data, and rhl-clear (or seamless-service-clear) to wipe it before re-testing on a cold cache.

The stale state is the post-mortem window for non-persistent services. jobserver, daskserver, and pure-daskserver have no persistent data; the log file is the only post-mortem artefact, and it is reachable via rhl-logs only while the server JSON exists. Read the log before running rhl-rm — the log file itself survives JSON removal but is no longer addressable by key through the helper.

JSON State Schema

Server JSON files live under ~/.remote-http-launcher/server/<key>.json and client files under ~/.remote-http-launcher/client/<key>.json. Readers must tolerate rows without meta — older launcher versions and callers that do not populate it still write valid state files.

When written by seamless-config, the meta block contains:

{
  "meta": {
    "service":    "hashserver",
    "cluster":    "MYCLUSTER",
    "mode":       "rw",
    "project":    "myproject",
    "subproject": null,
    "stage":      "fingertip",
    "substage":   null,
    "queue":      null
  }
}

rhl-ps --json emits this block verbatim alongside the process-state fields (key, status, port, pid, workdir, log). The launcher treats meta as opaque and does not validate its contents — Seamless-specific semantics live entirely in seamless-config, not in remote-http-launcher.

Helper commands (`rhl-*`)

Installing remote-http-launcher adds a set of server-side helper programs that perform specific, safe operations on launcher state. These are the commands agents and operators should use instead of raw kill, rm, or shell loops.

Command	Runs on	Purpose
`rhl-guard`	server	SSH guard entry point; validates `SSH_ORIGINAL_COMMAND` before exec
`rhl-cache-conda`	server	Discover conda setup and write `~/.remote-http-launcher/conda-setup.json`
`rhl-conda-info`	server	Print the cached conda-setup JSON; exit 1 if cache is absent
`rhl-launch-service --key K --workdir D [--conda-env E] [--network-interface I] [--parameters J] [--meta J] -- BINARY ARG...`	server	Launch a whitelisted service binary as a daemon; write server-side JSON
`rhl-inspect <key> [--with-mtime]`	server	Pretty-print the server state JSON; `--with-mtime` emits `{"mtime": …, "data": …}`
`rhl-logs <key> [--tail N]`	server	Print the stdout/stderr log for a service
`rhl-stop <key>`	server	Stop service processes without deleting JSON state
`rhl-pid-alive PID`	server	Exit 0 if the process is alive, 1 if it is not
`rhl-verify-port HOST PORT`	server	TCP-connect to HOST:PORT (3 retries); exit 0 on success, 1 on failure
`rhl-handshake URL`	server	GET URL; exit 0 on 2xx, 1 on network error, 2 on non-2xx status
`rhl-ps [--client]`	client/server	List process state rows; client mode lists local connection state
`rhl-ps-persistent <path> [--marker FILENAME]`	server	Report absent, empty, or populated filesystem-backed state; optionally report only directories containing a marker file
`rhl-rm <key> [--client] [--server]`	client/server	Remove launcher JSON state files while leaving logs on disk
`rhl-clear <path>`	server	Remove direct children of a validated persistent directory

When dispatched through rhl-guard, calls to rhl-clear, rhl-ps-persistent, and rhl-launch-service are subject to the path policy declared on the guard itself (see Path policy); the guard validates the SSH-supplied paths and refuses to dispatch if no policy was declared. The helpers themselves are not modified — direct local invocation (e.g. on the server, in tests, or for client-side helpers such as rhl-ps and rhl-rm --client) bypasses the guard and behaves exactly as in earlier versions.

All state-file helpers respect the REMOTE_HTTP_LAUNCHER_DIR environment variable (same as the launcher).

Cluster-wide Seamless operations are intentionally not implemented by rhl-* helpers. Use seamless-service-stop, seamless-service-rm, and seamless-service-ps from seamless-config; those tools resolve Seamless cluster/project/stage semantics on the client side and then dispatch safe rhl-* operations.

CLI scripts

Installing remote-http-launcher also provides:

remote-http-launcher — main launcher CLI
All rhl-* helpers listed in the table above

remote-http-launcher

How it works

Status-file protocol

Configuration

Example

Installation

Usage

Logging

Python API

SSH Guard

How it works

Path policy

Always-on heuristics

Installation

Conda cache

Reaching the rhl-* helpers over SSH

Lifecycle States

JSON State Schema

Helper commands (rhl-*)

CLI scripts

Reaching the `rhl-*` helpers over SSH

Helper commands (`rhl-*`)