websocket-relay/AGENTS.md
savinmax 905c241daa
Some checks failed
CI / test (push) Successful in 54s
CI / lint (push) Failing after 3m16s
Improve reliability, testing, and documentation
- Fix metrics: change MessagesTotal, ConnectionsTotal, DisconnectionsTotal
  from Gauge to Counter with proper _total naming convention
- Fix broadcast write-error handling: failed clients now get properly
  removed with accurate metrics updates
- Add graceful shutdown: SIGINT/SIGTERM handling with 10s timeout,
  CloseGoingAway frame sent to clients before disconnect
- Add integration tests: 11 tests using real WebSocket connections
  covering connect, broadcast, disconnect, concurrency, and shutdown
- Fix example client port: changed from 8000 to 8443 to match config
- Rewrite README.md to reflect current features and usage
- Add AGENTS.md and .agents/summary/ documentation for AI assistants
2026-06-11 19:14:19 +02:00

203 lines
6.6 KiB
Markdown

# AGENTS.md — AI Assistant Guide for websocket-relay
> This file provides context for AI coding assistants working on this project. It focuses on information not found in README.md and is optimized for quick comprehension.
## Project Identity
**websocket-relay** is a minimal Go WebSocket relay server that broadcasts every incoming message to all connected clients (hub-and-spoke / fan-out pattern). It supports TLS, Prometheus metrics, and graceful shutdown.
---
## Directory Structure
```
websocket-relay/
├── main.go # Entry point: config loading, signal handling, graceful shutdown
├── internal/
│ ├── config/config.go # YAML config loader (server port, TLS, metrics)
│ ├── hub/hub.go # Core logic: WebSocket hub, connection mgmt, broadcast
│ └── metrics/metrics.go # Prometheus counter/gauge definitions
├── example/index.html # Browser P2P chat demo client
├── config.yaml # Runtime config (edit for local dev)
├── config.example.yaml # Reference config with TLS enabled
├── Makefile # build, test, release, run, deps, clean
├── .gitea/workflows/
│ ├── ci.yml # Push/PR → test + lint
│ └── release.yml # Tag v* → cross-compile + Gitea release
└── .agents/summary/ # Generated documentation (see index.md)
```
---
## Architecture at a Glance
```mermaid
graph LR
C1[Client] -->|ws| HUB[Hub goroutine]
C2[Client] -->|ws| HUB
HUB -->|broadcast| C1
HUB -->|broadcast| C2
HUB --> MET[Prometheus :9090]
```
- **Single Hub goroutine** runs a `select` loop on 4 channels: `register`, `unregister`, `broadcast`, `stop`
- **Per-client reader goroutine** reads messages and pushes to `broadcast` channel
- **No write goroutine** — writes happen inline during broadcast (under RLock)
- **Thread safety** via `sync.RWMutex` on the clients map
- **Graceful shutdown** via SIGINT/SIGTERM → HTTP server shutdown → Hub shutdown → clean exit
---
## Coding Patterns
### Package Layout
- All internal packages live under `internal/` (Go internal package convention — cannot be imported externally)
- Flat package structure — each package has one primary `.go` file
- Tests use `_test.go` suffix in the same package (white-box testing)
### Naming Conventions
- Exported functions/types: `PascalCase` (e.g., `New`, `Run`, `HandleWebSocket`, `Shutdown`)
- Config struct uses nested anonymous structs with `yaml` tags
- Metrics use package-level `var` block with `promauto` for auto-registration
### Error Handling
- Fatal errors at startup → `log.Fatal()`
- WebSocket upgrade errors → logged and returned (no panic)
- Write errors during broadcast → client removed with proper metrics update
- Config load errors → fatal (server won't start without valid config)
- Shutdown errors → logged but not fatal (best-effort cleanup)
### Concurrency Pattern
- CSP via channels (not mutexes for coordination)
- The Hub `select` loop is the single coordination point
- RWMutex used additionally for broadcast iteration safety
- `stop` channel (closed on shutdown) signals the Hub to terminate
### Graceful Shutdown Pattern
- `main()` listens for SIGINT/SIGTERM via `os/signal`
- On signal: stops accepting new HTTP connections → shuts down Hub → exits
- 10-second timeout ensures the process doesn't hang indefinitely
- Clients receive a WebSocket Close frame (`CloseGoingAway`) before disconnection
---
## How to Write & Run Tests
### Running Tests
```bash
make test # Runs: go test -v ./...
go test ./internal/hub/ # Single package
go test -run TestNew ./internal/hub/ # Single test
```
### Test Conventions
- Test files: `*_test.go` in same package
- Use standard `testing.T` — no test framework
- Table-driven tests not yet adopted (tests are simple)
- Temp files for config tests (`os.CreateTemp`)
- Hub tests start `go h.Run()` and use `defer h.Shutdown()` for cleanup
- Integration tests use `httptest.Server` + real WebSocket dials
### Adding New Tests
```go
// File: internal/<pkg>/<pkg>_test.go
package <pkg>
import "testing"
func TestFeature(t *testing.T) {
// Setup
h := New()
go h.Run()
defer h.Shutdown()
// Assert
if h.ClientCount() != 0 {
t.Errorf("expected 0, got %d", h.ClientCount())
}
}
```
### Missing Test Coverage
- `internal/metrics` — no tests (metrics are auto-registered, mostly testing Prometheus library)
- No benchmarks
---
## Configuration
Config is loaded from YAML (default: `config.yaml`, override with `--config-file` flag):
```yaml
server:
port: 8443
tls:
enabled: false # Set true + provide cert/key for wss://
cert_file: cert.pem
key_file: key.pem
metrics:
enabled: true
port: 9090
```
---
## Build & Deployment
```bash
make build # → build/websocket-relay
make release # → build/websocket-relay-linux-amd64, build/websocket-relay-darwin-arm64
make run # → go run .
make deps # → go mod tidy
make clean # → rm build artifacts
```
### Release Process
1. Tag with `v*` prefix (e.g., `git tag v1.2.0`)
2. Push tag → Gitea Actions builds linux/amd64 + darwin/arm64
3. Binaries uploaded as Gitea Release assets
---
## Known Issues & Technical Debt
| Issue | Severity | Location |
|-------|----------|----------|
| No message size limits (`ReadLimit`) | Security | `internal/hub/hub.go` |
| No connection count limits | Security | `internal/hub/hub.go` |
| `gorilla/websocket` is archived | Debt | `go.mod` |
---
## Adding Features — Quick Guide
### Adding a new config field
1. Add field to `Config` struct in `internal/config/config.go` with `yaml` tag
2. Add to `config.yaml` and `config.example.yaml`
3. Use in `main.go` via `cfg.YourSection.YourField`
### Adding a new metric
1. Add `var` to `internal/metrics/metrics.go` using `promauto.NewGauge/Counter/Histogram`
2. Call `metrics.YourMetric.Inc()` (or `.Set()`, `.Observe()`) where needed
### Adding a new HTTP endpoint
1. Add handler method to Hub or create new handler
2. Register in `main.go`: `mux.HandleFunc("/path", handler)`
### Adding a new internal package
1. Create `internal/<name>/<name>.go`
2. Import as `websocket-relay/internal/<name>`
---
## Detailed Documentation
For deeper analysis, see `.agents/summary/index.md` which provides a complete knowledge base with:
- Architecture diagrams and concurrency model
- Component-level documentation with all structs/methods
- Complete interface specifications
- Data model definitions and state machines
- Workflow sequence diagrams
- Dependency analysis and security notes
- Prioritized improvement recommendations