Introducing Pilum: A Recipe-Driven Deployment Orchestrator
Today I’m open-sourcing Pilum, a multi-service deployment orchestrator written in Go. This post covers the context, architecture decisions, and technical implementation details.
Context
At SID Technologies, we run services across multiple cloud providers and distribution channels. A typical release involves:
- API services deployed to GCP Cloud Run
- CLI tools distributed via Homebrew
- Background workers on Cloud Run with different scaling configs
- Static assets synced to S3
Each platform has its own deployment CLI, authentication model, and configuration format. The cognitive overhead compounds: remembering gcloud run deploy flags versus brew tap semantics versus aws s3 sync options. Deployments became a copy-paste ritual of shell scripts, and the scripts inevitably drifted out of sync.
I wanted a single command that could deploy any service to any target, with the deployment logic defined declaratively rather than imperatively.
Inspiration: The Roman Pilum
The pilum was the javelin of the Roman legions. Its design was elegant in its specificity: a long iron shank connected to a wooden shaft, with a weighted pyramidal tip. It was engineered for a single purpose - to be thrown once and penetrate the target.
The soft iron shank would bend on impact, preventing the enemy from throwing it back and rendering their shield useless. One weapon. One throw. Mission accomplished.
This resonated with what I wanted from a deployment tool: define the target once, execute once, hit precisely.
The Problem with Existing Tools
Why not Terraform? Or Pulumi? Or just shell scripts?
Terraform/Pulumi are infrastructure-as-code tools optimized for provisioning resources. They’re declarative about what exists, not how to deploy code to those resources. Deploying a new version of a Cloud Run service requires running gcloud commands, not HCL.
Shell scripts work until they don’t. They’re imperative, hard to test, and the “deployment logic” gets scattered across Makefiles, CI configs, and random bash files. Adding a new provider means duplicating logic.
Platform-specific CLIs (goreleaser, ko, etc.) are excellent but single-purpose. You still need orchestration to coordinate multiple services across multiple platforms.
What I wanted:
- Declarative service configuration (YAML)
- Reusable deployment workflows (recipes)
- Parallel execution with step barriers
- Provider-agnostic core with pluggable handlers
Architecture
Pilum’s architecture separates three concerns:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Service Config │ │ Recipe │ │ Handlers │
│ (service.yaml) │────▶│ (recipe.yaml) │────▶│ (Go functions) │
└─────────────────┘ └─────────────────┘ └─────────────────┘
WHAT HOW IMPLEMENTATION
Service configs declare what you’re deploying: name, provider, region, build settings. These live in your repo alongside your code.
Recipes define how to deploy to a provider: the ordered sequence of steps, required fields, timeouts. These are YAML files that ship with Pilum.
Handlers implement the actual commands: building Docker images, pushing to registries, calling cloud CLIs. These are Go functions registered at startup.
Service Configuration
A minimal service.yaml:
name: api-gateway
provider: gcp
project: my-project
region: us-central1
build:
language: go
version: "1.23"
The provider field determines which recipe is used. All other fields are validated against that recipe’s requirements.
Recipe System
Recipes are the core abstraction. Here’s the GCP Cloud Run recipe:
name: gcp-cloud-run
description: Deploy to Google Cloud Run
provider: gcp
service: cloud_run
required_fields:
- name: project
description: GCP project ID
type: string
- name: region
description: GCP region to deploy to
type: string
steps:
- name: build binary
execution_mode: service_dir
timeout: 300
- name: build docker image
execution_mode: service_dir
timeout: 300
- name: publish to registry
execution_mode: root
timeout: 120
- name: deploy to cloud run
execution_mode: root
timeout: 180
default_retries: 2
The recipe declares:
- Required fields: Validated before any execution. If your service.yaml is missing
project, you get an error immediately, not 10 minutes into a build. - Steps: Ordered sequence of operations. Each step has a name, execution mode, and timeout.
- Execution mode:
service_dirruns in the service’s directory,rootruns from the project root.
Recipe Validation
The validation logic uses reflection to check service configs against recipe requirements:
func (r *Recipe) ValidateService(svc *serviceinfo.ServiceInfo) error {
for _, field := range r.RequiredFields {
value := getServiceField(svc, field.Name)
if value == "" && field.Default == "" {
return errors.New("recipe '%s' requires field '%s': %s",
r.Name, field.Name, field.Description)
}
}
return nil
}
The getServiceField function first checks a hardcoded map of common fields, then falls back to the raw config map, and finally uses reflection as a last resort. This gives us type safety for known fields while remaining flexible for custom recipe requirements.
Command Registry
Steps map to handlers via a pattern-matching registry:
type StepHandler func(ctx StepContext) any
type CommandRegistry struct {
handlers map[string]StepHandler
}
func (cr *CommandRegistry) Register(pattern string, provider string, handler StepHandler) {
key := cr.buildKey(pattern, provider)
cr.handlers[key] = handler
}
func (cr *CommandRegistry) GetHandler(stepName string, provider string) (StepHandler, bool) {
// Try provider-specific first, fall back to generic
// Pattern matching is case-insensitive with partial match support
}
This design allows:
- Generic handlers (
buildworks for any provider) - Provider-specific overrides (
deploy:gcpvsdeploy:aws) - Partial matching (
build binarymatches thebuildhandler)
Handlers return any because commands can be strings or string slices:
func buildHandler(ctx StepContext) any {
return []string{
"go", "build",
"-ldflags", fmt.Sprintf("-X main.version=%s", ctx.Tag),
"-o", fmt.Sprintf("dist/%s", ctx.Service.Name),
".",
}
}
Parallel Execution with Step Barriers
The orchestrator executes services in parallel within each step, but steps execute sequentially:
func (r *Runner) Run() error {
maxSteps := r.findMaxSteps()
for stepIdx := 0; stepIdx < maxSteps; stepIdx++ {
err := r.executeStep(stepIdx)
if err != nil {
return err // Fail fast
}
}
return nil
}
Within each step, a worker pool processes services concurrently:
func (r *Runner) executeTasksParallel(tasks []stepTask) error {
semaphore := make(chan struct{}, r.getWorkerCount())
for _, t := range tasks {
go func() {
semaphore <- struct{}{} // acquire
defer func() { <-semaphore }() // release
result := r.executeTask(task.service, task.step)
// ...
}()
}
}
This means:
- All services build in parallel (step 1)
- Once all builds complete, all pushes happen in parallel (step 2)
- Once all pushes complete, all deploys happen in parallel (step 3)
The barrier between steps ensures dependencies are satisfied. You can’t push an image that hasn’t been built.
Variable Substitution
Recipe commands support variable substitution:
func (r *Runner) substituteVars(cmd any, svc serviceinfo.ServiceInfo) any {
replacer := strings.NewReplacer(
"${name}", svc.Name,
"${service.name}", svc.Name,
"${provider}", svc.Provider,
"${region}", svc.Region,
"${project}", svc.Project,
"${tag}", r.options.Tag,
)
// Handle string, []string, and []any
}
This allows recipes to use service-specific values without hardcoding:
steps:
- name: deploy
command: gcloud run deploy ${name} --region=${region} --project=${project}
Why Open Source
A few reasons:
Dogfooding: Pilum deploys itself via Homebrew. Every release exercises the tool’s own recipe system. If it breaks, we feel it immediately.
Extensibility: The recipe system is designed for customization. You can add new providers by creating a YAML file - no Go code required for basic workflows. For complex providers, you register handlers.
Validation: Open source means more eyes on the code. Deployment tools have security implications. I’d rather have the community find issues than discover them in production.
Getting Started
Install via Homebrew:
brew tap sid-technologies/pilum
brew install pilum
Create a service.yaml in your project:
name: my-service
provider: gcp
project: my-gcp-project
region: us-central1
Validate your configuration:
pilum check
Deploy:
pilum deploy --tag=v1.0.0
Preview without executing:
pilum deploy --dry-run --tag=v1.0.0
What’s Next
Current providers: GCP Cloud Run, Homebrew, AWS Lambda (in progress).
On the roadmap:
- Kubernetes (kubectl apply workflows)
- AWS ECS
- GitHub Releases integration
- Parallel recipe discovery across monorepos