Illegal States Won't Compile
Before Orbit can place a workload or reclaim a core, it has to know how to say what a workload is. That sounds like the boring part. It isn't. The shape of the object model is where Borg and Kubernetes both left scars, in opposite directions, and getting it wrong taxes everything built on top.
The two ways to get grouping wrong
Borg had one way to group work: the job. A job was a set of identical tasks, and that was the only structure the system understood. If your service was actually five jobs that belonged together, Borg had no place to record that. So people wrote the relationship into the job names and enforced it with documentation. The system was blind to the thing the humans cared about most.
Kubernetes saw that and went the other way. Everything is a flat object, and you relate objects with labels: arbitrary key-value pairs you attach and then select on. It's flexible, and the flexibility is the problem. There's no built-in notion of "this service is these deployments." You assemble that relationship out of label conventions, and a typo in a selector is a silent outage rather than an error. Borg made you smuggle structure through strings. Kubernetes made you build all structure out of strings.
I wanted both halves: a real hierarchy that the system understands, and arbitrary tags for the ad-hoc grouping that hierarchies can't anticipate.
The hierarchy
Four nouns, top to bottom. A Mission is a whole application. It owns Modules, each a set of identical replicas of one binary. Modules schedule as Capsules, a resource envelope on a single machine. Inside a capsule run Payloads, the actual containers.
pub struct Mission {
pub id: MissionId,
pub owner: Principal,
pub modules: Vec<Module>,
pub tags: Tags, // arbitrary key/value: the queryable layer
pub intent: ServiceIntent, // what you want, not how to size it
pub ingress: Vec<Ingress>,
}The Mission is the thing Borg couldn't name: the unit a human thinks in. It has an owner, a set of modules, an intent (the SLOs you care about, which a later post hands to the autoscaler), and a tags field. The tags are the escape hatch. The hierarchy gives you structure the system enforces; the tags give you grouping the system doesn't have to know about in advance. You don't have to choose.
A Module carries the things that actually drive scheduling:
pub struct Module {
pub name: ModuleName,
pub payload: PayloadSpec,
pub replicas: ReplicaPolicy, // Fixed(n) | Managed
pub appclass: AppClass,
pub placement: PlacementConstraints,
pub gang: GangPolicy, // Independent | Gang { .. }
pub checkpoint: Option<CheckpointPolicy>,
pub preemptible: bool,
pub tags: Tags,
}I'll come back to most of these fields in their own posts: appclass and reclamation, gang and ML, replicas: Managed and the autoscaler. For now the point is that they're typed fields on a struct, not free-form annotations. The constraints are split into hard and soft, which is a distinction Borg drew and Kubernetes mostly blurred:
pub struct PlacementConstraints {
pub require: Vec<Predicate>, // hard: must hold
pub prefer: Vec<(Predicate, f64)>, // soft: weighted preference
pub spread: Vec<FailureDomain>, // anti-affinity: Host | Rack | Power | Zone
}A require predicate that doesn't hold makes a node infeasible. A prefer predicate just moves the score. That's the whole hard-versus-soft story, and it's in the types, so a scheduler can't accidentally treat a preference as a requirement.
A Mission isn't a Mission until it validates
Here's the part that matters more than the field list. Every type in the model has a validate method, and it returns a real error enum, not a bool:
pub enum ModelValidationError {
EmptyMissionId,
EmptyModules,
DuplicateModule(ModuleName),
EmptyImage { module: ModuleName },
DuplicatePortName { module: ModuleName, port: String },
InvalidIngressHost { host: String },
UnknownIngressModule { ingress: String, module: ModuleName },
IngressPortMissingTarget { ingress: String, module: ModuleName, port: String },
FixedReplicasZero { module: ModuleName },
GangMembersZero { module: ModuleName },
InvalidAvailability,
// ... about thirty of these
}
There are around thirty variants, and each one names a specific way a mission can be malformed. A module with no image. Two ports with the same name. An ingress route pointing at a module that doesn't exist, or at a port with no target, or a host that isn't a valid DNS name. An availability target of 1.5. A gang with zero members.
Mission::validate walks the whole tree and the checks compose: the mission validates its intent, then each module, then cross-references every ingress route against the modules and ports it claims to target. That last one is the kind of bug labels can't catch for you. An ingress route names a module and a port, and validation confirms both exist and the port actually has a target before the mission is allowed to exist:
let Some(module) = self.modules.iter().find(|m| m.name == route.module) else {
return Err(ModelValidationError::UnknownIngressModule { /* .. */ });
};
let Some(port) = module.payload.ports.iter().find(|p| p.name == route.port_name) else {
return Err(ModelValidationError::UnknownIngressPort { /* .. */ });
};
if port.target_port.is_none() {
return Err(ModelValidationError::IngressPortMissingTarget { /* .. */ });
}In a YAML world this is a runtime surprise: you apply the manifest, it's accepted, and the route 404s in production because the port name had a typo. Here it's a typed error returned before the mission is admitted. The config post covers how a user's `.orbit` file turns into one of these, but the validation lives in the core types, so every path that can produce a Mission, the CLI, the API, the simulator, gets the same checks for free.
The state machine that can't forget a case
The best example of letting the compiler do the work is the Capsule lifecycle. A capsule moves through a handful of states, and the legal moves between them are a fixed diagram (Borg drew it as Figure 2 in their paper). The naive way to implement that is a status field and a pile of `if` statements that mutate it, and the bug that always ships is the transition nobody handled.
Instead the state is an enum and the transition is a single total function over `(state, event)`:
pub enum CapsuleState {
Pending { since: DateTime<Utc>, reason: PendingReason },
Scheduled{ node: NodeId, at: DateTime<Utc> },
Running { node: NodeId, started: DateTime<Utc> },
Draining { node: NodeId, deadline: DateTime<Utc> },
Dead { outcome: Outcome },
}
impl CapsuleState {
pub fn apply(self, ev: LifecycleEvent) -> CapsuleTransition {
use CapsuleState::*;
use LifecycleEvent::*;
let next = match (self, ev) {
(Pending { .. }, Schedule { node }) => Scheduled { node, at: Utc::now() },
(Scheduled { node, .. }, Started) => Running { node, started: Utc::now() },
(Running { node, .. }, Preempt { deadline }) => Draining { node, deadline },
(Running { .. } | Scheduled { .. } | Draining { .. }, Finish(o)) => Dead { outcome: o },
(_, Lost) => Dead { outcome: Outcome::Lost },
(Dead { outcome }, _) => Dead { outcome },
// ... a few more real edges
(s, e) => return CapsuleTransition::Illegal(format!("{s:?} cannot transition on {e:?}")),
};
CapsuleTransition::Ok(next)
}
}Rust will not let that match compile unless it covers every (state, event) pair. The catch-all at the bottom turns anything I didn't explicitly allow into an `Illegal` transition rather than a silent corruption. A forgotten case is a compile error or an explicit rejection, never a capsule that quietly slipped into a state nobody meant.
A couple of the real edges are more interesting than the happy path. Pending plus a preempt is a no-op: you can't preempt something that isn't running yet, so it stays pending. And a capsule that's already Draining when a second preempt arrives keeps the *earlier* deadline:
(Draining { node, deadline: old }, Preempt { deadline: new }) => {
let deadline = if new < old { new } else { old };
Draining { node, deadline }
}That's a small thing that would be easy to get wrong in ad-hoc code, and once it's a single arm in a total function it's easy to test exhaustively. Which is the other half of the story.
Proving it instead of hoping
Because the transition is a pure function with no I/O, I can throw property tests at it and assert the invariants directly. Dead is terminal, so every event applied to a dead capsule leaves it dead (except Lost, which only ever sets the outcome to Lost). A Lost event from *any* non-dead state lands on Dead { outcome: Lost }. These run across generated states and events on every build:
#[test]
fn prop_lost_from_any_state() {
proptest!(|(reason_idx in 0u32..4, node_idx in 0u32..100)| {
// pending, scheduled, running, draining: Lost always yields Dead(Lost)
let running = CapsuleState::Running { node, started: Utc::now() };
assert!(matches!(running.apply(LifecycleEvent::Lost),
CapsuleTransition::Ok(CapsuleState::Dead { outcome: Outcome::Lost })));
});
}The same orbit-core crate that defines these types has no I/O in it at all. No network, no disk, no clock beyond Utc::now. That's deliberate, and it pays off twice: the invariants are testable in isolation, and the exact same types get reused everywhere, including the simulator that replays a whole Constellation in a single process. A pure core is the thing that makes the rest cheap.
The honest tradeoff is that a fixed hierarchy is opinionated. If your mental model doesn't fit Mission-Module-Capsule-Payload, the tags are your relief valve, but the nouns themselves aren't negotiable the way Kubernetes' object soup is. I think that's the right trade, a system that understands your structure is worth more than one that can represent any structure and understands none of them, but it's a trade, and I'd rather say so than pretend the model is free.
Next: the ledger those Commitments live in, and how propose turns "I'd like to place this capsule here" into a durable, conflict-checked fact.