summaryrefslogtreecommitdiff
path: root/PROTOCOL
blob: e5519d91925b849099a7434f5f64eebea910d845 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
Messages are sent over nanomsg.

Message format is JSON.

All messages have one string key 'op', a string key 'role' and a
unique name field, currently 'host'.

'op' can be:
- 'discover'
- 'register'

'role' can only be:
- 'master'
- 'coordinator'
- 'worker'

'host' is the FQDN or hostname of a coordinator (depends on the network
setup).

Operations:

Discovery:
----------

Master sends:

{ "op": "discover", "role": "master" }

Coordinators answer with:

All coordinators send:

{ "op": "register", "role": "coordinator", "host": "server1", 
  "cpus": 2, "os": "cpe:\/o:arch:arch:rolling", "arch": "x86_64" }

The coordinator sends its own configuration to the master.

On receiving a 'register' operation the master must handle accordingly,
usually adding the coordinator as known and alive and provide new
platforms and architectures to run workers on. Also currently scheduled
jobs must be examined.

The coordinators also send all known workers to the master as a list:

{ "workers": { "name": "worker1", "mode": "direct", "command": "build.sh" }
             { "name": "worker2", "mode": "direct", "command": "build.sh" }
}

Coordinator operations:
-----------------------

The master can start and stop workers:

{ "op": "start", "rule": "master", "worker": "worker1" }
{ "op": "stop", "rule": "master", "worker": "worker1" }

The master sends this as survey call to all coordinators, the coordinator
who feels responsible for this working with start/stop/kill the worker
with the given name. Then it sends back an ack message to the master:

{ "op": "stopped", "role": "coordinator", "host": "eeepc", "worker": "worker1", "found": true }

Worker messages:
----------------

Workers send their output and states to the master via a data channel
(PIPELINE):

{ "op": "output", "role": "worker", "worker": "worker1", "msg": "Msg 30\n", "stdout": false }