enge@inria

Goblins for number theory, part 4

2025-03-17T00:00:00Z

Client and server Goblins

After having introduced the basic concepts of Goblins, and in particular promises; after having looked at parallelisation over the network; and after an excursion to persistence, it is now time to get to the main topic. We would like more flexibility, as well in the behaviour of the clients as in the nature of the tasks they are handling and the control flow in the server. Clients should be able to come and go, maybe complete only one task never to be seen again (we will not handle the case of faults, however, that is, clients accepting a task and disappearing before completing it). Tasks could be heterogeneous, that is, take more or less time, or, equivalently, the clients could run on heterogeneous machines, and it would be nice to give out a new task to a client as soon as it finishes the previous one. And we would like the server to be able to work in rounds; in essence, distribute unrelated tasks corresponding to a loop, then gather the results and start with the next loop.

After David Thompson and Jessica Tallon had a look at my first solution, which had performance problems I could not explain, they came up with a much better idea, so I will present their solution without losing time in explaining what went wrong. Suffice it to say that one should avoid nesting with-vat expressions. In my experience, doing so with the same vat leads to a deadlock; doing so with different vats seems to work, but cause a severe performance penalty.

Queueing up clients, tasks and other promises

Our current solution already keeps a list of clients at the server, to which clients can register in the background. Instead of waiting until a fixed number of clients have arrived, we should be more dynamic and implement the server as follows. As long as there are tasks to be submitted and the client list is not empty, the server removes a client from the client list and submits a task to this client. If the client list is empty while there are still unsubmitted tasks, the server waits until a new client registers. So far, this scheme uses each client for exactly one task, and works if more clients register than there are tasks. The trick is to let the client do its computation, and at the end register itself again with the server as being available for the next task. My first solution used the existing ^registry actor, added an 'unregister method used by the server to retrieve an available client, and let the client call the 'register method after finishing a task. The problem with this straightforward approach is that one needs to have the server wait when no client is available, and this risks stalling everything. In a sense, we are back to the problem discussed in the first post: Goblins work with promises, and waiting for their resolution is not a Goblins concept; one should not try to master the time and write code to be executed at a specific moment, but rather define call-backs that are run when promises become true.

Jessica and David pointed out to me that since version 0.14 of Goblins, a suitable module is available in the actor library: the inbox, which is modelled after a post box that queues messages and delivers them one by one on request. Actually it rather delivers parcels, since it is a general first in, first out queue that can be filled with anything. We will replace the current clients list in a ^cell actor by an inbox that will contain client actors. The crucial difference with my home brew solution is that the level of an inbox can go below zero without blocking: If there are no elements in the queue, it nevertheless returns a promise to a future element, in our case a client actor that will register later. Thanks to promise pipelining, we can pretend that this empty promise is actually an actor and send messages to it using <-. Once a new actor registers, the promise fulfills itself, and the new actor will receive the message sent previously and act on it.

The essential modifications occur in the ^worker type actor in the client script:

(define-actor (^worker bcom server) #:self self
  (methods
    ((square x)
     (let ((res (* x x)))
          (format #t "square ~a\n" x)
          (sleep 3)
          (<- server self)
          res))
    ((finish)
     (signal-condition! end))))

After computing the result (and sleeping a little bit for testing purposes, since the tasks are so short that otherwise one client would end up grabbing all of them before we have a chance to start a second one), the client sends itself back to the server. To this purpose, it needs to know the server, which can be passed to it upon spawning; and it needs to have a notion of itself. This is why we have replaced the define by the more general define-actor, in which the optional #:self self defines a formal parameter self to later… speak to oneself! Notice that we have also modified the client so that as in the first blog posts, it does not register with its name (which thus is not passed on the command line either anymore): Using an inbox instead of the custom registry function implies that we would need to encapsulate the client actor into a composite data structure together with its name (for instance, SRFI-9 records), which is more hassle than warranted for our experimatal code. To make up for it, we let the client itself print the computing tasks it receives. The complete client script, after some shuffling around so that things are defined in the correct order, looks like this:

(use-modules (srfi srfi-1)
             (fibers conditions)
             (goblins)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))
(define end (make-condition))

(define-actor (^worker bcom server) #:self self
  (methods
    ((square x)
     (let ((res (* x x)))
          (format #t "square ~a\n" x)
          (sleep 3)
          (<- server self)
          res))
    ((finish)
     (signal-condition! end))))

(define capn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(define server
  (with-vat vat
    (<- capn 'enliven (string->ocapn-id (second (command-line))))))

(define client
  (with-vat vat (spawn ^worker server)))
(with-vat net ($ capn 'register client 'tcp-tls))

(with-vat vat
  (<- server client))

(wait end)

In the server, we essentially replace the custom ^registry by an inbox, which results in the following code:

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib inbox)
             (goblins actor-lib joiners)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls)
             (goblins persistence-store syrup)
             (goblins vat))

(define persistence-vat (spawn-vat))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry)))

(define-values (net capn)
  (spawn-persistent-vat
    (make-persistence-env #:extends (list captp-env tcp-tls-netlayer-env))
    (lambda ()
      (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost")))
    (make-syrup-store "ocapn.syrup")
    #:persistence-registry persistence-registry))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define-values (vat get-client put-client stop-clients)
  (spawn-persistent-vat
    (make-persistence-env
      #:extends inbox-env)
    (lambda ()
      (spawn-inbox))
    (make-syrup-store "registry.syrup")
    #:persist-on #f
    #:persistence-registry persistence-registry))

(let ((id (with-vat net ($ capn 'register put-client 'tcp-tls))))
  (print-id "Server ID" id))

(define all-clients (with-vat vat (spawn ^cell '())))
(define v '(1 2 3 4 5))
(with-vat vat
  (let
    ((clients (map (lambda (x) (<- get-client)) v)))
    ($ all-clients (append clients ($ all-clients)))
    (on (all-of* (map (cut <- <> 'square <>) clients v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res)))
        (on (all-of* ($ all-clients))
          (lambda (c)
            (map (cut <- <> 'finish) (delete-duplicates c))))))))

(sleep 3600)

The spawn-inbox function does not return one actor, but actually three at the same time: one for adding elements into the queue, one for retrieving an element (or a promise thereof), and one for shutting the inbox down (which we will not use). The expression

(map (lambda (x) (<- get-client)) v)

creates a list of (promises to) client actors that is as long as the size of the vector. Then we send the tasks as before and use all-of* to wait for their results. There is a little subtlety for sending the 'finish messages: Since the variable clients in general does not contain a list of client actors any more, but a list of promises, we also need to use (on (all-of* …)) to retrieve the actual list of actors. We go further by memorising all clients ever encountered (with multiplicities, actually) in a separate cell all-clients. This is a bit convoluted at this point (since at the end of the script, ($ all-clients) is the same as clients), but will make things easier later. Without any extra code for sending the 'finish signal to the clients, the main part of the server script could be condensed into only a few lines:

(define v '(1 2 3 4 5))
(with-vat vat
  (on (all-of* (map (lambda (x) (<- (<- get-client) 'square x)) v))
    (lambda (res)
      (format #t "~a\n" (sqrt (fold + 0 res))))))

Notice that this solution is strictly more general than that of the previous posts: If only one client registers, it runs all the squaring tasks; if a second one arrives, it obtains every other task; and so on. And… that's it! We have parallelised a for loop which may contain tasks of differing (and a priori unknown) lengths, and it can handle the situation where clients join at any time. To handle clients that may leave after a task is completed, the framework is essentially there: Instead of having the client register again after each computing task, this could be made dependent on a condition to be checked in the client. For handling faults, that is, clients which disappear in the middle of a task, one would need to add timeouts at the server level and requeue tasks for which the result has not appeared after a reasonable waiting time, which would depend on the application. Then all-of* would not be a suitable joiner, but one could use race, which resolves as soon as one of several promises resolves. Or one could use all-of* on a list of promises created by race from a computation promise and a timeout promise, as given precisely as an example in the documentation of race. We will not pursue the topic of faults in this post, but it is clear that Goblins mechanisms could be used to solve the problem.

In any case, our current Goblins code is already more flexible than the MPI solution, which assumes that all clients are known at the beginning of the computation and do not change throughout, and which also breaks in the presence of faults.

Time for crochet: loops after loops!

A common situation is that after running one loop, one needs to start a second round that continues the computations with the intermediate results that have just been obtained. In what follows, we will modify the server script accordingly, while keeping the client script unmodified, which may be seen as a sign that the architecture developed so far makes sense.

For instance, the following sequential code computes the L4-norm of a vector, that is, the fourth root of the sum of the fourth powers of its entries:

(use-modules (srfi srfi-1))
(define (square x) (* x x))
(define v '(1 2 3 4 5))
(define w (map square v))
(define res (map square w))
(format #t "~a\n" (sqrt (sqrt (fold + 0 res))))

It consists of two loops, one for squaring each entry in v and putting the results into w, and a second one for squaring the entries in w (which effectively computes the fourth powers of the entries in v).

This can be goblinified quite naturally by nesting the task submission and (on (all-of* …)) handling of the results. Without 'finish signals, this results in the following code:

(define v '(1 2 3 4 5))
(with-vat vat
  (on (all-of* (map (lambda (x) (<- (<- get-client) 'square x)) v))
    (lambda (w)
      (on (all-of* (map (lambda (x) (<- (<- get-client) 'square x)) w))
        (lambda (res)
          (format #t "~a\n" (sqrt (sqrt (fold + 0 res)))))))))

Including 'finish handling, the following server script is not the shortest solution, but its symmetries will be helpful in the next section. To make the code more readable, we have moved some (repetitive) code into the functions submit-square-jobs and submit-finish-jobs.

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib inbox)
             (goblins actor-lib joiners)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls)
             (goblins persistence-store syrup)
             (goblins vat))

(define persistence-vat (spawn-vat))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry)))

(define-values (net capn)
  (spawn-persistent-vat
    (make-persistence-env #:extends (list captp-env tcp-tls-netlayer-env))
    (lambda ()
      (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost")))
    (make-syrup-store "ocapn.syrup")
    #:persistence-registry persistence-registry))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define-values (vat get-client put-client stop-clients)
  (spawn-persistent-vat
    (make-persistence-env
      #:extends inbox-env)
    (lambda ()
      (spawn-inbox))
    (make-syrup-store "registry.syrup")
    #:persist-on #f
    #:persistence-registry persistence-registry))

(let ((id (with-vat net ($ capn 'register put-client 'tcp-tls))))
  (print-id "Server ID" id))

(define all-clients (with-vat vat (spawn ^cell '())))

(define (submit-square-jobs v)
  (let ((clients (map (lambda (x) (<- get-client)) v)))
    ($ all-clients (append clients ($ all-clients)))
    (map (cut <- <> 'square <>) clients v)))

(define (submit-finish-jobs clients)
  (map (cut <- <> 'finish) (delete-duplicates clients)))

(define v '(1 2 3 4 5))
(with-vat vat
  (on (all-of* (submit-square-jobs  v))
    (lambda (w)
      (on (all-of* (submit-square-jobs w))
        (lambda (res)
          (format #t "~a\n" (sqrt (sqrt (fold + 0 res))))
          (on (all-of* ($ all-clients))
            submit-finish-jobs))))))

(sleep 3600)

Untangling the threads: macros to the rescue

The last block of the server code now clearly shows a recurring pattern:

(on (all-of* SUBMIT SOME JOBS)
  (lambda (VAR)
    DO SOMETHING WITH THE RESULT IN VAR

which is actually nested, since handling the results of the first round requires to run the same pattern for the second round of job submissions. Now a pattern can be handled by a Guile macro, for instance as follows:

(define-syntax submit-reduce
  (syntax-rules ()
    ((submit-reduce submit v reduce ...)
     (on (all-of* submit)
       (lambda (v)
         (begin reduce ...))))))

The line following the syntax-rules () contains a pattern to be matched; the remainder of the macro is the Guile code above, with placeholders replaced by parts of the matched pattern. The first argument of the macro is a single expression corresponding to SUBMIT SOME JOBS; if several expressions are needed, they can be transformed into only one using let*, for instance. The second argument is the (formal) variable name VAR. All remaining arguments (of which there may be zero) correspond to DO SOMETHING WITH THE RESULT IN VAR; these will in general use the formal variable.

Using this macro, the main block of the server script can be compressed as follows:

(define v '(1 2 3 4 5))
(with-vat vat
  (submit-reduce (submit-square-jobs  v) w
    (submit-reduce (submit-square-jobs w) res
      (format #t "~a\n" (sqrt (sqrt (fold + 0 res))))
      (on (all-of* ($ all-clients))
        submit-finish-jobs))))

It is also possible to let the macro itself handle the nesting as follows:

(define-syntax submit-reduce
  (syntax-rules ()
    ((submit-reduce reduce)
     reduce)
    ((submit-reduce submit v reduce ...)
     (on (all-of* submit)
       (lambda (v)
         (submit-reduce reduce ...))))))

If the macro is called with at least three arguments, then the second pattern (submit-reduce submit v reduce ...) is matched. The first argument (a single expression) is considered to be the job submission phase, the second argument the variable name for the results of the first jobs; then the macro is called recursively, and more job submission phases, alternated with variable names, are expected; in the end, when only one argument remains, the first pattern (submit-reduce reduce) is matched, which corresponds to the handling of the results of the final round of job submissions. So to work, the macro requires an odd number of arguments, otherwise it raises an error (using just one argument is possible, but makes no sense). With this macro, the main server block looks as follows:

(define v '(1 2 3 4 5))
(with-vat vat
  (submit-reduce
    (submit-square-jobs v) w
    (submit-square-jobs w) res
    (begin
      (format #t "~a\n" (sqrt (sqrt (fold + 0 res))))
      (on (all-of* ($ all-clients))
        submit-finish-jobs))))

Notice that we needed to wrap the final reduction into (begin …) since it consists of several expressions. This macro, without its additional nesting, makes the sequence of submitting a series of tasks, submitting a new series of tasks depending on the results of the previous series, and so on, until the final result is handled through a side effect (to break out of the promises), quite clear. When exactly three arguments are given, both macros are equivalent, so that it is still possible to manually nest the macro invocations, and the second macro is more powerful than the first one (except that the first one admits several Guile expressions for the reduction phase).

To illustrate the simplicity with which the pattern continues, here is the complete server script for computing the L8-norm of a vector, that is, the eightth root of the sum of the eigtth powers of its entries:

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib inbox)
             (goblins actor-lib joiners)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls)
             (goblins persistence-store syrup)
             (goblins vat))

(define persistence-vat (spawn-vat))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry)))

(define-values (net capn)
  (spawn-persistent-vat
    (make-persistence-env #:extends (list captp-env tcp-tls-netlayer-env))
    (lambda ()
      (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost")))
    (make-syrup-store "ocapn.syrup")
    #:persistence-registry persistence-registry))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define-values (vat get-client put-client stop-clients)
  (spawn-persistent-vat
    (make-persistence-env
      #:extends inbox-env)
    (lambda ()
      (spawn-inbox))
    (make-syrup-store "registry.syrup")
    #:persist-on #f
    #:persistence-registry persistence-registry))

(let ((id (with-vat net ($ capn 'register put-client 'tcp-tls))))
  (print-id "Server ID" id))

(define all-clients (with-vat vat (spawn ^cell '())))

(define (submit-square-jobs v)
  (let ((clients (map (lambda (x) (<- get-client)) v)))
    ($ all-clients (append clients ($ all-clients)))
    (map (cut <- <> 'square <>) clients v)))

(define (submit-finish-jobs clients)
  (map (cut <- <> 'finish) (delete-duplicates clients)))

(define-syntax submit-reduce
  (syntax-rules ()
    ((submit-reduce reduce)
     reduce)
    ((submit-reduce submit v reduce ...)
     (on (all-of* submit)
       (lambda (v)
         (submit-reduce reduce ...))))))

(define v '(1 2 3 4 5))
(with-vat vat
  (submit-reduce
    (submit-square-jobs v) w
    (submit-square-jobs w) t
    (submit-square-jobs t) res
    (begin
      (format #t "~a\n" (sqrt (sqrt (sqrt (fold + 0 res)))))
      (on (all-of* ($ all-clients))
        submit-finish-jobs))))

(sleep 3600)

(Preliminary) conclusion

At this point, the goal set out at the beginning of this series of blog posts is met. We have developed a client and server structure in which the clients register with the server and the server hands them computation tasks that correspond to a sequence of loops. As already said above, the result is even a bit more flexible than with MPI: The number of clients need not be known and communicated to the server in advance, but clients can come and go, as long as they do not vanish in the middle of a task. And Goblins make it possible to do so over the Internet, either with TCP/TLS or even through the Tor network.

Goblins for number theory, part 3

2025-03-07T00:00:00Z

Ending and persisting

In previous posts we have seen how to solve our toy problem of computing the euclidian length of a vector in a distributed fashion using Goblins, with a client script that runs in several copies, carries out most of the work and reports back to a server script, which collects the partial results into a solution to the problem. The clients could in principle live on distant machines and communicate over the Tor network. For testing in a local setting, however, letting them run on the same machine as the server and communicating over TCP turns out to be more efficient. So far, our architecture is rather inflexible: We assume that the server knows the number of participating clients beforehand, and that all tasks take more or less the same time so that distributing them evenly to the clients is an optimal scheduling strategy. The logical next step is to overcome these limitations. My initial solution for a more general framework, however, turned out to be very inefficient. Jessica Tallon and David Thompson of the Spritely Institute (many thanks to them!) kindly had a look at it and came up with a much better solution; but our discussions also helped me understand Goblins better and inspired ideas on how to improve the current client and server scripts. So before going for more generality in the next post, let us do a pirouette with the current framework and also explore some interesting side tracks that did not make it into the previous post.

Spring cleaning

Before doing anything substantial, let us clean up a few things in the current code. The main actor in the server script is currently defined through the type ^register as follows:

(define clients (with-vat vat (spawn ^cell '())))
(define (^register bcom)
  (lambda (id)
    ($ clients (cons (<- mycapn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))
(define register (with-vat vat (spawn ^register)))

It captures the clients variable in the closure defined by lambda, which works, but requires the variables to be defined in this order. A more elegant solution is to pass clients as an argument. At the same time, we take the opportunity to rename the verb register to the noun registry.

(define (^registry bcom clients)
  (lambda (id)
    ($ clients (cons (<- mycapn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))
(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))

Let us also get rid of some “overgoblinification”; indeed the actor of type ^len in the server can be replaced by a simple function, or (since the Goblins promises force us to work with side effects anyway) by sequential code. We end up with the following server script server.scm:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define capn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(define (^registry bcom clients)
  (lambda (id)
    ($ clients (cons (<- capn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))

(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))
(let ((id (with-vat net ($ capn 'register registry 'tcp-tls))))
  (print-id "Server ID" id))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))
(with-vat vat
  (on (all-of* (map <- ($ clients) v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res))))))

(sleep 3600)

and the following client script client.scm:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define capn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
(define id
  (with-vat net ($ capn 'register client 'tcp-tls)))
(print-id "Client ID" id)

(define server
  (with-vat vat
    (<- capn 'enliven (string->ocapn-id (second (command-line))))))

(with-vat vat
  (on id
    (lambda (id)
      (<- server id))))

(sleep 3600)

Now run again

guile server.scm

in one terminal and two copies of the client script as

guile client.scm 'ocapn://…'

in two other terminals, where the ocapn URI has been replaced by the one printed by the server, to compute the same result as before.

Passing actors around

After going through the CapTP tutorial, I was under the impression that the only way to create a handle on an actor on a different machine was by obtaining its sturdyref ID and “enlivening” this ID locally. Currently the server script prints its ID, which the client script obtains as an argument when invoked from the command line. This enables the client to enliven the server and to send its ID to the server when registering by a <- call; then the server enlivens the client. It turns out, however, that it is also possible to directly send actors instead of their IDs through <-. Printing and copy-pasting IDs is still necessary for bootstrapping, but once a spanning tree is generated in this manner between all participating scripts, it is possible to obtain a complete communication graph by just sending actors along these bootstrapped network edges.

We would still like the client to somehow present itself to the server with a name, so that the server can print who connects to it and thus make debugging easier. If we drop the ocapn ID, then the client can use a pet name, a string that we pass as an additional argument on the command line. The server needs only minimal modifications:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define capn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(define (^registry bcom clients)
  (lambda (client name)
    ($ clients (cons client ($ clients)))
    (format #t "Registered ~a\n" name)))

(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))
(let ((id (with-vat net ($ capn 'register registry 'tcp-tls))))
  (print-id "Server ID" id))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))
(with-vat vat
  (on (all-of* (map <- ($ clients) v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res))))))

(sleep 3600)

Notice the additional argument name for the ^registry actor, which is used for announcing arriving clients instead of their ocapn ID. (In this implementation we forget the name of a client immediately; it would make sense to somehow keep it, either by remembering it directly in ^square or by having the server memorise it in its client list.) Instead of enlivening an ID and adding the resulting actor to the clients list, the server adds the client actor directly. The client modifications are also straightforward and simplify the script considerably:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

(define capn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
(with-vat net ($ capn 'register client 'tcp-tls))

(define name (second (command-line)))

(define server
  (with-vat vat
    (<- capn 'enliven (string->ocapn-id (third (command-line))))))

(with-vat vat
  (<- server client name))

(sleep 3600)

Now start the server as usual, and two clients as

guile client.scm Alice 'ocapn://…'
guile client.scm Bob 'ocapn://…'

to see the familiar result.

Being methodical

As it will be useful later on, let us replace the workhorse in the client, the ^square actor with only one possible action (squaring a number that is sent to it) by an implementation with potentially more actions. To do so, we use methods from Goblin actor libs, which dispatch actions using an additional symbol. So

(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

becomes

(use-module (goblins actor-lib methods)
…
(define (^worker bcom)
  (methods
    ((square x)
     (* x x))))
(define client
  (with-vat vat (spawn ^worker)))

Inside the server, we now need to change calls of the form

(<- client x)

by adding an additional symbol to

(<- client 'square x)

This is made more complicated since they appear inside map:

(map <- ($ clients) v)

The solution is to change the <- function, which now takes three arguments (a client, a symbol and a number) into a function with only two arguments by fixing the middle argument to 'square. This can be done using SRFI-26 cut; it takes the function name and for each argument of the function either a fixed value, or the placeholder <> indicating that this argument should be kept as such. In our case, this gives

(map (cut <- <> 'square <>) ($ clients) v))

So altogether, here is our current server:

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define capn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(define (^registry bcom clients)
  (lambda (client name)
    ($ clients (cons client ($ clients)))
    (format #t "Registered ~a\n" name)))

(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))
(let ((id (with-vat net ($ capn 'register registry 'tcp-tls))))
  (print-id "Server ID" id))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))
(with-vat vat
  (on (all-of* (map (cut <- <> 'square <>) ($ clients) v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res))))))

(sleep 3600)

and here our current client:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (^worker bcom)
  (methods
    ((square x)
     (* x x))))
(define client
  (with-vat vat (spawn ^worker)))

(define capn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
(with-vat net ($ capn 'register client 'tcp-tls))

(define name (second (command-line)))

(define server
  (with-vat vat
    (<- capn 'enliven (string->ocapn-id (third (command-line))))))

(with-vat vat
  (<- server client name))

(sleep 3600)

Everything has an end, but Goblins

It is mildly annoying that the scripts run forever (well, for one hour…) and need to be stopped with <ctrl-c>. But it is somewhat difficult to decide when to stop: In both our scripts, the control flow reaches the end of the programs, while Goblins are still working in the background through promises. It is possible to use conditions from Guile Fibers, as inspired by the chat example in the Goblins documentation. Since Fibers are a basic ingredient of Goblins in Guile, they do not need to be installed separately. We can modify the client as follows:

(use-module (fibers conditions)
…
(define end (make-condition))
…
(define (^worker bcom)
  (methods
    ((square x)
     (* x x))
    ((finish)
     (signal-condition! end))))
…
(wait end)

First we import the (fibers conditions) module. Then we create the “condition” end. We use signal-condition! to signal, well, that the condition has been fulfilled. And we replace sleeping by waiting for the condition. The signalling is encapsulated in a new method 'finish of the ^worker actor, which can be called from the server as

(map (cut <- <> 'finish) ($ clients))

after the result of the computations has been printed. This results in the following client script:

(use-modules (srfi srfi-1)
             (fibers conditions)
             (goblins)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))
(define end (make-condition))

(define (^worker bcom)
  (methods
    ((square x)
     (* x x))
    ((finish)
     (signal-condition! end))))
(define client
  (with-vat vat (spawn ^worker)))

(define capn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
(with-vat net ($ capn 'register client 'tcp-tls))

(define name (second (command-line)))

(define server
  (with-vat vat
    (<- capn 'enliven (string->ocapn-id (third (command-line))))))

(with-vat vat
  (<- server client name))

(wait end)

With the server script modified suitably as explained above, the clients now end correctly, but the server crashes after printing the result of the computations. A hasty decision we took earlier comes back to haunt us now: Since there are more tasks than clients, we have filled the clients list with duplicates of the client actors so as to send multiple 'square messages to the same actor; but now we send multiple 'finish messages to clients that have stopped running after the first such message, resulting in a scary error on the server side that boils down to &non-continuable. To reach this correct conclusion more gracefully, we take another hasty decision and deduplicate the clients list when calling finish:

(map (cut <- <> 'finish) (delete-duplicates ($ clients)))

An an excuse for our laziness in not looking for a more elegant solution, we remark that anyway this part will be reworked later to obtain a more flexible client queue.

I have not found a similar approach to also have the server end gracefully. If one places signal-condition! in the code right after sending the 'finish messages to the clients, then the clients do not end, since it turns out that the server finishes so fast that the messages are not actually sent. If one tries to wait for the promise coming out of the 'finish calls, then this also fails, since the finished clients cannot send back a function value any more. So I keep the sleep in the end and make it just a bit shorter. The current server.scm then looks like this:

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define capn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(define (^registry bcom clients)
  (lambda (client name)
    ($ clients (cons client ($ clients)))
    (format #t "Registered ~a\n" name)))

(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))
(let ((id (with-vat net ($ capn 'register registry 'tcp-tls))))
  (print-id "Server ID" id))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))
(with-vat vat
  (on (all-of* (map (cut <- <> 'square <>) ($ clients) v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res)))
        (map (cut <- <> 'finish) (delete-duplicates ($ clients))))))

(sleep 10)

Résistez ! euh, persistez !

Another annoyance in the current code is that the ocapn ID of the server changes every time it is started, so that there is a lot of copy-pasting for starting the clients. This turns from a minor annoyance into a problem when different clients are supposed to be started independently all over the Internet, and the ocapn ID is the de facto credential to enable connections. Then a restart of the server script for any reason, be it a power outage or an update, requires to communicate the new ID to all participants. From the name of it, it sounds as if persistence could come to the rescue. We only need to persist the server. In a first step, we add a bit of boilerplate, taken from the documentation of persistent vats; this seems to be required when several vats with cross-references to each other are to be persisted, but cannot do any harm in general.

(use-module (goblins vat)
…
(define persistence-vat (spawn-vat))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry)))

Then we follow the example on persistence in the documentation of the TCP netlayer (after correcting a small error in the documentation for version 0.15, which has been updated in the meantime) and replace

(define net (spawn-vat))
(define capn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

(use-module (goblins persistence-store syrup)
…
(define-values (net capn)
  (spawn-persistent-vat
    (make-persistence-env #:extends (list captp-env tcp-tls-netlayer-env))
    (lambda ()
      (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost")))
    (make-syrup-store "ocapn.syrup")
    #:persistence-registry persistence-registry))

The spawn-persistent-vat returns a number of values; the first one is a new vat, the other ones are created by the lambda expression and correspond to actors in the vat which are to be persisted (more precisely, they form the roots of the corresponding graph). A persistence environment is passed as the first argument; it “knows” how to store the different types of actors. In this case, we store to a file named ocapn.syrup, where syrup is the Goblins internal file format.

It is instructive to run the server and to inspect the ocapn ID it prints. The general format seems to be ocapn://….tcp-tls/s/…?host=localhost&port=… where the first ellipsis consists of 52 lower case letters and digits (a 256 bit hash encoded in base 32?), the second ellipsis consists of 43 lower and upper case letters, digits and symbols (a 256 bit hash encoded in base 64?), and the third ellipsis is a random port. Previously, all three would change when invoking the script. Now the sequence in the place of the first ellipsis as well as the port remain fixed.

So we need to persist more, in particular the actor that is registered in the network layer. So we replace

(define (^registry bcom clients) …)
(define vat (spawn-vat))
(define clients (with-vat vat (spawn ^cell '())))
(define registry (with-vat vat (spawn ^registry clients)))

(define-actor (^registry bcom clients) …)
(define-values (vat clients registry)
  (spawn-persistent-vat
    (make-persistence-env
      (list (list '((registry) ^registry) ^registry))
      #:extends cell-env)
    (lambda ()
      (let ((clients (spawn ^cell '())))
        (values
          clients
          (spawn ^registry clients))))
    (make-syrup-store "registry.syrup")
    #:persistence-registry persistence-registry))

Notice the use of define-actor instead of define, which appears to be necessary to achieve persistence. Besides the cell actor known to Goblins from the actor-lib, we also need to declare our self-defined actor of type ^registry in the persistence environment; this is obtained by the rather indigest boiler plate line creating nested lists. We use a second file, registry.syrup, to store this actor.

However, this fails miserably, as the server crashes with an error message containing keywords such as vat-churn and vat-maybe-persist-changed-objs!. What happens exactly seems to depend on timing. In this case there is a 176 byte file registry.syrup containing a few strings and binary data. I suppose it stores the empty client list and the corresponding registry. After clients register, there is a “churn” (which I understand as the vat taking a break after a turn is over), and the persistence system tries to update the file. However, the client list now contains an actor coming from the client script, that is, coming over the network from potentially a different machine. Since this is not under the control of the local script, it cannot be stored.

There is apparently a very simple workaround. The spawn-persistent-vat function admits on optional parameter #:persist-on; if this is changed from the default 'churn to something else, then the vat changes are not stored at each churn. In effect, the vat is only stored once in the beginning, and keeps an empty client list forever. This is actually exactly what we need, an empty client list at each restart of the server. So we end up with the following server.scm:

(use-modules (srfi srfi-1)
             (srfi srfi-26)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib methods)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls)
             (goblins persistence-store syrup)
             (goblins vat))

(define persistence-vat (spawn-vat))
(define persistence-registry
  (with-vat persistence-vat
    (spawn ^persistence-registry)))

(define-values (net capn)
  (spawn-persistent-vat
    (make-persistence-env #:extends (list captp-env tcp-tls-netlayer-env))
    (lambda ()
      (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost")))
    (make-syrup-store "ocapn.syrup")
    #:persistence-registry persistence-registry))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define-actor (^registry bcom clients)
  (lambda (client name)
    ($ clients (cons client ($ clients)))
    (format #t "Registered ~a\n" name)))

(define-values (vat clients registry)
  (spawn-persistent-vat
    (make-persistence-env
      (list (list '((registry) ^registry) ^registry))
      #:extends cell-env)
    (lambda ()
      (let ((clients (spawn ^cell '())))
        (values
          clients
          (spawn ^registry clients))))
    (make-syrup-store "registry.syrup")
    #:persist-on #f
    #:persistence-registry persistence-registry))

(let ((id (with-vat net ($ capn 'register registry 'tcp-tls))))
  (print-id "Server ID" id))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))
(with-vat vat
  (on (all-of* (map (cut <- <> 'square <>) ($ clients) v))
      (lambda (res)
        (format #t "~a\n" (sqrt (fold + 0 res)))
        (map (cut <- <> 'finish) (delete-duplicates ($ clients))))))

(sleep 10)

It may be prudent now to remove all .syrup files from previous failed attempts. Running a server and two client scripts computes the desired result as before. But now one notices that upon restarting the server script, it prints the exact same ocapn ID as before. So the clients can also be restarted with the exact same commands, and no more copy-pasting is needed.

Goblins for number theory, part 2

2025-02-25T00:00:00Z

Parallel Goblins

After seeing how to use the programming concepts of Goblins for a toy problem the structure of which resembles algorithms encountered in number theory, let us turn our attention to parallelising, or rather distributing the code. We keep the running example of computing the length of a vector, by giving out the tasks of squaring to the clients, and leaving the task of adding up the squares and taking the final square root to the server.

Networking

Communication in Goblins is abstracted over what is called the “Object Capabilities Network”, or “OCapN”. This somewhat frightening term simply means that a function in one script may call functions in another script running elsewhere in the network.

Goblins suggests to use Tor as the underlying network. Indeed after setting up a Tor daemon as described in the Goblins documentation on my laptop, the provided example of a chat client Alice talking to a chat server Bob works directly out of the box. This should also make it relatively easy to run distributed projects over the Internet, which would fit the idea of using Goblins for popular science projects.

On the other hand, institutional computing clusters tend to limit network access, sometimes even blocking outgoing HTTP requests to servers outside a whitelist. So it is unlikely that the Tor approach will work in this setting. Also it appears that Tor needs to have access to the Internet for bootstrapping: The chat script does not run purely locally after turning off Internet access. It may be possible to set up Tor in a specific way to cover such local use cases, but so far my knowledge of Tor is limited to what is described in the Goblins documentation. The documentation points to the possibility of using TCP. This requires that the participating nodes know each other's IP address or hostname, which sounds restrictive, but since I am currently using MPI over TCP, OpenMPI seems to somehow be able to determine these addresses, so it should also be a feasible option with Goblins. But for the time being let us assume that we are working with a machine that has access to the Tor network after setting up the Tor daemon as taught by the Goblins documentation; we will come back to the TCP setting below.

From chatting to computing

When saying that OCapN enables a function to call functions running somewhere else in the Tor network, one should more precisely use the term “actor” instead of “function”; and as seen before, these do not return values, but promises that resolve to the desired values. But it is conceptually helpful to think of calls to outsourced functions. So in our very simple model inspired by algorithmic number theory, we will have a client script that runs in a number of identical copies, and a server script that calls functions defined in the clients. This is in fact much easier to programme than with MPI, where the exchange of function arguments and results requires explicit MPISend and matching MPIReceive statements in the server and the client, and where furthermore complex data types need to be serialised by hand since the communication functions work only with basic, scalar types. Finally it is necessary to carefully and explicitly craft the control flows of the different programs exchanging data so that indeed the data sending statements exactly match the data receiving statements; otherwise there will be a deadlock. In the Goblins framework, this is all implicit. As an end result a distributed code does not look very different from the corresponding serial code.

But we still need to make a few things explicit: First of all, the different running scripts need to connect to the network. And the functions to be called remotely need to obtain a unique identifier and advertise it, and the caller needs to know this identifier to make the call. Luckily in our setting, most of the corresponding code can be considered as copy-pastable boilerplate.

Indeed the chat example can be transposed to our running example of vector lengths almost immediately.

Let us start with the client, to be put into a file client.scm (compared with the chat example, the client and server roles are reversed):

(use-modules (srfi srfi-1)
             (goblins)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer onion))

(define vat (spawn-vat))
(define net (spawn-vat))

;; Define the client functionality.
(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

;; Helper function for printing IDs.
(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

;; Create a communicator.
(define mycapn
  (with-vat net (spawn-mycapn (spawn ^onion-netlayer))))
;; Create an ID for the client and print it.
(define id
  (with-vat net ($ mycapn 'register client 'onion)))
(print-id "Client ID" id)

;; Wait for requests.
(sleep 3600)

The chat example uses two vats to separate the networking part and the actual functionality. This does not seem to be strictly necessary (the examples also work when everything is put into the same vat); but if I understand correctly, each vat corresponds to a separate, concurrent event loop, so having several vats might help to prevent deadlocks and possibly speed things up by separating communication and computation, so I follow the example and declare two vats from the start, net for everything network related and vat for everything else. The client actor in the main vat is defined as before through the function computing a square.

A network connection mycapn is defined, the client actor is registered with it and its network ID id is obtained through some magic incantations. Before version 0.15.0 of Goblins, id used to be a value, but now it is a promise. So before printing it as a string by applying the ocapn-id->string function, one needs to wait for the resolution of the promise; this is moved into the helper function print-id. This string value will be used to communicate the ID manually to the server later on.

Finally, we just wait for requests to compute squares (the chat example has a more sophisticated approach to waiting using Guile fibers, but sleep is enough for illustration purposes).

The corresponding server follows, to be put into a file server.scm:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer onion))

(define vat (spawn-vat))
(define net (spawn-vat))

(define mycapn
   (with-vat net (spawn-mycapn (spawn ^onion-netlayer))))

;; Enliven the clients.
(define client1
  (with-vat vat
    (<- mycapn 'enliven (string->ocapn-id (second (command-line))))))
(define client2
  (with-vat vat
    (<- mycapn 'enliven (string->ocapn-id (third (command-line))))))

(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

(define len (with-vat vat (spawn ^len)))

(with-vat vat
  (let-on ((l ($ len '(3 4))))
    (format #t "~a\n" l)))

;; Wait for the result to be computed, otherwise nothing will be printed.
(sleep 3600)

The server also starts by connecting to the network, and then it registers (or “enlivens” in Goblins parlance) two clients by magic incantations. The IDs of the clients are supposed to be passed as string arguments through the commandline, which are retrieved by (second (command-line)) and (third (command-line)), respectively (as with argv in C, the first argument is the name of the program or Guile script itself, and unlike in C, counting starts with 1, not 0). So we obtain local variables client1 and client2. The remainder of the code is the same as in the serial example, except that we again combine the norm and square root computations into one function len. Finally we add a bit of waiting: This is necessary to wait for the resolution of the promises, since let-on does apparently not do so; otherwise the server script will terminate before the result of the computation is printed.

To run the example, do not forget to start the Tor daemon with the command

tor -f $HOME/.config/goblins/tor-config.txt

Then open three terminals, and in two of them launch a client with the command

guile client.scm

and copy the two URIs of the form ocapn://… In the third terminal, start the server with the command

guile server.scm ocapn://… ocapn://…

where the ocapn://… command line arguments are pasted from the client output. After a few seconds the server will print the result of the computation, and all three programs can be stopped using the <ctrl>-<c> key combination.

If nothing happens, chances are there is a problem with the Tor network; the file $HOME/.cache/goblins/tor/tor-log.txt may contain hints. In particular, the network needs to be 100% bootstrapped.

TCP instead of onions, after all

Even if used only locally, the need to access the Internet makes the Tor protocol relatively slow; connections can fail, and this makes debugging somewhat painful – it is not easy to distinguish a deadlock in the program code from a poorly working network. The Goblins documentation does not provide a working example for using TCP, but moving from Tor to TCP is relatively straightforward: Replace all occurrences of the substring onion in the scripts above (also in the name ^onion-netlayer and the symbol 'onion) by tcp-tls, then add the parameter "localhost" to the invocation of (spawn ^tcp-tls-netlayer). To simplify copying and pasting, here is the resulting code for client.scm:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

;; Define the client functionality.
(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

;; Helper function for printing IDs.
(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

;; Create a communicator.
(define mycapn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
;; Create an ID for the client and print it.
(define id
  (with-vat net ($ mycapn 'register client 'tcp-tls)))
(print-id "Client ID" id)

;; Wait for requests.
(sleep 3600)

And server.scm becomes the following code:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define mycapn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

;; Enliven the clients.
(define client1
  (with-vat vat
    (<- mycapn 'enliven (string->ocapn-id (second (command-line))))))
(define client2
  (with-vat vat
    (<- mycapn 'enliven (string->ocapn-id (third (command-line))))))

(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

(define len (with-vat vat (spawn ^len)))

(with-vat vat
  (let-on ((l ($ len '(3 4))))
    (format #t "~a\n" l)))

;; Wait for the result to be computed, otherwise nothing will be printed.
(sleep 3600)

When starting the client, notice that the ID changes from a URI of the form ocapn://….onion/… to one of the form ocapn://….tcp-tls/…?host=localhost&port=…, where the port is chosen at random; the two clients and the server will each get their own port. (If desired, a given port can be chosen by adding a parameter such as #:port 12345 after "localhost" in the invocation of (spawn ^tcp-tls-netlayer).) Due to the special character & in the URI, it is necessary to enclose it in a pair of apostrophes ' on the command line, so one needs to start the server with the command

guile server.scm 'ocapn://…' 'ocapn://…'

That the parameter 'onion or 'tcp-tls is required in function calls such as ($ mycapn 'register client 'tcp-tls) is a surprising design choice in Goblins: When spawning the mycapn variable, a netlayer is passed as a parameter, so in theory the variable should be able to memorise the kind of network setting it is attached to.

Notice that with TCP, the result of the computation is printed immediately, whereas it takes a few seconds with Tor. So to ease debugging, we will from now on keep the TCP setting; going back to Tor is straightforward.

Registering clients

The approach in which the server needs to know all client IDs beforehand becomes unwieldy in a context where we expect hundreds or even thousands of computation cores. It would be preferable to use a two-stage process: The server publishes its ID, and the clients use it to connect to the server and to register their IDs. Then in a second step the server can send computing tasks to the clients. We will gradually transform the example code to end up with such a solution.

First of all, let us replace the fixed number (in our case, 2) of client variables by a more dynamic structure, a list of clients; for this, it is enough to modify the server as follows:

(define clients
  (with-vat vat
    (map (lambda (uri)
           (<- mycapn 'enliven (string->ocapn-id uri)))
         (list-tail (command-line) 1))))

(define (^len bcom)
  (lambda (v)
    (on (all-of* (map <- clients v))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

So here the client variable becomes a list instantiated using the (in principle variable number of) IDs passed on the command line. The variant all-of* of the joiner is used to treat lists of promises. Notice that <- can be used as any other function in a map statement: (map <- clients v) matches the two clients with the two entries of the vector and returns a list of promises resolving to the squares (for the time being we still assume that the length of the client list matches the length of the vector).

While we are at it, we may as well hold the list in a cell actor, as a way of introducing state by the backdoor: The cell may hold values that are exchanged throughout the program execution.

(use-modules (goblins actor-lib cell))
…
(define clients (with-vat vat (spawn ^cell '())))
(with-vat vat
  ($ clients (map (lambda (uri)
                    (<- mycapn 'enliven (string->ocapn-id uri)))
                  (list-tail (command-line) 1))))

(define (^len bcom)
  (lambda (v)
    (on (all-of* (map <- ($ clients) v))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

So instead of creating a list, we spawn a cell containing an empty list; then we put a different value into the cell by applying the $ function to it with the desired new value as additional argument. Later we extract the list by applying the $ function without additional argument to the cell (since we are in the same vat, we may use $ instead of <- and need not worry about promise resolution).

We are now prepared to implement the registration of clients in the server. For this, we create a new type of agent which takes a URI identifying a client and which adds it to the list of clients in the cell. To see that something actually happens, we then print the added URI:

(define (^register bcom)
  (lambda (uri)
    ($ clients (cons (<- mycapn 'enliven (string->ocapn-id uri))
                     ($ clients)))
    (format #t "Registered ~a\n" uri)))

We create an instance of this agent type, add it to the network and print its ID (using the same print-id function):

(define register (with-vat vat (spawn ^register)))
(let ((id (with-vat net ($ mycapn 'register register 'tcp-tls))))
  (print-id "Server ID" id))

Finally we can use this new register function instead of the ad-hoc creation to add the clients from the command line to the list:

(with-vat vat
  (map (lambda (uri)
         ($ register uri))
       (list-tail (command-line) 1)))

Altogether we arrive at the following code, which can replace the server.scm script while keeping the current clients:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define mycapn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

;; Register clients.
(define clients (with-vat vat (spawn ^cell '())))

(define (^register bcom)
  (lambda (uri)
    ($ clients (cons (<- mycapn 'enliven (string->ocapn-id uri))
                     ($ clients)))
    (format #t "Registered ~a\n" uri)))

(define register (with-vat vat (spawn ^register)))
(let ((id (with-vat net ($ mycapn 'register register 'tcp-tls))))
  (print-id "Server ID" id))

(with-vat vat
  (map (lambda (uri)
         ($ register uri))
       (list-tail (command-line) 1)))

;; Use clients.
(define (^len bcom)
  (lambda (v)
    (on (all-of* (map <- ($ clients) v))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

(define len (with-vat vat (spawn ^len)))

(with-vat vat
  (let-on ((l ($ len '(3 4))))
    (format #t "~a\n" l)))

(sleep 3600)

Now it is time to swap the roles! We first start the server without command line arguments (as it is written, it then just has an initial empty client list):

guile server.scm

Two clients are now started using the URI printed by the server as a command line argument:

guile client.scm 'ocapn://…'
guile client.scm 'ocapn://…'

For this to work, we need to add to the client script the necessary (and straightforward) code to enliven the server and to remotely register the client with the server. We end up with the following script client.scm:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))
(define client
  (with-vat vat (spawn ^square)))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define mycapn
  (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))
(define id
  (with-vat net ($ mycapn 'register client 'tcp-tls)))
(print-id "Client ID" id)

;; Enliven server.
(define server
  (with-vat vat
    (<- mycapn 'enliven (string->ocapn-id (second (command-line))))))

;; Register with server.
(with-vat vat
  (on id
    (lambda (id)
      (<- server id))))

(sleep 3600)

Notice that we have slightly modified registration with the server: Since the client communicates directly with the server, there is no need to go through a string representation of the ID, which we may use directly as an argument to the function call. This assumes that in the server, registration has been modified as follows:

(define (^register bcom)
  (lambda (id)
    ($ clients (cons (<- mycapn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))

Running the server and two copies of the client, one should now see the client IDs printed in their respective terminals, and messages in the server terminal that these clients have been registered. However, the desired length 5 is not printed. In fact, the len actor is called at the end of the server script before the clients have had a chance to register through the network (actually even before the clients are started), so the expression ($ clients) yields an empty list. Now the map function from SRFI 1 also truncates v to the empty list, all-of* resolves to the empty list, and fold returns the starting value 0, which is actually printed before the two client IDs.

This can be solved by having the server wait until the desired number of clients has registered, by adding the following code:

(define v '(3 4))
(while (not (eq? (length (with-vat vat ($ clients))) (length v)))
       (sleep 1))

As a warning, the equivalently looking lines

(define v '(3 4))
(with-vat vat
  (while (not (eq? (length ($ clients)) (length v)))
         (sleep 1)))

result in a deadlock in which none of the clients get a chance to register. It looks as if operations inside with-vat block the vat so that it does not handle incoming remote function calls.

After also removing the code that registers clients specified on the command line, the server script currently looks like this:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define mycapn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

;; Register clients.
(define clients (with-vat vat (spawn ^cell '())))

(define (^register bcom)
  (lambda (id)
    ($ clients (cons (<- mycapn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))

(define register (with-vat vat (spawn ^register)))
(let ((id (with-vat net ($ mycapn 'register register 'tcp-tls))))
  (print-id "Server ID" id))

;; Use clients.
(define (^len bcom)
  (lambda (v)
    (on (all-of* (map <- ($ clients) v))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

(define len (with-vat vat (spawn ^len)))

;; Wait until enough clients have registered.
(define v '(3 4))
(while (not (eq? (length (with-vat vat ($ clients))) (length v)))
       (sleep 1))

(with-vat vat
  (let-on ((l ($ len '(3 4))))
    (format #t "~a\n" l)))

(sleep 3600)

Notice that the same code can be run for vectors with different numbers of entries; it just requires that (at least) as many clients connect as there are tasks to handle. As a small caveat, the code is correct as we did not implement an unregister procedure for the clients, so their number is monotonically increasing – otherwise it would be possible that between the arrival of the second client and the call to the len function, one of the clients has disappeared again and the clients list contains only one entry, say. Then the SRFI-1 map function we are using, which accepts lists of different lengths by truncating them all to the smallest occurring length, would only consider the first entry of v, and the incorrect length 3 would be computed.

In a more realistic setting, there are more computing tasks than clients. When these take all more or less the same time, they may be evenly split between the available clients. For instance, the following server code waits for two clients to connect and then computes the length of vectors of arbitrary dimension:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib cell)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on)
             (goblins ocapn ids)
             (goblins ocapn captp)
             (goblins ocapn netlayer tcp-tls))

(define vat (spawn-vat))
(define net (spawn-vat))

(define (print-id prefix id)
  (with-vat net
    (on id
      (lambda (sref)
        (format #t "~a ~a\n"
                   prefix (ocapn-id->string sref))))))

(define mycapn
   (with-vat net (spawn-mycapn (spawn ^tcp-tls-netlayer "localhost"))))

;; Register clients.
(define clients (with-vat vat (spawn ^cell '())))

(define (^register bcom)
  (lambda (id)
    ($ clients (cons (<- mycapn 'enliven id)
                     ($ clients)))
    (print-id "Registered" id)))

(define register (with-vat vat (spawn ^register)))
(let ((id (with-vat net ($ mycapn 'register register 'tcp-tls))))
  (print-id "Server ID" id))

;; Use clients.
(define (^len bcom)
  (lambda (v)
    (on (all-of* (map <- ($ clients) v))
        (lambda (res)
          (sqrt (fold + 0 res)))
        #:promise? #t)))

(define len (with-vat vat (spawn ^len)))

(while (not (eq? (length (with-vat vat ($ clients))) 2))
       (sleep 1))

(define v '(1 2 3 4 5))
(with-vat vat
  (while (< (length ($ clients)) (length v))
     (let ((c ($ clients)))
       ($ clients (append c c)))))

(with-vat vat
  (let-on ((l ($ len v)))
    (format #t "~a\n" l)))

(sleep 3600)

The code somewhat crudely “doubles” the client list until there are at least as many occurrences of clients (with multiplicities) as tasks; then map does the right thing.

This simple situation occurs surprisingly often in number theory. For instance in ECPP, one needs to compute many modular square roots for the same modulus; trial factor many batches of numbers of the same size; do many primality tests for numbers of the same size. However, the more general case of tasks taking more or less long also occurs (in ECPP, for instance, when computing roots of class polynomials of vastly differing degrees). The relative task durations are also not necessarily easy to estimate. In a more distributed setting, one can also imagine that even homogeneous tasks are more or less quickly solved with more or less powerful participating machines. Scheduling tasks by hand is thus not realistic in general. Instead, one would need a more dynamic approach, in which the server maintains a list of tasks and a list of clients; whenever a client is idle it should be sent a new task.

Given the length of this second part, this is a question I plan to pursue in another instalment.

This blog post, originally published on 2024-09-04, was updated on 2025-02-25 to cover changes between Goblins 0.13.0 and 0.15.0 and to incorporate minor improvements.

Goblins for number theory, part 1

2024-09-04T00:00:00Z

Starting with Goblins

Motivation

Most of my code in algorithmic number theory is written in C and runs in a parallelised version using MPI on a cluster. The C language is mandatory for efficiency reasons; MPI is mostly a convenience. Indeed number theoretic code is often embarrassingly parallel. For instance my ECPP implementation for primality proving essentially consists of a number of for loops running one after the other. A server process distributes evaluations of the function inside the loop to the available clients, which take a few seconds or even minutes to report back their respective results. These are then handled by the server before entering the next loop. In a cluster environment with a shared file system, this could even be realised by starting a number of clients over SSH, starting a server, and then using numbered files to exchange function arguments and results, or touch on files to send “signals” between the server and the clients. Computation time is the bottleneck, communication is minimal, so even doing this over NFS is perfectly feasible (and I have written and deployed such code in the past). MPI then provides a convenience layer that makes the process look more professional, and also integrates more smoothly with the batch submission approach of computation clusters.

The very loosely coupled nature of number theoretic computations should make it possible to distribute them beyond a cluster. Why not even do a primality proof with several participants working together over the Internet? I have looked at BOINC previously; but the system seems to be intended for completely uncoupled problems, essentially exploration of a large search space. The work is cut up into a number of independent tasks that are sent out to the participants; if they do not report back in a few days, the same task is sent out again, and over several months all tasks are treated. While number theoretic computations may also take a few months, they do require at least some synchronisation, and the server needs to hear back from the clients every few minutes so as not to be blocked. (Each for loop is embarrassingly parallel, but several loops must be run sequentially.) Also the “administrative” overhead of things to be done for BOINC outside the program itself looks rather daunting: setting up a database, for instance. So I have been looking for a programming environment that is somewhere between MPI and BOINC, making loosely coupled computations possible; it should be able to run over the Internet and not only on a cluster connected by SSH; and it should result in code that is relatively easy to write, and for which just as with MPI the parallel version does not look very different from the sequential one. (In my C code, I usually end up having everything in one file, with the parallel and the sequential versions being handled by alternating blocks selected by #ifdef.)

Goblins is a distributed programming environment by the Spritely Institute that seems to fit the bill. It is meant for distributed programming, and locally running code can seamlessly be run over networks using various mechanisms such as Tor or simply TCP and TLS. On the other hand, not only does it use puzzling vocabulary, but also puzzling concepts, such as object “capabilities”, actor “model” and “vats”. For someone coming from imperative programming and good old C, with a penchant for assembly, looking at Goblins can feel like reading Heidegger. Fortunately Goblins comes with an excellent tutorial, which clarifies the seemingly exotic concepts by a hands-on approach with concrete examples. These put the emphasis, however, on the distribution and communication layer; tangible results are essentially obtained as side effects of printing values on screen. While I think it is an excellent idea to teach programming without mathematics to lower the barrier for people who do not like mathematics, I had the opposite problem: As a mathematician wanting to do computations, it was not immediately clear to me how to have the server send computational tasks to the clients and recover the results. Goblins is a library (or collection of “modules”) for Guile (or Racket), two Scheme dialects; from what I understand, it is unlikely that a C implementation will be available any time soon.

So the way in which I see Goblins being useful to distributed number theory computations is as follows:

Use Guile and Goblins to express the high-level control flow of the program, and additionally Goblins to express its parallelised and distributed aspects.
Use functions from a C library to do the heavy lifting, for instance functions from CM to do the computationally intensive tasks related to primality proving, which can be called from Guile using the Foreign Function Interface or FFI. In a number theoretic context, function arguments and values are often arbitrarily long integers coming from the GMP library. Given that Guile itself is written in C and relies on GMP for its implementation of integers, one can be hopeful that this should not pose too many problems.

The prototypical example

As a running example, I would like to treat the computation of the euclidean length of a vector; starting simple and enhancing the example step by step in the following tutorial, which I am making up while I am trying to solve the problem for myself.

Let us begin with a fixed vector of small, fixed size, which would look like the following in C:

int square (int x) {
   return x*x;
}

int v [] = {3, 4};
double len;

len = 0.0;
for (int i = 0; i < sizeof (v) / sizeof (int); i++)
   len += square (v [i]);
len = sqrt (len);

Granted, this it not number theory, but it fits the situation described in the motivational section above: There is a for loop going through the vector calling independently for each entry the square function, which stands for a function that would be expensive to compute, should be distributed to the clients and loaded from a C library; the additions and the square root, the + and sqrt functions, stand for cheap post-treatment done at the server level.

The following Guile code captures this sequential computation with the map and fold idiom in appreciable compactness:

(use-modules (srfi srfi-1))
(define (square x) (* x x))
(define v '(3 4))
(sqrt (fold + 0 (map square v)))

Open a Guile REPL using the guile command. Then copy-paste this code into the REPL; or save it as a file euclid.scm and type (load "euclid.scm") in the REPL; enjoy the Pythagorean result!

Before continuing, please go first through Chapters 1 to 4 of the Goblins documentation and tutorial.

The next step is to “locally distribute” the computation, that is, to create two clients and a server for the different steps of the computation; for the time being, these will all live in the same local Guile REPL. What is called a “process” in MPI corresponds to a “vat” in Goblins; so we create vat0 for the server and vat1 and vat2 for the clients:

(use-modules (goblins))
(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

Functions in an MPI instrumented parallel program correspond to “actors” living in a vat; we first define a type of actor computing squares:

(define (^square bcom)
  (lambda (x)
    (* x x)))

To distinguish it from the square function above, we prepend a ^ to its name; it takes a formal parameter called bcom that we need not worry about.

Then we populate the client vats with a square actor each and keep references to the different actors under different global names:

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

Now we can create a function in the server vat which computes the length of a 2-dimensional vector by calls to the client actors using <-. For this to work, we will need to wait for the clients to finish their computations (or, in Goblins parlance, for their “promises” to be “fulfilled”); this is done using on for each call to a client actor. The “Goblins standard library”, described in Chapter 6 of the documentation, comes in handy here; in particular we can use a joiner to wait for several actors at the same time.

(use-modules (srfi srfi-1)
             (goblins actor-lib joiners))
(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (let ((l (sqrt (fold + 0 res))))
            (format #t "~a\n" l))))))
(define len (with-vat vat0 (spawn ^len)))

The len actor can now be called as follows, which will print the euclidean length of a vector on screen; from within the vat where the actor resides, we may use $ instead of <-, which behaves like a normal function call:

(with-vat vat0 ($ len '(3 4)))

Putting this all together for convenient copy-pasting, here is the complete code:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (let ((l (sqrt (fold + 0 res))))
            (format #t "~a\n" l))))))

(define len (with-vat vat0 (spawn ^len)))

(with-vat vat0 ($ len '(3 4)))

Promises, promises!

The previous code prints the length of the vector on screen using the format function; one might wish to instead create a function that returns the value, to assign it to a variable for future treatment, for instance, or to enter the Guile equivalent of the next for loop. This turns out to be surprisingly difficult, or, to be more precise, impossible. The reason is that on handles the promise by calling the function in its body with the return value of the promise, but does not itself return the result of this evaluation, as I would have expected. (Promises in Guile itself, created with delay, behave in this expected way when using force.) It is possible to obtain a return value for on, but this will again be a promise and not a “real” value — once a promise, always a promise!

So instead of passing around values, one quickly ends up passing around promises; this requires to get used to, and entails an additional layer of wrapping everything into on and a function instead of just evaluating the body of the function. As far as I understand, to obtain any tangible result, one eventually needs to print it on screen or into a file. The following example illustrates how to use the #:promise? #t keyword parameter with on to ensure that it returns a promise, and how to lug this promise around to continue computations with its encapsulated value, while never leaving the realm of promises until eventually printing a result. It moves the computation of the square root out of the ^len actor.

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^norm bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (fold + 0 res))
        #:promise? #t)))

(define norm (with-vat vat0 (spawn ^norm)))

(with-vat vat0
  (define n ($ norm '(3 4)))
  (define l (on n (lambda (x) (sqrt x)) #:promise? #t))
  (on l (lambda (x) (format #t "~a\n" x))))

So here, n and l are promises, and fulfillment occurs only in the last on that prints a result.

Now that we have understood how things work, it is useful to introduce the let*-on syntactic sugar, which lets us end up with the following code:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^norm bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (fold + 0 res))
        #:promise? #t)))

(define norm (with-vat vat0 (spawn ^norm)))

(with-vat vat0
  (let*-on ((n ($ norm '(3 4)))
            (l (sqrt n)))
    (format #t "~a\n" l)))

This looks exactly like normal let* syntax in Guile! So in the end, we arrive at a program which looks as if it handled normal values, with all promises swept under the rug.

Acknowledgements

I thank Jessica Tallon and David Thompson for their kind help with understanding the concept of promises covered in the previous section.

This first part has dealt with basic programming concepts in Goblins. In the end, all our code still runs in a single script, so we have taken a twisted path to write essentially serial code, but in doing so, we have laid the groundwork for true parallelisation with Goblins.