Goblins for number theory, part 1

Starting with Goblins

Motivation

Most of my code in algorithmic number theory is written in C and runs in a parallelised version using MPI on a cluster. The C language is mandatory for efficiency reasons; MPI is mostly a convenience. Indeed number theoretic code is often embarrassingly parallel. For instance my ECPP implementation for primality proving essentially consists of a number of for loops running one after the other. A server process distributes evaluations of the function inside the loop to the available clients, which take a few seconds or even minutes to report back their respective results. These are then handled by the server before entering the next loop. In a cluster environment with a shared file system, this could even be realised by starting a number of clients over SSH, starting a server, and then using numbered files to exchange function arguments and results, or touch on files to send “signals” between the server and the clients. Computation time is the bottleneck, communication is minimal, so even doing this over NFS is perfectly feasible (and I have written and deployed such code in the past). MPI then provides a convenience layer that makes the process look more professional, and also integrates more smoothly with the batch submission approach of computation clusters.

The very loosely coupled nature of number theoretic computations should make it possible to distribute them beyond a cluster. Why not even do a primality proof with several participants working together over the Internet? I have looked at BOINC previously; but the system seems to be intended for completely uncoupled problems, essentially exploration of a large search space. The work is cut up into a number of independent tasks that are sent out to the participants; if they do not report back in a few days, the same task is sent out again, and over several months all tasks are treated. While number theoretic computations may also take a few months, they do require at least some synchronisation, and the server needs to hear back from the clients every few minutes so as not to be blocked. (Each for loop is embarrassingly parallel, but several loops must be run sequentially.) Also the “administrative” overhead of things to be done for BOINC outside the program itself looks rather daunting: setting up a database, for instance. So I have been looking for a programming environment that is somewhere between MPI and BOINC, making loosely coupled computations possible; it should be able to run over the Internet and not only on a cluster connected by SSH; and it should result in code that is relatively easy to write, and for which just as with MPI the parallel version does not look very different from the sequential one. (In my C code, I usually end up having everything in one file, with the parallel and the sequential versions being handled by alternating blocks selected by #ifdef.)

Goblins is a distributed programming environment by the Spritely Institute that seems to fit the bill. It is meant for distributed programming, and locally running code can seamlessly be run over networks using various mechanisms such as Tor or simply TCP and TLS. On the other hand, not only does it use puzzling vocabulary, but also puzzling concepts, such as object “capabilities”, actor “model” and “vats”. For someone coming from imperative programming and good old C, with a penchant for assembly, looking at Goblins can feel like reading Heidegger. Fortunately Goblins comes with an excellent tutorial, which clarifies the seemingly exotic concepts by a hands-on approach with concrete examples. These put the emphasis, however, on the distribution and communication layer; tangible results are essentially obtained as side effects of printing values on screen. While I think it is an excellent idea to teach programming without mathematics to lower the barrier for people who do not like mathematics, I had the opposite problem: As a mathematician wanting to do computations, it was not immediately clear to me how to have the server send computational tasks to the clients and recover the results. Goblins is a library (or collection of “modules”) for Guile (or Racket), two Scheme dialects; from what I understand, it is unlikely that a C implementation will be available any time soon.

So the way in which I see Goblins being useful to distributed number theory computations is as follows:

  1. Use Guile and Goblins to express the high-level control flow of the program, and additionally Goblins to express its parallelised and distributed aspects.
  2. Use functions from a C library to do the heavy lifting, for instance functions from CM to do the computationally intensive tasks related to primality proving, which can be called from Guile using the Foreign Function Interface or FFI. In a number theoretic context, function arguments and values are often arbitrarily long integers coming from the GMP library. Given that Guile itself is written in C and relies on GMP for its implementation of integers, one can be hopeful that this should not pose too many problems.

The prototypical example

As a running example, I would like to treat the computation of the euclidean length of a vector; starting simple and enhancing the example step by step in the following tutorial, which I am making up while I am trying to solve the problem for myself.

Let us begin with a fixed vector of small, fixed size, which would look like the following in C:

int square (int x) {
   return x*x;
}

int v [] = {3, 4};
double len;

len = 0.0;
for (int i = 0; i < sizeof (v) / sizeof (int); i++)
   len += square (v [i]);
len = sqrt (len);

Granted, this it not number theory, but it fits the situation described in the motivational section above: There is a for loop going through the vector calling independently for each entry the square function, which stands for a function that would be expensive to compute, should be distributed to the clients and loaded from a C library; the additions and the square root, the + and sqrt functions, stand for cheap post-treatment done at the server level.

The following Guile code captures this sequential computation with the map and fold idiom in appreciable compactness:

(use-modules (srfi srfi-1))
(define (square x) (* x x))
(define v '(3 4))
(sqrt (fold + 0 (map square v)))

Open a Guile REPL using the guile command. Then copy-paste this code into the REPL; or save it as a file euclid.scm and type (load "euclid.scm") in the REPL; enjoy the Pythagorean result!

Before continuing, please go first through Chapters 1 to 4 of the Goblins documentation and tutorial.

The next step is to “locally distribute” the computation, that is, to create two clients and a server for the different steps of the computation; for the time being, these will all live in the same local Guile REPL. What is called a “process” in MPI corresponds to a “vat” in Goblins; so we create vat0 for the server and vat1 and vat2 for the clients:

(use-modules (goblins))
(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

Functions in an MPI instrumented parallel program correspond to “actors” living in a vat; we first define a type of actor computing squares:

(define (^square bcom)
  (lambda (x)
    (* x x)))

To distinguish it from the square function above, we prepend a ^ to its name; it takes a formal parameter called bcom that we need not worry about.

Then we populate the client vats with a square actor each and keep references to the different actors under different global names:

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

Now we can create a function in the server vat which computes the length of a 2-dimensional vector by calls to the client actors using <-. For this to work, we will need to wait for the clients to finish their computations (or, in Goblins parlance, for their “promises” to be “fulfilled”); this is done using on for each call to a client actor. The “Goblins standard library”, described in Chapter 6 of the documentation, comes in handy here; in particular we can use a joiner to wait for several actors at the same time.

(use-modules (srfi srfi-1)
             (goblins actor-lib joiners))
(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (let ((l (sqrt (fold + 0 res))))
            (format #t "~a\n" l))))))
(define len (with-vat vat0 (spawn ^len)))

The len actor can now be called as follows, which will print the euclidean length of a vector on screen; from within the vat where the actor resides, we may use $ instead of <-, which behaves like a normal function call:

(with-vat vat0 ($ len '(3 4)))

Putting this all together for convenient copy-pasting, here is the complete code:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^len bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (let ((l (sqrt (fold + 0 res))))
            (format #t "~a\n" l))))))

(define len (with-vat vat0 (spawn ^len)))

(with-vat vat0 ($ len '(3 4)))

Promises, promises!

The previous code prints the length of the vector on screen using the format function; one might wish to instead create a function that returns the value, to assign it to a variable for future treatment, for instance, or to enter the Guile equivalent of the next for loop. This turns out to be surprisingly difficult, or, to be more precise, impossible. The reason is that on handles the promise by calling the function in its body with the return value of the promise, but does not itself return the result of this evaluation, as I would have expected. (Promises in Guile itself, created with delay, behave in this expected way when using force.) It is possible to obtain a return value for on, but this will again be a promise and not a “real” value — once a promise, always a promise!

So instead of passing around values, one quickly ends up passing around promises; this requires to get used to, and entails an additional layer of wrapping everything into on and a function instead of just evaluating the body of the function. As far as I understand, to obtain any tangible result, one eventually needs to print it on screen or into a file. The following example illustrates how to use the #:promise? #t keyword parameter with on to ensure that it returns a promise, and how to lug this promise around to continue computations with its encapsulated value, while never leaving the realm of promises until eventually printing a result. It moves the computation of the square root out of the ^len actor.

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^norm bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (fold + 0 res))
        #:promise? #t)))

(define norm (with-vat vat0 (spawn ^norm)))

(with-vat vat0
  (define n ($ norm '(3 4)))
  (define l (on n (lambda (x) (sqrt x)) #:promise? #t))
  (on l (lambda (x) (format #t "~a\n" x))))

So here, n and l are promises, and fulfillment occurs only in the last on that prints a result.

Now that we have understood how things work, it is useful to introduce the let*-on syntactic sugar, which lets us end up with the following code:

(use-modules (srfi srfi-1)
             (goblins)
             (goblins actor-lib joiners)
             (goblins actor-lib let-on))

(define vat0 (spawn-vat))
(define vat1 (spawn-vat))
(define vat2 (spawn-vat))

(define (^square bcom)
  (lambda (x)
    (* x x)))

(define client1
  (with-vat vat1 (spawn ^square)))
(define client2
  (with-vat vat2 (spawn ^square)))

(define (^norm bcom)
  (lambda (v)
    (on (all-of (<- client1 (first v))(<- client2 (second v)))
        (lambda (res)
          (fold + 0 res))
        #:promise? #t)))

(define norm (with-vat vat0 (spawn ^norm)))

(with-vat vat0
  (let*-on ((n ($ norm '(3 4)))
            (l (sqrt n)))
    (format #t "~a\n" l)))

This looks exactly like normal let* syntax in Guile! So in the end, we arrive at a program which looks as if it handled normal values, with all promises swept under the rug.

Acknowledgements

I thank Jessica Tallon and David Thompson for their kind help with understanding the concept of promises covered in the previous section.

This first part has dealt with basic programming concepts in Goblins. In the end, all our code still runs in a single script, so we have taken a twisted path to write essentially serial code, but in doing so, we have laid the groundwork for true parallelisation with Goblins.