leastfixedpoint

Erlang/OTP’s global module

This page is a mirrored copy of an article originally posted on the (now sadly defunct) LShift blog; see the archive index here.

Fri, 13 February 2009

Erlang/OTP’s global module helps with atomic assignment of names for processes in a distributed Erlang cluster. It makes sure that only a single process at a time holds any given name, across all connected nodes. Unlike the local name registration function, names aren’t limited to being atoms: with global, they can be any term at all.

To see global’s conflict-resolution in action, we need to register a name on two nodes not initially connected, and then make them aware of each other. The system will pick one registration to survive, and will terminate the other registration.

First, register the name “a” on each of two nodes (started with erl -sname one and erl -sname two, respectively). On node one:

Eshell V5.6.2  (abort with ^G)
(one@walk)1> global:register_name(a, self()).
yes
(one@walk)2> global:whereis_name(a).
<0.37.0>

We see that the name was registered successfully (the call to register_name returned yes), and that when looked up, a pid (the pid of the shell process) is returned, as we would expect. Now, the same on node two:

Eshell V5.6.2  (abort with ^G)
(two@walk)1> global:register_name(a, self()).
yes
(two@walk)2> global:whereis_name(a).
<0.37.0>

Again, we see it succeeding. Note that each node has successfully registered the “global” name “a”. This is because they are unaware of each other. Once they’re connected, Erlang/OTP will automatically resolve the situation. By default, it does this by terminating one of the two contending processes.

Let’s see what happens. Connect the two nodes together, by pinging one from the other — here, pinging node two from node one:

(one@walk)3> net_adm:ping(two@walk).
pong
(one@walk)4> 
=INFO REPORT==== 13-Feb-2009::03:05:22 ===
global: Name conflict terminating {a,<5744.37.0>}

(one@walk)4> global:whereis_name(a).
<0.37.0>
(one@walk)5> 

See that the termination of one of the contenders is reported with a message in the system log. It was the registration on node two that was terminated, and the registration on node one that survived. Here’s what we see on node two:

** exception error: killed
(two@walk)3> global:whereis_name(a).
<5768.37.0>
(two@walk)4> node(global:whereis_name(a)).   
one@walk

Node two’s registered process has been killed. When we then ask about the registration for the name “a”, we see a pid from node one.

Finally, we’ll try registering the name for a second time:

(two@walk)5> global:register_name(a, self()).
no
(two@walk)6> 

It answers no because there’s already a registration that it knows about in the system. The same no answer would have been returned if we’d tried the same thing on node one instead.