:tocdepth: 2 .. _protocol-design: ************************** Protocol design ************************** .. index:: leader election protocol example In this chapter we will explain how to specify and analyze a distributed protocol using Alloy. We will use a very simple example of a leader election protocol. The aim of a leader election protocol is to designate one node in a network as the *organizer* or *leader* of some distributed task. Nodes are initially unaware of which node will play that role, but after running the protocol all nodes should recognize the same node as leader. Many leader election protocols exist: here we will use a well-known version that can be used to elect a leader in a network of nodes organized in a ring, and where nodes also have unique comparable identifiers. This protocol was proposed by Chang and Roberts [CACM79]_ and roughly operates as follows: the goal is to elect as leader the node with the highest identifier; each node starts by sending its own identifier to its successor in the ring; upon receiving an identifier, a node only forwards it if it is higher than its own; a node receiving back its own identifier designates itself as leader; finally the leader broadcasts its newly found role to the network. In our specification we will start by omitting this second last phase of the protocol and will only be concerned with the fundamental property of a leader election protocol: eventually exactly one node will designate itself as the leader. Specifying the network configuration ==================================== .. index:: configuration When designing a distributed protocol, we should start by specifying the network configurations in which it is supposed to operate. In this case, the protocol operates in a ring network of nodes with unique identifiers. Notice that many network configurations satisfy this requirement: we could have rings of different size, and for the same size, different assignments of the unique identifiers to the nodes (the relevant fact here is the relative order in which the identifiers appear in the network). The verification of the expected properties of the protocol should take into account all possible different configurations (up to the specified bound on the ring size). The network configuration, although arbitrary, does not change during the execution of the protocol. As such, it will be specified with immutable signatures and fields. Unlike some formal specification languages, Alloy has no pre-defined notion of *process* or *node*. These have to be explicitly specified, by declaring a signature. Since the network of nodes forms a ring, a binary relation that associates each node with its immediate successor in the ring should also be declared. .. code-block:: sig Node { succ : one Node } Recall that binary relations should be declared as fields inside the domain signature. In the declaration of field :alloy:`succ` the multiplicity :alloy:`one` is used to ensure that each node has exactly one successor in the ring. As usual in Alloy, we can immediately start validating our specification by inspecting possible instances generated using a :alloy:`run` command. .. code-block:: run example {} By iterating over the returned instances with :guilabel:`New` we could get the following instance, which is of course not a valid ring. .. image:: instance1.png :width: 400 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_1/leaderelection.als#L9-L14 :alt: Get the code to generate this instance. .. index:: ^ (transitive closure) transitive closure operator In addition to requiring each node to have exactly one successor, a simple way to ensure that a network forms a ring is to require that every node is reachable from any other node. Given a node :alloy:`n`, to determine the set of nodes reachable from :alloy:`n` via relation :alloy:`succ` we can use expression :alloy:`n.^succ`. This expression makes use of the composition (:alloy:`.`) and transitive closure (:alloy:`^`) operators already presented in the :ref:`structural-modeling` chapter. The constraint that :alloy:`succ` forms a ring can thus be specified in a fact as follows. .. code-block:: fact ring { // succ forms a ring all n, m : Node | m in n.^succ } An alternative specification of fact `ring` is to require that the set of all nodes (denoted by :alloy:`Node`) is reachable from every node. .. code-block:: fact ring { // succ forms a ring all n : Node | Node in n.^succ } If we now iterate over all instances returned by command :alloy:`example` we will get the 3 possible rings with up to 3 nodes and a configuration without any node, which is also allowed by our specification. The limit of 3 is due to the implicit command scope of 3 atoms per top-level signature. As an example we show the ring with 3 nodes. .. image:: instance2.png :width: 400 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_2/leaderelection.als#L12-L19 :alt: Get the code to generate this instance. To enforce that we have at least one node that can be elected we can add the following fact. .. code-block:: fact some_node { // at least one node some Node } .. index:: util/ordering module Besides being organized in a ring, nodes must also have unique identifiers. Since we are going to need to compare identifiers, one possibility would be to just use integers to represent them. However, besides being totally ordered, integers carry additional semantics that is not necessary here. In particular, integers support arithmetic operations that will not be required in this example. When developing a formal specification it is good practice to be as abstract as possible and use the simplest structure that fulfills our needs. Also, the analysis of Alloy specifications with integers has some subtleties which makes it better to avoid, unless strictly necessary. As such, we will introduce a signature :alloy:`Id`, to denote identifiers, and declare that this signature is totally ordered using a call to module `util/ordering` at the top of the file. Doing so implicitly extends the model with a binary relation :alloy:`next` that given an identifier will yield the next identifier in the total order, two singleton signatures to denote the :alloy:`first` and :alloy:`last` identifiers of the order, and a binary predicate `gt` to compare if the first argument is greater than the second. All identifiers but one (the `last`) have a :alloy:`next`. .. code-block:: open util/ordering[Id] sig Id {} Having our identifiers totally ordered, we can now declare a field :alloy:`id` inside signature :alloy:`Node`, to associate each node with its identifier. .. code-block:: sig Node { succ : one Node, id : one Id } .. index:: relation; injective To ensure that identifiers are unique it suffices to require :alloy:`id` to be injective, which can be done as follows. .. code-block:: fact unique_ids { // ids are unique all i : Id | lone id.i } If we can now create a new :alloy:`example3` command to ask for all configurations with exactly 3 nodes and 3 identifiers, as follows. .. code-block:: run example3 {} for exactly 3 Node, exactly 3 Id .. index:: symmetry breaking instance; exploration New button; in visualizer Setting an exact scope over identifiers is actually redundant because `util/ordering` always imposes an exact scope. If we iterate over all instances with :guilabel:`New`, we will get the two only (truly) different configurations with this scope: one where the identifiers increase as we move forward in the ring, and another where they decrease. The latter is depicted below (notice the `next`, `prev`, `first` and `last` elements introduced by `util/ordering`). This illustrates the power of Alloy's symmetry breaking, which in this case prevents the return of any other instance (that necessarily would be isomorphic to one of these two). .. image:: instance3.png :width: 400 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_3/leaderelection.als#L30-L38 :alt: Get the code to generate this instance. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/specifying-the-network-configuration :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: The predefined :alloy:`ordering` module :link-type: ref :link: ordering :octicon:`beaker` Further reading ^^^ Learn more about the predefined module that we have used to impose a total order on the identifiers. .. card:: Working with integers :link-type: ref :link: integers :octicon:`beaker` Further reading ^^^ Learn what are the caveats of using integers and how to properly address them in Alloy. .. _specifying-protocol-dynamics: Specifying the protocol dynamics ================================ .. index:: field; declaration field; mutable var keyword The coordination between the different nodes in this distributed protocol, like in many others, is achieved by means of message passing. Again, likewise for nodes, Alloy has no special support for message passing, so all the relevant concepts have to be specified from scratch. In this phase of the protocol messages are just node identifiers, so there is no need to add a signature to model those, and we can use signature `Id` directly to represent messages. To capture the incoming and outgoing messages we can declare mutable binary fields inside signature :alloy:`Node`, to associate each node with the set of identifiers it has received or that need to be sent, respectively. Using sets for the mailboxes is an abstraction that allows us to verify that the protocol is correct even if messages are transmitted out of order. In this protocol the same message is never sent twice by each node, so we also don't need to account for multiple copies of the same message. As we have seen in the :ref:`behavioral-modeling` chapter, to declare a mutable field the keyword :alloy:`var` should be used. .. code-block:: sig Node { succ : one Node, id : one Id, var inbox : set Id, var outbox : set Id } .. index:: signature; subset signature; mutable We also need to distinguish the nodes that have been elected as leaders. To do so we can declare a mutable signature :alloy:`Elected` that is a subset of :alloy:`Node`. .. code-block:: var sig Elected in Node {} Note that, on purpose, we did not impose a multiplicity `lone` in this declaration, because that constraint, that at most one node is elected leader, is one of the key assertions we will want to verify to be a consequence of the protocol dynamics, and thus should not be imposed as an assumption upfront. .. index:: trace; visualization If we now run the :alloy:`example3` command we might get the following instance. .. image:: instance4.png :width: 700 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_4/leaderelection.als#L34-L45 :alt: Get the code to generate this instance. As we have seen in the :ref:`behavioral-modeling` chapter, the interface of the visualizer is different when we have mutable elements: it now shows two states of a trace side-by-side, there are several new instance exploration buttons in the toolbar, and a depiction of a path just below those. In this case we have a trace where the same state 0 repeats itself indefinitely, something that is possible since so far we added no restrictions that constrain the behavior of the protocol. We recall that in the depiction of the states, by default, immutable sets and fields (the configuration of the protocol) are shown with solid lines, while mutable ones are shown with dashed lines. To simplify the visualization, we can configure the theme so that node identifiers, the inbox, and the outbox are shown as node attributes, and change the color of elected nodes to green. Since the elected nodes can now be easily distinguished by color we can also hide the respective textual label. As for the totally ordered identifiers, we will just show the :alloy:`next` relation. The result of this customization will be the following. .. image:: instance5.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_5/leaderelection.als#L34-L45 :alt: Get the code to generate this instance. Something that is clearly wrong in this execution is that in the initial state nodes already have identifiers in their inboxes and outboxes, and some of them are already elected. To restrict the initial value of these mutable fields we can add the following fact. .. code-block:: fact init { // initially inbox and outbox are empty no inbox and no outbox // initially there are no elected nodes no Elected } Since this fact has no temporal connectives it applies to the initial state of the system. Rerunning the :alloy:`example3` command now yields instances such as the following, where the systems remains indefinitely in a valid initial state. .. image:: instance6.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_6_7/leaderelection.als#L41-L52 :alt: Get the code to generate this instance. .. index:: instance; exploration We can explore additional instances by using the instance exploration buttons in the toolbar. For example, the :guilabel:`New Fork` button asks the Analyzer for a different trace with the same behavior up to the left-hand side state, but a different outcome of the current transition (a different right-hand side state). By pressing this button we may get the following. .. image:: instance7.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_6_7/leaderelection.als#L54-L70 :alt: Get the code to generate this instance. This trace displays a behavior that should not be allowed in a correct execution of the protocol: in the second state all of a sudden several identifiers appeared in the inbox and outbox of :alloy:`Node1`. Moreover, this node was also randomly designated as leader. So far this is allowed because we added no constraints to our specification that restrict how the system can transition from one state to another. If there are no constraints then any transition is possible. .. index:: event event; guard event; effect As already explained in chapter :ref:`behavioral-modeling`, the simplest way to constrain the valid transitions is to consider all the possible events that can occur in the system, and for each event specify when it can occur (its *guard*) and what is its *effect* on the mutable relations. In our protocol we can distinguish three possible events: * a node *initiates* the execution by sending its own identifier to the next node; * a node reads and *processes* an identifier in its inbox, decides whether it should be propagated or discarded and, if it is its own identifier, elects itself as leader; * the network *sends* one message in the outbox of a node to the inbox of the next one. Alloy has no special keywords to declare events. A common approach is to declare a predicate for each event, that specifies its guard and effect, and add a constraint enforcing that at each possible state only one of those predicates can hold. Event predicates can also be parametrized, and in our specification they will be parametrized the node where the event occurs. .. index:: event; frame condition ' (prime) prime operator Let's begin by specifying the *initiate* event. For each event we should always start by specifying what is its guard, a constraint on the current (or pre-) state that determines when can the event occur. To simplify our specification, for the moment, we will assume that a node is free to initiate (or re-initiate) the protocol whenever it wants, so in this case there will be no guard, meaning the event is always possible. Then we should specify the effect of the event on all mutable relations of the specification: note that if nothing is said about a particular mutable relation then its value can change freely when the event occurs. A special effect is forcing the value of a mutable relation to not change at all, something known as a *frame condition*. For example in this event we will have two frame conditions, since it does not change the value of :alloy:`inbox` nor :alloy:`Elected`. To specify that a relation does not change with a logic formula we need to refer to its value in the next (or post-) state. In Alloy the value of a relation (or relational expression) in the next state is accessed by appending a prime. So, to state that, for example, the relation :alloy:`inbox` does not change we could add the constraint :alloy:`inbox' = inbox`. Likewise for :alloy:`Elected`. So far, the specification of event initiate looks as follows. .. code-block:: pred initiate_no_effect [n : Node] { // node n initiates the protocol inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } We now need to specify the effect on relation :alloy:`outbox`. We can start by specifying its effect on the outbox of the initiating node. Expression :alloy:`n.outbox` denotes the set of identifiers in the :alloy:`outbox` of node :alloy:`n` in the current state. The effect of this event on this set is to add :alloy:`n`'s own identifier, which can be specified by :alloy:`n.outbox' = n.outbox + n.id`. Here we used the set union operator :alloy:`+` to add the singleton set :alloy:`n.id` to :alloy:`n.outbox`. Note that requiring :alloy:`n.id` to be present in the outbox in the next state, which could be specified by :alloy:`n.id in n.outbox'`, is not sufficient, as it would allow identifiers other than :alloy:`n.id` to be freely removed or added from :alloy:`n.outbox`. Having specified the effect on the :alloy:`outbox` of `n` we now need to specify the effect on the :alloy:`outbox` of all other nodes. Again, if nothing is said about those, they will be allowed to vary freely. Of course this event should not modify the outboxes of other nodes, something we could specify with constraint :alloy:`all m : Node - n | m.outbox' = m.outbox`, that quantifies over all nodes except the initiating one. The final specification of our event looks as follows. .. code-block:: pred initiate_no_guard [n : Node] { // node n initiates the protocol n.outbox' = n.outbox + n.id // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } .. index:: always keyword To validate our first event we can impose a restriction stating that, for the moment, this is the only event that can occur, and inspect the possible instances to see if the system behaves as expected. To specify such a restriction we need the temporal logic operator :alloy:`always` to enforce a formula to be true at all possible states. In particular, to enforce that, at every possible state, one of the nodes initiates the protocol, the following fact can be added to the specification. .. code-block:: fact always_initiate { // possible events always (some n : Node | initiate_no_guard[n]) } Running command :alloy:`example3` yields the following trace, where :alloy:`Node2` initiates the protocol and then nothing else happens. .. image:: instance8.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_8/leaderelection.als#L63-L79 :alt: Get the code to generate this instance. .. index:: stuttering On first glance, this may seem like an invalid trace, given that we required that at every state one node initiates the protocol. However, if we look closely at our specification of event *initiate* we can see that it also holds true in a state where a node already has its own identifier in the outbox, which will remain in the outbox. Our specification did not had a guard forbidding this occurrence of the event, and the effect :alloy:`n.outbox' = n.outbox + n.id` is also true if :alloy:`n.id` is already in :alloy:`n.outbox` and in :alloy:`n.outbox'`. To address this issue we could add a guard like :alloy:`n.id not in n.outbox` to predicate :alloy:`initiate_no_guard`. However, this only forbids a node from initiating the protocol if it's own identifier is in its outbox at that particular moment. This means that, once we include an event to send messages and this identifier leaves the outbox, a node will be able to re-initiate the protocol. In principle, this does not pose any problem concerning the correctness of this particular protocol, but if we really wanted each node to initiate the protocol only once we would need to strengthen this guard. One possibility would be to add a field to record at which phase of the protocol execution each node is in, and allow this event only when a node is in an "uninitiated" phase. This would be the common approach in most formal specification languages. However, with Alloy we have a more direct alternative. Since there is no special language to specify events and they are specified by arbitrary logic formulas, we are free to use temporal operators to specify guards or effects. By contrast, in most formal specification languages we can only specify the relation between the pre- and post-state. .. index:: historically keyword In this case we could, for example, use the temporal operator :alloy:`historically`, that checks if something was always true in the past up to the current state, to only allow *initiate* to occur if the node's identifier was never in its outbox. The specification of our event would look as follows. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically n.id not in n.outbox // guard n.outbox' = n.outbox + n.id // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } .. index:: stuttering Unfortunately, after adding this guard our command will return no trace, meaning there are no behaviors that satisfy our constraints. This happens because we are currently not allowing messages to be sent or read, so after all nodes have initiated the protocol nothing else is allowed to happen by our specification, and it is impossible to obtain a valid trace. Recall that traces are infinite sequences of states. A simple way to solve this issue is to add an event allowing the system to *stutter*, that is to do nothing (keeping the value of all relations unchanged). Stuttering can also be understood as an event specifying something else that is occurring on the environment outside the scope of our specification, and in general it is a good idea to add such stuttering events. In this case, a predicate to specify a stuttering event can be specified as follows. .. code-block:: pred stutter { // no node acts outbox' = outbox inbox' = inbox Elected' = Elected } To allow this event we need to change the above fact that restricts the possible behaviors. .. code-block:: fact stutter_or_initiate { // possible events always (stutter or some n : Node | initiate[n]) } .. index:: instance; exploration New Fork button → button Running command :alloy:`example3` returns a trace where nothing happens in our system, something that is now allowed. If we press :guilabel:`New Fork` we might get a trace such as the previously depicted, where :alloy:`Node2` initiates the protocol and then nothing else happens. If we move to the next state by pressing :guilabel:`→`, and fork again with :guilabel:`New Fork`, we might get the following trace where :alloy:`Node0` initiates the protocol after :alloy:`Node2`. .. image:: instance9.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_9/leaderelection.als#L66-L77 :alt: Get the code to generate this instance. By exploring a bit more the possible traces we can get some confidence that event *initiate* is well specified. Let us now specify the *send* event. In this event we could have as parameters the node :alloy:`n` from where the message will be sent and the message :alloy:`i` that will be sent (an identifier). The guard of this event should require :alloy:`i` to be in the :alloy:`outbox` of :alloy:`n`. The effect will be to remove :alloy:`i` from the :alloy:`outbox` of :alloy:`n`, and add it to the :alloy:`inbox` of the next node :alloy:`n.succ`. All other inboxes and outboxes should keep their value, as well as the :alloy:`Elected` set. The specification of this event would be the following. .. code-block:: pred send [n : Node, i : Id] { // i is sent from node n to its successor i in n.outbox // guard n.outbox' = n.outbox - i // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox all m : Node - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes Elected' = Elected // frame condition on Elected } .. index:: else keyword For event *process* message, again we will have as parameters the node :alloy:`n` and the message :alloy:`i` to be read and processed. The guard should require that :alloy:`i` is in the :alloy:`inbox` of :alloy:`n`. The effect on :alloy:`inbox` is obvious: :alloy:`i` should be removed from the :alloy:`inbox` of :alloy:`n` and all other inboxes should keep their value. The effect on the :alloy:`outbox` of :alloy:`n` depends on whether :alloy:`i` is greater than :alloy:`n.id`: if so, :alloy:`i` should be added to the :alloy:`outbox`, to be later propagated along the ring; if not, it should not be propagated, meaning the :alloy:`outbox` of :alloy:`n` will keep its value. To write such conditional outcome we could use the logical operator :alloy:`implies` together with an :alloy:`else`: :alloy:`C implies A else B` is the same as :alloy:`(C and A) or (not C and B)`, but the former is easier to understand. The event has no effect on the outboxes of other nodes, which should all keep their value. The same conditional outcome applies to :alloy:`Elected`: if the received :alloy:`i` is equal to the :alloy:`n`'s own identifier, then :alloy:`n` should be included in :alloy:`Elected`; otherwise :alloy:`Elected` keeps its value. The full specification of :alloy:`process` is as follows. .. code-block:: pred process [n : Node, i : Id] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox all m : Node - n | m.inbox' = m.inbox // effect on the inboxes of other nodes gt[i, n.id] implies n.outbox' = n.outbox + i // effect on n.outbox else n.outbox' = n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes i = n.id implies Elected' = Elected + n // effect on Elected else Elected' = Elected } We should now add these two events to the fact that constraints the valid behaviors of our system. We will also factorize all node events into a separate predicate, to later simplify the specification of fairness conditions. .. code-block:: pred node_acts [n : Node] { initiate[n] or (some i : Id | send[n, i]) or (some i : Id | process[n, i]) } fact events { // possible events always (stutter or some n : Node | node_acts[n]) } .. index:: instance; exploration Before proceeding to the verification of the expected properties of the protocol, we should play around with the different instance exploration buttons to explore different execution scenarios and validate the specification of our events. An alternative would be to ask for specific scenarios directly in a :alloy:`run` command. For example, we could change command :alloy:`example` to ask directly for a scenario where some node will be elected. To do so we need to use the temporal operator :alloy:`eventually`, which checks if a formula is valid at some point in the future (including the present state). Our command would look as follows. .. code-block:: run eventually_elected { eventually some Elected } for exactly 3 Node, exactly 3 Id Running this command will return a trace with 8 states where the highest identifier is passed around until the respective node gets elected. This is the shortest trace where a leader can be elected in a ring with 3 nodes, corresponding to one initiate event followed by 3 interleaved send and process message events. With the bounded model-checking engine that is used by default, the Analyzer guarantees that the shortest traces that satisfy (or refute) a property are returned first. The first transition of this trace is the following, where the node with the highest identifier (:alloy:`Id2`) initiates the protocol. .. image:: instance10.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_10_11/leaderelection.als#L106-L118 :alt: Get the code to generate this instance. In the last transition, that can be focused by pressing :guilabel:`→` 6 times or by directly clicking the state numbered 6 in the trace depiction, this node reads back its own identifier that was passed around in the ring and elects himself as leader. .. image:: instance11.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_10_11/leaderelection.als#L106-L118 :alt: Get the code to generate this instance. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/specifying-the-protocol-dynamics :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: Encoding scenarios :link-type: ref :link: scenarios :octicon:`beaker` Further reading ^^^ Learn how to specify richer :alloy:`run` commands to search for specific scenarios, where a particular sequence of events occurred. .. card:: Event depiction :link-type: ref :link: event-depiction :octicon:`beaker` Further reading ^^^ Learn a simple Alloy idiom that can be used to depict which events occurred in each transition of a trace. Verifying the expected properties ================================================= The key property of our protocol is that exactly one node will become leader. We can break this property into three simpler ones: - there will never be more than one leader; - every leader remains leader; - eventually there will be at least one leader. .. index:: safety liveness fairness These properties are of very different nature: the first two are a *safety* properties, forbidding some (undesired) behavior of the system, while the latter is a *liveness* property, forcing some (desirable) behavior of the system. The analysis of safety properties is usually simpler than the analysis of liveness properties. To find a counter-example for a safety property it suffices to search for a finite sequence of states that leads to a (bad) state (in the case of the first property, one state where there are two or more leaders), and it is irrelevant what happens afterwards, as any continuation of this finite sequence will still be a counter-example. On the other hand, to find a counter-example for a liveness property it is necessary to search for a complete infinite trace where the expected behavior definitely never happened (in the case of the last property, one trace where a leader is never elected). Moreover, it will be necessary to impose additional *fairness* conditions when verifying liveness properties, in particular to forbid unreasonable counter-examples where at some point the system stutters forever and the nodes never execute an enabled event. .. index:: invariant assert keyword check keyword Let us start by verifying the first (safety) property. Recall that properties that are expected to hold can be written in named assertions (declared with keyword :alloy:`assert`) and then verified with :alloy:`check` commands. This safety property is a very simple example of an *invariant*, a property that requires something to be be true in all states of all possible traces. In Alloy, invariants can be specified using the temporal operator :alloy:`always`. In each state, to check that there is at most one leader in set :alloy:`Elected` we could use the keyword :alloy:`lone`. This invariant could thus be specified as follows. .. code-block:: assert at_most_one_leader { always (lone Elected) } To check this assertion we could define the following command. .. code-block:: check at_most_one_leader .. index:: configuration This protocol is known to be correct, so, as expected, running this command returns no counter-example. Recall that the default scope on top-level signatures is 3, so this command actually verifies that the property holds for all rings with up to three nodes. This Alloy ability of model checking a property at once for all possible configurations of a system is not possible in some competing model checkers, where we are forced to specify the specific configuration on which we want to verify a protocol. That is rather cumbersome, since with a reasonable scope there can be many different configurations and its difficult to even enumerate them manually. Also, we could easily miss a bug in a specific configuration. .. index:: model checking; bounded steps keyword scope; definition As mentioned in chapter :ref:`behavioral-modeling`, by default, verification of temporal properties in Alloy is done with a technique known as *bounded model checking*, meaning that the search for counter-examples will only consider a given maximum number of different transitions before the system starts exhibiting a repeated behavior. By default this maximum number is 10. To increase our confidence in the result of the analysis we could, for example, check this property for all rings with up to 4 nodes and consider up to 20 different transitions (or steps). To do so we could change the scope of the command as follows. .. code-block:: check at_most_one_leader for 4 but 20 steps This commands sets the default scope on signatures to 4 but also changes the scope on transitions to 20 using keyword :alloy:`steps`. Again, as expected, no counter-example is returned, but the analysis now takes considerably longer, as there are many more network configurations and states to explore. .. index:: model checking; complete steps keyword scope; definition It is also possible to check a temporal property with *complete model checking* if a compatible solver is selected. This will consider an arbitrary number of transitions, but still with the signatures bounded by a maximum scope. To do so, a special scope on :alloy:`steps` is needed to trigger unbounded temporal analysis. .. code-block:: check at_most_one_leader for 4 but 1.. steps The second expected property is another example of a safety property, this time requiring a nested :alloy:`always` operator to be specified. .. code-block:: assert leader_stays_leader { always (all n : Elected | always n in Elected) } check leader_stays_leader Again this assertion is true, as expected. Note that the universal quantification on `Elected` could not be moved outside the first `always`, because `Elected` is a mutable subset signature, and if used outside any temporal connective it will be evaluated in the initial state, where it is empty. Of course, that is not what we want. Here we check that `Elected` never decreases in content, that is, at every state any node that it contains will remain in `Elected` for ever. Combined with `at_most_one_leader` this ensures that there can never exist two different leaders. The third expected property is a very simple example of a liveness requirement, namely one that requires something to hold at least in one state of every trace. In this case we want to make sure that at least one node will be elected. Simple liveness properties such as this one can many times be specified directly with the temporal connective :alloy:`eventually`. Inside this temporal connective, again a very simple cardinality check on :alloy:`Elected` suffices, this time with keyword :alloy:`some`. .. code-block:: assert at_least_one_leader { eventually (some Elected) } The following command can be used to check this property with bounded model checking and the default scopes. .. code-block:: check at_least_one_leader We recommend that you start by verifying specifications with bounded model checking, because for small scopes on `steps` it is usually much faster than the unbounded counterpart, and most counter-examples tend to require only a handful of transitions. Once you have a design where no counter-example is returned with bounded model checking, you can increase your confidence on the analysis by using unbounded model checking. As hinted above, running this command immediately returns a counter-example where the system stutters forever, and obviously no leader is elected. This counter-example can be replicated in trivial configurations, for example the following with a single node. .. image:: instance12.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_12/leaderelection.als#L123-L134 :alt: Get the code to generate this instance. .. index:: event; enabled This behavior is currently allowed because we added a stutter event that captures events external to our system. At the moment this event can occur indefinitely, which is a bit unrealistic or *unfair* to the system under analysis: in particular, if at some point one node is continuously ready to execute one of its events (the node is continuously *enabled*), then it should eventually do something. For example, in our protocol, a trace where at some point a message is in the inbox of a node but is never processed is unfair to that node. To specify fairness properties that exclude such unreasonable executions, we first need to understand two well-known combinations of the :alloy:`always` and :alloy:`eventually` temporal operators: - :alloy:`eventually always P` holds in a system if all traces reach a state where :alloy:`P` becomes valid indefinitely. - :alloy:`always eventually P` holds in a system if in every state of every trace :alloy:`P` is later valid, which means that :alloy:`P` is true infinitely often. To specify that a node is at some point continuously enabled we can use the first combination. Most of the times the enabled condition for a node is exactly the same as the disjunction of the guards of its events. In our case a node :alloy:`n` is enabled if it has never initiated or has some message in the inbox or in the outbox. To specify that a continuously enabled event eventually occurs (after the point it becomes enabled) the second combination of operators presented above can be used. Note that a continuously enabled event would still be (continuously) enabled after the required occurrence of the event, which means that the event should occur again and again, or infinitely often. So, the desired fairness condition can be specified as follows. .. code-block:: pred initiate_enabled [n : Node] { historically n.id not in n.outbox } pred process_enabled [n : Node, i : Id] { some n.inbox } pred send_enabled [n : Node, i : Id] { some n.outbox } pred node_enabled [n : Node] { initiate_enabled[n] or (some i : Id | process_enabled[n, i]) or (some i : Id | send_enabled[n, i]) } pred fairness { all n : Node { eventually always node_enabled[n] implies always eventually node_acts[n] } } .. index:: fairness fairness; weak fairness; strong Here we specified separately the enabled conditions for each of the events. Technically, this fairness condition is known as *weak fairness*, as a node is only required to act if is continuously enabled. Often this suffices to verify most liveness properties. However, sometimes we may need *strong fairness* constraints, requiring that some event occurs when it becomes recurrently enabled (infinitely often, but not necessarily continuously). If we change the :alloy:`at_least_one_leader` assertion to assume the above fairness condition, no counter-example will be returned even with increased scopes, as expected. .. code-block:: assert at_least_one_leader_fair { fairness implies eventually (some Elected) } check at_least_one_leader_fair .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/verifying-the-expected-properties :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: Fairness and Liveness :link-type: ref :link: fairness :octicon:`beaker` Further reading ^^^ Learn how to specify different fairness conditions and the impact they have on the analysis. .. card:: A temporal logic primer :link-type: ref :link: temporal-logic :octicon:`beaker` Further reading ^^^ Read about the full syntax and semantics of the temporal logic supported in Alloy. Making the specification more abstract ====================================== There are several ways in which we can make our specification of this protocol more abstract, and potentially speed up its analysis. For example, we actually do not need to model the identifiers explicitly: we can use each node atom name as its own identifier. To model this, we could impose the total order on :alloy:`Node` itself, and get rid of signature :alloy:`Id` and field :alloy:`id`. Also, messages are now the node atoms, and the declaration of `inbox` and `outbox` should change accordingly. However, imposing a total order with `util/ordering` will make the scope on `Node` exact, which means that for a given scope we would only verify all configurations with that exact number of nodes. This is undesirable, because we could easily miss bugs for smaller configurations. As such, instead of using `util/ordering` we will directly impose a total order on the nodes, by explicitly adding a `next` relation, the `first` and `last` nodes, and the corresponding fact. .. code-block:: sig Node { next : lone Node, succ : one Node, var inbox : set Node, var outbox : set Node } one sig first, last in Node {} fact ordering { no next.first and no last.next Node-first in first.^next } Actually, in our first model with a separate `Id` signature and using `util/ordering` we were verifying some redundant configurations that were symmetric for this protocol. For example, with the default scope of 3, the scope of `Id` was exactly 3, meaning that even if we had less than 3 nodes, there would still exist 3 identifiers. This means that, for example, for a network of 2 nodes we were analyzing three configurations: one where the identifiers were consecutive starting in the first identifier, one where the identifiers were consecutive starting in the second identifier, and one where the identifiers differed in two units. For this protocol, where we just need to check which identifier is greater, all these configurations are equivalent, and thus it would be unnecessary to check them all. In this new version without explicit identifiers, we no longer have this issue, and for a scope of 3 only 4 truly different configurations will be analyzed: one network with one node, one with two nodes, and two networks with three nodes, one where `next` is in the same order of `succ` and one where it is in reverse order. Of course, after removing `Id` we have to slightly change the specification of our event predicates and the assertions. For example, event *process* can now be specified as follows. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox all m : Node - n | m.inbox' = m.inbox // effect on the inboxes of other nodes i in n.^next implies n.outbox' = n.outbox + i // effect on n.outbox else n.outbox' = n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes i = n implies Elected' = Elected + n // effect on Elected else Elected' = Elected } Note that since we now don't have the `gt` predicate available we directly compared nodes using the `next` relation. The :alloy:`eventually_elected` command now only needs to specify a scope for :alloy:`Node`. .. code-block:: run eventually_elected { eventually some Elected } for exactly 3 Node Running this command yields the following instance (with the theme already customized), and it is now more clear if the ring follows the increasing order of the identifiers (as happens here), or the other way around. Notice that without using `util/ordering`, the node identifiers are no longer representative of the total order (in the trace below, `Node2` is the first element of the total order, and `Node1` the last). .. image:: instance13.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_13/leaderelection.als#L103-L113 :alt: Get the code to generate this instance. Another signature we can remove is :alloy:`Elected`, since actually we can infer which node has been elected by looking at what happened in the past: if at some point in the past a node had its identifier in the inbox and processed it, then it necessarily became a leader. Instead of declaring a subset signature we can define :alloy:`Elected` as a derived set of nodes using a function without parameters. .. code-block:: fun Elected : set Node { { n : Node | once (before n in n.inbox and n not in n.inbox) } } Here, the set of elected nodes at each state is defined by comprehension. Note the usage of the :alloy:`once` and :alloy:`before` past operators to check if the desired transition happened in the past. We can now simplify the specification of all three events, removing the effects on :alloy:`Elected`. For example, event *process* can now be specified as follows. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox all m : Node - n | m.inbox' = m.inbox // effect on the inboxes of other nodes i in n.^next implies n.outbox' = n.outbox + i // effect on n.outbox else n.outbox' = n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes } Another abstraction could be to remove the :alloy:`outbox` and incorporate the sending of a message in the *initiate* and *process* events. This is a reasonable simplification in this protocol, since each node only communicates with the successor node, and out of order message delivery is already being modeled by having the inboxes contain sets of messages. The declaration of signature :alloy:`Node` becomes the following. .. code-block:: sig Node { next : lone Node, succ : one Node, var inbox : set Node } The specification of event *initiate* should be changed to directly add the node identifier to the inbox of the successor node. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically n not in n.succ.inbox // guard n.succ.inbox' = n.succ.inbox + n // effect on n.succ.inbox all m : Node - n.succ | m.inbox' = m.inbox // effect on the outboxes of other nodes } Finally, the specification of *process* should also be changed accordingly. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox i in n.^next implies n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox else n.succ.inbox' = n.succ.inbox all m : Node - n - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes } After removing event *send* from predicate `node_acts`, we can run the :alloy:`eventually_elected` command and see that it is now possible to elect a leader in a ring with three nodes with almost half the transitions. .. image:: instance14.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_14/leaderelection.als#L77-L86 :alt: Get the code to generate this instance. With the same scope on the :alloy:`steps` we can now check much longer behaviors with roughly the same efficiency. If we check our three assertions they are all still valid, as expected. However, this specification is actually not correct! If we run the following command to get an example of electing a leader in a ring with one node, we will get no instances. .. code-block:: run eventually_elected_1node { eventually some Elected } for exactly 1 Node The problem is in the specification of the effect of *process* on :alloy:`n.succ.inbox`. If the ring has a single node, both :alloy:`n` and :alloy:`n.succ` are the same node, and the two conditions specifying the effect on :alloy:`n.succ.inbox` and on :alloy:`n.inbox` contradict themselves, so the *process* event cannot actually occur (is not truly enabled) even if its guard is true in a state. This was the reason why checking the :alloy:`at_least_one_leader_fair` assertion did not produce a counter-example, because the fairness condition now has the wrong enabledeness condition for this event. It is very easy to make such subtle mistakes while specifying, and introduce inconsistencies in our model, and to avoid them we should always exhaustively validate our specification before checking our assertions. This is specially true for liveness properties. Ideally, we should have different :alloy:`run` commands to check the consistency of different scenarios, for instance asking for examples where leaders are elected in rings of different sizes. To fix the specification of *process* we could require the frame condition on :alloy:`n.succ.inbox` to only apply when :alloy:`n.succ` is not `n` itself. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox i in n.^next implies n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox else n.succ != n implies n.succ.inbox' = n.succ.inbox all m : Node - n - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes } If we ask for a trace where a leader is elected in a ring with one node, we now get the expected outcome after two events. .. image:: instance15.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_15_16/leaderelection.als#L81-L90 :alt: Get the code to generate this instance. .. image:: instance16.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/e9d9cb2eff0bb71f51de53791e5cb2a1280bafe9/protocol-design/instance_15_16/leaderelection.als#L81-L90 :alt: Get the code to generate this instance. So far we have specified the effects of events in a *point-wise* style, stating the effect on different projections of the `inbox` separately. Using a *point-free* style, where we specify the effect on all inboxes at once, can many times considerably simplify the specification. Consider for example the point-free version of the *initiate* event. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically n not in n.succ.inbox // guard inbox' = inbox + n.succ->n // effect on inbox } Now we just specify in a single constraint that the binary relation `inbox` should have a new tuple in the next state, namely one where `n` was added to the inbox of `n.succ`. For `process` specifying the effect on the all `inbox` in a single constraint is a bit more challenging, but could be done as follows. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard inbox' = inbox - n->i + n.succ->(i & n.^next) // effect on inbox } In the next state, binary relation `inbox` should have one less tuple, corresponding to deleting `i` from the inbox of `n`, and, if `i in n.^next` one more tuple, corresponding to adding `i` to the inbox of `n.succ`. This last conditional addition was specified by always adding tuple `n.succ->(i & n.^next)` to `inbox`: note that expression `i & n.^next` is equal to `i` if `i in n.^next` or `none` otherwise, hence in the latter case no tuple will be added since the Cartesian product with an empty set is also an empty binary relation. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/making-the-specification-more-abstract :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: Pointwise effects :link-type: ref :link: pointwise-effects :octicon:`beaker` Further reading ^^^ Learn about the different ways to specify effects and frame conditions of events. Here we used the `point-free` style to simplify the specifications. Explicit messages as signatures ====================================== Up until now, we only specified the first phase of the leader election protocol. In a second phase, once a node identifies itself as the leader, it should propagate that information to all the other nodes to make sure that every node is aware of the elected leader. This second phase can begin while there are still messages from the first phase being exchanged. A message informing about the newly elected leader contains the payload exchanged in the first phase: a node identifier, in this case of the elected leader instead of candidate. Thus, we can no longer abstract messages by just exchanging identifiers between nodes, and must introduce a richer message type into our model to distinguish between the messages exchanged in the two phases of the protocol. In a programming language we would probably encode messages using a *record*, or *struct* type. In Alloy such records can to be encoded with signatures, but that causes some issues that will be discussed in this section. We can easily extend our model to support different types of messages by introducing an abstract static signature to represent them, and then extend it by the more specific message types. Signature fields would represent the payload or data fields of the messages. In our leader election protocol, we have two message types that happen to contain the same payload -- a node identifier -- so we can define that field at the top-level signature, which will be inherited by the specific `CandidateMsg` and `ElectedMsg`. The payload is mandatory, so it is assigned multiplicity `one`. .. code-block:: abstract sig Message { payload : one Node } sig CandidateMsg, ElectedMsg extends Message {} Then we need to adapt the `inbox` declaration to contain the new messages instead of just node identifiers. .. code-block:: sig Node { next : lone Node, succ : one Node, var inbox : set Message } Next, we must adapt the event predicates to consider the new message types. Some of these events (such as when a node initiates, or elects itself the leader) are expected to 'create' new messages. This will expose a main difference between having true record-like data structures and Alloy static signatures: with the latter, messages will in fact never really be created, since all possible messages that can be sent in an execution already belong to signature `Message` in the initial state. The 'creation' of a new message will be specified by an existential quantification that searches for an atom inside `Message` with the required type and content. Predicate `initiate`, for instance, could be adapted in the following way. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically no CandidateMsg & payload.n & n.succ.inbox // guard some m : CandidateMsg & payload.n | inbox' = inbox + n.succ->m // effect on inbox } The first difference to our previous specification is in the guard: rather than testing whether its own identifier has ever been in the inbox of its successor, a node can now only initiate if a candidate message with its identifier was never in that inbox. Note that for a node identifier `n`, `payload.n` retrieves all messages with that identifier, which intersected with `CandidateMsg` gives all candidate messages for node `n` (in the current model, there may be multiple `CandidateMsg` atoms with the same payload). The second difference, as we already mentioned, is the use of an existential quantification in the effect to pick a message of the appropriate type and content to be added it to the successor's inbox. Rather than having an identifier as a parameter, the *process* event should now consider a message. To ease the specification, we split the *process* event into two different events specified in different predicates: processing a `CandidateMsg` message or processing an `ElectedMsg` message. Predicate `processCandidate` becomes slightly more complicated, since there are now three possible effects on the successor's inbox: as previously, a node receiving a candidate message may discard it (if the identifier in the message is smaller than its own), or propagate it (if the identifier in the message is larger than its own). But now, when a node identifies itself as the leader (the identifier in the message is the same as its own), it must also send a new elected message informing other nodes that it become the leader. One possibility to specify this event is as follows. .. code-block:: pred processCandidate[n : Node, m : CandidateMsg] { // candidate message m is read and processed by node n m in n.inbox // guard m.payload in n.^next implies inbox' = inbox - n->m + n.succ->m // effect on inbox else m.payload in ^next.n implies inbox' = inbox - n->m else some e : ElectedMsg & payload.n | inbox' = inbox - n->m + n.succ->e } Due to the need to send a different type of message in the last case, it is now more difficult to specify the effect in a single case, so we keep the three cases separate. Predicate `processElected` is quite simple, just propagating the elected messages unless it's the elected leader receiving the message again, in which case it is discarded. .. code-block:: pred processElected[n : Node, m : ElectedMsg] { // elected message m is read and processed by node n m in n.inbox // guard m.payload != n implies inbox' = inbox - n->m + n.succ->m // effect on inbox else inbox' = inbox - n->m } Concerning `Elected`, we would now to consider a more interesting version, where we register for each node which other node it considers the leader, if any. As such, we will change `Elected` to return not the set of elected nodes, but a binary relation that for each node returns the leader it currently knows. A node gets to know who is the leader when it processes an elected message, so the set of leaders for each node is the payload of all previously processed messages of this type. We can define this relation by comprehension, using a `let`-expression to denote the set of elected messages in a node inbox. .. code-block:: fun Elected : Node -> Node { { n, i : Node | let inbox_elected = payload.i & ElectedMsg & n.inbox | once (before some inbox_elected and no inbox_elected) } } Lastly, we just have to update the fact that specifies what are the valid executions of the protocol. .. code-block:: pred node_acts [n : Node] { initiate[n] or (some m : CandidateMsg | processCandidate[n, m]) or (some m : ElectedMsg | processElected[n, m]) } fact events { // possible events always (stutter or some n : Node | node_acts[n]) } Let us validate the new version of our protocol using the functionalities provided by the Analyzer. Let us run our empty `run` command `example` just to explore some possible traces. The first trace returned is one where there is a single node and the system stutters forever (recall that bounded model checking guarantees minimal trace prefix length). By hitting :guilabel:`New Fork` we should be able to observe a different event, namely an *initiate* event. However, depending on the selected solver, we may get the following message. .. image:: validation1.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_17_18/leaderelection.als#L96-L105 :alt: Get the code to generate this instance. .. index:: trace; evaluator Let's try to use the evaluator to debug the trace. Recall that the evaluator for traces supports any formula and expression, and that this evaluation is always performed on the focused state (the one at the left-hand side). We can start by confirming that the system is simply stuttering by calling `stutter` at the current state (in this case, this is still the initial one since we did not navigate in the trace). We can also ask if `stutter` is happening `after` or even `always`. For all theses cases the evaluator will answer `true` as expected. We can then ask the evaluator whether there is `some` node for which the guard of the *initiate* predicate holds (a formula with the past operator `historically`), or use comprehension to get all nodes for which that guard holds. The evaluator will show us that it holds for the existing node. So this must be a case where the guard is not necessarily the same as the enabled condition of the event. The problem is in the existential quantifier searching for the new candidate message: this cannot be satisfied because there is no available `CandidateMsg` message in the configuration of this trace that can be added to the successor's inbox. .. image:: validation2.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_17_18/leaderelection.als#L96-L105 :alt: Get the code to generate this instance. This issue is actually evident if we inspect the available messages in the visualizer, as there are only `ElectedMsg` messages in the configuration. So we can hit :guilabel:`New Config` until a `CandidateMsg` is available (after closing the evaluator). This should happen after a few iterations. Then, we can ask again for :guilabel:`New Fork`, and this time a node will effectively initiate. .. image:: instance18.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_17_18/leaderelection.als#L107-L116 :alt: Get the code to generate this instance. This means that to simulate a specific scenario we must first think about which messages will be necessary and hit :guilabel:`New Config` until those messages are in scope at the initial state. Of course, this does not scale to more complex scenarios. Another problem occurs when we want to inspect a different network configuration (recall that with up to 3 nodes, there are only 4 distinct configurations). This should be easy to achieve by using :guilabel:`New Config`. Unfortunately, you'll notice that the network configuration most likely stays the same, and only the set of available messages keeps changing. This is due to the fact that, as we've seen, the available messages are also part of the configuration because `Message` is an immutable signature. With one of the available solvers we actually needed to hit :guilabel:`New Config` 10 times before seeing a different ring network, and dozens of times to inspect the 4 alternatives, as the Analyzer will just iterate over different sets of possible messages. .. image:: validation3.png :width: 800 :align: center .. image:: validation4.png :width: 800 :align: center .. image:: validation5.png :width: 800 :align: center .. index:: generator axiom One possible solution to circumvent these issues if we have explicit messages is to enforce a so-called *generator axiom*. These are constraints that force every possible combination of field values of a signature to exist *in every instance*. In our leader election model, we would want to force the existence of messages with all possible types and values, which could be specified as follows. .. code-block:: pred generator { all n : Node { some m : CandidateMsg | m.payload = n some m : ElectedMsg | m.payload = n } } In order for models with the generator axiom to be consistent, we must set the scope to allow every possible message. In our model, we can have one message of each type for each available node, so the `Message` scope should be twice that of `Node`. Notice that we define the generator axiom as a predicate, rather than a fact: enforcing all possible messages to exist is helpful when exploring scenarios, but might slow down too much the verification properties, where enforcing a large scope for messages might possibly encumber the solving process. This is less of a problem for `run` commands that are expected to be satisfiable, which usually finish quickly. So, for the moment, we call this predicate only in `run` commands used for validation, such as the one below for ring networks with exactly 3 nodes. .. code-block:: run example3_generator { generator } for exactly 3 Node, 6 Message This command now guarantees the existence of all possible messages, and simulation using :guilabel:`New Fork` can easily be done as before. And since we have a tight scope of `Message` there is a unique configuration of messages available, and hitting :guilabel:`New Config` will effectively generate a new network configuration. Unfortunately, this will only work if we know the smallest scope needed for messages, which is not always easy to identify. Otherwise, different messages with the same payload will be allowed to appear in the configuration. That will be the case if we no longer ask for `exactly` 3 nodes in the command above: for ring networks with 2 nodes, 6 messages result in repetitions of the network configurations, changing only the available messages. To address this, we can also additionally enforce messages to be unique with the following predicate. .. code-block:: pred unique { all m1, m2 : CandidateMsg | m1.payload = m2.payload implies m1 = m2 all m1, m2 : ElectedMsg | m1.payload = m2.payload implies m1 = m2 } Given this, even if we assign a scope to messages larger than the one needed, a unique message configuration is guaranteed to be generated, and :guilabel:`New Config` will necessarily return a different network configuration (a different ring or a different order of node identifiers). .. code-block:: run example_unique_generator { generator unique } for 3 Node, 10 Message However, by forcing messages to be unique, we are actually restricting the set of possible behaviors of our protocol, and we must consider the implications of that decision. The main consequence is that, since inboxes are modeled as sets, it is now impossible to have two messages with the same content in an inbox (elements are unique in sets). But this was already the case in the version of the model without messages - a node identifier could only appear once in an inbox - so supporting duplicated messages was actually a deviation from the original model. Regardless, this is not an issue in this protocol, since each node sends at most one candidate message and one elected message with its own identifier. Regardless of the issues with trace iteration, we should always reason carefully about message scopes. For instance, we could try to ask for a scenario where all nodes initialize, and eventually all nodes know who is the leader, with the following invalid command. .. code-block:: run bad_all_initiate { all n : Node | eventually initiate[n] eventually Node = Elected.Node } for 3 but exactly 3 Node The Analyzer will reply that there is no such trace. That's because the default scope for signatures is 3, and 3 messages are not sufficient to represent this trace (at least 4 are needed). This can also lead to a false sense of safety from `check` commands, since a limited set of messages may reduce the valid traces explored by the Analyzer. To actually see a trace with 3 nodes where eventually a node is elected, we should use the following command, which returns a trace such as the one below. Notice how `Elected` is now a binary relation. To improve the visualization, we could introduce an auxiliary function for `Node.Elected` to highlight the nodes considered the leader by any node. .. code-block:: run all_initiate { all n : Node | eventually initiate[n] eventually Node = Elected.Node } for exactly 3 Node, exactly 6 Message .. image:: instance19.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_19/leaderelection.als#L166-L180 :alt: Get the code to generate this instance. So, before re-running the 3 check commands defined previously for our leader election protocol, we should extend the scope of available messages. However, we must first adapt the specification of the properties to the new `Elected` relation. .. code-block:: assert at_most_one_leader { always lone Node.Elected } check at_most_one_leader for 3 but 6 Message assert leader_stays_leader { always (all n : Node.Elected | always n in Node.Elected) } check leader_stays_leader for 3 but 6 Message assert at_least_one_leader_fair { fairness implies eventually (Node = Elected.Node) } check at_least_one_leader_fair for 3 but 6 Message Note that `Node.Elected` is now the set of leaders known to any of the nodes, so the two safety properties are the same as before, but with the previous `Elected` set replaced by `Node.Elected`. The liveness property specifies that all nodes will eventually know who is the leader, by requiring that eventually the domain of relation `Elected` contains all nodes. The two safety properties are satisfied as before. Here is the result of the analysis for the first. .. image:: log1.png :width: 600 :align: center To verify the liveness property `at_least_one_leader`, we should first update the fairness condition to consider the new event and the new guards. .. code-block:: pred initiate_enabled [n : Node] { historically no CandidateMsg & payload.n & n.succ.inbox } pred processCandidate_enabled [n : Node, m : Message] { m in n.inbox } pred processElected_enabled [n : Node, m : Message] { m in n.inbox } pred node_enabled [n : Node] { initiate_enabled[n] or (some m : CandidateMsg | processCandidate_enabled[n, m]) or (some m : ElectedMsg | processElected_enabled[n, m]) } pred fairness { all n : Node { eventually always node_enabled[n] implies always eventually node_acts[n] } } With this fairness condition property `at_least_one_leader_fair` holds. Note that in the above enabled conditions we did not explicitly checked if it was possible to create a message of the appropriate type to send. For example in `initiate_enabled` we did not include the condition `some CandidateMsg & payload.n`. The behaviors where that condition is false will be excluded from the analysis by the fairness property, which is precisely the intended. Also, since our scope for `Message` is big enough, the configuration where all possible messages exist will also be analyzed, and in that configuration those constraints would be trivially true, and so all the interesting behaviors will still be verified. Actually, when using explicit messages it might be a good idea to just change the `generator` and `unique` predicates to facts, because analysis commands will only have to explore a single message configuration and this might lead to efficiency gains (besides all the advantages in validation). Of course, then we must be very careful to always choose a big enough scope for messages. The only reason to keep them as predicates is if the scope needed to have all possible messages becomes too big: in that case the analysis might be too inefficient, and without the `generator` axiom we might still be able to check the property for a smaller number of messages. Of course, then we will have no guarantees that all possible executions for a given number of nodes are explored, but we might still be able to find bugs and have some confidence in the results. The reader might wonder if instead of having a static `Message` signature we could declare it as mutable, and then really create new messages when they are needed. That can be done in Alloy, but, unfortunately, from our experience, that option makes specification more difficult and analyses less efficient, when compared to the static version with the `generator` and `unique` assumptions, so we will not discuss it here. And concerning efficiency, there is a much better alternative that we will discuss next, that is to not have explicit messages as signatures. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/explicit-messages-as-signatures :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: The trace evaluator :link-type: ref :link: evaluator_trace :octicon:`beaker` Further reading ^^^ Learn more about using the evaluator to validate and debug traces. .. card:: Commands in detail :link-type: ref :link: commands :octicon:`beaker` Further reading ^^^ Understanding how scopes work is essential to write useful analysis commands. Messages as tuples ============================= In the previous section we used signatures to encode records, which led to some issues with validation. However, we can also explore Alloy's flexible relational logic to implement a more abstract version that instead of using signatures implements messages directly as tuples. This has the advantage of avoiding the validation issues, and, by removing message atoms, it will also speed up analysis. The trade off is that it does not scale very well when we have very complex message types. Recall that fields in Alloy can be of arbitrary arity, so one can simply declare the inboxes to return a relation with a set of tuples, each containing all the information of a message: if a message has 2 fields of type `A` and `B`, then the inbox will contain a subset of pairs contained in `A -> B`. However, we will also need to identify the type of the message in the inbox. So, if we have different message types, the inbox should actually contain triples belonging to `Type -> A -> B`, where `Type` will be an enumerated signature discriminating the different types of messages. Since each inbox is associated to a node, this would result in a quaternary `inbox` relation with type `Node -> Type -> A -> B`. This strategy works if all message types contain exactly the same payload, otherwise we would need to somehow normalize this tuple set (more on this later). In our current example, we have two kinds of messages that happen to have the same payload, so this strategy can simply be encoded by changing the declaration of `inbox` to a mutable ternary relation. .. code-block:: abstract sig Type {} one sig Candidate, Elect extends Type {} sig Node { succ : one Node, next : lone Node, var inbox : Type -> Node } Notice that this idiom again disallows repeated messages in the inboxes, since a tuple belonging to `Type -> Node` can only appear once in the inbox. Specifying the event predicates is actually simpler than the version with explicit messages, since we can always create a new tuple with the needed message payload. So, in the *initiate* event, the guard just has to check whether the tuple `Candidate->n` has ever been in the inbox of the successor, and if not that 'message' is added to the successor's inbox. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically Candidate->n not in n.succ.inbox // guard inbox' = inbox + n.succ->Candidate->n // effect on inbox } The *process* events are also simpler, and with messages as tuples it is even simpler to specify effects in a point-free style. The `processElected` event can be specified as follows. Note that now it is again parametrized by the node identifier expected to be contained in the message to be processed. .. code-block:: pred processElected[n : Node, i : Node] { // elected message m is read and processed by node n Elect->i in n.inbox // guard inbox' = inbox - n->Elect->i + n.succ->Elect->(i - n) // effect on inbox } The guard states that a message with content `Elect->i` should exist in the inbox of `n`. In the effect, this message is removed from the inbox of `n` and only added to the inbox of `n.succ` if the `i != n`. This conditional addition is achieved by adding message `Elect->(i - n)` to the inbox of `n.succ`. Note that `i - n` is empty iff `i != n`, hence no message will be added in that case. The specification of `processCandidate` can also be done with a couple of one-liners. .. code-block:: pred processCandidate[n : Node, i : Node] { // candidate msg m is read and processed by node n Candidate->i in n.inbox // guard inbox' = inbox - n->Candidate->i + n.succ->Candidate->(i & n.^next) + n.succ->Elect->(n & i) // effect on inbox } Here we expect a `Candidate->i` message in the inbox of `n`, which is then removed in the effect. The conditional addition of that message to the inbox of `n.succ` when `i in n.^next` is achieved by adding tuple `Candidate->(i & n.^next)`, and the conditional addition of an elected message when the received identifier is equal to `n` is achieved by adding tuple `Elect->(n & i)`, which will be empty iff `n != i`. The `Elected` function is also simpler. .. code-block:: fun Elected : Node -> Node { { n, i : Node | once (before Elect->i in n.inbox and Elect->i not in n.inbox) } } Finally, you must update fact `events` and the enabledness predicates for the fairness constraint for this new version of the model. The specification of all properties remains the same, only the commands need to be adapted by removing the scope on `Message`. The specification of the enabledness predicates is the following. .. code-block:: pred initiate_enabled [n : Node] { historically Candidate->n not in n.succ.inbox } pred processCandidate_enabled [n : Node, i : Node] { Candidate->i in n.inbox } pred processElected_enabled [n : Node, i : Node] { Elect->i in n.inbox } pred node_enabled [n : Node] { initiate_enabled[n] or (some i : Node | processCandidate_enabled[n, i]) or (some i : Node | processElected_enabled[n, i]) } In this new model all commands return the expected result. Below is a transition for the `eventually_elected` command. .. image:: instance20.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_20/leaderelection.als#L135-L145 :alt: Get the code to generate this instance. Notice that, without messages as signatures, messages are now represented as edges in the visualizer. In this case, the inbox is a ternary relation `Node -> Type -> Node`, so the type appears as an annotation of the edge. This may be arguably more difficult to interpret than the previous version. It is also worth noting that the analysis of this version of the model is much more efficient than the one with explicit messages. Running the `at_most_one_leader` command, whose log was already shown for the other version, is now one order of magnitude faster than the version with an explicit `Message` signature. .. code-block:: check at_most_one_leader for 3 .. image:: log2.png :width: 600 :align: center The payload of the messages in this protocol is very simple, with just two message types that happen to contain the same single, mandatory field. Let us imagine we needed an additional, optional field `r` with type `X`, in elected messages. With messages as signatures this would be easy to encode: just add a new `lone` field to the respective message type. .. code-block:: sig ElectedMsg extends Message { r : lone X } Adapting the events is mostly straightforward, just determining the correct value of `r`. However, issues with scope are now exacerbated: the number of messages must now increase to consider the possibility of an empty `r` (and the generator axiom must also be adapted to force those cases). In the messages as tuples idiom, we must now adapt the declaration of `inbox` so that it always contain tuples with the same arity. Elected messages with an `r` are quaternary tuples of type `Node -> Type -> Node -> X`, while all other messages are of type `Node -> Type -> Node`. To make all tuples of the same arity, we could introduce an `Empty` singleton signature to pad the tuples of the messages with less content. In particular, it will always be the forth element of candidate messages of elected messages without the `r` field. The declaration of the `inbox` would look as follows. .. code-block:: abstract sig Type {} one sig Candidate, Elect extends Type {} one sig Empty {} sig Node { succ : one Node, next : lone Node, var inbox : Type -> Id -> X + Empty } Using tuples to direct represent messages has many advantages as we have seen. Unfortunately, if messages have a lot of different fields it will not scale up, due to the wat the analysis of Alloy models is performed. Fields of arity :math:`n` might require :math:`2^{n^2}` boolean variables in the backend analysis engine, and this value becomes too large for larger values of :math:`n`. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/messages-as-tuples :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: Model finding :link-type: ref :link: analysis :octicon:`beaker` Further reading ^^^ Learn how the analysis engine of Alloy works, to understand why analysis becomes unfeasible with fields of large arity. .. card:: Higher-arity relations :link-type: ref :link: nary-relations :octicon:`beaker` Further reading ^^^ In this section we used higher-arity fields to model inboxes. Learn how to use them. Refining the network model ====================================== So far we've considered node inboxes to be sets, meaning that received messages are unordered and thus not necessarily processed in the delivery order. This allows us to show that our protocol is correct in a network model where there are no guarantees or ordered delivery. However, we might want to verify the correctness of our protocol in a more refined network model, closer to a possible implementation. For example, some middlewares ensure ordered delivery, but only support inboxes with a maximum size, meaning that if a message is received while the inbox is still full it will be silently discarded. If we wanted to check the correctness of our protocol with this network model, we would need to model inboxes using bounded queues, which can be done by resorting to Alloy sequences. To simplify the presentation, we will illustrate this by building on one of the simpler models of our protocol, where only the first phase was specified using a point-wise style, and where messages where just node identifiers. Sequences in Alloy are introduced with the keyword `seq`, and are essentially partial functions from integer indices to elements of the specified type. The scope of indices is bounded, meaning that sequences cannot grow indefinitely. Changing our inboxes to (mutable) sequences results in the following signature declaration. .. code-block:: sig Node { next : lone Node, succ : one Node, var inbox : seq Node } To adapt the events, we can make use of the auxiliary functions provided by utility module `sequniv` (which is imported automatically whenever sequences are used). But first, the Analyzer will throw an ambiguity error in our `ordering` do to a call to our `first` signature that clashes with a function defined in `sequniv`. To disambiguate, just replace it with `this/first`. Writing the events in the point-free style is complicated when sequence operations are involved, so let us adapt the point-wise version previously presented. To test whether a message is already in the inbox of the successor in the guard of the *initiate* event, one can use `elems` to get all elements in a sequence. To add a new message to an inbox, one can use function `add` to add an element to the end of a sequence. If the sequence is already full, `add` discards the new element and returns the sequence unchanged. .. code-block:: pred initiate [n : Node] { // node n initiates the protocol historically n not in elems[n.succ.inbox] // guard n.succ.inbox' = add[n.succ.inbox, n] // effect on n.succ.inbox all m : Node - n.succ | m.inbox' = m.inbox // effect on the outboxes of other nodes } The *process* event can be adapted in a similar manner. Rather than selecting an arbitrary message from the inbox, it now must select the `first` element in the sequence, and then update the inbox by keeping only the `rest` of the sequence. When propagating the message, is it also added to the successor's inbox with `add`. .. code-block:: pred process [n : Node, i : Node] { // i is read and processed by node n i in first[n.inbox] // guard n.inbox' = rest[n.inbox] // effect on n.inbox i in n.^next implies n.succ.inbox' = add[n.succ.inbox, i] // effect on n.succ.inbox else n.succ != n implies n.succ.inbox' = n.succ.inbox all m : Node - n - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes } Finally, we must adapt the notion of elected node. We'll just say that a node is considered to be elected when its own identifier was at the head of its inbox and processed in the next state. .. code-block:: fun Elected : set Node { { n : Node | once (before n in first[n.inbox] and n not in first[n.inbox]) } } At this point our leader election model with queues can be validated, and the previously defined safety and liveness properties will still be valid for a default scope of 3. However, this version of the protocol will actually not work if the capacity of the inboxes is less than the number of nodes. In particular it is no longer guaranteed that a leader will be elected. This can be checked with the following command. .. code-block:: check at_least_one_leader_fair for 3 but 2 seq This will now return a counter-example, for which a relevant transition is shown below. .. image:: instance21.png :width: 600 :align: center :target: https://github.com/practicalalloy/models/blob/40da9924bc7abef72c2bafb6875e9f779230d48a/protocol-design/instance_21/leaderelection.als#L126-L136 :alt: Get the code to generate this instance. The issue is that the inbox of `Node2`, the potential leader, is already at full capacity. `Node0` is attempting to send the `Node2` identifier to `Node2`, but since the queue is already full, the message is discarded. So to ensure liveness in this version of the protocol - and in an eventual implementation with such a communication middleware - one must make sure that the inboxes are large enough to contain all possible messages. Although this counter-example uses very small queues sizes, the problem still might exist in realistic implementations at larger scales. .. card-carousel:: 2 .. card:: Full model for the section :link: https://github.com/practicalalloy/models/tree/main/protocol-design/refining-the-network-model :octicon:`file-badge` Alloy model ^^^ Download and explore the files relevant for the model at this point of the book. .. card:: Sequences :link-type: ref :link: sequences :octicon:`beaker` Further reading ^^^ Sequences in Alloy have some particularities, learn how to use them correctly.