.. _analysis:

Model finding
===================

.. index::
  model finding

The Alloy Analyzer relies on a procedure called *model finding* to execute the
analysis commands. This procedure, in turn, relies on off-the-shelf solvers for
Boolean satisfiability problems (SAT). This topic provides an overview of this
process, pointing out to other references for more detailed presentations.

Relational model finding
------------------------

.. index::
   Kodkod

The Analyzer relies on an intermediate tool to perform relational model finding
named Kodkod [FSE04]_. Kodkod is also based on first-order relational logic, but
is stripped of all the structure and syntactic sugar supported by the Alloy
language: a Kodkod problem is simply a set of relations bounded by tuple sets
and a single formula that will be tested for satifiability. Relations have both a lower-bound declaring which tuples it *must* contain, and an upper-bound declaring which tuples it *may* contain. These bounds allow the user to express some partial-knowledge about the possible solutions to the problem.
If a Kodkod problem
is satisfiable, an instance is returned that assigns a tuple set to each
declared relation, ensuring each tuple set is within the specified bounds. 
The problem's relations are derived from the signatures and
fields declared in the Alloy model, and their bounds from the scope of the
command under analysis. The problem's formula contains the model facts and
constraints defined in the command, as well as additional restrictions to encode
the signature hierarchy and multiplicity constraints. For :alloy:`check`
commands, the expected property is negated so that Kodkod will search for
counter-examples, i.e., instances that obey all the model's constraints but
break the expected property. Kodkod is also responsible for performing
Skolemization (see below), and detecting and enforcing symmetry breaking.

Let us get back to our file system example from the  
:ref:`structural-modeling` chapter, and consider the following simple command with a small
scope, asking for instances where all entries point to directories.

.. code-block::

  run all_entries_dir { 
    Entry.object in Dir 
  } for 2

Given this scope, a universe with 6 atoms is created, with 2 atoms for each
top-level signature. Then the following relations, and the respective lower- and
upper-bounds, are declared in the Kodkod problem . 

.. code-block::

   Root    = {(Root0)} {(Root0)}
   Dir$    = {}        {(Object0)}
   File    = {}        {(Object0)}
   Entry   = {}        {(Entry0),(Entry1)}
   Name    = {}        {(Name0),(Name1)}
 
   entries = {}        {(Root0,Entry0),(Root0,Entry1),(Object0,Entry0),(Object0,Entry1)}
   object  = {}        {(Entry0,Root0),(Entry1,Root0),(Entry0,Object0),(Entry1,Object0)}
   name    = {}        {(Entry0,Name0),(Entry1,Name1),(Entry0,Name1),(Entry0,Name1)}

Note that there isn't a relation created for signatures that are extended by
others. Instead, whenever called, they are replaced by the union of all its
children signatures. If the signature is not abstract, then a new relation is
declared to encode the atoms that do not belong to any of its children (we
denote this set with a :code:`$` suffix). For instance, any call to
:alloy:`Dir` in an expression is replaced by :alloy:`Root + Dir$` in Kodkod, and
calls to :alloy:`Object` by :alloy:`File + Root + Dir$` (since :alloy:`Object`
is abstract, there are no objects that are not files or directories). Note also
that, since :alloy:`Root` has multiplicity :alloy:`one`, its lower-bound is the
same as its upper-bound, making its bounds exact; all other relations in this
model have an empty lower-bound.

Then, besides the constraints imposed by the model's facts and the :alloy:`run`
command, a few other constraints are introduced from the signature declarations.
The first, shown below, forces the set of directories and files to be disjoint.

.. code-block::

  no (Root + Dir$) & File

The others enforce the type and multiplicity of the binary relations. Below are
the constraints corresponding to the :alloy:`object` relation. The first
constraint enforces the multiplicity :alloy:`one` for all entries and the
appropriate range type, :alloy:`Object`. The second constraint enforces the type
of the domain, :alloy:`Entry`. These constraints may seem a bit spurious when
looking at the upper-bound of :alloy:`object`, since it only contains pairs
relating entries and objects. However, recall that at this point it is still
unknown which atoms will effectively exist in the instance. For instance,
:alloy:`Entry` can even be empty, and :alloy:`object` must be restricted
accordingly during solving.

.. code-block::

    all e : Entry | one e.object and e.object in Root + Dir$ + File    
    object.univ in Entry

.. index::
   Options menu
   Record the Kodkod input/output menu option

You can actually inspect the generated Kodkod problem by setting menu option
:menuselection:`Options --> Record the Kodkod input/output` to
:menuselection:`Yes`. When turned on, words :guilabel:`Assertion` or
:guilabel:`Predicate` in the Analyzer log, for :alloy:`check` and :alloy:`run`
commands respectively, will open a pop-window with this information. Below is
how it looks for the command above. 

.. image:: log1.png
   :width: 550 px
   :align: center

You can inspect the full formula that was extracted from an Alloy model after
all the structure was flattened (it will probably be a bit more verbose than the
simplified snippets shown above). You'll also find the corresponding Java code
that allows the generation of that problem, since Kodkod does not support a
concrete syntax for defining problems. 

.. card-carousel:: 2

   .. card:: Full model for the section
      :link: https://github.com/practicalalloy/models/tree/2024-02-28/structural-topics/model-finding/relational-model-finding

      :octicon:`file-badge` Alloy model 
      ^^^
      Download and explore the files relevant for the model at this point of the book.

   .. card:: Commands in detail
      :link-type: ref
      :link: commands

      :octicon:`beaker` Further reading
      ^^^
      Analysis is triggered by commands declared in the model. Learn in detail how these are defined.

   .. card:: Type system
      :link-type: ref
      :link: type-system

      :octicon:`beaker` Further reading
      ^^^
      Alloy type system also flattens the signature hierarchy to simplify type checking.


From relational logic to SAT
----------------------------------

.. index::
   SAT solving

Kodkod relies on off-the-shelf SAT solvers to solve model finding problems. SAT
solvers process only propositional formulas in conjunctive normal form (CNF),
and generating these from complex  Kodkod problems is not trivial [FSE04]_.
Roughly, if the universe has *u* atoms, each *n*-ary relation is seen as an
*n*-dimensional Boolean matrix with size *u*, where each entry in the matrix
denotes whether an *n*-ary tuple belongs to the relation. Using the bounds
declared for the relation, some of these entries will be necessarily true, and
some necessarily false; for those whose value is not pre-defined in the bounds, Boolean variables are created
whose value will be determined by the solver. Then, relational operations are
converted into corresponding matrix operators. For instance, the union
:alloy:`+` is encoded as entrywise matrix disjunction, while composition
:alloy:`.` is encoded as matrix multiplication. Atomic formulas are converted
accordingly. For instance, inclusion :alloy:`in` is encoded as the conjunction
of entrywise implications. Finally, first-order operators are translated into
propositional logic, which is possible because at this point the finite universe
of atoms is known. So universal quantifications :alloy:`all` are flattened into
conjunctions and existential :alloy:`some` into disjunctions.

Let us get back to our example. It has 6 atoms, so unary relations are
represented by 6 x 1 matrices (a vector). Below are the matrices corresponding
to the unary relations. Notice how tuples that necessarily belong to a relation
due to its lower-bound are set to true (here only :alloy:`Root`), and those
outside the upper-bound are set to false. Those between the lower- and
upper-bound are assigned a new Boolean variable.

+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Root`      |            || :alloy:`Dir$`      |            || :alloy:`File`      |            || :alloy:`Entry`     |            || :alloy:`Name`      |            |
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Root0`     |:math:`\top`|| :alloy:`Root0`     |:math:`\bot`|| :alloy:`Root0`     |:math:`\bot`|| :alloy:`Root0`     |:math:`\bot`|| :alloy:`Root0`     |:math:`\bot`|
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Object0`   |:math:`\bot`|| :alloy:`Object0`   | :math:`v_1`|| :alloy:`Object0`   |:math:`v_2` || :alloy:`Object0`   |:math:`\bot`|| :alloy:`Object0`   |:math:`\bot`|
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Entry0`    |:math:`\bot`|| :alloy:`Entry0`    |:math:`\bot`|| :alloy:`Entry0`    |:math:`\bot`|| :alloy:`Entry0`    |:math:`v_3` || :alloy:`Entry0`    |:math:`\bot`|
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Entry1`    |:math:`\bot`|| :alloy:`Entry1`    |:math:`\bot`|| :alloy:`Entry1`    |:math:`\bot`|| :alloy:`Entry1`    |:math:`v_4` || :alloy:`Entry1`    |:math:`\bot`|
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Name0`     |:math:`\bot`|| :alloy:`Name0`     |:math:`\bot`|| :alloy:`Name0`     |:math:`\bot`|| :alloy:`Name0`     |:math:`\bot`|| :alloy:`Name0`     |:math:`v_5` |
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+
| :alloy:`Name1`     |:math:`\bot`|| :alloy:`Name1`     |:math:`\bot`|| :alloy:`Name1`     |:math:`\bot`|| :alloy:`Name1`     |:math:`\bot`|| :alloy:`Name1`     |:math:`v_6` |
+--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------++--------------------+------------+

Getting back to command :alloy:`all_entries_dir`, relation :alloy:`Root + Dir$`
(i.e., relation :alloy:`Dir`) is calculated by the entrywise disjunction of the
corresponding matrices, resulting in the matrix below.

+----------------------+------------+
| :alloy:`Root + Dir$` |            |
+----------------------+------------+
| :alloy:`Root0`       |:math:`\top`|
+----------------------+------------+
| :alloy:`Object0`     |:math:`v_1` |
+----------------------+------------+
| :alloy:`Entry0`      |:math:`\bot`|
+----------------------+------------+
| :alloy:`Entry1`      |:math:`\bot`|
+----------------------+------------+
| :alloy:`Name0`       |:math:`\bot`|
+----------------------+------------+
| :alloy:`Name1`       |:math:`\bot`|
+----------------------+------------+

The binary relations are represented by 6 x 6 matrices. Below is the matrix for
:alloy:`object` (matrices for :alloy:`entries` and :alloy:`name` are omitted but
look similar).

+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`object` | :alloy:`Root0`| :alloy:`Object0`| :alloy:`Entry0` | :alloy:`Entry1` | :alloy:`Name0` | :alloy:`Name1` |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Root0`  |  :math:`\bot` | :math:`\bot`    | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Object0`|  :math:`\bot` | :math:`\bot`    | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Entry0` | :math:`v_{7}` | :math:`v_{8}`   | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Entry1` | :math:`v_{9}` | :math:`v_{10}`  | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Name0`  |  :math:`\bot` | :math:`\bot`    | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+
| :alloy:`Name1`  |  :math:`\bot` | :math:`\bot`    | :math:`\bot`    | :math:`\bot`    | :math:`\bot`   | :math:`\bot`   |
+-----------------+---------------+-----------------+-----------------+-----------------+----------------+----------------+

Continuing with command :alloy:`all_entries_dir`, we must calculate composition
:alloy:`Entry.object`, which is encoded as matrix multiplication. The result is
the following vector. Intuitively, this states that, for instance, atom
:alloy:`Object0` is present in :alloy:`Entry.object` if :alloy:`Entry0` exists in
:alloy:`Entry` (:math:`v_3`) and :alloy:`object` relates those two atoms
(:math:`v_9`), or :alloy:`Entry1` exists in :alloy:`Entry` (:math:`v_4`) and is
related with :alloy:`Object0` through :alloy:`object` (:math:`v_{10}`).

+------------------------------+--------------------------------------------------+
| :alloy:`Entry.object`        |                                                  |
+------------------------------+--------------------------------------------------+
| :alloy:`Root`                | :math:`(v_3 \wedge v_7) \vee (v_4 \wedge v_8)`   |
+------------------------------+--------------------------------------------------+
| :alloy:`Object`              | :math:`(v_3 \wedge v_9) \vee (v_4 \wedge v_{10})`|
+------------------------------+--------------------------------------------------+
| :alloy:`Entry0`              |  :math:`\bot`                                    |
+------------------------------+--------------------------------------------------+
| :alloy:`Entry1`              |  :math:`\bot`                                    |
+------------------------------+--------------------------------------------------+
| :alloy:`Name0`               |  :math:`\bot`                                    |
+------------------------------+--------------------------------------------------+
| :alloy:`Name1`               |  :math:`\bot`                                    |
+------------------------------+--------------------------------------------------+

Now, it only remains to encode the inclusion :alloy:`Entry.object in Dir`
through entrywise implication of the two corresponding matrices. The result is
the following propositional formula.

.. math::

  &(((v_3 \wedge v_7) \vee (v_4 \wedge v_8)) \rightarrow \top) \wedge (((v_3 \wedge v_9) \vee (v_4 \wedge v_{10})) \rightarrow v_1) \\
  &\equiv \\ 
  &((v_3 \wedge v_9) \vee (v_4 \wedge v_{10})) \rightarrow v_1

This reads as: if atom :alloy:`Object0` belongs to :alloy:`Entry.object`, then
it must belong to :alloy:`Dir$` (:math:`v_1`). The first conjunct was trivially
true since it related to :alloy:`Root0`, which is necessarily in :alloy:`Dir`
through :alloy:`Root`. Although we avoided it for simplicity, in practice, the
resulting formula will be in CNF.

When the Analyzer reports the result of a command, you can actually get a bit of
an insight of the complexity of the SAT problem deployed underneath. The
*primary vars* reported are precisely those that correspond to tuples of the
problem's relations. The execution of command :alloy:`all_entries_dir` shown
above reported 18 primary variables (the 10 shown above plus another 8 for the
omitted :alloy:`entries` and :alloy:`name` relations). The *vars* and *clauses*
result from the translation of the problem's formula into CNF.

Since this procedure generates SAT problems, we can then use many different solvers to check satisfiability, and
the Analyzer comes packaged with many state-of-the-art SAT solvers. The SAT solver
used by Kodkod can be set through the menu option :menuselection:`Options -->
Solver` of the Analyzer. By default, this solver is set to
:menuselection:`SAT4J`, a Java SAT solver that is guaranteed to run in every
operating system that the Analyzer also runs. While it's impossible to predict
which SAT solver will perform best for each particular model, SAT4J is rarely
the fastest, and the user can explore other solvers that come pre-packaged with
the Analyzer depending on the operating system, such as :menuselection:`Glucose`
or :menuselection:`MiniSAT`. Note that instance iteration requires the usage of
an incremental SAT solver to be feasible. So if you select a SAT solver that is
not incremental (such as :menuselection:`Lingeling`), you'll get an error
message when trying to iterate instances in the visualizer. Note also that the
order on which the instances are generated my change depending on the selected
solver (although eventually they will all generate the same set of instances).

.. card-carousel:: 2


   .. card:: Full model for the section
      :link: https://github.com/practicalalloy/models/tree/2024-02-28/structural-topics/model-finding/from-relational-logic-to-sat

      :octicon:`file-badge` Alloy model 
      ^^^
      Download and explore the files relevant for the model at this point of the book.
      

   .. card:: A relational logic primer
      :link-type: ref
      :link: relational-logic

      :octicon:`beaker` Further reading
      ^^^
      Learn about all the different operators of the relational logic, that are also supported by Kodkod.


Skolemization
---------------------------

.. index::
   Skolemization
   quantifier; higher-order
   Skolem depth menu option

Before deploying the solver, Kodkod performs a process called *Skolemization*
whenever outermost existential quantifications occur in :alloy:`run` commands
(or universal quantifications for :alloy:`check` commands). Instead of flattening the existential quantifier to a disjunction, skolemization transforms the problem to an equisatisfiable one, by creating a new free singleton set (a unary relation) for the existentially quantified variable, and then removing the quantifier and leaving just the quantified formula. The solver will then try to search for a value for the new free variable that satisfies the formula. This largely
improves the efficiency of the analysis procedures, and as a side-effect such
singleton sets are now also part of the generated instance. Moreover,
Skolemization allows the analysis of some commands with *higher-order
quantifications*, which is forbidden in Alloy in general. In particular an existential
quantified higher-order variable in a `run` command will be Skolomized to a new relation with the appropriate
arity. Such Skolemized variables can be treated as any other relation in the theme
customization. You can control how deep the Analyzer will search for
quantifications to Skolemize through menu option :menuselection:`Options -->
Skolem depth`, but you cannot actually turn it off altogether.

Let us consider an example, for our file system model. For instance, the following
command asks for instances where some entry points to a directory.

.. code-block::

  run some_entries_dir {
    some d : Dir | d in Entry.object
  } for 2

The resulting instance is the following, where variable :alloy:`d` was
Skolemized and is treated as a regular subset of :alloy:`Dir`. Its name is
prefixed with the command name, and with a :code:`$` to distinguish it from the
model's relations. So it appears in the visualization, can be customized in the
theme, or called in the evaluator. Here, we just set its border to be
:guilabel:`Bold`. If you look at the log, you'll notice that the Analyzer now
reported 20 primary variables rather than the expected 18: that's because
:alloy:`d` was encoded as a relation of the Kodkod problem.

.. image:: instance1.png
   :width: 450 px
   :align: center
   :target: https://github.com/practicalalloy/models/blob/2024-02-28/structural-topics/model-finding/instance_01_02/filesystem.als#L87-L99
   :alt: Get the code to generate this instance.
  
As an example of a higher-order quantification, let's suppose we wanted to check the obviously false property that all subsets of entries in a file system have the same name. The `check` command to do so is the following.

.. code-block::

  check all_entries_same_name {
    all s : set Entry | lone s.name
  }

When translated to Kodkod the formula in this command will be negated and originate an existential higher-order quantification that searches for any subset of entries with at least two different names, which would be a counter-example to this property. The Skolomized variable will be part of the counter-example instance, as can be seen
in the instance below, where `$all_entries_same_name_s` contains entries with two different names.

.. image:: instance2.png
   :width: 450 px
   :align: center
   :target: https://github.com/practicalalloy/models/blob/2024-02-28/structural-topics/model-finding/instance_01_02/filesystem.als#L101-L112
   :alt: Get the code to generate this instance.

It is worth noting that Skolomization of higher-order quantifications does not add any expressiveness to the language: you
could achieve the same result by introducing a new subset signature of
:alloy:`Entry` and enforcing the same constraints over it (although this would
pollute the model, while the higher-order quantification is local to the assertion).
If an higher-order quantification cannot be solved by 
Skolemized an error is thrown by the Analyzer.

.. card-carousel:: 2

   .. card:: Full model for the section
      :link: https://github.com/practicalalloy/models/tree/2024-02-28/structural-topics/model-finding/skolemization

      :octicon:`file-badge` Alloy model 
      ^^^
      Download and explore the files relevant for the model at this point of the book.