Introduce the main modular simulator interfaces and documentation

author Pascal Merz <pascal.merz@me.com>

Tue, 18 Jun 2019 13:33:24 +0000 (07:33 -0600)

committer Mark Abraham <mark.j.abraham@gmail.com>

Wed, 28 Aug 2019 13:40:27 +0000 (15:40 +0200)
author Pascal Merz <pascal.merz@me.com>
Tue, 18 Jun 2019 13:33:24 +0000 (07:33 -0600)
committer Mark Abraham <mark.j.abraham@gmail.com>
Wed, 28 Aug 2019 13:40:27 +0000 (15:40 +0200)
diff --git a/docs/doxygen/lib/modularsimulator.md b/docs/doxygen/lib/modularsimulator.md

new file mode 100644 (file)

index 0000000..687776c
--- /dev/null
+++ b/docs/doxygen/lib/modularsimulator.md
@@ -0,0 +1,317 @@
+The modular simulator {#page_modularsimulator}
+==============================================
+
+A new modular approach to the GROMACS simulator is described. The
+simulator in GROMACS is the object which carries out a simulation. The
+simulator object is created and owned by the runner object, which is
+outside of the scope of this new approach, and will hence not be further
+described. The simulator object provides access to some generally used
+data, most of which is owned by the runner object.
+
+## Legacy implementation
+
+In the legacy implementation, the simulator consisted of a number of
+independent functions carrying out different type of simulations, such
+as `do_md` (MD simulations), `do_cg` and `do_steep` (minimization),
+`do_rerun` (force and energy evaluation of simulation trajectories),
+`do_mimic` (MiMiC QM/MM simulations), `do_nm` (normal mode analysis),
+and `do_tpi` (test-particle insertion).
+
+The legacy approach has some obvious drawbacks:
+* *Data management:* Each of the `do_*` functions defines local data,
+  including complex objects encapsulating some data and functionality,
+  but also data structures effectively used as "global variables" for
+  communication between different parts of the simulation. Neither the
+  ownership nor the access rights (except for `const` qualifiers) are
+  clearly defined.
+* *Dependencies:* Many function calls in the `do_*` functions are
+  dependent on each others, i.e. rely on being called in a specific
+  order, but these dependencies are not clearly defined.
+* *Branches:* The flow of the `do_*` functions are hard to understand
+  due to branching. At setup time, and then at every step of the
+  simulation run, a number of booleans are set (e.g. `bNS` (do neighbor
+  searching), `bCalcEner` (calculate energies), `do_ene` (write
+  energies), `bEner` (energy calculation needed), etc). These booleans
+  enable or disable branches of the code (for the current step or the
+  entire run), mostly encoded as `if(...)` statements in the main `do_*`
+  loop, but also in functions called from there.
+* *Task scheduling:* Poorly defined dependencies and per-step branching
+  make task scheduling (e.g. parallel execution of independent tasks)
+  very difficult.
+* *Error-prone for developers:* Poorly defined dependencies and unclear
+  code flow make changing the simulator functions very error-prone,
+  rendering the implementation of new methods tedious.
+
+## The modular simulator approach
+
+The main design goals of the new, fully modular simulator approach
+include
+* *Extensibility:* We want to ease maintenance and the implementation
+  of new integrator schemes.
+* *Monte Carlo:* We want to add MC capability, which can be mixed with
+  MD to create hybrid MC/MD schemes.
+* *Data locality & interfaces:* We aim at localizing data in objects,
+  and offer interfaces if access from other objects is needed.
+* *Multi-stepping:* We aim at a design which intrinsically supports
+  multi-step integrators, e.g. having force calls at different
+  frequencies, or avoid having branches including rare events
+  (trajectory writing, neighbor search, ...) in the computation loops.
+* *Task parallelism:* Although not first priority, we want to have a
+  design which can be extended to allow for task parallelism.
+
+The general design approach is that of a **task scheduler**. *Tasks*
+are argument-less functions which perform a part of the computation.
+Periodically during the simulation, the scheduler builds a
+*queue of tasks*, i.e. a list of tasks which is then run through in
+order. Over time, with data dependencies clearly defined, this
+approach can be modified to have independent tasks run in parallel.
+
+The approach is most easily displayed using some pseudo code:
+
+    class ModularSimulator : public ISimulator
+    {
+        public:
+            //! Run the simulator
+            void run() override;
+        private:
+            std::vector<ISignaller*> signallers_;
+            std::vector<ISimulatorElement*> elements_;
+            std::queue<SimulatorRunFunction*> taskQueue_;
+    }
+
+    void ModularSimulator::run()
+    {
+        constructElementsAndSignallers();
+        setupAllElements();
+        while (not lastStep)
+        {
+            // Fill the task queue with new tasks (can be precomputed for many steps)
+            populateTaskQueue();
+            // Now simply loop through the queue and run one task after the next
+            for (auto task : taskQueue)
+            {
+                (*task)();  // run task
+            }
+        }
+    }
+
+This allows for an important division of tasks.
+
+* `constructElementsAndSignallers()` is responsible to **store the
+  elements in the right order**. This includes the different order of
+  element in different algorithms (e.g. leap-frog vs. velocity
+  verlet), but also logical dependencies (energy output after compute
+  globals).
+* `populateTaskQueue()` is responsible to **decide if elements need to
+  run at a specific time step**. The elements get called in order, and
+  decide whether they need to run at a specific step. This can be
+  pre-computed for multiple steps. In the current implementation, the
+  tasks are pre-computed for the entire life-time of the neighbor
+  list.
+* **Running the actual simulation tasks** is done after the task queue
+  was filled.  This is achieved by simply looping over the task list,
+  no conditionals or branching needed.
+
+### Simulator elements
+
+The task scheduler holds a list of *simulator elements*, defined by
+the `ISimulatorElement` interface. These elements have a
+`scheduleTask(Step, Time)` function, which gets called by the task
+scheduler. This allows the simulator element to register one (or more)
+function pointers to be run at that specific `(Step, Time)`. From the
+point of view of the element, it is important to note that the
+computation will not be carried out immediately, but that it will be
+called later during the actual (partial) simulation run. From the
+point of view of the builder of the task scheduler, it is important to
+note that the order of the elements determines the order in which
+computation is performed. The task scheduler periodically loops over
+its list of elements, builds a queue of function pointers to run, and
+returns this list of tasks. As an example, a possible application
+would be to build a new queue after each domain-decomposition (DD) /
+neighbor-searching (NS) step, which might occur every 100 steps. The
+scheduler would loop repeatedly over all its elements, with elements
+like the trajectory-writing element registering for only one or no
+step at all, the energy-calculation element registering for every
+tenth step, and the force, position / velocity propagation, and
+constraining algorithms registering for every step. The result would
+be a (long) queue of function pointers including all computations
+needed until the next DD / NS step, which can be run without any
+branching.
+
+### Signallers
+
+Some elements might require computations by other elements. If for
+example, the trajectory writing is an element independent from the
+energy-calculation element, it needs to signal to the energy element
+that it is about to write a trajectory, and that the energy element
+should be ready for that (i.e. perform an energy calculation in the
+upcoming step). This requirement, which replaces the boolean branching
+in the current implementation, is fulfilled by a Signaller - Client
+model. Classes implementing the `ISignaller` interface get called
+*before* every loop of the element list, and can inform registered
+clients about things happening during that step. The trajectory
+element, for example, can tell the energy element that it will write
+to trajectory at the end of this step. The energy element can then
+register an energy calculation during that step, being ready to write
+to trajectory when requested.
+
+### Sequence diagrams
+
+#### Pre-loop
+In the loop preparation, the signallers and elements are created and
+stored in the right order. The signallers and elements can then
+perform any setup operations needed.
+
+\msc
+hscale="2";
+
+ModularSimulator,
+Signallers [label="ModularSimulator::\nSignallers"],
+Elements [label="ModularSimulator::\nElements"],
+TaskQueue [label="ModularSimulator::\nTaskQueue"];
+
+--- [ label = "constructElementsAndSignallers()" ];
+    ModularSimulator => Signallers [ label = "Create signallers\nand order them" ];
+    ModularSimulator => Elements [ label = "Create elements\nand order them" ];
+--- [ label = "constructElementsAndSignallers()" ];
+|||;
+|||;
+
+--- [ label = "setupAllElements()" ];
+    ModularSimulator => Signallers [ label = "Call setup()" ];
+    Signallers box Signallers [ label = "for signaler in Signallers\n    signaller->setup()" ];
+    |||;
+    ModularSimulator => Elements [ label = "Call setup()" ];
+    Elements box Elements [ label = "for element in Elements\n    element->setup()" ];
+--- [ label = "setupAllElements()" ];
+\endmsc
+
+#### Main loop
+The main loop consists of two parts which are alternately run until the
+simulation stop criterion is met. The first part is the population of
+the task queue, which determines all tasks that will have to run to
+simulate the system for a given time period. In the current implementation,
+the scheduling period is set equal to the lifetime of the neighborlist.
+Once the tasks have been predetermined, the simulator runs them in order.
+This is the actual simulation computation, which can now run without any
+branching.
+
+\msc
+hscale="2";
+
+ModularSimulator,
+Signallers [label="ModularSimulator::\nSignallers"],
+Elements [label="ModularSimulator::\nElements"],
+TaskQueue [label="ModularSimulator::\nTaskQueue"];
+
+ModularSimulator box TaskQueue [ label = "loop: while(not lastStep)" ];
+ModularSimulator note TaskQueue [ label = "The task queue is empty. The simulation state is at step N.", textbgcolor="yellow" ];
+|||;
+|||;
+ModularSimulator box ModularSimulator [ label = "populateTaskQueue()" ];
+ModularSimulator =>> TaskQueue [ label = "Fill task queue with tasks until next neighbor-searching step" ];
+|||;
+|||;
+ModularSimulator note TaskQueue [ label = "The task queue now holds all tasks needed to move the simulation from step N to step N + nstlist. The simulation for these steps has not been performed yet, however. The simulation state is hence still at step N.", textbgcolor="yellow" ];
+|||;
+|||;
+
+ModularSimulator => TaskQueue [ label = "Run all tasks in TaskQueue" ];
+TaskQueue box TaskQueue [label = "for task in TaskQueue\n    run task" ];
+TaskQueue note TaskQueue [ label = "All simulation computations are happening in this loop!", textbgcolor="yellow" ];
+|||;
+|||;
+ModularSimulator note TaskQueue [ label = "The task queue is now empty. The simulation state is at step N + nstlist.", textbgcolor="yellow" ];
+ModularSimulator box TaskQueue [ label = "end loop: while(not lastStep)" ];
+
+\endmsc
+
+#### Task scheduling
+A part of the main loop, the task scheduling in `populateTaskQueue()` 
+allows the elements to push tasks to the task queue. For every scheduling 
+step, the signallers are run first to give the elements information about 
+the upcoming scheduling step. The scheduling routine elements are then 
+called in order, allowing the elements to register their respective tasks.
+
+\msc
+hscale="2";
+
+ModularSimulator,
+Signallers [label="ModularSimulator::\nSignallers"],
+Elements [label="ModularSimulator::\nElements"],
+TaskQueue [label="ModularSimulator::\nTaskQueue"];
+
+--- [ label = "populateTaskQueue()" ];
+    ModularSimulator box ModularSimulator [ label = "doDomainDecomposition()\ndoPmeLoadBalancing()" ];
+    ModularSimulator =>> Elements [ label = "Update state and topology" ];
+    |||;
+    |||;
+
+    ModularSimulator note ModularSimulator [ label = "schedulingStep == N\nsimulationStep == N", textbgcolor="yellow" ];
+    ModularSimulator box TaskQueue [ label = "loop: while(not nextNeighborSearchingStep)" ];
+        ModularSimulator => Signallers [ label = "Run signallers for schedulingStep" ];
+        Signallers box Signallers [label = "for signaller in Signallers\n    signaller->signal(scheduleStep)" ];
+        Signallers =>> Elements [ label = "notify" ];
+        Signallers note Elements [ label = "The elements now know if schedulingStep has anything special happening, e.g. neighbor searching, log writing, trajectory writing, ...", textbgcolor="yellow" ];
+        |||;
+        |||;
+
+        ModularSimulator => Elements [ label = "Schedule run functions for schedulingStep" ];
+        Elements box Elements [label = "for element in Elements\n    element->scheduleTask(scheduleStep)" ];
+        Elements =>> TaskQueue [ label = "Push task" ];
+        Elements note TaskQueue [ label = "The elements have now registered everything they will need to do for schedulingStep.", textbgcolor="yellow" ];
+        ModularSimulator note ModularSimulator [ label = "schedulingStep++", textbgcolor="yellow" ];
+
+    ModularSimulator box TaskQueue [ label = "end loop: while(not nextNeighborSearchingStep)" ];
+--- [ label = "populateTaskQueue()" ];
+ModularSimulator note ModularSimulator [ label = "schedulingStep == N + nstlist\nsimulationStep == N", textbgcolor="yellow" ];
+
+\endmsc
+
+## Acceptance tests and further plans
+
+Acceptance tests before this can be made default code path (as
+defined with Mark Jan 2019)
+* End-to-end tests pass on both `do_md` and the new loop in
+  Jenkins pre- and post-submit matrices
+* Physical validation cases pass on the new loop
+* Performance on different sized benchmark cases, x86 CPU-only
+  and NVIDIA GPU are at most 1% slower -
+  https://github.com/ptmerz/gmxbenchmark has been developed to
+  this purpose.
+
+After the NVE MD bare minimum, we will want to add support for
+* Thermo- / barostats
+* FEP
+* Pulling
+* Checkpointing
+
+Using the new modular simulator framework, we will then explore
+adding new functionality to GROMACS, including
+* Monte Carlo barostat
+* hybrid MC/MD schemes
+* multiple-time-stepping integration
+
+We sill also explore optimization opportunities, including
+* re-use of the same queue if conditions created by user input are 
+  sufficiently favorable (e.g. by design or when observed)
+* simultaneous execution of independent tasks
+
+We will probably not prioritize support for (and might consider
+deprecating from do_md for GROMACS 2020)
+* Simulated annealing
+* REMD
+* Simulated tempering
+* Multi-sim
+* Membrane embedding
+* QM/MM
+* FEP lambda vectors
+* Fancy mdp options for FEP output
+* MTTK
+* Shell particles
+* Essential dynamics
+* Enforced rotation
+* Constant acceleration groups
+* Ensemble-averaged restraints
+* Time-averaged restraints
+* Freeze, deform, cos-acceleration
diff --git a/docs/doxygen/user/mainpage.md b/docs/doxygen/user/mainpage.md

index dbe55531afa150a7ff51286082b1b2dfaca897d3..959f1e417623ae70cbd3478572725532dcac771c 100644 (file)
--- a/docs/doxygen/user/mainpage.md
+++ b/docs/doxygen/user/mainpage.md
@@ -75,6 +75,8 @@ give an overview of some of the topics that are documented:
   - \subpage page_refdata <br/>
     Documentation for writing tests that use reference data generated by the
     same test.
+ - \subpage page_modularsimulator <br/>
+   Documentation of the modular simulator approach.
  
   - \subpage page_modulegraph <br/>
     Automatically generated module dependency graph.
diff --git a/src/gromacs/CMakeLists.txt b/src/gromacs/CMakeLists.txt

index 9471ce6e79b2ad3259d3570a624fee61f41cbb85..1f8225f92cb2bd76d5e12154f3dc9bb644620a00 100644 (file)
--- a/src/gromacs/CMakeLists.txt
+++ b/src/gromacs/CMakeLists.txt
@@ -105,6 +105,7 @@ add_subdirectory(simd)
  add_subdirectory(imd)
  add_subdirectory(compat)
  add_subdirectory(mimic)
+add_subdirectory(modularsimulator)
  if (NOT GMX_BUILD_MDRUN_ONLY)
      add_subdirectory(gmxana)
      add_subdirectory(gmxpreprocess)
diff --git a/src/gromacs/modularsimulator/CMakeLists.txt b/src/gromacs/modularsimulator/CMakeLists.txt

new file mode 100644 (file)

index 0000000..e9e40d9
--- /dev/null
+++ b/src/gromacs/modularsimulator/CMakeLists.txt
@@ -0,0 +1,35 @@
+#
+# This file is part of the GROMACS molecular simulation package.
+#
+# Copyright (c) 2019, by the GROMACS development team, led by
+# Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
+# and including many others, as listed in the AUTHORS file in the
+# top-level source directory and at http://www.gromacs.org.
+#
+# GROMACS is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public License
+# as published by the Free Software Foundation; either version 2.1
+# of the License, or (at your option) any later version.
+#
+# GROMACS is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with GROMACS; if not, see
+# http://www.gnu.org/licenses, or write to the Free Software Foundation,
+# Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA.
+#
+# If you want to redistribute modifications to GROMACS, please
+# consider that scientific software is very special. Version
+# control is crucial - bugs must be traceable. We will be happy to
+# consider code for inclusion in the official distribution, but
+# derived work must not be called official GROMACS. Details are found
+# in the README & COPYING files - if they are missing, get the
+# official version at http://www.gromacs.org.
+#
+# To help us fund GROMACS development, we humbly ask that you cite
+# the research papers on the package. Check out http://www.gromacs.org.
+
+gmx_install_headers()
diff --git a/src/gromacs/modularsimulator/modularsimulatorinterfaces.h b/src/gromacs/modularsimulator/modularsimulatorinterfaces.h

new file mode 100644 (file)

index 0000000..8992602
--- /dev/null
+++ b/src/gromacs/modularsimulator/modularsimulatorinterfaces.h
@@ -0,0 +1,145 @@
+/*
+ * This file is part of the GROMACS molecular simulation package.
+ *
+ * Copyright (c) 2019, by the GROMACS development team, led by
+ * Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
+ * and including many others, as listed in the AUTHORS file in the
+ * top-level source directory and at http://www.gromacs.org.
+ *
+ * GROMACS is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 2.1
+ * of the License, or (at your option) any later version.
+ *
+ * GROMACS is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with GROMACS; if not, see
+ * http://www.gnu.org/licenses, or write to the Free Software Foundation,
+ * Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA.
+ *
+ * If you want to redistribute modifications to GROMACS, please
+ * consider that scientific software is very special. Version
+ * control is crucial - bugs must be traceable. We will be happy to
+ * consider code for inclusion in the official distribution, but
+ * derived work must not be called official GROMACS. Details are found
+ * in the README & COPYING files - if they are missing, get the
+ * official version at http://www.gromacs.org.
+ *
+ * To help us fund GROMACS development, we humbly ask that you cite
+ * the research papers on the package. Check out http://www.gromacs.org.
+ */
+/*! \defgroup module_modularsimulator The modular simulator
+ * \ingroup group_mdrun
+ * \brief
+ * The modular simulator improves extensibility, adds Monte Carlo capabilities,
+ * promotes data locality and communication via interfaces, supports
+ * multi-stepping integrators, and paves the way for some task parallelism.
+ *
+ * For more information, see page_modularsimulator
+ * \todo Can we link to `docs/doxygen/lib/modularsimulator.md`?
+ *
+ * \author Pascal Merz <pascal.merz@me.com>
+ */
+/*! \libinternal \file
+ * \brief
+ * Declares the main interfaces used by the modular simulator
+ *
+ * \author Pascal Merz <pascal.merz@me.com>
+ * \ingroup module_modularsimulator
+ */
+#ifndef GMX_MODULARSIMULATOR_MODULARSIMULATORINTERFACES_H
+#define GMX_MODULARSIMULATOR_MODULARSIMULATORINTERFACES_H
+
+#include <functional>
+#include <memory>
+
+namespace gmx
+{
+//! \addtogroup module_modularsimulator
+//! \{
+
+// This will make signatures more legible
+//! Step number
+using Step = int64_t;
+//! Simulation time
+using Time = double;
+
+//! The function type that can be scheduled to be run during the simulator run
+typedef std::function<void()> SimulatorRunFunction;
+//! Pointer to the function type that can be scheduled to be run during the simulator run
+typedef std::unique_ptr<SimulatorRunFunction> SimulatorRunFunctionPtr;
+
+//! The function type that allows to register run functions
+typedef std::function<void(SimulatorRunFunctionPtr)> RegisterRunFunction;
+//! Pointer to the function type that allows to register run functions
+typedef std::unique_ptr<RegisterRunFunction> RegisterRunFunctionPtr;
+
+/*! \libinternal
+ * \brief The general interface for elements of the modular simulator
+ *
+ * Setup and teardown are run once at the beginning of the simulation
+ * (after all elements were set up, before the first step) and at the
+ * end of the simulation (after the last step, before elements are
+ * destructed), respectively. `registerRun` is periodically called
+ * during the run, at which point elements can decide to register one
+ * or more run functions to be run at a specific time / step. Registering
+ * more than one function is especially valuable for collective elements
+ * consisting of more than one single element.
+ */
+class ISimulatorElement
+{
+    public:
+        /*! \brief Query whether element wants to run at step / time
+         *
+         * Element can register one or more functions to be run at that step through
+         * the registration pointer.
+         */
+        virtual void scheduleTask(Step, Time, const RegisterRunFunctionPtr&) = 0;
+        //! Method guaranteed to be called after construction, before simulator run
+        virtual void elementSetup() = 0;
+        //! Method guaranteed to be called after simulator run, before deconstruction
+        virtual void elementTeardown() = 0;
+        //! Standard virtual destructor
+        virtual ~ISimulatorElement() = default;
+};
+
+/*! \libinternal
+ * \brief The general Signaller interface
+ *
+ * Signallers are run at the beginning of Simulator steps, informing
+ * their clients about the upcoming step. This allows clients to
+ * decide if they need to be activated at this time step, and what
+ * functions they will have to run. Examples for signallers
+ * include the neighbor-search signaller (informs its clients when a
+ * neighbor-searching step is about to happen) or the logging
+ * signaller (informs its clients when a logging step is about to
+ * happen).
+ *
+ * We expect all signallers to accept registration from clients, but
+ * different signallers might handle that differently, so we don't
+ * include this in the interface.
+ */
+class ISignaller
+{
+    public:
+        //! Function run before every step of scheduling
+        virtual void signal(Step, Time) = 0;
+        //! Method guaranteed to be called after construction, before simulator run
+        virtual void signallerSetup() = 0;
+        //! Standard virtual destructor
+        virtual ~ISignaller() = default;
+};
+
+//! The function type that can be registered to signallers for callback
+typedef std::function<void(Step, Time)> SignallerCallback;
+//! Pointer to the function type that can be registered to signallers for callback
+typedef std::unique_ptr<SignallerCallback> SignallerCallbackPtr;
+
+//! /}
+}      // namespace gmx
+
+#endif // GMX_MODULARSIMULATOR_MODULARSIMULATORINTERFACES_H
author	Pascal Merz <pascal.merz@me.com>
	Tue, 18 Jun 2019 13:33:24 +0000 (07:33 -0600)
committer	Mark Abraham <mark.j.abraham@gmail.com>
	Wed, 28 Aug 2019 13:40:27 +0000 (15:40 +0200)
docs/doxygen/lib/modularsimulator.md	[new file with mode: 0644]	patch \| blob
docs/doxygen/user/mainpage.md		patch \| blob \| history
src/gromacs/CMakeLists.txt		patch \| blob \| history
src/gromacs/modularsimulator/CMakeLists.txt	[new file with mode: 0644]	patch \| blob
src/gromacs/modularsimulator/modularsimulatorinterfaces.h	[new file with mode: 0644]	patch \| blob