docs/gmxapi/userguide/usage.rst

   1 ========================
   2 Using the Python package
   3 ========================
   4
   5 After installing GROMACS, sourcing the "GMXRC" (see GROMACS docs), and installing
   6 the gmxapi Python package (see :doc:`install`), import the package in a Python
   7 script or interactive interpreter. This documentation assumes a convenient alias
   8 of ``gmx`` to refer to the ``gmxapi`` Python package.
   9
  10 ::
  11
  12     import gmxapi as gmx
  13
  14 For full documentation of the Python-level interface and API, use the ``pydoc``
  15 command line tool or the :py:func:`help` interactive Python function, or refer to
  16 the :doc:`pythonreference`.
  17
  18 Any Python *exception* raised by gmxapi
  19 should be descended from (and catchable as) :class:`gmxapi.exceptions.Error`.
  20 Additional status messages can be acquired through the :ref:`gmxapi logging`
  21 facility.
  22 Unfortunately, some errors occurring in the GROMACS library are not yet
  23 recoverable at the Python level, and much of the standard GROMACS terminal
  24 output is not yet accessible through Python.
  25 If you find a particularly problematic scenario, please file a GROMACS bug report.
  26
  27 During installation, the *gmxapi* Python package becomes tied to a specific
  28 GROMACS installation.
  29 If you would like to access multiple GROMACS installations
  30 from Python, build and install *gmxapi* in separate
  31 :ref:`virtual environments <gmxapi venv>`.
  32
  33 In some cases *gmxapi* still needs help finding infrastructure from the
  34 GROMACS installation.
  35 For instance, :py:func:`gmxapi.commandline_operation` is not a pure API utility,
  36 but a wrapper for command line tools.
  37 Make sure that the command line tools you intend to use are discoverable in
  38 your :envvar:`PATH`, such as by "source"ing your :file:`GMXRC` before launching
  39 a *gmxapi* script.
  40
  41 .. todo:: Get relevant GROMACS paths in Python environment.
  42
  43     :py:class:`gmxapi.commandline_operation` relies on the environment :envvar:`PATH`
  44     to locate executables, including the :command:`gmx` wrapper binary.
  45     Relates to `#2961 <https://redmine.gromacs.org/issues/2961>`__.
  46
  47 .. _parallelism:
  48
  49 Notes on parallelism and MPI
  50 ============================
  51
  52 When launching a *gmxapi* script in an MPI environment,
  53 such as with :command:`mpiexec` or :command:`srun`,
  54 you must help *gmxapi* detect the MPI environment by ensuring that :py:mod:`mpi4py`
  55 is loaded.
  56 Refer to :ref:`mpi_requirements` for more on installing :py:mod:`mpi4py`.
  57
  58 Assuming you use :command:`mpiexec` to launch MPI jobs in your environment,
  59 run a *gmxapi* script on two ranks with something like the following.
  60 Note that it can be helpful to provide :command:`mpiexec` with the full path to
  61 the intended Python interpreter since new process environments are being created.
  62
  63 ::
  64
  65     mpiexec -n 2 `which python` -m mpi4py myscript.py
  66
  67 *gmxapi* 0.1 has limited parallelism, but future versions will include seamless
  68 acceleration as integration improves with the GROMACS library and computing
  69 environment runtime resources.
  70 Currently, *gmxapi* and the GROMACS library do not have an effective way to
  71 share an MPI environment.
  72 Therefore, if you intend to run more than one simulation at a time, in parallel,
  73 in a *gmxapi* script, you should build GROMACS with *thread-MPI* instead of a
  74 standard MPI library.
  75 I.e. configure GROMACS with the CMake flag ``-DGMX_THREAD_MPI=ON``.
  76 Then, launch your *gmxapi* script with one MPI rank per node, and *gmxapi* will
  77 assign each (non-MPI) simulation to its own node, while keeping the full MPI
  78 environment available for use via :py:mod:`mpi4py`.
  79
  80 Running simple simulations
  81 ==========================
  82
  83 Once the ``gmxapi`` package is installed, running simulations is easy with
  84 :py:func:`gmxapi.read_tpr`.
  85
  86 ::
  87
  88     import gmxapi as gmx
  89     simulation_input = gmx.read_tpr(tpr_filename)
  90     md = gmx.mdrun(simulation_input)
  91
  92 Note that this sets up the work you want to perform, but does not immediately
  93 trigger execution. You can explicitly trigger execution with::
  94
  95     md.run()
  96
  97 or you can let gmxapi automatically launch work in response to the data you
  98 request.
  99
 100 The :py:func:`gmxapi.mdrun` operation produces a simulation trajectory output.
 101 You can use ``md.output.trajectory`` as input to other operations,
 102 or you can get the output directly by calling ``md.output.trajectory.result()``.
 103 If the simulation has not been run yet when ``result()`` is called,
 104 the simulation will be run before the function returns.
 105
 106 Running ensemble simulations
 107 ============================
 108
 109 To run a batch of simulations, just pass an array of inputs.::
 110
 111     md = gmx.read_tpr([tpr_filename1, tpr_filename2, ...])
 112     md.run()
 113
 114 Make sure to launch the script in an MPI environment with a sufficient number
 115 of ranks to allow one rank per simulation.
 116
 117 For *gmxapi* 0.1, we recommend configuring the GROMACS build with
 118 ``GMX_THREAD_MPI=ON`` and allowing one rank per node in order to allow each
 119 simulation ensemble member to run on a separate node.
 120
 121 .. seealso:: :ref:`parallelism`
 122
 123 .. _commandline:
 124
 125 Accessing command line tools
 126 ============================
 127
 128 In *gmxapi* 0.1, most GROMACS tools are not yet exposed as *gmxapi* Python operations.
 129 :class:`gmxapi.commandline_operation` provides a way to convert a :command:`gmx`
 130 (or other) command line tool into an operation that can be used in a *gmxapi*
 131 script.
 132
 133 In order to establish data dependencies, input and output files need to be
 134 indicated with the ``input_files`` and ``output_files`` parameters.
 135 ``input_files`` and ``output_files`` key word arguments are dictionaries
 136 consisting of files keyed by command line flags.
 137
 138 For example, you might create a :command:`gmx solvate` operation as::
 139
 140     solvate = gmx.commandline_operation('gmx',
 141                                         arguments=['solvate', '-box', '5', '5', '5'],
 142                                         input_files={'-cs': structurefile},
 143                                         output_files={'-p': topfile,
 144                                                       '-o': structurefile,
 145                                                       }
 146
 147 To check the status or error output of a command line operation, refer to the
 148 ``returncode`` and ``erroroutput`` outputs.
 149 To access the results from the output file arguments, use the command line flags
 150 as keys in the ``file`` dictionary output.
 151
 152 Example::
 153
 154     structurefile = solvate.output.file['-o'].result()
 155     if solvate.output.returncode.result() != 0:
 156         print(solvate.output.erroroutput.result())
 157
 158 Preparing simulations
 159 =====================
 160
 161 Continuing the previous example, the output of ``solvate`` may be used as the
 162 input for ``grompp``::
 163
 164     grompp = gmx.commandline_operation('gmx', 'grompp',
 165                                        input_files={
 166                                            '-f': mdpfile,
 167                                            '-p': solvate.output.file['-p'],
 168                                            '-c': solvate.output.file['-o'],
 169                                            '-po': mdout_mdp,
 170                                        },
 171                                        output_files={'-o': tprfile})
 172
 173 Then, ``grompp.output.file['-o']`` can be used as the input for :py:func:`gmxapi.read_tpr`.
 174
 175 Simulation input can be modified with the :py:func:`gmxapi.modify_input` operation
 176 before being passed to :py:func:`gmxapi.mdrun`.
 177 For *gmxapi* 0.1, a subset of MDP parameters may be overridden using the
 178 dictionary passed with the ``parameters`` key word argument.
 179
 180 Example::
 181
 182     simulation_input = gmx.read_tpr(grompp.output.file['-o'])
 183     modified_input = gmx.modify_input(input=simulation_input, parameters={'nsteps': 1000})
 184     md = gmx.mdrun(input=modified_input)
 185     md.run()
 186
 187 Using arbitrary Python functions
 188 ================================
 189
 190 Generally, a function in the *gmxapi* package returns an object that references
 191 a node in a work graph,
 192 representing an operation that will be run when the graph executes.
 193 The object has an ``output`` attribute providing access to data Futures that
 194 can be provided as inputs to other operations before computation has actually
 195 been performed.
 196
 197 You can also provide native Python data as input to operations,
 198 or you can operate on native results retrieved from a Future's ``result()``
 199 method.
 200 However, it is trivial to convert most Python functions into *gmxapi* compatible
 201 operations with :py:func:`gmxapi.function_wrapper`.
 202 All function inputs and outputs must have a name and type.
 203 Additionally, functions should be stateless and importable
 204 (e.g. via Python ``from some.module import myfunction``)
 205 for future compatibility.
 206
 207 Simple functions can just use :py:func:`return` to publish their output,
 208 as long as they are defined with a return value type annotation.
 209 Functions with multiple outputs can accept an ``output`` key word argument and
 210 assign values to named attributes on the received argument.
 211
 212 Examples::
 213
 214     from gmxapi import function_wrapper
 215
 216     @function_wrapper(output={'data': float})
 217     def add_float(a: float, b: float) -> float:
 218         return a + b
 219
 220
 221     @function_wrapper(output={'data': bool})
 222     def less_than(lhs: float, rhs: float, output=None):
 223         output.data = lhs < rhs
 224
 225 .. seealso::
 226
 227     For more on Python type hinting with function annotations,
 228     check out :pep:`3107`.
 229
 230 Subgraphs
 231 =========
 232
 233 Basic *gmxapi* work consists of a flow of data from operation outputs to
 234 operation inputs, forming a directed acyclic graph (DAG).
 235 In many cases, it can be useful to repeat execution of a subgraph with updated
 236 inputs.
 237 You may want a data reference that is not tied to the immutable result
 238 of a single node in the work graph, but which instead refers to the most recent
 239 result of a repeated operation.
 240
 241 One or more operations can be staged in a :py:class:`gmxapi.operation.Subgraph`,
 242 a sort of meta-operation factory that can store input binding behavior so that
 243 instances can be created without providing input arguments.
 244
 245 The subgraph *variables* serve as input, output, and mutable internal data
 246 references which can be updated by operations in the subgraph.
 247 Variables also allow state to be propagated between iterations when a subgraph
 248 is used in a *while* loop.
 249
 250 Use :py:func:`gmxapi.subgraph` to create a new empty subgraph.
 251 The ``variables`` argument declares data handles that define the state of the
 252 subgraph when it is run.
 253 To initialize input to the subgraph, give each variable a name and a value.
 254
 255 To populate a subgraph, enter a SubgraphContext by using a :py:func:`with` statement.
 256 Operations created in the *with* block will be captued by the SubgraphContext.
 257 Define the subgraph outputs by assigning operation outputs to subgraph variables
 258 within the *with* block.
 259
 260 After exiting the *with* block, the subgraph may be used to create operation
 261 instances or may be executed repeatedly in a *while* loop.
 262
 263 .. note::
 264
 265     The object returned by :py:func:`gmxapi.subgraph` is atypical of *gmxapi*
 266     operations, and has some special behaviors. When used as a Python
 267     `context manager <https://docs.python.org/3/reference/datamodel.html#context-managers>`__,
 268     it enters a "builder" state that changes the behavior of its attribute
 269     variables and of operaton instantiation. After exiting the :py:func:`with`
 270     block, the subgraph variables are no longer assignable, and operation
 271     references obtained within the block are no longer valid.
 272
 273 Looping
 274 =======
 275
 276 An operation can be executed an arbitrary number of times with a
 277 :py:func:`gmxapi.while_loop` by providing a factory function as the
 278 *operation* argument.
 279 When the loop operation is run, the *operation* is instantiated and run repeatedly
 280 until *condition* evaluates ``True``.
 281
 282 :py:func:`gmxapi.while_loop` does not provide a direct way to provide *operation*
 283 arguments. Use a *subgraph* to define the data flow for iterative operations.
 284
 285 When a *condition* is a subgraph variable, the variable is evaluated in the
 286 running subgraph instance at the beginning of an iteration.
 287
 288 Example::
 289
 290     subgraph = gmx.subgraph(variables={'float_with_default': 1.0, 'bool_data': True})
 291     with subgraph:
 292         # Define the update for float_with_default to come from an add_float operation.
 293         subgraph.float_with_default = add_float(subgraph.float_with_default, 1.).output.data
 294         subgraph.bool_data = less_than(lhs=subgraph.float_with_default, rhs=6.).output.data
 295     operation_instance = subgraph()
 296     operation_instance.run()
 297     assert operation_instance.values['float_with_default'] == 2.
 298
 299     loop = gmx.while_loop(operation=subgraph, condition=subgraph.bool_data)
 300     handle = loop()
 301     assert handle.output.float_with_default.result() == 6
 302
 303 .. _gmxapi logging:
 304
 305 Logging
 306 =======
 307
 308 *gmxapi* uses the Python :py:mod:`logging` module to provide hierarchical
 309 logging, organized by submodule.
 310 You can access the logger at ``gmxapi.logger`` or, after importing *gmxapi*,
 311 through the Python logging framework::
 312
 313     import gmxapi as gmx
 314     import logging
 315
 316     # Get the root gmxapi logger.
 317     gmx_logger = logging.getLogger('gmxapi')
 318     # Set a low default logging level
 319     gmx_logger.setLevel(logging.WARNING)
 320     # Make some tools very verbose
 321     #  by descending the hierarchy
 322     gmx_logger.getChild('commandline').setLevel(logging.DEBUG)
 323     #  or by direct reference
 324     logging.getLogger('gmxapi.mdrun').setLevel(logging.DEBUG)
 325
 326 You may prefer to adjust the log format or manipulate the log handlers.
 327 For example, tag the log output with MPI rank::
 328
 329     try:
 330         from mpi4py import MPI
 331         rank_number = MPI.COMM_WORLD.Get_rank()
 332     except ImportError:
 333         rank_number = 0
 334         rank_tag = ''
 335         MPI = None
 336     else:
 337         rank_tag = 'rank{}:'.format(rank_number)
 338
 339     formatter = logging.Formatter(rank_tag + '%(name)s:%(levelname)s: %(message)s')
 340
 341     # For additional console logging, create and attach a stream handler.
 342     ch = logging.StreamHandler()
 343     ch.setFormatter(formatter)
 344     logging.getLogger().addHandler(ch)
 345
 346 For more information, refer to the Python `logging documentation <https://docs.python.org/3/library/logging.html>`__.
 347
 348 More
 349 ====
 350
 351 Refer to the :doc:`pythonreference` for complete and granular documentation.
 352
 353 For more information on writing or using pluggable simulation extension code,
 354 refer to https://redmine.gromacs.org/issues/3133.
 355 (For gmxapi 0.0.7 and GROMACS 2019, see https://github.com/kassonlab/sample_restraint)
 356
 357 .. todo:: :issue:`3133`: Replace these links as resources for pluggable extension code become available.