From: Teemu Murtola <teemu.murtola@gmail.com>
Date: Fri, 25 Jul 2014 04:02:44 +0000 (+0300)
Subject: Include directive sorter
X-Git-Url: http://biod.pnpi.spb.ru/gitweb/?p=alexxy%2Fgromacs.git;a=commitdiff_plain;h=1bf821d461c749fbc34987c5b10f4527da782ba1

Include directive sorter

To facilitate discussion on how we want to sort the includes, the tool
supports a few different styles (with a --style option) that at least
roughly correspond to what has been proposed in #1557 or related
discussions.  Can be removed later when the final style is agreed on.

Having an automatic sorter also allows relatively easily using other
tools that may sort the includes as part of their operation (the
automatic sorter can just be run afterwards).

Made gmxtree.py aware of the location of the list of installer headers,
removing the need to pass the file on the command line.  This simplifies
manual invocation (in particular for the include sorter).

It only sorts a list of consecutive #include lines, possibly separated
by empty lines.  Comments between the lines or #ifdefs are left alone,
and only includes on both sides sorted independently.  This is by
design, as it keeps the script relatively simple, and also allows using
comments or similar to override the sorting for special cases.  Some
manual formatting may be required to make files with heavy use of
conditinal includes to look nice (to move headers to the appropriate
side of the #ifdefs), but after that, the tool should work quite well.

Add a git attribute to allow easily applying the script in combination
with a file list produced by admin/reformat_all.sh, and extend
reformat_all.sh such that it can be be run as
  admin/reformat_all.sh includesort -B=<builddir>
to sort all files with the attribute.

Related to #1557.

Change-Id: I362d7c5131a2e6952bbededfd51da4c4e1fc4a7e
---

diff --git a/.gitattributes b/.gitattributes
index 7712db33da..b558257f87 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -8,9 +8,9 @@ CMakeLists.txt  filter=copyright
 *.cmake         filter=copyright
 *.cmakein       filter=copyright
 *.py            filter=copyright
-*.l             filter=copyright
-*.y             filter=copyright
-*.pre           filter=copyright
+*.l             filter=includesort
+*.y             filter=includesort
+*.pre           filter=includesort
 *.tex           filter=copyright
 *.bm            filter=copyright
 # Exceptions: extra files to include
@@ -41,3 +41,4 @@ src/gromacs/selection/parser.h          !filter
 src/gromacs/selection/scanner.cpp       !filter
 src/gromacs/selection/scanner_flex.h    !filter
 nb_kernel_Elec*                         filter=copyright
+nb_kernel*pre                           filter=copyright
diff --git a/admin/reformat_all.sh b/admin/reformat_all.sh
index 764c44868f..caeede50fa 100755
--- a/admin/reformat_all.sh
+++ b/admin/reformat_all.sh
@@ -35,9 +35,9 @@
 
 function usage() {
     echo "usage: reformat_all.sh [-f|--force]"
-    echo "           [--filter=(uncrustify|copyright)] [--pattern=<pattern>]"
-    echo "           [<action>]"
-    echo "<action>: (list-files|uncrustify*|copyright) (*=default)"
+    echo "           [--filter=(uncrustify|copyright|includesort)]"
+    echo "           [--pattern=<pattern>] [<action>] [-B=<build dir>]"
+    echo "<action>: (list-files|uncrustify*|copyright|includesort) (*=default)"
 }
 
 filter=default
@@ -46,12 +46,14 @@ patterns=()
 action=uncrustify
 for arg in "$@" ; do
     if [[ "$arg" == "list-files" || "$arg" == "uncrustify" ||
-          "$arg" == "copyright" ]] ; then
+          "$arg" == "copyright" || "$arg" == "includesort" ]] ; then
         action=$arg
     elif [[ "$arg" == --filter=* ]] ; then
         filter=${arg#--filter=}
     elif [[ "$arg" == --pattern=* ]] ; then
         patterns[${#patterns[@]}]=${arg#--pattern=}
+    elif [[ "$arg" == -B=* ]] ; then
+        builddir=${arg#-B=}
     elif [[ "$arg" == "-f" || "$arg" == "--force" ]] ; then
         force=1
     else
@@ -90,6 +92,13 @@ case "$action" in
     copyright)
         command="xargs admin/copyright.py --check"
         ;;
+    includesort)
+        if [ -z "${builddir}" ] ; then
+            echo "Build directory must be set with -B for includesort."
+            exit 2
+        fi
+        command="docs/doxygen/includesorter.py -S . -B ${builddir} -F -"
+        ;;
     *)
         echo "Unknown action: $action"
         exit 2
@@ -100,11 +109,14 @@ if [[ "$filter" == "default" ]] ; then
 fi
 
 case "$filter" in
+    includesort)
+        filter_re="(uncrustify|includesort)"
+        ;;
     uncrustify)
         filter_re="(uncrustify|uncrustify_only)"
         ;;
     copyright)
-        filter_re="(uncrustify|copyright)"
+        filter_re="(uncrustify|copyright|includesort)"
         ;;
     *)
         echo "Unknown filter mode: $filter"
diff --git a/admin/uncrustify.sh b/admin/uncrustify.sh
index c683163076..e9590bdab3 100755
--- a/admin/uncrustify.sh
+++ b/admin/uncrustify.sh
@@ -82,9 +82,9 @@
 # filters, specified in .gitattributes files.  Only files that have the filter
 # set as "uncrustify" (or something ending in "uncrustify") are processed: if
 # other files have been changed, they are ignored by the script.  Files passed
-# to uncrustify, as well as files with filter "copyright", get their copyright
-# header checked.  To only run uncrustify for a file, set the filter to
-# "uncrustify_only".
+# to uncrustify, as well as files with filter "copyright" or "includesort", get
+# their copyright header checked.  To only run uncrustify for a file, set the
+# filter to "uncrustify_only".
 #
 # If you want to run uncrustify automatically for changes you make, there are
 # two options:
@@ -202,7 +202,7 @@ cut -f2 <$tmpdir/difflist | \
 cut -f2 <$tmpdir/filtered >$tmpdir/filelist_all
 grep -E '(uncrustify|uncrustify_only)$' <$tmpdir/filtered | \
     cut -f2 >$tmpdir/filelist_uncrustify
-grep -E '(uncrustify|copyright)$' <$tmpdir/filtered | \
+grep -E '(uncrustify|copyright|includesort)$' <$tmpdir/filtered | \
     cut -f2 >$tmpdir/filelist_copyright
 git diff-files --name-only | grep -Ff $tmpdir/filelist_all >$tmpdir/localmods
 
diff --git a/docs/doxygen/CMakeLists.txt b/docs/doxygen/CMakeLists.txt
index 95c73442c4..30dbf7cc20 100644
--- a/docs/doxygen/CMakeLists.txt
+++ b/docs/doxygen/CMakeLists.txt
@@ -126,7 +126,6 @@ if (DOXYGEN_FOUND)
         set(doc_check_command
             ${PYTHON_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/doxygen-check.py
             -S ${CMAKE_SOURCE_DIR} -B ${CMAKE_BINARY_DIR}
-            --installed ${CMAKE_CURRENT_BINARY_DIR}/installed-headers.txt
             -l ${CMAKE_CURRENT_BINARY_DIR}/doxygen-check.log
             --exitcode
             --ignore ${CMAKE_CURRENT_SOURCE_DIR}/suppressions.txt
@@ -139,7 +138,6 @@ if (DOXYGEN_FOUND)
         set(dep_graphs_command_python
             ${PYTHON_EXECUTABLE} ${CMAKE_CURRENT_SOURCE_DIR}/graphbuilder.py
             -S ${CMAKE_SOURCE_DIR} -B ${CMAKE_BINARY_DIR}
-            --installed ${CMAKE_CURRENT_BINARY_DIR}/installed-headers.txt
             --ignore-cycles ${CMAKE_CURRENT_SOURCE_DIR}/cycle-suppressions.txt
             -o ${CMAKE_CURRENT_BINARY_DIR}/depgraphs)
         set(dep_graphs_command_dot
diff --git a/docs/doxygen/codelayout.md b/docs/doxygen/codelayout.md
index 0d104c0626..727f978c43 100644
--- a/docs/doxygen/codelayout.md
+++ b/docs/doxygen/codelayout.md
@@ -68,7 +68,7 @@ See \ref page_unittesting for more details.
 \endif
 
 When compiling, the include search path is set to `src/`.  This means that
-source files include headers as
+files include headers as
 
     #include "gromacs/<module>/<file>.h"
 
@@ -76,13 +76,6 @@ The following is also possible for intra-module headers:
 
     #include "<file>.h"
 
-Header files include other headers using
-
-    #include "../<othermodule>/<file>.h"
-
-because relative paths work best for installed headers.  For non-installed
-headers, the path relative to `src/` is sometimes also used.
-
 For historical reasons, there are directories `src/gromacs/gmxana/`,
 `src/gromacs/gmxlib/`, `src/gromacs/mdlib/`, and `src/gromacs/gmxpreprocess/`
 that do not follow the above rules.  The installed headers for these are in
diff --git a/docs/doxygen/doxygen-check.py b/docs/doxygen/doxygen-check.py
index db560720cc..773dbfbefd 100755
--- a/docs/doxygen/doxygen-check.py
+++ b/docs/doxygen/doxygen-check.py
@@ -55,6 +55,7 @@ them to the script.
 import sys
 from optparse import OptionParser
 
+import gmxtree
 from gmxtree import GromacsTree, DocType
 from reporter import Reporter
 
@@ -342,8 +343,6 @@ def main():
                       help='Source tree root directory')
     parser.add_option('-B', '--build-root',
                       help='Build tree root directory')
-    parser.add_option('--installed',
-                      help='Read list of installed files from given file')
     parser.add_option('-l', '--log',
                       help='Write issues into a given log file in addition to stderr')
     parser.add_option('--ignore',
@@ -358,12 +357,6 @@ def main():
                       help='Return non-zero exit code if there are warnings')
     options, args = parser.parse_args()
 
-    installedlist = []
-    if options.installed:
-        with open(options.installed, 'r') as outfile:
-            for line in outfile:
-                installedlist.append(line.strip())
-
     reporter = Reporter(options.log)
     if options.ignore:
         reporter.load_filters(options.ignore)
@@ -371,7 +364,7 @@ def main():
     if not options.quiet:
         sys.stderr.write('Scanning source tree...\n')
     tree = GromacsTree(options.source_root, options.build_root, reporter)
-    tree.set_installed_file_list(installedlist)
+    tree.load_installed_file_list()
     if not options.quiet:
         sys.stderr.write('Reading source files...\n')
     tree.scan_files()
@@ -387,6 +380,8 @@ def main():
         sys.stderr.write('Checking...\n')
 
     for fileobj in tree.get_files():
+        if isinstance(fileobj, gmxtree.GeneratorSourceFile):
+            continue
         check_file(fileobj, reporter)
         for includedfile in fileobj.get_includes():
             check_include(fileobj, includedfile, reporter)
diff --git a/docs/doxygen/doxygenxml.py b/docs/doxygen/doxygenxml.py
index 49de76261c..b6dd5659f5 100755
--- a/docs/doxygen/doxygenxml.py
+++ b/docs/doxygen/doxygenxml.py
@@ -1117,13 +1117,34 @@ class DocumentationSet(object):
                 member.add_parent_compound(compound)
                 compound.add_member(member)
 
-    def load_file_details(self):
-        """Load detailed XML files for all files and possible parents of files."""
+    def load_file_details(self, filelist=None):
+        """Load detailed XML files for all files and possible parents of files.
+
+        If filelist is set, it should be a list of file paths, and details will
+        be loaded only for files in those paths.  The path format should match
+        what Doxygen writes into the files (with Gromacs setup, it seems to be
+        absolute paths)."""
         for compound in self._compounds.itervalues():
-            if isinstance(compound, (File, Directory, Group)):
+            if isinstance(compound, (Directory, Group)):
+                compound.load_details()
+            elif not filelist and isinstance(compound, File):
                 compound.load_details()
+                self._files[compound.get_path()] = compound
+        if filelist:
+            # We can't access the full path from the File object before the
+            # details are loaded, because Doxygen does not write that into
+            # index.xml.  But we can use the Directory objects (which were
+            # loaded above) to get the path.
+            for compound in self._compounds.itervalues():
                 if isinstance(compound, File):
-                    self._files[compound.get_path()] = compound
+                    dirobj = compound.get_directory()
+                    if not dirobj:
+                        continue
+                    abspath = compound.get_directory().get_path()
+                    abspath = os.path.join(abspath, compound.get_name())
+                    if abspath in filelist:
+                        compound.load_details()
+                        self._files[compound.get_path()] = compound
 
     def load_details(self):
         """Load detailed XML files for each compound."""
diff --git a/docs/doxygen/gmxtree.py b/docs/doxygen/gmxtree.py
index aaf21e01c7..1c28e6a3e4 100644
--- a/docs/doxygen/gmxtree.py
+++ b/docs/doxygen/gmxtree.py
@@ -50,6 +50,7 @@ rules that come from GROMACS-specific knowledge.  In the future, more such
 customizations will be added.
 """
 
+import collections
 import os
 import os.path
 import re
@@ -96,15 +97,44 @@ class IncludedFile(object):
     def is_relative(self):
         return self._is_relative
 
+    def get_included_path(self):
+        return self._included_path
+
     def get_including_file(self):
         return self._including_file
 
     def get_file(self):
         return self._included_file
 
+    def get_line_number(self):
+        return self._line_number
+
     def get_reporter_location(self):
         return reporter.Location(self._including_file.get_abspath(), self._line_number)
 
+class IncludeBlock(object):
+
+    """Block of consequent #include directives in a file."""
+
+    def __init__(self, first_included_file):
+        self._first_line = first_included_file.get_line_number()
+        self._last_line = self._first_line
+        self._files = []
+        self.add_file(first_included_file)
+
+    def add_file(self, included_file):
+        self._files.append(included_file)
+        self._last_line = included_file.get_line_number()
+
+    def get_includes(self):
+        return self._files
+
+    def get_first_line(self):
+        return self._first_line
+
+    def get_last_line(self):
+        return self._last_line
+
 class File(object):
 
     """Source/header file in the GROMACS tree."""
@@ -121,6 +151,9 @@ class File(object):
         self._apitype = DocType.none
         self._modules = set()
         self._includes = []
+        self._include_blocks = []
+        self._main_header = None
+        self._lines = None
         directory.add_file(self)
 
     def set_doc_xml(self, rawdoc, sourcetree):
@@ -140,6 +173,11 @@ class File(object):
         """Mark the file installed."""
         self._installed = True
 
+    def set_main_header(self, included_file):
+        """Set the main header file for a source file."""
+        assert self.is_source_file()
+        self._main_header = included_file
+
     def _process_include(self, lineno, is_system, includedpath, sourcetree):
         """Process #include directive during scan()."""
         is_relative = False
@@ -153,21 +191,36 @@ class File(object):
                 fileobj = sourcetree.get_file(fullpath)
             else:
                 fileobj = sourcetree.find_include_file(includedpath)
-        self._includes.append(IncludedFile(self, lineno, fileobj, includedpath,
-                is_relative, is_system))
+        included_file = IncludedFile(self, lineno, fileobj, includedpath,
+            is_relative, is_system)
+        self._includes.append(included_file)
+        return included_file
 
-    def scan_contents(self, sourcetree):
+    def scan_contents(self, sourcetree, keep_contents):
         """Scan the file contents and initialize information based on it."""
         # TODO: Consider a more robust regex.
-        include_re = r'^#\s*include\s+(?P<quote>["<])(?P<path>[^">]*)[">]'
+        include_re = r'^\s*#\s*include\s+(?P<quote>["<])(?P<path>[^">]*)[">]'
+        current_block = None
+        # TODO: Consider reading directly into this list, and iterate that.
+        lines = []
         with open(self._abspath, 'r') as scanfile:
             for lineno, line in enumerate(scanfile, 1):
+                lines.append(line)
                 match = re.match(include_re, line)
                 if match:
                     is_system = (match.group('quote') == '<')
                     includedpath = match.group('path')
-                    self._process_include(lineno, is_system, includedpath,
-                            sourcetree)
+                    included_file = self._process_include(lineno, is_system,
+                            includedpath, sourcetree)
+                    if current_block is None:
+                        current_block = IncludeBlock(included_file)
+                        self._include_blocks.append(current_block)
+                    else:
+                        current_block.add_file(included_file)
+                elif line and not line.isspace():
+                    current_block = None
+        if keep_contents:
+            self._lines = lines
 
     def get_reporter_location(self):
         return reporter.Location(self._abspath, None)
@@ -199,6 +252,9 @@ class File(object):
     def get_name(self):
         return os.path.basename(self._abspath)
 
+    def get_directory(self):
+        return self._dir
+
     def get_doc_type(self):
         if not self._rawdoc:
             return DocType.none
@@ -208,7 +264,7 @@ class File(object):
         return self._apitype
 
     def api_type_is_reliable(self):
-        if self._apitype > DocType.internal:
+        if self._apitype in (DocType.internal, DocType.library):
             return True
         module = self.get_module()
         return module and module.is_documented()
@@ -238,7 +294,29 @@ class File(object):
     def get_includes(self):
         return self._includes
 
+    def get_include_blocks(self):
+        return self._include_blocks
+
+    def get_main_header(self):
+        return self._main_header
+
+    def get_contents(self):
+        return self._lines
+
 class GeneratedFile(File):
+    def __init__(self, abspath, relpath, directory):
+        File.__init__(self, abspath, relpath, directory)
+        self._generator_source_file = None
+
+    def set_generator_source(self, sourcefile):
+        self._generator_source_file = sourcefile
+
+    def get_reporter_location(self):
+        if self._generator_source_file:
+            return self._generator_source_file.get_reporter_location()
+        return File.get_reporter_location(self)
+
+class GeneratorSourceFile(File):
     pass
 
 class Directory(object):
@@ -327,6 +405,15 @@ class Directory(object):
         for fileobj in self._files:
             yield fileobj
 
+    def contains(self, fileobj):
+        """Check whether file is within the directory or its subdirectories."""
+        dirobj = fileobj.get_directory()
+        while dirobj:
+            if dirobj == self:
+                return True
+            dirobj = dirobj._parent
+        return False
+
 class ModuleDependency(object):
 
     """Dependency between modules."""
@@ -515,8 +602,8 @@ class GromacsTree(object):
     subdirectories.  At this point, only information that is accessible from
     file names and paths only is available.
 
-    set_installed_file_list() can be called to set the list of installed
-    files.
+    load_installed_file_list() can be called to load the list of installed
+    files from the build tree (generated by the find-installed-headers target).
 
     scan_files() can be called to read all the files and initialize #include
     dependencies between the files based on the information.  This is done like
@@ -543,6 +630,29 @@ class GromacsTree(object):
         self._namespaces = set()
         self._members = set()
         self._walk_dir(os.path.join(self._source_root, 'src'))
+        for fileobj in self.get_files():
+            if fileobj and fileobj.is_source_file() and not fileobj.is_external():
+                (basedir, name) = os.path.split(fileobj.get_abspath())
+                (basename, ext) = os.path.splitext(name)
+                header = self.get_file(os.path.join(basedir, basename + '.h'))
+                if not header and ext == '.cu':
+                    header = self.get_file(os.path.join(basedir, basename + '.cuh'))
+                if not header and fileobj.is_test_file():
+                    basedir = os.path.dirname(basedir)
+                    header = self.get_file(os.path.join(basedir, basename + '.h'))
+                    if not header:
+                        # Somewhat of a hack; currently, the tests for
+                        # analysisdata/modules/ and trajectoryanalysis/modules/
+                        # is at the top-level tests directory.
+                        # TODO: It could be clearer to split the tests so that
+                        # there would be a separate modules/tests/.
+                        header = self.get_file(os.path.join(basedir, 'modules', basename + '.h'))
+                    if not header and basename.endswith('_tests'):
+                        header = self.get_file(os.path.join(basedir, basename[:-6] + '.h'))
+                if not header and fileobj.get_relpath().startswith('src/gromacs'):
+                    header = self._files.get(os.path.join('src/gromacs/legacyheaders', basename + '.h'))
+                if header:
+                    fileobj.set_main_header(header)
         rootdir = self._get_dir(os.path.join('src', 'gromacs'))
         for subdir in rootdir.get_subdirectories():
             self._create_module(subdir)
@@ -587,10 +697,20 @@ class GromacsTree(object):
                 elif extension == '.cmakein':
                     extension = os.path.splitext(basename)[1]
                     if extension in extensions:
+                        fullpath = os.path.join(dirpath, filename)
+                        relpath = self._get_rel_path(fullpath)
+                        sourcefile = GeneratorSourceFile(fullpath, relpath, currentdir)
+                        self._files[relpath] = sourcefile
                         fullpath = os.path.join(dirpath, basename)
                         relpath = self._get_rel_path(fullpath)
-                        fullpath = os.path.join(dirpath, filename)
-                        self._files[relpath] = GeneratedFile(fullpath, relpath, currentdir)
+                        fullpath = os.path.join(self._build_root, relpath)
+                        generatedfile = GeneratedFile(fullpath, relpath, currentdir)
+                        self._files[relpath] = generatedfile
+                        generatedfile.set_generator_source(sourcefile)
+                elif extension in ('.l', '.y', '.pre'):
+                    fullpath = os.path.join(dirpath, filename)
+                    relpath = self._get_rel_path(fullpath)
+                    self._files[relpath] = GeneratorSourceFile(fullpath, relpath, currentdir)
 
     def _create_module(self, rootdir):
         """Create module for a subdirectory."""
@@ -599,11 +719,15 @@ class GromacsTree(object):
         rootdir.set_module(moduleobj)
         self._modules[name] = moduleobj
 
-    def scan_files(self):
+    def scan_files(self, only_files=None, keep_contents=False):
         """Read source files to initialize #include dependencies."""
-        for fileobj in self._files.itervalues():
+        if only_files:
+            filelist = only_files
+        else:
+            filelist = self._files.itervalues()
+        for fileobj in filelist:
             if not fileobj.is_external():
-                fileobj.scan_contents(self)
+                fileobj.scan_contents(self, keep_contents)
                 module = fileobj.get_module()
                 if module:
                     for includedfile in fileobj.get_includes():
@@ -613,7 +737,7 @@ class GromacsTree(object):
                             if othermodule and othermodule != module:
                                 module.add_dependency(othermodule, includedfile)
 
-    def load_xml(self, only_files=False):
+    def load_xml(self, only_files=None):
         """Load Doxygen XML information.
 
         If only_files is True, XML data is not loaded for code constructs, but
@@ -622,7 +746,11 @@ class GromacsTree(object):
         xmldir = os.path.join(self._build_root, 'docs', 'html', 'doxygen', 'xml')
         self._docset = xml.DocumentationSet(xmldir, self._reporter)
         if only_files:
-            self._docset.load_file_details()
+            if isinstance(only_files, collections.Iterable):
+                filelist = [x.get_abspath() for x in only_files]
+                self._docset.load_file_details(filelist)
+            else:
+                self._docset.load_file_details()
         else:
             self._docset.load_details()
             self._docset.merge_duplicates()
@@ -675,11 +803,15 @@ class GromacsTree(object):
         """Load Doxygen XML file information."""
         for filedoc in self._docset.get_files():
             path = filedoc.get_path()
+            if not path:
+                # In case of only partially loaded file information,
+                # the path information is not set for unloaded files.
+                continue
             if not os.path.isabs(path):
                 self._reporter.xml_assert(filedoc.get_xml_path(),
                         "expected absolute path in Doxygen-produced XML file")
                 continue
-            extension = os.path.splitext(filedoc.get_path())[1]
+            extension = os.path.splitext(path)[1]
             # We don't care about Markdown files that only produce pages
             # (and fail the directory check below).
             if extension == '.md':
@@ -740,20 +872,23 @@ class GromacsTree(object):
             if testpath in self._files:
                 return self._files[testpath]
 
-    def set_installed_file_list(self, installedfiles):
-        """Set list of installed files."""
-        for path in installedfiles:
-            if not os.path.isabs(path):
-                self._reporter.input_error(
-                        "installed file not specified with absolute path: {0}"
-                        .format(path))
-                continue
-            relpath = self._get_rel_path(path)
-            if relpath not in self._files:
-                self._reporter.input_error(
-                        "installed file not in source tree: {0}".format(path))
-                continue
-            self._files[relpath].set_installed()
+    def load_installed_file_list(self):
+        """Load list of installed files from the build tree."""
+        listpath = os.path.join(self._build_root, 'docs', 'doxygen', 'installed-headers.txt')
+        with open(listpath, 'r') as installedfp:
+            for line in installedfp:
+                path = line.strip()
+                if not os.path.isabs(path):
+                    self._reporter.input_error(
+                            "installed file not specified with absolute path: {0}"
+                            .format(path))
+                    continue
+                relpath = self._get_rel_path(path)
+                if relpath not in self._files:
+                    self._reporter.input_error(
+                            "installed file not in source tree: {0}".format(path))
+                    continue
+                self._files[relpath].set_installed()
 
     def load_cycle_suppression_list(self, filename):
         """Load a list of edges to suppress in cycles.
diff --git a/docs/doxygen/graphbuilder.py b/docs/doxygen/graphbuilder.py
index 10db472c40..f666dbeaa7 100755
--- a/docs/doxygen/graphbuilder.py
+++ b/docs/doxygen/graphbuilder.py
@@ -531,8 +531,6 @@ def main():
                       help='Source tree root directory')
     parser.add_option('-B', '--build-root',
                       help='Build tree root directory')
-    parser.add_option('--installed',
-                      help='Read list of installed files from given file')
     parser.add_option('--ignore-cycles',
                       help='Set file with module dependencies to ignore in cycles')
     parser.add_option('-o', '--outdir', default='.',
@@ -541,18 +539,12 @@ def main():
                       help='Do not write status messages')
     options, args = parser.parse_args()
 
-    installedlist = []
-    if options.installed:
-        with open(options.installed, 'r') as outfile:
-            for line in outfile:
-                installedlist.append(line.strip())
-
     reporter = Reporter(quiet=True)
 
     if not options.quiet:
         sys.stderr.write('Scanning source tree...\n')
     tree = GromacsTree(options.source_root, options.build_root, reporter)
-    tree.set_installed_file_list(installedlist)
+    tree.load_installed_file_list()
     if not options.quiet:
         sys.stderr.write('Reading source files...\n')
     tree.scan_files()
diff --git a/docs/doxygen/includesorter.py b/docs/doxygen/includesorter.py
new file mode 100755
index 0000000000..23c2ecf0c6
--- /dev/null
+++ b/docs/doxygen/includesorter.py
@@ -0,0 +1,367 @@
+#!/usr/bin/python
+#
+# This file is part of the GROMACS molecular simulation package.
+#
+# Copyright (c) 2012,2013,2014, by the GROMACS development team, led by
+# Mark Abraham, David van der Spoel, Berk Hess, and Erik Lindahl,
+# and including many others, as listed in the AUTHORS file in the
+# top-level source directory and at http://www.gromacs.org.
+#
+# GROMACS is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public License
+# as published by the Free Software Foundation; either version 2.1
+# of the License, or (at your option) any later version.
+#
+# GROMACS is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with GROMACS; if not, see
+# http://www.gnu.org/licenses, or write to the Free Software Foundation,
+# Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA.
+#
+# If you want to redistribute modifications to GROMACS, please
+# consider that scientific software is very special. Version
+# control is crucial - bugs must be traceable. We will be happy to
+# consider code for inclusion in the official distribution, but
+# derived work must not be called official GROMACS. Details are found
+# in the README & COPYING files - if they are missing, get the
+# official version at http://www.gromacs.org.
+#
+# To help us fund GROMACS development, we humbly ask that you cite
+# the research papers on the package. Check out http://www.gromacs.org.
+
+"""Include directive sorter for GROMACS.
+
+This module implements an #include directive sorter for GROMACS C/C++ files.
+It allows (in most cases) automatically sorting includes and formatting
+the paths to use either relative paths or paths relative to src/.
+It groups includes in groups of related headers, sorts the headers
+alphabetically within each block, and inserts empty lines in between.
+The script requires an up-to-date list of installed headers and Doxygen XML
+documentation to be present in the build tree.
+
+The sorting assumes some conventions (e.g., that system headers are included
+with angle brackets instead of quotes).  Generally, these conventions are
+checked by the doxygen-check.py script.
+"""
+
+import os.path
+import re
+import sys
+
+class IncludeGroup(object):
+
+    """Enumeration type for grouping includes."""
+
+    def __init__(self, value):
+        """Initialize a IncludeGroup instance.
+
+        IncludeGroup.{main,system_c,...} should be used outside the
+        class instead of calling the constructor.
+        """
+        self._value = value
+
+    def __cmp__(self, other):
+        """Order include groups in the desired order."""
+        return cmp(self._value, other._value)
+
+# gmxpre.h is always first
+IncludeGroup.pre = IncludeGroup(0)
+# "main" include file for the source file is next
+IncludeGroup.main = IncludeGroup(1)
+# config.h is next, if present, to keep its location consistent
+IncludeGroup.config = IncludeGroup(2)
+# Followed by system headers, with C first and C++ following
+IncludeGroup.system_c = IncludeGroup(3)
+IncludeGroup.system_c_cpp = IncludeGroup(4)
+IncludeGroup.system_cpp = IncludeGroup(5)
+# System headers not in standard C/C++ are in a separate block
+IncludeGroup.system_other = IncludeGroup(6)
+# src/external/ contents that are included with quotes go here
+IncludeGroup.nonsystem_other = IncludeGroup(7)
+# Other GROMACS headers
+IncludeGroup.gmx_general = IncludeGroup(8)
+# This group is for shared (unit) testing utilities
+IncludeGroup.gmx_test = IncludeGroup(9)
+# This group is for headers local to the including file/module
+IncludeGroup.gmx_local = IncludeGroup(10)
+
+class GroupedSorter(object):
+
+    """Grouping and formatting logic for #include directives.
+
+    This class implements the actual logic that decides how includes are
+    grouped and sorted, and how they are formatted."""
+
+    # These variables contain the list of system headers for various blocks
+    _std_c_headers = ['assert.h', 'ctype.h', 'errno.h', 'float.h',
+            'inttypes.h', 'limits.h', 'math.h', 'signal.h', 'stdarg.h',
+            'stddef.h', 'stdint.h', 'stdio.h', 'stdlib.h', 'string.h',
+            'time.h']
+    _std_c_cpp_headers = ['c' + x[:-2] for x in _std_c_headers]
+    _std_cpp_headers = ['algorithm', 'deque', 'exception', 'fstream',
+            'iomanip', 'ios', 'iosfwd', 'iostream', 'istream', 'iterator',
+            'limits', 'list', 'map', 'memory', 'new', 'numeric', 'ostream',
+            'regex', 'set', 'sstream', 'stdexcept', 'streambuf', 'string', 'strstream',
+            'typeinfo', 'vector', 'utility']
+
+    def __init__(self, style, absolute):
+        """Initialize a sorted with the given style."""
+        if style == 'single-group':
+            self._local_group = 'none'
+        elif style == 'pub-priv':
+            self._local_group = 'private'
+        else:
+            self._local_group = 'local'
+        if absolute:
+            self._abspath_main = True
+            self._abspath_local = True
+        else:
+            self._abspath_main = False
+            self._abspath_local = False
+
+    def _get_path(self, included_file, group, including_file):
+        """Compute include path to use for an #include.
+
+        The path is made either absolute (i.e., relative to src/), or
+        relative to the location of the including file, depending on the group
+        the file is in.
+        """
+        use_abspath = including_file is None or group is None
+        if not use_abspath:
+            if group in (IncludeGroup.gmx_general, IncludeGroup.gmx_test):
+                use_abspath = True
+            elif group == IncludeGroup.main and self._abspath_main:
+                use_abspath = True
+            elif group == IncludeGroup.gmx_local and self._abspath_local:
+                use_abspath = True
+        if not use_abspath:
+            fromdir = os.path.dirname(including_file.get_abspath())
+            relpath = os.path.relpath(included_file.get_abspath(), fromdir)
+            if not relpath.startswith('..'):
+                return relpath
+        path = included_file.get_relpath()
+        assert path.startswith('src/')
+        return path[4:]
+
+    def _get_gmx_group(self, including_file, included_file):
+        """Determine group for GROMACS headers.
+
+        Helper function to determine the group for an #include directive
+        when the #include is in one of the gmx_* groups (or in the main group).
+        """
+        main_header = including_file.get_main_header()
+        if main_header and main_header == included_file:
+            return IncludeGroup.main
+        if included_file.get_directory().get_name() == 'testutils':
+            return IncludeGroup.gmx_test
+        if including_file.get_directory().contains(included_file):
+            if self._local_group == 'local':
+                return IncludeGroup.gmx_local
+            if self._local_group == 'private':
+                if included_file.api_type_is_reliable() \
+                        and included_file.is_module_internal():
+                    return IncludeGroup.gmx_local
+                if not included_file.api_type_is_reliable() \
+                        and including_file.get_relpath().startswith('src/programs'):
+                    return IncludeGroup.gmx_local
+        if included_file.is_test_file():
+            return IncludeGroup.gmx_test
+        return IncludeGroup.gmx_general
+
+    def get_sortable_object(self, include):
+        """Produce a sortable, opaque object for an include.
+
+        Includes are sorted by calling this function for each #include object,
+        and sorting the list made up of these objects (using the default
+        comparison operators).  Each element from the sorted list is then
+        passed to format_include(), which extracts information from the opaque
+        object and formats the #include directive for output.
+        """
+        included_file = include.get_file()
+        if not included_file:
+            path = include.get_included_path()
+            if path in self._std_c_headers:
+                group = IncludeGroup.system_c
+            elif path in self._std_c_cpp_headers:
+                group = IncludeGroup.system_c_cpp
+            elif path in self._std_cpp_headers:
+                group = IncludeGroup.system_cpp
+            else:
+                group = IncludeGroup.system_other
+        elif included_file.is_external():
+            group = IncludeGroup.nonsystem_other
+            if 'external/' in include.get_included_path():
+                path = self._get_path(included_file, group, None)
+            else:
+                path = include.get_included_path()
+        elif included_file.get_name() == 'gmxpre.h':
+            group = IncludeGroup.pre
+            path = self._get_path(included_file, group, None)
+        elif included_file.get_name() in ('config.h', 'gmx_header_config.h'):
+            group = IncludeGroup.config
+            path = self._get_path(included_file, group, None)
+        else:
+            including_file = include.get_including_file()
+            group = self._get_gmx_group(including_file, included_file)
+            path = self._get_path(included_file, group, including_file)
+        return (group, os.path.split(path), include)
+
+    def format_include(self, obj, prev, lines):
+        """Format an #include directive after sorting."""
+        result = []
+        if prev:
+            if prev[0] != obj[0]:
+                # Print empty line between groups
+                result.append('\n')
+            elif prev[1] == obj[1]:
+                # Skip duplicates
+                return result
+        include = obj[2]
+        line = lines[include.get_line_number()-1]
+        include_re = r'^(?P<head>\s*#\s*include\s+)["<][^">]*[">](?P<tail>.*)$'
+        match = re.match(include_re, line)
+        assert match
+        if include.is_system():
+            path = '<{0}>'.format(os.path.join(obj[1][0], obj[1][1]))
+        else:
+            path = '"{0}"'.format(os.path.join(obj[1][0], obj[1][1]))
+        result.append('{0}{1}{2}\n'.format(match.group('head'), path, match.group('tail')))
+        return result
+
+class IncludeSorter(object):
+
+    """High-level logic for sorting includes.
+
+    This class contains the high-level logic for sorting include statements.
+    The actual ordering and formatting the includes is delegated to a sort method
+    (see GroupedSorter) to keep things separated.
+    """
+
+    def __init__(self, sortmethod, quiet):
+        """Initialize the include sorter with the given sorter and options."""
+        self._sortmethod = sortmethod
+        self._quiet = quiet
+        self._changed = False
+
+    def _sort_include_block(self, block, lines):
+        """Sort a single include block.
+
+        Returns a new list of lines for the block.
+        If anything is changed, self._changed is set to True, and the caller
+        can check that."""
+        includes = map(self._sortmethod.get_sortable_object, block.get_includes())
+        includes.sort()
+        result = []
+        prev = None
+        current_line_number = block.get_first_line()-1
+        for include in includes:
+            newlines = self._sortmethod.format_include(include, prev, lines)
+            result.extend(newlines)
+            if not self._changed:
+                for offset, newline in enumerate(newlines):
+                    if lines[current_line_number + offset] != newline:
+                        self._changed = True
+                        break
+                current_line_number += len(newlines)
+            prev = include
+        return result
+
+    def sort_includes(self, fileobj):
+        """Sort all includes in a file."""
+        lines = fileobj.get_contents()
+        # Format into a list first:
+        #  - avoid bugs or issues in the script truncating the file
+        #  - can check whether anything was changed before touching the file
+        newlines = []
+        prev = 0
+        self._changed = False
+        for block in fileobj.get_include_blocks():
+            newlines.extend(lines[prev:block.get_first_line()-1])
+            newlines.extend(self._sort_include_block(block, lines))
+            # The returned values are 1-based, but indexing here is 0-based,
+            # so an explicit +1 is not needed.
+            prev = block.get_last_line()
+        if self._changed:
+            if not self._quiet:
+                sys.stderr.write('{0}: includes reformatted\n'.format(fileobj.get_relpath()))
+            newlines.extend(lines[prev:])
+            with open(fileobj.get_abspath(), 'w') as fp:
+                fp.write(''.join(newlines))
+
+def main():
+    """Run the include sorter script."""
+    import os
+    import sys
+
+    from optparse import OptionParser
+
+    from gmxtree import GromacsTree
+    from reporter import Reporter
+
+    parser = OptionParser()
+    parser.add_option('-S', '--source-root',
+                      help='Source tree root directory')
+    parser.add_option('-B', '--build-root',
+                      help='Build tree root directory')
+    parser.add_option('-F', '--files',
+                      help='Specify files to sort')
+    parser.add_option('-q', '--quiet', action='store_true',
+                      help='Do not write status messages')
+    # This is for evaluating different options; can be removed from the final
+    # version.
+    parser.add_option('-s', '--style', type='choice', default='pub-priv',
+                      choices=('single-group', 'pub-priv', 'pub-local'),
+                      help='Style for Gromacs includes')
+    parser.add_option('--absolute', action='store_true',
+                      help='Write all include paths relative to src/')
+    options, args = parser.parse_args()
+
+    filelist = args
+    if options.files:
+        if options.files == '-':
+            lines = sys.stdin.readlines()
+        else:
+            with open(options.files, 'r') as fp:
+                lines = fp.readlines()
+        filelist.extend([x.strip() for x in lines])
+
+    reporter = Reporter(quiet=True)
+
+    if not options.quiet:
+        sys.stderr.write('Scanning source tree...\n')
+    tree = GromacsTree(options.source_root, options.build_root, reporter)
+    tree.load_installed_file_list()
+    files = []
+    for filename in filelist:
+        fileobj = tree.get_file(os.path.abspath(filename))
+        if not fileobj:
+            sys.stderr.write('warning: ignoring unknown file {0}\n'.format(filename))
+            continue
+        files.append(fileobj)
+    if not options.quiet:
+        sys.stderr.write('Reading source files...\n')
+    tree.scan_files(only_files=files, keep_contents=True)
+    extfiles = set(files)
+    for fileobj in files:
+        for included_file in fileobj.get_includes():
+            other_file = included_file.get_file()
+            if other_file:
+                extfiles.add(other_file)
+    if not options.quiet:
+        sys.stderr.write('Reading Doxygen XML files...\n')
+    tree.load_xml(only_files=extfiles)
+
+    if not options.quiet:
+        sys.stderr.write('Sorting includes...\n')
+
+    sorter = IncludeSorter(GroupedSorter(options.style, options.absolute), options.quiet)
+
+    for fileobj in files:
+        sorter.sort_includes(fileobj)
+
+if __name__ == '__main__':
+    main()
diff --git a/docs/doxygen/suppressions.txt b/docs/doxygen/suppressions.txt
index 6d4cde7532..fe616bb451 100644
--- a/docs/doxygen/suppressions.txt
+++ b/docs/doxygen/suppressions.txt
@@ -5,11 +5,14 @@ src/gromacs/gmxlib/gmx_cpuid.c: warning: duplicate declarations for a member 'gm
 
 # The script is currently a bit too eager
 share/template/template.cpp: error: source file documentation appears outside full documentation
+# The parser in the script is not clever enough
+src/gromacs/version.h: warning: includes local file as <gromacs/version.h>
 
 # These are OK
 src/gromacs/math/vec.h: warning: installed header includes non-installed "config.h"
 src/gromacs/linearalgebra/gmx_blas/*: warning: does not include "gmxpre.h" first
 src/gromacs/linearalgebra/gmx_lapack/*: warning: does not include "gmxpre.h" first
+src/gromacs/utility/baseversion-gen.c: warning: does not include "gmxpre.h" first
 
 # This module name doesn't really fall into any currently used pattern; needs some thought
 : error: no matching directory for module: module_mdrun_integration_tests