improve readability of help/man output
I find the new help (and man) page formatting introduced in 5.0 considerably harder to read than the previous one and I would like to ask for improvements for 5.1.
Update help output
Various changes to console help output (and some to man output) based on
feedback in #1687:
- People seem to prefer a centered startup header with some ASCII art,
so made the first output line just like that (except if running with
-quiet), and center the list of authors.
- Always start the option description at a new line.
- Give more space to various parts of the option list so that they are
more likely to align up.
- Split the file listing based on input/output type. This causes,
e.g., some logically related options (e.g., -cpi and -cpo) to get
separated in the output, but this seems to be the preferred approach.
- Add formatting to the synopsis on man pages (similar to how the
options appeared in the 4.6-era man pages, except that reasonable
line wrapping is still there).
Clean up the design for formatting the options list; now the actual
formatting code is better encapsulated in HelpWriterContext.
The user-visible scope of this change is limited to changing behavior
that changed between 4.6 and 5.0. Thus, the changes here should be
sufficient to close #1687.
#2 Updated by Teemu Murtola about 5 years ago
This is useless if you don't have some concrete suggestions. The change that introduced the formatting didn't exactly get that much attention during those several months it was in gerrit, and many of the changes there were actually done in response to your requests about followin more standard notation (and you even commented that it looks good on the main change). There certainly are things that are better formatted now than previously (for example, pre-5.0 man pages had several spacing issues with either missing or double spaces). And the line wrapping wasn't any better in pre-5.0 man pages.
#3 Updated by Mark Abraham about 5 years ago
Indeed, concrete suggestions are good. For example, I lately looked at
mdrun -h in master and thought that
SYNOPSIS gmx mdrun [-s [<.tpr>]] [-o [<.trr/.cpt/...>]] [-x [<.xtc/.tng>]] [-cpi [<.cpt>]] [-cpo [<.cpt>]] [-c [<.gro/.g96/...>]] [-e [<.edr>]] [-g [<.log>]] [-dhdl [<.xvg>]] [-field [<.xvg>]] [-table [<.xvg>]] [-tabletf [<.xvg>]] [-tablep [<.xvg>]] [-tableb [<.xvg>]] [-rerun [<.xtc/.trr/...>]] [-tpi [<.xvg>]] [-tpid [<.xvg>]] [-ei [<.edi>]] [-eo [<.xvg>]] [-devout [<.xvg>]] [-runav [<.xvg>]] [-px [<.xvg>]] [-pf [<.xvg>]] [-ro [<.xvg>]] [-ra [<.log>]] [-rs [<.log>]] [-rt [<.log>]] [-mtx [<.mtx>]] [-dn [<.ndx>]] [-multidir [<dir> [...]]] [-membed [<.dat>]] [-mp [<.top>]] [-mn [<.ndx>]] [-if [<.xvg>]] [-swap [<.xvg>]] [-deffnm <string>] [-xvg <enum>] [-dd <vector>] [-ddorder <enum>] [-npme <int>] [-nt <int>] [-ntmpi <int>] [-ntomp <int>] [-ntomp_pme <int>] [-pin <enum>] [-pinoffset <int>] [-pinstride <int>] [-gpu_id <string>] [-[no]ddcheck] [-rdd <real>] [-rcon <real>] [-dlb <enum>] [-dds <real>] [-gcom <int>] [-nb <enum>] [-nstlist <int>] [-[no]tunepme] [-[no]v] [-[no]compact] [-pforce <real>] [-[no]reprod] [-cpt <real>] [-[no]cpnum] [-[no]append] [-nsteps <int>] [-maxh <real>] [-multi <int>] [-replex <int>] [-nex <int>] [-reseed <int>]
would be more readable with newlines between each option. That's going to mean a lot of lines in this case - but mdrun -h is already huge, and IMO half those things can be implemented better such that they are not on the command line of mdrun... (side issue)
#4 Updated by Teemu Murtola about 5 years ago
Newline between every option would waste a lot of screen estate, which makes it difficult to read in other ways. How about adding some extra whitespace between the options, and/or trying to align them to some columns if they fit? In man pages, we could consider reintroducing different fonts for the different elements in the synopsis (that is much easier for 5.1, now that it is not necessary to do everything as raw nroff)...
#6 Updated by Justin Lemkul about 5 years ago
I'll add a few thoughts of my own and things that people in my group have complained to me about :)
1. I find the synopsis is not helpful. Without context, what good is a block of options and file extensions?
2. I liked the :-) GROMACS (-: from before because the unique character pattern made it really easy to skip through a file when looking for information, debugging, etc. Now, you can't even search efficiently for "GROMACS" because every tool prints it at least 4 times in the header its printed output.
3. The I/O options are a mess for most tools. Take grompp, for instance, which only has a few options:
-f [<.mdp>] (grompp.mdp) (Input) grompp input file with MD parameters -po [<.mdp>] (mdout.mdp) (Output) grompp input file with MD parameters -c [<.gro/.g96/...>] (conf.gro) (Input) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -r [<.gro/.g96/...>] (conf.gro) (Input, Opt.) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -rb [<.gro/.g96/...>] (conf.gro) (Input, Opt.) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -n [<.ndx>] (index.ndx) (Input, Opt.) Index file -p [<.top>] (topol.top) (Input) Topology file -pp [<.top>] (processed.top) (Output, Opt.) Topology file -o [<.tpr/.tpb/...>] (topol.tpr) (Output) Run input file: tpr tpb tpa -t [<.trr/.cpt/...>] (traj.trr) (Input, Opt.) Full precision trajectory: trr cpt trj tng -e [<.edr>] (ener.edr) (Input, Opt.) Energy file -imd [<.gro>] (imdgroup.gro) (Output, Opt.) Coordinate file in Gromos-87 format -ref [<.trr/.cpt/...>] (rotref.trr) (In/Out, Opt.) Full precision trajectory: trr cpt trj tng
The alignment makes it hard to read and it's haphazard in terms of inputs and outputs. What I'd like to see is something more like:
INPUT -f [<.mdp>] (grompp.mdp) grompp input file with MD parameters -c [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -p [<.top>] (topol.top) Topology file OUTPUT -po [<.mdp>] (mdout.mdp) grompp input file with MD parameters -o [<.tpr/.tpb/...>] (topol.tpr) Run input file: tpr tpb tpa OPTIONAL INPUT -r [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -rb [<.gro/.g96/...>] (conf.gro) Structure file: gro g96 pdb brk ent esp tpr tpb tpa -n [<.ndx>] (index.ndx) Index file -t [<.trr/.cpt/...>] (traj.trr) Full precision trajectory: trr cpt trj tng -e [<.edr>] (ener.edr) Energy file -ref [<.trr/.cpt/...>] (rotref.trr) Full precision trajectory: trr cpt trj tng OPTIONAL OUTPUT -pp [<.top>] (processed.top) Topology file -imd [<.gro>] (imdgroup.gro) Coordinate file in Gromos-87 format
(OK, I don't know where -ref goes because I've never used that feature) Improve the alignment a bit so that extensions, default names, and short descriptions all start at some uniform place and that will look downright elegant.
But the gist is that something like this would (1) reduce clutter by taking out repetitive (Input), (Output Opt), etc from each line and (2) provide some structure to the help information. Heck, maybe even alphabetize the options. The short descriptions could also be improved, because for options like -c, -r, -rb, etc. the extension and "description" say the same thing. If we improve the descriptions, that can cut down on some of the "wall of text" problems we're having, too.
I had a point #4 when I started this, but now I've forgotten...
I should also state that I appreciate everyone's efforts thus far in improving things like this. Our guys are always in awe of GROMACS documentation, which makes CHARMM look like a train wreck :)
#7 Updated by Teemu Murtola about 5 years ago
For the file options and alignment in general, I agree that there is definitely some room for improvement, but there are issues with your proposal as well:
- Even with the repetitive (Input) etc. taken out, your example still has lines longer than 80 characters, so it will look a mess on a standard-width terminal. Portably detecting the width of the terminal so that the wrapping can adapt is not trivial, although probably doable.
- Grouping the options like you suggest, and/or alphabetizing them breaks any natural order that there is in the option declarations in the source code. For example, most tools now list the generic options first, followed by tool-specific options, and has related options next to each other.
- Making everything align up is certainly possible, but that means that the alignment will eat up a lot of screen estate. If it is acceptable that every option always takes two lines so that the description always starts on a new line, then that should be doable. Otherwise there isn't enough space on a standard-width terminal. Or we will end up longer descriptions taking up many more lines, if we only use the right half of the terminal to cram the descriptions in a narrow column, like the pre-5.0 code did.
- There was/is no API to specify alternative descriptions for file options in t_filenm. The new/C++-converted tools actually specify more meaningful short descriptions; take a look at, e.g.,
gmx help sasa. But using that in old tools would require API changes, and someone would need to write the descriptions for all the options for the mostly unmaintained tools.
In the context of this issue, only the alignment and introduction of the [<.xvg>] part is something that has changed from pre-5.0, and the latter is there to make things a bit more consistent between the two option listings.
#8 Updated by Szilárd Páll almost 5 years ago
Thanks for quickstarting the discussion.
@Teemu: my request at the time of 5.0 was asking for more adherence to the formatting standards. Sadly, I did not have enough time to provide more input or work on the code itself, but my current complaints, as you will hopefully agree, are complementary or unrelated to what I labeled as "looks good".
Secondly, sorry for the delay, it took me some time to compile useful information, had to dig through help outputs and man pages for examples; please take it as constructive criticism and not bashing!
0. :-) GROMACS (-: I did note already back at the time of the changes' first introduction that the centered on 80 wide column header with the peculiar hence easy to spot pattern, as Justin points out, was a great way to identify the start of file or terminal output. Now that the header is left-aligned
1. Help and man output are (mostly) different. Without too many details (more on these in 2/3), the former should be short and concise (typically with a usage and most common options), while the latter, is typically detailed and structured. A good example to check is
man git vs
git --help or
man find vs
find --help. One simple metric that should illustrate my point even without typing is:
$ man git | wc 965 4625 38192 $ git --help | wc 32 235 1768
$ man find | wc 1316 9573 69663 $ find --help | wc 32 235 1768
The 32-line help in both cases is not a typo, just a coincidence, I guess :)
2. -h output This typically contains only usage info, brief description in some cases, a short list of (typically mandatory) options/arguments, sometimes a few lines of important description and contact/home page/bug reporting info.
Regarding the changes I propose/consider useful, [with the risk of stating the stating the obvious] the user docs should not be dumped on the help output, and especially the current order makes the already hard to read synopsis even less useful. I think that even if the effort of a complete redesign is (understandably) too much in the first stage, at least a few moderate changes could improve things a lot: the hundreds of lines of user guide text could simply be dropped; synopsis turned into a simpler/shorter usage info; option list at least reformatted possibly trimmed to contain only important stuff - the complete list of all exotic input/output control options that most users have never user (including me) should definitely be reserved for the man page, I think.
3. man page In general, the current formatting and organization of the content is decent for the man page format. However, I think more categorization, classification into topics/usage scenarios is needed to improve the readability and ease of navigation.
3a. Synopsis IMHO it is really important. However, it rarely contains all options it is meant to show in what ways/use-cases the command/program/function can be employed, not all possible uses of it. Typically the most important features or categories of use are included and sometimes many or most existing options are omitted to avoid clutter. Examples:
- The git merge examples shows nicely the different functionalities:
git merge [-n] [--stat] [--no-commit] [--squash] [--[no-]edit] [-s <strategy>] [-X <strategy-option>] [-S[<key-id>]] [--[no-]rerere-autoupdate] [-m <msg>] [<commit>...] git merge <msg> HEAD <commit>... git merge --abort
- The rsync synopsis goes even further highlighting the distinct categories of use-cases and it obviously does not list the huge number of options - for those they have a separate ~250 line option summary section and then about 2500-3000 lines of option documentation (reflecting the importance of categorization, more on that later).
Local: rsync [OPTION...] SRC... [DEST] Access via remote shell: Pull: rsync [OPTION...] [USER@]HOST:SRC... [DEST] Push: rsync [OPTION...] SRC... [USER@]HOST:DEST Access via rsync daemon: Pull: rsync [OPTION...] [USER@]HOST::SRC... [DEST] rsync [OPTION...] rsync://[USER@]HOST[:PORT]/SRC... [DEST] Push: rsync [OPTION...] SRC... [USER@]HOST::DEST rsync [OPTION...] SRC... rsync://[USER@]HOST[:PORT]/DEST Usages with just one SRC arg and no DEST arg will list the source files instead of copying.
Based on these, I propose that our synopsis includes the different usage scenarios of e.g. for mdrun: tMPI mode, (native) MPI mode, -multi mode, etc.
3b. Description It could use needs some kind of organization into topics/categories.
3c. Options The list of options would be best categorized (e.g. input/output option, run control otpions, etc.). Additionally, as other programs' man pages illustrate well, the -h one-liner description of an option is different form the man page's option description. Hence, on the long run, more detailed description of option could be useful. I guess this will mean shifting some of the content from the main block of text to the option description.
3d. Other sections WIP
(Sorry for the WIP post, still need to put some thought into the last/last 2-3 items.)
#9 Updated by Teemu Murtola almost 5 years ago
I don't disagree with your points, but I fail to see how they relate to your original description in this issue: all of them, except for 0, relate to content of the help/man pages, and things are there exactly the same in 4.6 and 5.0. Many of your points were also already captured in #1437, and replied to there, so I don't want to start repeating things here. None of the things you mention, except 0 marginally, have any relation to the changes done to
-h/man page formatting between 4.6 and 5.0.
#10 Updated by Szilárd Páll almost 5 years ago
You're right, the 5.0 changes indeed mostly involved a redesign of the header, but also a reformatting of the option list. You're right in that some of the other point are not related to change in 5.0 and also greatly overlap with #1437 which I forgot to read through - sorry about that.
Regarding the option list, I meant to but forgot to mention that Justin's proposed formatting looks reasonable and even though this will require multiple lines per option, I don't think that is an issue - especially if the help output gets reduced to display only a subset of options. One way to accomplish this is to tag some of the options for help output. The same mechanism of using flags/tags could be used for custom categorization (e.g. input/output, run control options etc.). It seems that such flags/tags would require a relatively small change in the option module's API and no changes in the order the options are listed in the code.
#12 Updated by Teemu Murtola almost 5 years ago
- Category set to core library
- Status changed from New to Blocked, need info
The linked change should resolve all the concrete feedback to date about 4.6 to 5.0 changes; others should be tracked in some other issue (e.g., #1437), as they are not related to the original description of this issue. Now we need some feedback on whether the proposed changes actually address the original concerns to finalize the change.