Project

General

Profile

Feature #2570

Better string formatting and printing

Added by Joe Jordan 7 days ago. Updated 2 days ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Difficulty:
uncategorized
Close

Description

The transition to c++ means that there is an unholy mix of std::sting and char floating around the codebase. This means there is a lot of converting between types; currently there are >500 instances of c_str(). The current potential solutions leave some things still desired e.g.:
IOStream lacks formated output and needs an unaesthetic number of chevrons.
gmx::formatString is somewhat useful here but still needs c_str() conversions.

A potential solution (h/t Roland S.) is http://fmtlib.net/latest/index.html which has python-like positional formatted printing of arbitrary types. I don't know that we want to make gromacs dependent on the fmt library but it is worth discussing whether a similar functionality should exist in gromacs.

Associated revisions

Revision 4c266b29 (diff)
Added by Roland Schulz 3 days ago

Warn for type mismatch for gmx printf like functions 1/3

Automatic fixes with clang fix-it

Related #2570

Change-Id: Id565d2705c673d1e45318bb8bf1091c165a724de

History

#1 Updated by Mark Abraham 7 days ago

There's a few related issues entangled here.

These kind outputs go to a stream, which should be a low level component used by some higher level writer that understands stuff like lines, indenting, wrapping and formatting. That permits the stream to be swapped out for testing, or do nothing if the kind of output should not be emitted at this verbosity level. Teemu made a lot of this already (see http://manual.gromacs.org/documentation/current/doxygen/html-full/classgmx_1_1TextWriter.xhtml and related TextReader and various text and file stream implementaionts). There's an example use case in write_inpfile() of src/gromacs/fileio/readinp.cpp for writing out .mdp files. A useful addition would be to borrow from the implementation of formatString and add to TextWriter some methods like writeFormattedLine() and writeFormattedString(). C++11 argument forwarding makes this easier than back when Teemu first implemented formatString.

An even more useful thing would be to forward the arguments after using metaprogramming to call c_str() on any type that supports it. Who's keen to hurt their brain a bit? :-) Example at https://msdn.microsoft.com/en-us/magazine/dn913181.aspx

Avoiding other use of c_str() is often a matter of providing a convenience overload, or refactoring to no longer have a C-style consumer. However there are plenty of cases where neither of those is feasible in the scope of a current change, and one should use c_str() and maybe leave a TODO. And later coming back to resolve them some time ;-) Some good news is that there is tooling that can find that a c_str() is being used to construct a temporary std::string to pass to a call that now uses std::string rather than const char *.

Type safe formatting is a nice thing to have, but it's a stretch goal. Until the string output of some area of the code is under test (e.g. by using the above approaches), it isn't be a priority to replace the formatting by something else (even if the library dependency was 100% desirable) because there's a risk of inadvertently changing the usability of the resulting output. I agree there's good things that would flow from such formatting, but there are simpler ways to get bigger impact for now.

#2 Updated by Roland Schulz 6 days ago

TextWriter and formatString are orthogonal. If we had a better formatString that could be used by TextWriter.

Writing a formatString which is better requires a lot of code. As you say we would need overloading. And va_args doesn't work with overloading and is anyhow very ugly (core-guidelines says you shoulnd't use). It should be a variadic template. The problem is that there is no built-in function to just replace the first conversion "%..." of the fmt string. But that is required when replacing va_args because you need to iterate (or recurse) over the arguments. Thus you would end having to implement your own conversion function. And in that case you would end up with almost as much code as if you just take the one header from fmt which does that (that code isn't big). And if you anyhow have a type-aware conversion function (which you would automatically get by using a variadic template) there is no reason not to make it typesafe.

There is no need to change the print format. fmt supports printf style as legacy format: http://fmtlib.net/latest/api.html#printf-api . Whether we decide to stick to the legacy fmt even for new code (to have consistent format) or use the new format for new code (to avoid specifying information which is redundant for a typesafe printing) is a separate question.

#3 Updated by Mark Abraham 6 days ago

Roland Schulz wrote:

TextWriter and formatString are orthogonal. If we had a better formatString that could be used by TextWriter.

Writing a formatString which is better requires a lot of code. As you say we would need overloading. And va_args doesn't work with overloading and is anyhow very ugly (core-guidelines says you shoulnd't use). It should be a variadic template. The problem is that there is no built-in function to just replace the first conversion "%..." of the fmt string. But that is required when replacing va_args because you need to iterate (or recurse) over the arguments. Thus you would end having to implement your own conversion function. And in that case you would end up with almost as much code as if you just take the one header from fmt which does that (that code isn't big). And if you anyhow have a type-aware conversion function (which you would automatically get by using a variadic template) there is no reason not to make it typesafe.

I think we're talking about different things. I'm proposing a variadic template function that recognizes any std::string in the template parameter pack and uses c_str() to convert any such arg, which then calls something like formatString() with the resulting function parameter pack. That doesn't touch the format string or va_args, and sounds to me like a much lighter weight way to let reduce developers frequency of calling c_str().

#4 Updated by Roland Schulz 6 days ago

I agree. I misunderstood you and your solution should be easy to do.
In gerrit, I was talking mostly about the problem that forgetting .cstr() isn't a compiler error, rather than the very minor issue of having to type 7 more characters. And of course this type safety issue is something which doesn't just apply to string. But of course there would be an advantage to at least avoid the string usage error without fully resolving the type safety issue. If we don't do a proper type safe solution we should at least add format function attribute to formatString.

#5 Updated by Roland Schulz 6 days ago

const char* conv(const char *s) { return s;}
const char* conv(const std::string &s) { return s.c_str();}

//template<typename ...Args>
//std::string formatString(const char*, Args...) __attribute__ ((format (printf, 1, 2)));

template<typename ...Args>
std::string formatString(const char* fmt, Args...args)
{
    return formatStringImpl(fmt, conv(args)...);
}

This works. But only without the attribute because the attribute is only valid for variadic functions not variadic templates. I think the warning is more important than the convenience.

#6 Updated by Gerrit Code Review Bot 6 days ago

Gerrit received a related patchset '1' for Issue #2570.
Uploader: Roland Schulz ()
Change-Id: gromacs~master~Ie2bc5181a7fee1c0d1a647a12d8c73c4fb3ebe96
Gerrit URL: https://gerrit.gromacs.org/8080

#7 Updated by Gerrit Code Review Bot 6 days ago

Gerrit received a related patchset '1' for Issue #2570.
Uploader: Roland Schulz ()
Change-Id: gromacs~master~Id565d2705c673d1e45318bb8bf1091c165a724de
Gerrit URL: https://gerrit.gromacs.org/8082

#8 Updated by Gerrit Code Review Bot 6 days ago

Gerrit received a related patchset '1' for Issue #2570.
Uploader: Roland Schulz ()
Change-Id: gromacs~master~I2046d323dad6420e61f47918b12aa538f4a6d4a0
Gerrit URL: https://gerrit.gromacs.org/8083

#9 Updated by Gerrit Code Review Bot 4 days ago

Gerrit received a related patchset '1' for Issue #2570.
Uploader: Roland Schulz ()
Change-Id: gromacs~master~I9b46559c2554404b309aa8710eb34ffa128d8cae
Gerrit URL: https://gerrit.gromacs.org/8087

#10 Updated by Roland Schulz 3 days ago

absl also has a type-safe fmt library: https://abseil.io/docs/cpp/guides/format
Uses standard printf syntax. Ignoring the size modifier and supporting string/string_view.
And absl has a lot of other things which would improve code quality a lot (span, fixed_array, optional, variant).

#11 Updated by Roland Schulz 2 days ago

For compile time format string checking absl requires clang's enable_if attribute and fmt requires full c++14 constexpr support(clang, msvc, gcc >=6). While neither requirement covers all compilers, as long some Jenkins configuration have compile time checking enabled, it should be sufficient.

Also available in: Atom PDF