Project

General

Profile

Task #887

Decide on pointer usage in C++

Added by Roland Schulz over 8 years ago. Updated about 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
core library
Target version:
Difficulty:
uncategorized
Close

Description

Our C++ developer guidelines are not clear yet on the rules for pointer usage. For exception safety it is important to use RAII and thus smart pointers.
Depending on the ownership pointers can be classified into 3 groups:
- Shared ownership with reference counting > shared_ptr
Passing/moving of (unique) ownership > ideal unique_ptr (also possible with shared_ptr and auto_ptr)
Non transferable ownership -> unique_ptr or scoped_ptr

The issue is that the ideal unique_ptr is introduced in C++11 and not supported yet by all important compilers. auto_ptr is confusing (and error prone) syntax. shared_ptr has overhead and might mislead developers to create less clear shared ownership designs.

A few proposals of which pointers to use for these 3 classes:
1) shared_ptr, shared_ptr, scoped_ptr
2) shared_ptr, unique_ptr, unique_ptr (with a C++03 emulation of unique_ptr)
3) shared_ptr, unique_ptr, scoped_ptr (also with emulation)
4) shared_ptr, unique_ptr, scoped_ptr (where unique_ptr is not a emulation but syntax compatible in C++03, C++11 uses std::unique_ptr)

Other questions are:
- Should pointer or references be used for (const) arguments/return-values. Should bare pointers be avoided?

For the beginning of the discussion see: https://gerrit.gromacs.org/#/c/516/1 . Some was also in private email. Please ask if you need copies of those.

hello.cpp (970 Bytes) hello.cpp Example Program John Eblen, 03/02/2012 12:00 AM

Associated revisions

Revision a20e3f5b (diff)
Added by Roland Schulz over 8 years ago

Use smart pointers for internal memory management.

- Replaced uses of std::auto_ptr with boost::scoped_ptr where memory was
allocated locally and release() was not used, or refactored the use of
pointers out completely.
- Use gmx_unique_ptr in containers that are responsible for freeing
the contained objects.
- Removed an unnecessary #include <memory>.
- Also added doxygen comments in modified parts to satisfy Jenkins.

Part of issue #887.

Change-Id: Ib582b6395fb348d392a742acd4b1ff73acfa74a3

History

#1 Updated by Roland Schulz over 8 years ago

John (on Gerrit)

Regarding issue 2 and the unique_or_shared_ptr, I'm confused. What is the policy for transfer of ownership of such a pointer?
Simple assignment will not work (aside from the obvious problem that this only copies and does not move):

I think it depends on what we want to achieve. Sure we don't need a unique_or_shared_ptr if we go for proposal 1. We would only need a special pointer for proposal 4. And proposal 4 would require that the pointer being used works like a unique_ptr (but can be less efficient and less strict). Thus one could use a shared_ptr to implement it (auto_ptr only if we knew we don't need collections). The advantage would be one would have code which already works with C++11 unique_ptr (which is probably the long therm aim) and one would guarantee that it isn't misused because C++11 compilers would complain. But this would also imply that passing ownership would need to use the move syntax (e.g. a = std::move(b)). If one would implement that using Boost::Move one would have all the dependency and could as well use proposal 2/3. But if move is simply a no-op and unique_ptr is simply a shared_ptr for C++03 that actually might work. Only developer would better test with a C++11 compiler so that they don't get a whole bunch of compile errors (for each forgotten move) in Jenkins for errors their C++03 compiler didn't catch. Another nice advantage of that would be that it would work with non move aware containers. Of course the disadvantage is that it has the shared_ptr overhead. But since this would only be the C++03 fallback, this might be OK.

#2 Updated by Roland Schulz over 8 years ago

Teemu (on Gerrit)

3) One of the big benefits of C++11 unique_ptr is that it is not possible to implicitly move it; according to the link John provided, the boost::move emulation allows it (although only if we use their rvalue reference wrapper in our code, which we may not want to do).

I don't think this is an issue. unique_ptr(using boost::move) guarantees that you can't implicitly move. Thus given "unique_ptr<int> p(new int(1)), p2;", p2=p is illegal but p2=move(p) is legal. I just tested that. The only problem I can find in John's link is some corner case for types.

4) My gut feeling is that boost unique_ptr does not provide enough benefits to be worth the effort or the size it brings. For reference, I checked that the whole <memory> header on MSVC (which declares unique_ptr, shared_ptr and some more) is only ~3000 lines. I think that for C++11 features, it could be better to wait until the compilers we need have the support before we start using them, and not try to do any complex workarounds.

Unique_ptr (465 lines) and move (1267 lines) are both small. The problem is that both depend on parts of type_traits and mpl. unique_ptr depends on type_traits/is_pointer.hpp and mpl/if.hpp. These two alone are 3.4MB and all unique_ptr dependencies (without move) are 3.5MB. move depends on boost/mpl/and.hpp which makes it with is_pointer 7MB. All of move is 7.2MB. Most of that are the preprocessed header files for the different compilers. Thus each individual compiler would only compiles a very small part of it.

Almost all boost libraries (>80%) have some dependency on type_traits and/or mpl (e.g. serialization and mpi). Thus it is questionable whether we can avoid that dependency in the long run even if we only use small parts of boost.

Having said that: if we could make it work that we can use shared_ptr on C++03 and unique_ptr on C++11 compilers the advantage would really be very small and probably not worth it. If we can't make that work, I would think the advantage is in the long run quite significant. It will be many years until all compilers we want to support have all unique_ptr implemented (judging from the fact that people use really old compiler and how long it took in the past for all compilers to support C++03). Thus we would be stuck with using shared_ptr instead of unique_ptr for a long while even for modern compilers (and thus having the (small) overhead).

#3 Updated by Mark Abraham over 8 years ago

I'm a long way from being a good user of C++, but doesn't "passing/moving of (unique) ownership" suggest a design flaw? If the ownership needs to move, why should it have been unique? Can someone suggest a good example of where we really need this kind of pointer? If not, then we can plan on shared_ptr for type 1 and scoped_ptr for type 3 until we hit an actual problem?

#4 Updated by Roland Schulz over 8 years ago

E.g. source+sink function (e.g. a factory + a consumer), containers (ownership needs to be transferred into the container). In the current code for review look at all usages for shared_ptr. They are all of the 2nd type and none of the 1st. The 1st should be avoided when possible. In all these cases the ownership of the pointer is still unique in the sense that only one class/scope/method/container/... has ownership. And thus as soon as this class is deleted (/scope is left) the object the smart pointer points to is deleted. Thus no reference counting is required.

#5 Updated by Teemu Murtola over 8 years ago

  • Tracker changed from Bug to Task

Performance issues won't probably have any major role in this: typically, pass-of-ownership would occur in initialization and not in any inner loop (if it does, it means that heap memory is also constantly allocated, which is probably more critical to performance). There shouldn't be any performance cost in just dereferencing a shared_ptr.

If I'm not mistaken, both the emulation alternatives would also require us to use containers that are aware of this emulation (unless we still use shared_ptr in the containers), which will increase the dependency footprint.

If I understood correctly, the emulation does not correctly handle this (skipping the fact that syntax is different for the emulation):

void f(unique_ptr<int> &&p);

unique_ptr<int> p1(new int(1));
f(p1);

This fails in C++11 (should be f(std::move(p1))), but the emulation limitations page seems to indicate that this passes with the emulation. E.g., std::vector::push_back() has this kind of signature in C++11.

#6 Updated by Roland Schulz over 8 years ago

Teemu Murtola wrote:

Performance issues won't probably have any major role in this: typically, pass-of-ownership would occur in initialization and not in any inner loop (if it does, it means that heap memory is also constantly allocated, which is probably more critical to performance). There shouldn't be any performance cost in just dereferencing a shared_ptr.

True. So the only real advantage would be that by the different name the intention would be clear and that (at least new) compilers would prevent wrong usage.

If I'm not mistaken, both the emulation alternatives would also require us to use containers that are aware of this emulation (unless we still use shared_ptr in the containers), which will increase the dependency footprint.

Yes for the unique_ptr emulation that is true and is an important disadvantage. But if we simply use shared_ptr on C++03 compilers and (the real) unique_ptr on C++11 (as long as they support at least unique_ptr and the move aware containers) this would work without any dependency. And we would still have the advantage that C++11 compilers would give errors about incorrect usage. Would we need anything else then a gmx::move (noop function for C++03 and imported std::move for C++11) and gmx_unique_ptr (shared_ptr for C++03, unique_ptr for C++11 - see earlier discussion of how to best do emulate a template alias) to make this work? What do you think about that idea?

If I understood correctly, the emulation does not correctly handle this (skipping the fact that syntax is different for the emulation):
[...]
This fails in C++11 (should be f(std::move(p1))), but the emulation limitations page seems to indicate that this passes with the emulation. E.g., std::vector::push_back() has this kind of signature in C++11.

It works correctly with boost::container::vector because they do extra magic in the container to make the compiler fail (the resulting compiling errors are pretty incomprehensible) but you are right just declaring a function f(BOOST_RV_REF(unique_ptr<int> p) is not enough to get the compiler to fail.

#7 Updated by Teemu Murtola over 8 years ago

Roland Schulz wrote:

But if we simply use shared_ptr on C++03 compilers and (the real) unique_ptr on C++11 (as long as they support at least unique_ptr and the move aware containers) this would work without any dependency. And we would still have the advantage that C++11 compilers would give errors about incorrect usage. Would we need anything else then a gmx::move (noop function for C++03 and imported std::move for C++11) and gmx_unique_ptr (shared_ptr for C++03, unique_ptr for C++11 - see earlier discussion of how to best do emulate a template alias) to make this work? What do you think about that idea?

I think that this (option 4) would be a reasonable approach, if we can make it work. I think the essential parts are the emulated template alias and std::move wrapper as you say. We can't use rvalue parameters for functions, but those should be possible to replace with simple pass-by-value at a (negligible) performance cost. I don't think I will have time to do anything about this in the next two weeks, though...

My preference would be to first try whether this can be made to work, and if not, simply go for option 1.

#8 Updated by John Eblen over 8 years ago

Roland and I created a "Hello World" program, attached,that shows one way that option 4 could work (using shared_ptr for C++03 and unique_ptr for C++0x).

We compiled as follows with gcc (must be version 4.4 or higher):
g++ -o hello hello.cpp
g++ -DC11 -std=c++0x -o hello11 hello.cpp

#9 Updated by Roland Schulz over 8 years ago

A few issues where gmx_unique_ptr doesn't behave as unique_ptr when using a C++03 compiler (and thus using shared_ptr underneath):
  • no release method. One cannot implement one for shared_ptr because it doesn't provide any method to remove ownership.
  • As is, does not work for arrays. Easy to add by
      template<typename T> struct gmx_unique_ptr<T[]> { typedef shared_array<T> type; };
      

    But that adds a dependency on shared_array and I assume we don't want to use it for arrays anyhow. Since one gets an error if trying to use gmx_unique_ptr<T[]> without the additional code, I think it is fine. It isn't unsafe (incorrect but no error) as gmx_unique_ptr<T> p(new T[N]) (which calls delete instead of delete[] - but isn't caught by unique_ptr either).
  • Does not provide a way to set custom deleter function. It can be added by (replacing the gmx_unique_ptr one parameter version in John's version):
    #ifdef C11
    template<typename T, typename D = std::default_delete<T>>
    struct gmx_unique_ptr {
        typedef std::unique_ptr<T,D> type;
    };
    #else
    template<typename T, typename D=void> //D is not used so type does not matter
    struct gmx_unique_ptr {
        typedef shared_ptr<T> type;
    };
    #endif
    

    And then used as:
    gmx_unique_ptr<int,void (*)(int*)>::type p4(new int(12),&custom_deleter);
    

    Of course this emulation doesn't make sure that the type passed to the constructor is of the right type and behaves different if the type has an implicit conversion or doesn't need any value at all. Because it would behave different in a few (odd) cases, it might be better to just stay with the 1 parameter template version and not support custom deleter.

#10 Updated by Roland Schulz over 8 years ago

  • Status changed from New to Closed

With change Ib582b639 merged the main issue is mostly solved. If we still have open policy question regarding bare pointer usage, it probably should be discussed on the mailing list or in a separate issue.

#11 Updated by Teemu Murtola about 6 years ago

  • Category set to core library
  • Target version set to 5.0

Also available in: Atom PDF