Thursday, February 17, 2011

Could C++ have not obviated the pimpl idiom?

As I understand, the pimpl idiom is exists only because C++ forces you to place all the private class members in the header. If the header were to contain only the public interface, theoretically, any change in class implementation would not have necessitated a recompile for the rest of the program.

What I want to know is why C++ is not designed to allow such a convenience. Why does it demand at all for the private parts of a class to be openly displayed in the header (no pun intended)?

From stackoverflow
  • May be because the size of the class is required when passing its instance by values, aggregating it in other classes, etc ?

    If C++ did not support value semantics, it would have been fine, but it does.

  • Someone will have a much more verbose answer than I, but the quick response is two-fold: the compiler needs to know all the members of a struct to determine the storage space requirements, and the compiler needs to know the ordering of those members to generate offsets in a deterministic way.

    The language is already fairly complicated; I think a mechanism to split the definitions of structured data across the code would be a bit of a calamity.

    Typically, I've always seen policy classes used to define implementation behavior in a Pimpl-manner. I think there are some added benefits of using a policy pattern -- easier to interchange implementations, can easily combine multiple partial implementations into a single unit which allow you to break up the implementation code into functional, reusable units, etc.

  • This would be a nice feature, however: This has to do with the size of the object. When the h file is read, the size of the object is known (based on all its contained elements).

    If the private elements are not known, then you would not know how large of an object to new.

    You can simulate your desired behavior by the following:

    class MyClass
    {
    public:
       // public stuff
    
    private:
    #include "MyClassPrivate.h"
    };
    

    This does not enforce the behavior, but it gets the private stuff out of the .h file. On the down side, this adds another file to maintain. Also, in visual studio, the intellisense does not work for the private members - this could be a plus or a minus.

    Frederick : But of course! What a boob I am to ask that question. Thanks everyone anyway.
    peterchen : Another downside is that a change to the private interface still does require a recompile of the clients.
    J.F. Sebastian : Why this answer is accepted? "MyClassPrivate.h" is as easily can be read as the original header. It still requires recompile. Size of the object is a minor issue. The true show-stoppers are efficiency and backward-compatibility with some C idioms.
    Edward Kmett : Unfortunately MyClassPrivate.h doesn't add any value from a link-time perspective. You still have to traverse them all. For private template members functions, it would be much nicer to not have to include them in the header at all.
    KJAWolf : My soul aches at the sight of that include statement.
  • Yes, but...

    You need to read Stroustrup's "Design and Evolution of C++" book. It would have inhibited the uptake of C++.

  • You're all ignoring the point of the question -

    Why must the developer type out the PIMPL code?

    For me, the best answer I can come up with is that we don't have a good way to express C++ code that allows you to operate on it. For instance, compile-time (or pre-processor, or whatever) reflection or a code DOM.

    C++ badly needs one or both of these to be available to a developer to do meta-programming.

    Then you could write something like this in your public MyClass.h:

    #pragma pimpl(MyClass_private.hpp)
    

    And then write your own, really quite trivial wrapper generator.

  • I think there is a confusion here. The problem is not about headers. Headers don't do anything (they are just ways to include common bits of source text among several source-code files).

    The problem, as much as there is one, is that class declarations in C++ have to define everything, public and private, that an instance needs to have in order to work. (The same is true of Java, but the way reference to externally-compiled classes works makes the use of anything like shared headers unnecessary.)

    It is in the nature of common Object-Oriented Technologies (not just the C++ one) that someone needs to know the concrete class that is used and how to use its constructor to deliver an implementation, even if you are using only the public parts. The device in (3, below) hides it. The practice in (1, below) separates the concerns, whether you do (3) or not.

    1. Use abstract classes that define only the public parts, mainly methods, and let the implementation class inherit from that abstract class. So, using the usual convention for headers, there is an abstract.hpp that is shared around. There is also an implementation.hpp that declares the inherited class and that is only passed around to the modules that implement methods of the implementation. The implementation.hpp file will #include "abstract.hpp" for use in the class declaration it makes, so that there is a single maintenance point for the declaration of the abstracted interface.

    2. Now, if you want to enforce hiding of the implementation class declaration, you need to have some way of requesting construction of a concrete instance without possessing the specific, complete class declaration: you can't use new and you can't use local instances. (You can delete though.) Introduction of helper functions (including methods on other classes that deliver references to class instances) is the substitute.

    3. Along with or as part of the header file that is used as the shared definition for the abstract class/interface, include function signatures for external helper functions. These function should be implemented in modules that are part of the specific class implementations (so they see the full class declaration and can exercise the constructor). The signature of the helper function is probably much like that of the constructor, but it returns an instance reference as a result (This constructor proxy can return a NULL pointer and it can even throw exceptions if you like that sort of thing). The helper function constructs a particular implementation instance and returns it cast as a reference to an instance of the abstract class.

    Mission accomplished.

    Oh, and recompilation and relinking should work the way you want, avoiding recompilation of calling modules when only the implementation changes (since the calling module no longer does any storage allocations for the implementations).

0 comments:

Post a Comment