Sunday, March 6, 2011

Why is ++i considered an l-value, but i++ is not?

Why is ++i is l-value? and i++ not

Initially there were 2 questions one was removed since that was exact duplicate. So don't down vote the answers that were answering difference between pre- and post-increment.

From stackoverflow
    • i++ - increment i and return its previous value.
    • ++i - increment i and return its current value.
  • i++ increments i and returns its original value. In other words, post-increment returns the variable and then increments the variable. ++i increments i and returns its new value. Or it increments the variable and then returns the value.

    In general, in C++, you should use pre-increment, unless you need the original value. It's more efficient, because it doesn't involve making a copy of the original value.

    jjnguy : This is no longer answering his question.
    Dan : Its not more efficient. This is a pretty straightforward optimisation for compilers to make. In fact, tiling algorithms (see http://en.wikipedia.org/wiki/Instruction_selection and maximal munch, which I've implemented in JavaCC for uni once) should catch this optimisation easily, in most cases.
    Bill the Lizard : The optimization can only really be done when the increment is in a statement by itself (or so I understand). If it's part of a compound statement the compiler still needs to use the temp variable.
    David Thornley : For a built-in type, there's little or no efficiency difference. For a user-defined type, it can be major. It's best to get into the pre- habit.
    Dan : @Bill, I guess it depends on the expression, but in those cases its not that simple as to switch i++ to ++i anyway. I imagine when people mention this "optimization" they mean cases where you can simply switch them.
  • Regarding LValue

    • In C (and Perl for instance), neither ++i nor i++ are LValues.

    • In C++, i++ is not and LValue but ++i is.

      ++i is equivalent to i += 1, which is equivalent to i = i + 1.
      The result is that we're still dealing with the same object i.
      It can be viewed as:

      int i = 0;
      ++i = 3;  
      // is understood as
      i = i + 1;  // i now equals 1
      i = 3;
      

      i++ on the other hand could be viewed as:
      First we use the value of i, then increment the object i.

      int i = 0;
      i++ = 3;  
      // would be understood as 
      0 = 3  // Wrong!
      i = i + 1;
      

    (edit: updated after a blotched first-attempt).

    Paul Stephenson : In my compiler, it is 'i++ = 5' that doesn't make sense. '++i = 5' is OK: you increment 'i', return 'i' and then reassign it to 5.
    Martin York : @Paul: Incrementing and assignment in the same expression is undefined behavior.
    Renaud Bompuis : @Paul and Martin: I corrected my post after my blotched attempt and working on it with a clearer head thank last night :-)
  • C#:

    public void test(int n)
    {
      Console.WriteLine(n++);
      Console.WriteLine(++n);
    }
    
    /* Output:
    n
    n+2
    */
    
  • The main difference is that i++ returns the pre-increment value whereas ++i returns the post-increment value. I normally use ++i unless I have a very compelling reason to use i++ - namely, if I really do need the pre-increment value.

    IMHO it is good practise to use the '++i' form. While the difference between pre- and post-increment is not really measurable when you compare integers or other PODs, the additional object copy you have to make and return when using 'i++' can represent a significant performance impact if the object is either quite expensive to copy, or incremented frequently.

  • An example:

    var i = 1;
    var j = i++;
    
    // i = 2, j = 1
    

    and

    var i = 1;
    var j = ++i;
    
    // i = 2, j = 2
    
  • POD Pre increment:

    The pre-increment should act as if the object was incremented before the expression and be usable in this expression as if that happened. Thus the C++ standards comitee decided it can also be used as an l-value.

    POD Post increment:

    The post-increment should increment the POD object and return a copy for use in the expression (See n2521 Section 5.2.6). As a copy is not actually a variable making it an l-value does not make any sense.

    Objects:

    Pre and Post increment on objects is just syntactic sugar of the language provides a means to call methods on the object. Thus technically Objects are not restricted by the standard behavior of the language but only by the restrictions imposed by method calls.

    It is up to the implementor of these methods to make the behavior of these objects mirror the behavior of the POD objects (It is not required but expected).

    Objects Pre-increment:

    The requirement (expected behavior) here is that the objects is incremented (meaning dependant on object) and the method return a value that is modifiable and looks like the original object after the increment happened (as if the increment had happened before this statement).

    To do this is siple and only require that the method return a reference to it-self. A reference is an l-value and thus will behave as expected.

    Objects Post-increment:

    The requirement (expected behavior) here is that the object is incremented (in the same way as pre-increment) and the value returned looks like the old value and is non-mutable (so that it does not behave like an l-value).

    Non-Mutable:
    To do this you should return an object. If the object is being used within an expression it will be copy constructed into a temporary variable. Temporary variables are const and thus it will non-mutable and behave as expected.

    Looks like the old value:
    This is simply achieved by creating a copy of the original (probably using the copy constructor) before makeing any modifications. The copy should be a deep copy otherwise any changes to the original will affect the copy and thus the state will change in relationship to the expression using the object.

    In the same way as pre-increment:
    It is probably best to implement post increment in terms of pre-increment so that you get the same behavior.

    class Node // Simple Example
    {
         /*
          * Pre-Increment:
          * To make the result non-mutable return an object
          */
         Node operator++(int)
         {
             Node result(*this);   // Make a copy
             operator++();         // Define Post increment in terms of Pre-Increment
    
             return result;        // return the copy (which looks like the original)
         }
    
         /*
          * Post-Increment:
          * To make the result an l-value return a reference to this object
          */
         Node& operator++()
         {
             /*
              * Update the state appropriatetly */
             return *this;
         }
    };
    
  • By the way - avoid using multiple increment operators on the same variable in the same statement. You get into a mess of "where are the sequence points" and undefined order of operations, at least in C. I think some of that was cleaned up in Java nd C#.

    David Thornley : In C and C++, using multiple increment operators on the same variable with no sequence points in between is undefined behavior. Java and C# may well have defined the behavior, I don't know offhand. I wouldn't call that "cleaning up", and wouldn't write such code anyway.
  • I'm getting the lvalue error when I try to compile

    i++ = 2;
    

    but not when I change it to

    ++i = 2;
    

    This is because the prefix operator (++i) changes the value in i, then returns i, so it can still be assigned to. The postfix operator (i++) changes the value in i, but returns a temporary copy of the old value, which cannot be modified by the assignment operator.


    Answer to original question:

    If you're talking about using the increment operators in a statement by themselves, like in a for loop, it really makes no difference. Preincrement appears to be more efficient, because postincrement has to increment itself and return a temporary value, but a compiler will optimize this difference away.

    for(int i=0; i<limit; i++)
    ...
    

    is the same as

    for(int i=0; i<limit; ++i)
    ...
    

    Things get a little more complicated when you're using the return value of the operation as part of a larger statement.

    Even the two simple statements

    int i = 0;
    int a = i++;
    

    and

    int i = 0;
    int a = ++i;
    

    are different. Which increment operator you choose to use as a part of multi-operator statements depends on what the intended behavior is. In short, no you can't just choose one. You have to understand both.

  • Other people have tackled the functional difference between post and pre increment.

    As far as being an lvalue is concerned, i++ can't be assigned to because it doesn't refer to a variable. It refers to a calculated value.

    In terms of assignment, both of the following make no sense in the same sort of way:

    i++   = 5;
    i + 0 = 5;
    

    Because pre-increment returns a reference to the incremented variable rather than a temporary copy, ++i is an lvalue.

    Preferring pre-increment for performance reasons becomes an especially good idea when you are incrementing something like an iterator object (eg in the STL) that may well be a good bit more heavyweight than an int.

    Paul Stephenson : +1: appears to be the only answer to respond to the actual question correctly!
    Paul Tomblin : @Paul - to be fair, originally the question was written differently and appeared to be asking what people have answered.
    Paul Stephenson : Fair enough, I must have come in after the first few minutes before it was edited as I didn't see the original. I guess it should be bad practice on SO to change questions substantially after answers have been received.
    mackenir : @Paul (heh, getting confusing) - I didnt see the original message, and must admit I was a bit confused as to why nobody had addressed the lvalue issue.
  • It seem that a lot of people are explaining how ++i is an lvalue, but not the why, as in, why did the C++ standards committee put this feature in, especially in light of the fact that C doesn't allow either as lvalues. From this discussion on comp.std.c++, it appears that it is so you can take its address or assign to a reference. A code sample excerpted from Christian Bau's post:

       int i;
       extern void f (int* p);
       extern void g (int& p);
    
       f (&++i);   /* Would be illegal C, but C programmers
                      havent missed this feature */
       g (++i);    /* C++ programmers would like this to be legal */
       g (i++);    /* Not legal C++, and it would be difficult to
                      give this meaningful semantics */
    
    

    By the way, if i happens to be a built-in type, then assignment statements such as ++i = 10 invoke undefined behavior, because i is modified twice between sequence points.

    yesraaj : why did u make this as community post?
    Cirno de Bergerac : I guess the CW checkbox defaulted to checked, and I didn't notice.
    mackenir : CW is the default setting for answers to CW questions. Your question transitioned to CW because you edited it quite a few times. So I think this answer was made late on, when the question had gone to CW. As a result it was by default CW.
    Arkadiy : The last paragraph (about sequence points) is quite curious. Could you provide a link to the source of this idea?
    Martin York : Updating a l-value twice in the same expression is undefined (unspecified) behaivor. Compiler is free to aggrisively optimise the code between two sequence points. see: http://stackoverflow.com/questions/367633/what-are-all-the-common-undefined-behaviour-that-c-programmer-should-know-about#367690
    yesraaj : http://en.wikipedia.org/wiki/Sequence_point
  • Maybe this has something to do with the way the post-increment is implemented. Perhaps it's something like this:

    • Create a copy of the original value in memory
    • Increment the original variable
    • Return the copy

    Since the copy is neither a variable nor a reference to dynamically allocated memory, it can't be a l-value.

  • Well as another answerer pointed out already the reason why ++i is an lvalue is to pass it to a reference.

    int v = 0;
    int const & rcv = ++v; // would work if ++v is an rvalue too
    int & rv = ++v; // would not work if ++v is an rvalue
    

    The reason for the second rule is to allow to initialize a reference using a literal, when the reference is a reference to const:

    void taking_refc(int const& v);
    taking_refc(10); // valid, 10 is an rvalue though!
    

    Why do we introduce an rvalue at all you may ask. Well, these terms come up when building the language rules for these two situations:

    • We want to have a locator value. That will represent a location which contains a value that can be read.
    • We want to represent the value of an expression.

    The above two points are taken from the C99 Standard which includes this nice footnote quite helpful:

    [ The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modiļ¬able) lvalue. It is perhaps better considered as representing an object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described as the ‘‘value of an expression’’. ]

    The locator value is called lvalue, while the value resulting from evaluating that location is called rvalue. That's right according also to the C++ Standard (talking about the lvalue-to-rvalue conversion):

    4.1/2: The value contained in the object indicated by the lvalue is the rvalue result.

    Conclusion

    Using the above semantics, it is clear now why i++ is no lvalue but an rvalue. Because the expression returned is not located in i anymore (it's incremented!), it is just the value that can be of interest. Modifying that value returned by i++ would make not sense, because we don't have a location from which we could read that value again. And so the Standard says it is an rvalue, and it thus can only bind to a reference-to-const.

    However, in constrast, the expression returned by ++i is the location (lvalue) of i. Provoking an lvalue-to-rvalue conversion, like in int a = ++i; will read the value out of it. Alternatively, we can make a reference point to it, and read out the value later: int &a = ++i;.

    Note also the other occasions where rvalues are generated. For example, all temporaries are rvalues, the result of binary/unary + and minus and all return value expressions that are not references. All those expressions are not located in an named object, but carry rather values only. Those values can of course be backed up by objects that are not constant.

    The next C++ Version will include so-called rvalue references that, even though they point to nonconst, can bind to an rvalue. The rationale is to be able to "steal" away resources from those anonymous objects, and avoid copies doing that. Assuming a class-type that has overloaded prefix ++ (returning Object&) and postfix ++ (returning Object), the following would cause a copy first, and for the second case it will steal the resources from the rvalue:

    Object o1(++a); // lvalue => can't steal. It will deep copy.
    Object o2(a++); // rvalue => steal resources (like just swapping pointers)
    
    yesraaj : y did u make it as community wiki
    Johannes Schaub - litb : :) i've already got my 200p limit for today. doesn't matter whether it's community or not really. so much other questions around there to collect points from hehe.
    yesraaj : anyway i will accept if this give much more clarity to the qn.
    Johannes Schaub - litb : btw, now you understand http://stackoverflow.com/questions/373419/whats-the-difference-between-a-parameter-passed-by-reference-vs-passed-by-value#373429 . pass-by-reference just means that an lvalue is passed instead of an rvalue. And that, as we've seen, requires a reference type parameter.

0 comments:

Post a Comment