Dissecting the C++ STL Vector: Part 3 – Capacity & Size

This is the third part of a series about dissecting the vector container in C++. It depends on the first part and the second part, so make sure that you have read them before continuing. In this part, I will dissect the “capacity” methods of the vector template. That is, size, max_size, resize, capacity, empty, reserve, and shrink_to_fit. It will be relatively simpler than the previous parts. In fact, as you will see, the implementation of most of these methods is obvious. My goal is to help you be more familiar with the vector template before I start discussing the most important methods, namely insert and erase.

The underlying array of the vector might contain a number of unused slots at the end. The number of occupied slots is the actual number of elements and it is called the size of the vector. The size of the array is called the capacity of the vector. Allowing unused slots improves the performance of adding and deleting elements from the vector. I will rigorously discuss this in another article. Now I will concentrate on the code that manages such array.

Recall from part 2, the three fields on which the whole implementation of the vector template is based on are _Myfirst, _Mylast, and _Myend. _Myfirst points to the array that will be used to store the elements of the vector. _Mylast points to the first element in the array that has not been initialized (that is, the slot has not been occupied). _Myend points to past-the-last element of the array. All of these fields are actually not defined by vector, but inherited from its grandparent _Vector_val. Now we can easily implement all of the capacity methods.

The size method returns the number of elements in the vector. By definition of _Myfirst and _Mylast, the code follows. The max_size method returns the maximum number of elements that can be stored by the vector instance. This value is determined by the allocator. allocator<_Ty> considers ((size_t)(-1) / sizeof (_Ty)) to be the maximum size. In this example, _Ty is Foo).  By definition of _Myfirst and _Myend, the implementation of capacity follows. And by definition of _Myfirst and _Mylast, the implementation of empty follows.

size_type size() const throw()
    return (this->_Mylast - this->_Myfirst);

size_type max_size() const thorw()
   // _Getal returns a pointer to an allocator instance.
   return (this->_Getal().max_size());

size_type capacity() const throw()
   return (this->_Myend - this->_Myfirst);

bool empty() const throw()
   return (this->_Myfirst == this->_Mylast);

Now let’s move on to resize, reserve, and shrink_to_fit. resize takes a value of type size_type (which is by default size_t) and ensures that the number of elements is equal to the provided argument. It adds elements if the provided value is greater than the current size (this might result in expansion of the underlying array), otherwise if the value is smaller than the current size it destructs elements from the end (the size of the underlying array does not change). Note that if the provided value is equal to the current size then resize does nothing. Here is the implementation:

void resize(size_type _Newsize)
    if (_Newsize < size())
        _Pop_back_n(size() - _Newsize);
    else if (size() < _Newsize)
        _Alty _Alval(this->_Getal());
        _Reserve(_Newsize - size());
        _Uninitialized_default_fill_n(this->_Mylast, _Newsize - size(), _Alval);
        this->_Mylast += _Newsize - size();

This method looks very much similar to the constructor we have seen in part 2. If we are asked to shrink, we call the _Pop_back_n method. If we are asked to expand, we first call _Reserve which is supposed to allocate a new array of the required size and copy all existing elements to it and then by calling _Uninitialized_default_fill_n, the required new elements are constructed. At the end, we update the _Mylast pointer. _Reserve will update the _Myend pointer for us. We have already discussed _Uninitialized_default_fill_n in part 2, therefore we are left with _Pop_back_n and _Reserve. Here is the implementation of _Pop_back_n:

void _Pop_back_n(size_type _Count)
    pointer _Ptr = this->_Mylast - _Count;
    _Destroy(_Ptr, this->_Mylast);
    this->_Mylast = _Ptr;

_Ptr points to the first element that we want to get rid of. So we want to call the destructor of each element from _Ptr to, but not including, _Mylast. This sounds a job for _Destroy, which we have already seen in part 2. Note that the memory is not released, we just destruct the elements and so we don’t need to update _Myend. In Debug mode, _Orphan_range (not shown in the code) is called before calling _Destroy. This method tells each iterator that points to one of the elements that has been destructed that it has become invalid. It is similar to _Orphan_all discussed in footnote 5 of part 2.

_Reserve takes a value of type size_type and ensures that the number of empty slots in the underlying array is at least that much. It is very important now to think about by how much should we expand the array? First, let’s examine the code:

void _Reserve(size_type _Count)
    if (_Unused_capacity() < _Count)
        if (max_size() - size() < _Count)
        _Reallocate(_Grow_to(size() + _Count));

_Unused_capacity is obviously the difference between _Myend and _Mylast and only if it is smaller than the _Count, we attempt is reallocate the underlying array. We have to check first the new size will not exceed the maximum size (throwing a std::length_error exception otherwise) and then the array is actually reallocated by _Reallocate passing the value returned from _Grow_to. The argument passed to _Grow_to is the new required capacity. This method computes the new capacity of the vector:

size_type _Grow_to(size_type _Count) const
    // grow by 50% or at least to _Count
    size_type _Capacity = capacity();
    _Capacity = max_size() - _Capacity / 2 < _Capacity
        ? 0 : _Capacity + _Capacity / 2;  // try to grow by 50%
    if (_Capacity < _Count)
        _Capacity = _Count;
    return (_Capacity);

So we try to increase the capacity by 50%, if this increment causes the new capacity to exceeds the maximum size or if this increment is just not enough to satisfy the required capacity, then we consider the new capacity to be _Count, the same as the required capacity. Everything makes sense except this 50% increment. Why 50%? Why not 49% or 51%? Let’s pretend that this is a good idea and leave the justification for another article.

At this point, _Reallocate will be called passing to it the size of the new array. It is in fact much simpler than what it seems:

void _Reallocate(size_type _Count)
    // First, allocate the new array.
    pointer _Ptr = this->_Getal().allocate(_Count);

    // Second, move the existing elements to the new array.
    _Umove(this->_Myfirst, this->_Mylast, _Ptr);
    this->_Getal().deallocate(_Ptr, _Count);

    // Third, destroy and deallocate the old array.
    size_type _Size = size();
    if (this->_Myfirst != pointer())
        _Destroy(this->_Myfirst, this->_Mylast);
        this->_Getal().deallocate(this->_Myfirst, this->_Myend - this->_Myfirst);

    // Finally, update the pointers.
    this->_Myend = _Ptr + _Count;
    this->_Mylast = _Ptr + _Size;
    this->_Myfirst = _Ptr;

We have already seen the allocate, deallocate, _Destroy, and _Orphan_all methods in part 2. _Umove is the new method and it is actually interesting:

template<class _Iter>
pointer _Umove(_Iter _First, _Iter _Last, pointer _Ptr)
    // move initializing [_First, _Last), using allocator
    _Alty _Alval(this->_Getal());
    return (_Uninitialized_move(_First, _Last, _Ptr, _Alval));

The type _Iter will be pointer, the parameter _First has value _Myfirst, the parameter _Last has value _Mylast, and the parameter _Ptr points to the new array. The most important part of _Uninitialized_move is the following:

for (; _First != _Last; ++_Dest, ++_First)
    _Al.construct(_Dest, (_Valty&&)*_First);

So for each existing element, move it to the corresponding slot in the new array if a move constructor is available and copy it otherwise.

What are the possible exceptions that might be thrown from resize? Many! Let’s start with _Pop_back_n. This method calls the destructor on each element in the specified range. If any of these destructors throws an exception, the vector will be left in an invalid state. _Reserve may throw the std::length_error or std::bad_alloc. Also it either calls the copy or the move constructor on every existing element. In the first case, after copying all the elements the destructor will be called on each element in the old array and if an exception is thrown, the vector is left in an invalid state. In the second, calling the move constructor on the element actually destroys it and therefore if one of the executions of the move constructor has thrown an exception, the previously moved elements are not restored back to the old array and the vector is left in an invalid state. Finally, _Uninitialized_default_fill_n calls the default constructor on every new element which might throw an exception. If this happens, fortunately for us, _Tidy will put the vector in a valid state as discussed in part 2.

The reserve method ensures that the underlying array is of the specified size. It simply calls _Reallocate if expansion is required. Note that any new slots are left uninitialized. The shrink_to_fit method makes the size and the capacity of the vector equal. It does that by simply calling _Reallocate passing to it the returned value from size.

Well, that was easy. Stay tuned for the next part.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s