Dissecting the C++ STL Vector: Part 2 – Constructors


This is the second part of a series about dissecting the vector container in C++. It depends on the first part, so make sure that you have read it before continuing. In this part, I will dissect the constructors of the vector container. You will learn about all the fields of a vector instance, what do they represent, and how they are being initialized. You will be able to compute the size of any vector instance. Also you will see how allocators are involved into construction and the exceptions that might occur along the way.  Here is a list of all instance constructors:

// These constructors do not take an allocator object.
vector();
explicit vector(size_type _Count);
vector(size_type _Count, const value_type& _Val);
vector(const _Myt& _Right);
template<class _Iter, class = typename enable_if<_Is_iterator<_Iter>::value, void>::type>
vector(_Iter _First, _Iter _Last);
vector(_Myt& _Right);

// These constructors do take an allocator object.
explicit vector(const _Alloc& _Al);
vector(size_type _Count, const value_type& _Val, const _Alloc& _Al);
vector(const _Myt& _Right, const _Alloc& _Al);
template<class _Iter, class = typename enable_if<_Is_iterator<_Iter>::value, void>::type>
vector(_Iter _First, _Iter _Last, const _Alloc& _Al);
vector(_Myt&& _Right, const _Alloc& _Al);

// This constructor takes an initializer list (C++11 feature), and an allocator object.
vector(_XSTD initializer_list<value_type> _Ilist, const _Alloc& _Al = allocator_type());

I have divided them into three groups: The first group includes the constructors that do not take an allocator object, the second group includes similar constructors but take an allocator object (_Al), and the third group has just one constructor that takes an initializer list and an allocator object. All constructors of the first group call the default base class constructor. All constructors of the second and third groups call the base class constructor that takes an allocator object. These calls are in the initialization lists. This is the only difference in the code between the constructors in the first group and the corresponding ones in the second group. I will only dissect the first two versions in the first group and leave the rest together with the constructor in the third group as an exercise for you.

The default constructor does nothing but calling the base class constructor (it will not allocate memory). So let’s skip this one and start with the second constructor in our list – the one that has a parameter called _Count of type size_type, which is defined by the allocator. The following code constructs a vector using this constructor:

class Foo
{
public:
    Foo(){ std::cout<<"Constructing a Foo instance...\n"; }
private:
    int value;
};

std::vector<Foo> vec(2);

Since we are using the default allocator allocator<Foo>, size_type will be size_t (you can determine this by taking a look at the allocator definition in xmemory0 source code file), which is unsigned int. The first thing that will happen is executing the initialization list of the constructor which does nothing but calling the default constructor of the base type _Vector_alloc<false, _Vec_base_types<Foo,allocator<Foo>>> which in turn constructs its default argument (calling the default constructor of allocator<Foo>, which does nothing) and calls the default constructor of its base class _Vector_val<typename _Alloc_types::_Val_types> which in turn calls the default constructor of its base class _Container_base0 (defined in xutility source code file) which, fortunately, does not have a base type. In fact, _Container_base0 does nothing at all. However, _Container_base0 is not always the base type. To be more concrete, when compiling in Release mode _Container_base0 is the base type, but when compiling in Debug mode, _Container_base12 will be the base type and everything happens at this level is to support debugging and I will just ignore it (1). So let’s go back to the constructor of _Vector_val.

What is _Alloc_types::_Val_types? Well, does it matter? Yes it does. Take a look at the definition of _Vector_val (in the vector source code file):

template<class _Val_types>
class _Vector_val : public _Container_base
{
public:
    typedef _Vector_val<_Val_types> _Myt;

    typedef typename _Val_types::value_type value_type;
    typedef typename _Val_types::size_type size_type;
    typedef typename _Val_types::difference_type difference_type;
    typedef typename _Val_types::pointer pointer;
    typedef typename _Val_types::const_pointer const_pointer;
    typedef typename _Val_types::reference reference;
    typedef typename _Val_types::const_reference const_reference;

    typedef _Vector_iterator<_Myt> iterator;
    typedef _Vector_const_iterator<_Myt> const_iterator;

    _Vector_val()
    {
        _Myfirst = pointer();
        _Mylast = pointer();
        _Myend = pointer();
    }

    pointer _Myfirst;    // pointer to beginning of array
    pointer _Mylast;     // pointer to current end of sequence
    pointer _Myend;      // pointer to end of array
};

The constructor is initializing three fields of type pointer, which is defined to be _Val_types::pointer. So in order to know what exactly is happening in here, we have to know what is _Val_types. The type parameter here is specified by the derived type:

template<class _Alloc_types>
class _Vector_alloc<false, _Alloc_types> : public _Vector_val<typename _Alloc_types::_Val_types>;

Recall that this is the base type of vector, and in our example it will be _Vector_alloc<false, _Vec_base_types<Foo,allocator<Foo>>>. This means that:

_Val_types ==  _Alloc_types::_Val_types == _Vec_base_types<Foo,allocator<Foo>>::_Val_types

That’s just fantastic. Now we have to look at another type: _Vec_base_types, which is defined in xtr1common. By being patient and finding the definition of _Val_types within _Vec_base_types, it turns out to be simpler than what one might expect:

typedef typename _If<_Is_simple_alloc<_Alty>::value,
    _Simple_types<typename _Alty::value_type>,
    _Vec_iter_types<typename _Alty::value_type,
        typename _Alty::size_type,
        typename _Alty::difference_type,
        typename _Alty::pointer,
        typename _Alty::const_pointer,
        typename _Alty::reference,
        typename _Alty::const_reference>>::type
    _Val_types;

It’s getting simpler and simpler! The expression is of the form _If<part1, part2, part3>::type. _If is a template with a specialization defined in xstddef in the std namespace:

template<bool, class _Ty1, class _Ty2>
struct _If { typedef _Ty2 type; };

template<class _Ty1, class _Ty2>
struct _If<true, _Ty1, _Ty2> { typedef _Ty1 type; };

The first definition is used when the first template parameter is false and the second is used when the parameter is true. It is more like an if statement but on types. So it all depends on whether _Is_simple_alloc<_Alty>::value is true or false. _Is_simple_alloc is defined in xmemory0 and it is derived from _Cat_base which we have already seen in part 1 of this series. _Is_simple_alloc<_Alty>::value is true when the following conditions are satisfied:

_Alty::size_type == size_t
_Alty::difference_type == ptrdiff_t
_Alty::pointer == _Alty::value_type*
_Alty::const_pointer == const _Alty::value_type*
_Alty::reference == _Alty::value_type&
_Alty::const_reference == const _Alty::value_type&

In our example, _Alty is allocator<Foo> (2) and value_type is Foo. This condition is testing whether pointer and pointer arithmetic types are standard, which they are in this case. Therefore _Is_simple_alloc<_Alty>::value is true and _Val_types is _Simple_types<Foo>, which is just a wrapper around the these standard typedefs (3). From all of this we get:

_Vector_val()
{
    _Myfirst = pointer();
    _Mylast = pointer();
    _Myend = pointer();
}

All of these fields are just pointers to Foo and they are initialized to NULL. You will see very soon what these fields mean. After this, we go down to the constructor of_Vector_alloc which does nothing in Release mode and initializes the field of _Container_base12 in Debug mode. At this point, we have finished constructing fields of all base types of vector. Now its actual construction starts:

explicit vector(size_type _Count) : _Mybase()
{
    if (_Buy(_Count))
    {
        _Alty _Alval(this->_Getal());
        _TRY_BEGIN
        _Uninitialized_default_fill_n(this->_Myfirst, _Count, _Alval);
        this->_Mylast += _Count;
        _CATCH_ALL
        _Tidy();
        _RERAISE;
        _CATCH_END
    }
}

The _Buy function returns false when _Count is zero, throws a std::length_error exception when _Count exceeds the maximum size of the vector (which is determined by the allocator. allocator<_Ty> considers ((size_t)(-1) / sizeof (_Ty)) to be the maximum size. In this example, _Ty is Foo), and otherwise it will execute the following:

this->_Myfirst = this->_Getal().allocate(_Capacity);
this->_Mylast = this->_Myfirst;
this->_Myend = this->_Myfirst + _Capacity;
return true;

It’s time to meet _Myfirst, _Mylast, and _Myend (4). _Myfirst points to the array that will be used to store the elements of the vector. _Mylast points to the first element in the array that has not been initialized. _Myend points to past-the-last element of the array.

With no doubt, we need to allocate an array to hold the two elements. This looks like a job for the allocator. _Getal, defined by _Vector_alloc, returns an allocator instance that we can use to allocate memory. The allocate method will execute the following:

_Ty *_Ptr = ::operator new(_Count * sizeof (_Ty)));
return _Ptr;

In this example, we have _Ty is Foo and _Count is 2. Hopefully 8 bytes will be allocated (otherwise, a std::bad_alloc exception is thrown). Note that this version of new does not initialize or call the default constructor. By the end of this process, _Myfirst  as well as _Mylast will be holding the address of our array and _Myend will be pointing at the past-the-last element. At this point, _Buy will return true. Great!

It has become clear now what will happen next – we will initialize the memory that we have allocated inside a try block by calling _Uninitialized_default_fill_n which is defined in xmemory. If this fails, _Tidy is called and the exception is rethrown. Let’s see what each of these methods do.

_Uninitialized_default_fill_n asks the allocator to construct each new element of the array (that is, starting at _Mylast). The default allocator does the following:

::operator new((void *)_Ptr) _Ty();

This version of new does not allocate any memory – it just calls the default constructor on the instance allocated at _Ptr. So by doing this for all elements, we would have all of them properly constructed. If any of the executions of the default constructor throw any exception, all of the elements that has been successfully constructed are destructed by calling the destructor and then the exception is rethrown. If no exception has been thrown, _Uninitialized_default_fill_n would return successfully and _Mylast is updated. Otherwise, _Tidy is called:

void _Tidy()
{
    if (this->_Myfirst != pointer())
    {
        this->_Orphan_all();
        _Destroy(this->_Myfirst, this->_Mylast);
        this->_Getal().deallocate(this->_Myfirst, this->_Myend - this->_Myfirst);
        this->_Myfirst = pointer();
        this->_Mylast = pointer();
        this->_Myend = pointer();
    }
}

Let’s just ignore _Orphan_all (5). All we are doing here is calling the destructor on all of the elements of the array that has been successfully constructed (from _Myfirst up to but not including _Mylast) and then deallocate the whole array. Remember that destruction and deallocation are done by the allocator. As a good practice, we set all pointers to NULL. In other words, _Tidy precludes any further use of the vector instance. That’s it.

  1. Some details for the curious. _Container_base12 has one instance field that is a pointer to a _Container_proxy object. _Container_proxy has two instance fields: a pointer to _Container_base12 and a pointer to _Iterator_base12. _Iterator_base12 is a linked list of iterators. In other words, _Container_proxy associates a list of iterators with the container into which they are pointing. Whenever the container is destroyed, the destructor of _Container_base12 is called which invalidates this association. An iterator cannot be used if its associated container is NULL (doesn’t exist), until it gets adopted by another container.
  2. Not precisely. _Alty is _Wrap_alloc<allocator> which is derived from allocator<Foo>, so we are cool.
  3. I am not able to see why _If is required. It does not improve the performance of the compiler. It seems to me that if we always took the false branch, we will get the same results.
  4. By the way, the vector template itself does not define any fields, it inherits these three fields from _Vector_val, _Alval from _Vector_alloc when the allocator is not “empty”, and _Myproxy from _Container_base12 when compiled in Debug mode. These are all possible instance fields that any vector instance can have. All of them are pointers except _Alval which is of type _Wrap_alloc<_Alloc> and derived from _Alloc. If the size of pointer is P and the size of the allocator is A then the size of a vector instance is 4*P + A in Debug mode, and 3*P + A in Release mode. In our example, the allocator is “empty” and so A = 0, assuming 64-bit platform in Debug mode, P would be 8 bytes and the total size would be 32 bytes. There are no static fields.
  5. Some details for the curious. _Orphan_all depends on whether the base type is _Container_base12 or _Container_base0. In the first case, the linked list of _Iterator_base12 is traversed and for each of its iterators, a pointer to _Myproxy is set to NULL. In other words, we are telling each of these iterators that the container into which they point has been destroyed. This helps in debugging.

1 thought on “Dissecting the C++ STL Vector: Part 2 – Constructors

  1. Pingback: Dissecting the C++ STL Vector: Part 3 – Capacity | Mysterious Mysteries

Leave a comment