Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.Please state the document version you're referring to, as found in the title (in this document: 8.1.0~pre2) and please state chapter and paragraph name or number you're referring to.
All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.
The
Standard Template Library (
STL) is a
general purpose library
consisting of
containers,
generic algorithms,
iterators,
function objects,
allocators,
adaptors and
data structures. The data structures used
in the algorithms are abstract in the sense that the algorithms can be
used on (practically) every data type.
The algorithms can work on these abstract data types due to the fact that they are template based algorithms. In this chapter the construction of templates is not discussed (see chapter 20 for that). Rather, this chapter focuses on the use of these template algorithms.
Several parts of the standard template library have already been discussed in the C++ Annotations. In chapter 12 the abstract containers were discussed, and in section 10.10 function objects were introduced. Also, iterators were mentioned at several places in this document.
The remaining components of the STL will be covered in this and the next chapter. Iterators, adaptors, smart pointers, multi threading and other features of the STL will be discussed in the coming sections. Generic algorithms are covered in the next chapter (19).
Allocators take care of the memory allocation within the STL. The default allocator class suffices for most applications, and is not further discussed in the C++ Annotations.
All elements of the STL are defined in the standard namespace. Therefore, a
using namespace std or comparable directive is required unless it is
preferred to specify the required namespace explicitly. This occurs in at
least one situation: in header files no using directive should be used,
so here the std:: scope specification should always be specified when
referring to elements of the STL.
sort()
expecting two iterators defining the range of objects that should be sorted,
as well as a function object calling the appropriate comparison operator for
two objects. Let's take a quick look at this situation. Assume strings are
stored in a vector, and we want to sort the vector in descending order. In
that case, sorting the vector stringVec is as simple as:
sort(stringVec.begin(), stringVec.end(), greater<std::string>());
The last argument is recognized as a
constructor: it is an
instantiation of the
greater<>() class template, applied to
strings. This object is called as a
function object by the sort()
generic algorithm. It will call the
operator>() of the provided data type
(here
std::string) whenever its
operator()() is called. Eventually,
when sort() returns, the first element of the vector will be the greatest
element.
The operator()() (
function call operator) itself is not visible
at this point: don't confuse the parentheses in greater<string>() with
calling operator()(). When that operator is actually used inside
sort(), it receives two arguments: two strings to compare for
`greaterness'. Internally, the
operator>() of the data type to which the
iterators point (i.e., string) is called by greater<string>'s function
operator (operator()()) to compare the two objects. Since greater<>'s
function call operator is defined
inline, the call itself is not actually
present in the code. Rather, sort() calls string::operator>(),
thinking it called greater<>::operator()().
Now that we know that a constructor is passed as argument to (many) generic
algorithms, we can design our own function objects. Assume we want to sort our
vector case-insensitively. How do we proceed? First we note that the default
string::operator<() (for an incremental sort) is not appropriate, as it
does
case sensitive comparisons. So, we provide our own case_less
class, in which the two strings are compared case insensitively. Using the
standard C function
strcasecmp(), the following program performs the
trick. It sorts its command-line arguments in ascending alphabetic order:
#include <iostream>
#include <string>
#include <algorithm>
using namespace std;
class case_less
{
public:
bool operator()(string const &left, string const &right) const
{
return strcasecmp(left.c_str(), right.c_str()) < 0;
}
};
int main(int argc, char **argv)
{
sort(argv, argv + argc, case_less());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << endl;
}
The
default constructor of the class case_less is used with
sort()'s final argument. Therefore, the only member function that must be
defined with the class case_less is the function object operator
operator()(). Since we know it's called with string arguments, we
define it to expect two string arguments, which are used in the
strcasecmp() function. Furthermore, the operator()() function is made
inline, so that it does not produce overhead when called by the sort()
function. The sort() function calls the function object with various
combinations of strings, i.e., it thinks it does so. However, in fact
it calls strcasecmp(), due to the inline-nature of
case_less::operator()().
The comparison function object is often a predefined function object, since these are available for many commonly used operations. In the following sections the available predefined function objects are presented, together with some examples showing their use. At the end of the section about function objects function adaptors are introduced. Before predefined function objects can be used the following preprocessor directive must have been specified:
#include <functional>
Predefined function objects are used predominantly with generic
algorithms. Predefined function objects exists for arithmetic, relational, and
logical operations. In section 23.5 predefined function objects are
developed performing
bitwise operations.
plus<Type>
is available. If we set type to size_t
then the + operator for size_t values is used, if we set type to
string, then the + operator for strings is used. For example:
#include <iostream>
#include <string>
#include <functional>
using namespace std;
int main(int argc, char **argv)
{
plus<size_t> uAdd; // function object to add size_ts
cout << "3 + 5 = " << uAdd(3, 5) << endl;
plus<string> sAdd; // function object to add strings
cout << "argv[0] + argv[1] = " << sAdd(argv[0], argv[1]) << endl;
}
/*
Generated output with call: a.out going
3 + 5 = 8
argv[0] + argv[1] = a.outgoing
*/
Why is this useful? Note that the function object can be used with all
kinds of data types (not only with the predefined datatypes), in which the
particular operator has been overloaded. Assume that we want to perform an
operation on a common variable on the one hand and, on the other hand, in turn
on each element of an array. E.g., we want to compute the sum of the elements
of an array; or we want to concatenate all the strings in a text-array. In
situations like these the function objects come in handy. As noted before, the
function objects are heavily used in the context of the generic algorithms, so
let's take a quick look ahead at one of them.
One of the generic algorithms is called
accumulate(). It visits all
elements implied by an iterator-range, and performs a requested binary
operation on a common element and each of the elements in the range, returning
the accumulated result after visiting all elements.
For example, the following program accumulates all command line arguments,
and prints the final string:
#include <iostream>
#include <string>
#include <functional>
#include <numeric>
using namespace std;
int main(int argc, char **argv)
{
string result =
accumulate(argv, argv + argc, string(), plus<string>());
cout << "All concatenated arguments: " << result << endl;
}
The first two arguments define the (iterator) range of elements to visit,
the third argument is string(). This anonymous string object provides an
initial value. It could as well have been initialized to
string("All concatenated arguments: ")
in which case the cout statement could have been a simple
cout << result << endl;
Then, the operator to apply is plus<string>(). Note here that a
constructor is called: it is not plus<string>, but rather
plus<string>(). The final concatenated string is returned.
Now we define our own class Time, in which the
operator+() has been overloaded. Again, we can apply the predefined
function object plus, now tailored to our newly defined datatype, to add
times:
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
#include <functional>
#include <numeric>
using namespace std;
class Time
{
friend ostream &operator<<(ostream &str, Time const &time)
{
return cout << time.d_days << " days, " << time.d_hours <<
" hours, " <<
time.d_minutes << " minutes and " <<
time.d_seconds << " seconds.";
}
size_t d_days;
size_t d_hours;
size_t d_minutes;
size_t d_seconds;
public:
Time(size_t hours, size_t minutes, size_t seconds)
:
d_days(0),
d_hours(hours),
d_minutes(minutes),
d_seconds(seconds)
{}
Time &operator+=(Time const &rValue)
{
d_seconds += rValue.d_seconds;
d_minutes += rValue.d_minutes + d_seconds / 60;
d_hours += rValue.d_hours + d_minutes / 60;
d_days += rValue.d_days + d_hours / 24;
d_seconds %= 60;
d_minutes %= 60;
d_hours %= 24;
return *this;
}
};
Time const operator+(Time const &lValue, Time const &rValue)
{
return Time(lValue) += rValue;
}
int main(int argc, char **argv)
{
vector<Time> tvector;
tvector.push_back(Time( 1, 10, 20));
tvector.push_back(Time(10, 30, 40));
tvector.push_back(Time(20, 50, 0));
tvector.push_back(Time(30, 20, 30));
cout <<
accumulate
(
tvector.begin(), tvector.end(), Time(0, 0, 0), plus<Time>()
) <<
endl;
}
/*
produced output:
2 days, 14 hours, 51 minutes and 30 seconds.
*/
Note that all member functions of Time in the above source are inline
functions. This approach was followed in order to keep the example relatively
small and to show explicitly that the operator+=() function may be an
inline function. On the other hand, in real life Time's operator+=()
should probably not be made inline, due to its size.
Considering the previous discussion of the plus function object, the
example is pretty straightforward. The class Time defines a constructor,
it defines an insertion operator and it defines its own operator+(),
adding two time objects.
In main() four Time objects are stored in a
vector<Time> object. Then, the accumulate() generic algorithm is
called to compute the accumulated time. It returns a Time object, which
is inserted in the cout ostream object.
While the first example did show the use of a named function object,
the last two examples showed the use of
anonymous objects which were
passed to the (accumulate()) function.
The following arithmetic objects are available as predefined objects:
plus<>(): as shown, this object's operator()() member calls
operator+() as a
binary operator, passing it its two parameters,
returning operator+()'s return value.
minus<>(): this object's operator()() member calls
operator-() as a binary operator, passing it its two parameters and
returning operator-()'s return value.
multiplies<>(): this object's operator()() member calls
operator*() as a binary operator, passing it its two parameters and
returning operator*()'s return value.
divides<>(): this object's operator()() member calls
operator/(), passing it its two parameters and
returning operator/()'s return value.
modulus<>(): this object's operator()() member calls
operator%(), passing it its two parameters and
returning operator%()'s return value.
negate<>(): this object's operator()() member calls
operator-() as a unary operator, passing it its parameter and
returning the unary operator-()'s return value.
operator-() follows, in
which the
transform() generic algorithm is used to toggle the signs of all
elements in an array. The transform() generic algorithm expects two
iterators, defining the range of objects to be transformed, an iterator
defining the begin of the destination range (which may be the same iterator as
the first argument) and a function object defining a unary operation for the
indicated data type.
#include <iostream>
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
int main(int argc, char **argv)
{
int iArr[] = { 1, -2, 3, -4, 5, -6 };
transform(iArr, iArr + 6, iArr, negate<int>());
for (int idx = 0; idx < 6; ++idx)
cout << iArr[idx] << ", ";
cout << endl;
}
/*
Generated output:
-1, 2, -3, 4, -5, 6,
*/
==, !=, >, >=, < and <=. The
following objects are available:
equal_to<>(): this object's operator()() member calls
operator==() as a binary operator, passing it its two parameters and
returning operator==()'s return value.
not_equal_to<>(): this object's operator()() member calls
operator!=() as a binary operator, passing it its two parameters and
returning operator!=()'s return value.
greater<>(): this object's operator()() member calls
operator>() as a binary operator, passing it its two parameters and
returning operator>()'s return value.
greater_equal<>(): this object's operator()() member calls
operator>=() as a binary operator, passing it its two parameters and
returning operator>=()'s return value.
less<>(): this object's operator()() member calls
operator<() as a binary operator, passing it its two parameters and
returning operator<()'s return value.
less_equal<>(): this object's operator()() member calls
operator<=() as a binary operator, passing it its two parameters and
returning operator<=()'s return value.
sort() is:
#include <iostream>
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
int main(int argc, char **argv)
{
sort(argv, argv + argc, greater_equal<string>());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << endl;
sort(argv, argv + argc, less<string>());
for (int idx = 0; idx < argc; ++idx)
cout << argv[idx] << " ";
cout << endl;
}
The sort() generic algorithm expects an iterator range and a
comparator of the data type to which the iterators point. The example shows
the
alphabetic sorting of strings and the
reversed sorting of
strings. By passing greater_equal<string>() the strings are sorted in
decreasing order (the first word will be the 'greatest'), by passing
less<string>() the strings are sorted in increasing order (the first
word will be the 'smallest').
Note that the type of the elements of argv is char *, and that the
relational function object expects a string. The relational object
greater_equal<string>() will therefore use the >= operator of strings,
but will be called with char * variables. The promotion from char const
* to string is performed silently.
and, or, and not. The following objects are
available:
logical_and<>(): this object's operator()() member calls
operator&&() as a binary operator, passing it its two parameters and
returning operator&&()'s return value.
logical_or<>(): this object's operator()() member calls
operator||() as a binary operator, passing it its two parameters and
returning operator||()'s return value.
logical_not<>(): this object's operator()() member calls
operator!() as a unary operator, passing it its parameter and
returning the unary operator!()'s return value.
operator!() is provided in the following trivial
program, in which the
transform() generic algorithm is used to transform
the logicalvalues stored in an array:
#include <iostream>
#include <string>
#include <functional>
#include <algorithm>
using namespace std;
int main(int argc, char **argv)
{
bool bArr[] = {true, true, true, false, false, false};
size_t const bArrSize = sizeof(bArr) / sizeof(bool);
for (size_t idx = 0; idx < bArrSize; ++idx)
cout << bArr[idx] << " ";
cout << endl;
transform(bArr, bArr + bArrSize, bArr, logical_not<bool>());
for (size_t idx = 0; idx < bArrSize; ++idx)
cout << bArr[idx] << " ";
cout << endl;
}
/*
generated output:
1 1 1 0 0 0
0 0 0 1 1 1
*/
minus<int>() function object, which is a
binary function object, the
first argument may be bound to 100, meaning that the resulting value will
always be 100 minus the value of the second argument. Either the first or
the second argument may be bound to a specific value. To bind the first
argument to a specific value, the function object
bind1st() is used. To
bind the second argument of a binary function to a specific value
bind2nd() is used. As an example, assume we want to count all elements of
a vector of Person objects that exceed (according to some criterion) some
reference Person object. For this situation we pass the following binder
and
relational function object to the
count_if() generic algorithm:
bind2nd(greater<Person>(), referencePerson)
What would such a binder do? First of all, it's a function object, so it
needs operator()(). Next, it expects two arguments: a reference to another
function object and a fixed operand. Although binders are defined as
templates, it is illustrative to have a look at their implementations,
assuming they were straight functions. Here is such a pseudo-implementation of
a binder:
class bind2nd
{
FunctionObject const &d_object;
Operand const &d_operand;
public:
bind2nd(FunctionObject const &object, Operand const &operand);
ReturnType operator()(Operand const &lvalue);
};
inline bind2nd::bind2nd(FunctionObject const &object,
Operand const &operand)
:
d_object(object),
d_operand(operand)
{}
inline ReturnType bind2nd::operator()(Operand const &lvalue)
{
return d_object(lvalue, d_rvalue);
}
When its operator()() member is called the binder merely passes the
call to the object's operator()(), providing it with two arguments: the
lvalue it itself received and the fixed operand it received via its
constructor. Note the simplicity of these kind of classes: all its members can
usually be implemented inline.
The count_if() generic algorithm visits all the elements in an
iterator range, returning the number of times the
predicate specified as
its final argument returns
true. Each of the elements of the iterator
range is given to the predicate, which is therefore a
unary function. By
using the binder the binary function object greater() is adapted to a
unary function object, comparing each of the elements in the range to the
reference person. Here is, to be complete, the call of the count_if()
function:
count_if(pVector.begin(), pVector.end(),
bind2nd(greater<Person>(), referencePerson))
not1() is
the negator used with
unary function objects,
not2() is the
negator used with
binary function objects.
vector<Person> vector
not exceeding a certain reference person, we may, among other approaches,
use either of the following alternatives:
count_if(pVector.begin(), pVector.end(),
bind2nd(less_equal<Person>(), referencePerson))
not2 combined with the greater() predicate:
count_if(pVector.begin(), pVector.end(),
bind2nd(not2(greater<Person>()), referencePerson))
Note that not2() is a negator negating the truth value of a binary
operator()() member: it must be used to wrap the binary predicate
greater<Person>(), negating its truth value.
not1() combined with the bind2nd() predicate:
count_if(pVector.begin(), pVector.end(),
not1(bind2nd(greater<Person>(), referencePerson)))
Note that not1() is a negator negating the truth value of a unary
operator()() member: it is used to wrap the unary predicate bind2nd(),
negating its truth value.
The following little example illustrates the use of negator function adaptors, completing the section on function objects:
#include <iostream>
#include <functional>
#include <algorithm>
#include <vector>
using namespace std;
int main(int argc, char **argv)
{
int iArr[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10};
cout << count_if(iArr, iArr + 10, bind2nd(less_equal<int>(), 6)) <<
endl;
cout << count_if(iArr, iArr + 10, bind2nd(not2(greater<int>()), 6)) <<
endl;
cout << count_if(iArr, iArr + 10, not1(bind2nd(greater<int>(), 6))) <<
endl;
}
/*
produced output:
6
6
6
*/
count_if():
operator()() is called;
<= is performed for int values.
not2 negator is used to
negate the truth value of the complementary logicalfunction adaptor, three
actions must be performed for each iteration by count_if():
operator()() is called;
operator()() is called;
> is performed for int values.
not1 negator is used to
negate the truth value of the binder, three
actions must be performed for each iteration by count_if():
operator()() is called;
operator()() is called;
> is performed for int values.
g++ compiler on an old, 166 MHz pentium, performing 3,000,000
count_if() calls for each variant, shows the first approach requiring
about 70% of the time needed by the other two approaches to complete.
However, these differences disappear if the compiler is instructed to
optimize for speed (using the
-O6
compiler
flag). When interpreting these results one should keep in mind that multiple
nested function calls are merged into a single function call if the
implementations of these functions are given inline and if the compiler
follows the suggestion to implement these functions as true inline functions
indeed. If this is happening, the three approaches all merge to a single
operation: the comparison between two int values. It is likely that the
compiler does so when asked to optimize for speed.
== and
!= operators. Note that the ordering operators (e.g., >, <)
normally cannot be used.
iter, *iter represents the object the
iterator points to (alternatively, iter-> can be used to reach the members
of the object the iterator points to).
++iter or iter++ advances the iterator to the next
element. The notion of advancing an iterator to the next element is
consequently applied: several containers have a
reversed_iterator type, in
which the iter++ operation actually reaches a
previous element in a
sequence.
vector and
deque. For these containers iter + 2 points to the second element
beyond the one to which iter points.
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<int>::iterator vi;
cout << &*vi << endl; // prints 0
}
iterator) using member functions
begin() and
end() and, in the
case of reversed iterators (type reverse_iterator),
rbegin() and
rend(). Standard practice requires the iterator range to be left
inclusive: the notation
[left, right) indicates that left is an
iterator pointing to the first element that is to be considered, while
right is an iterator pointing just beyond the last element to be
used. The iterator-range is said to be
empty when left == right.
Note that with
empty containers
the begin- and
end-iterators are equal to each other.
The following example shows a situation where all elements of a vector of
strings are written to cout using the iterator range [begin(),
end()), and the iterator range [rbegin(), rend()). The following
example shows that the for-loops for both ranges are
identical. Furthermore it nicely illustrates how the
auto keyword can be
used to define the type of the loop control variable instead of using a much
more verbose explicit variable definition like vector<string>::iterator
(see also section 3.3.5):
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
vector<string> args(argv, argv + argc);
for (auto iter = args.begin(); iter != args.end(); ++iter)
cout << *iter << " ";
cout << endl;
for (auto iter = args.rbegin(); iter != args.rend(); ++iter)
cout << *iter << " ";
cout << endl;
return 0;
}
Furthermore, the STL defines const_iterator types to be able to visit
a series of elements in a constant container. Whereas the elements of the
vector in the previous example could have been altered, the elements of the
vector in the next example are immutable, and const_iterators are
required:
#include <iostream>
#include <vector>
#include <string>
using namespace std;
int main(int argc, char **argv)
{
vector<string> const args(argv, argv + argc);
for
(
vector<string>::const_iterator iter = args.begin();
iter != args.end();
++iter
)
cout << *iter << " ";
cout << endl;
for
(
vector<string>::const_reverse_iterator iter = args.rbegin();
iter != args.rend();
++iter
)
cout << *iter << " ";
cout << endl;
return 0;
}
The examples also illustrates that plain
pointers can be used instead of
iterators. The initialization vector<string> args(argv, argv + argc)
provides the args vector with a pair of pointer-based iterators: argv
points to the first element to initialize sarg with, argv + argc
points just beyond the last element to be used, argv++ reaches the next
string. This is a general characteristic of pointers, which is why they too
can be used in situations where iterators are expected.
The STL defines five types of iterators. These types recur in the generic algorithms, and in order to be able to create a particular type of iterator yourself it is important to know their characteristics. In general, iterators must define:
operator==(), testing two iterators for equality,
operator++(), incrementing the iterator, as
prefix operator,
operator*(), to access the element the iterator refers to,
InputIterators can read from a container. The dereference operator is guaranteed to work asrvaluein expressions. Instead of an InputIterator it is also possible to (see below) use a Forward-, Bidirectional- or RandomAccessIterator. With the generic algorithms presented in this chapter. Notations likeInputIterator1andInputIterator2may be observed as well. In these cases, numbers are used to indicate which iterators `belong together'. E.g., the generic functioninner_product()has the following prototype:Type inner_product(InputIterator1 first1, InputIterator1 last1, InputIterator2 first2, Type init);HereInputIterator1 first1andInputIterator1 last1are a set of input iterators defining one range, whileInputIterator2 first2defines the beginning of a second range. Analogous notations like these may be observed with other iterator types.
OutputIterators can be used to write to a container. The dereference operator is guaranteed to work as anlvaluein expressions, but not necessarily asrvalue. Instead of an OutputIterator it is also possible to use, see below, a Forward-, Bidirectional- or RandomAccessIterator.
ForwardIterators combine InputIterators and OutputIterators. They can be used to traverse containers in one direction, for reading and/or writing. Instead of a ForwardIterator it is also possible to use a Bidirectional- or RandomAccessIterator.
BidirectionalIterators can be used to traverse containers in both directions, for reading and writing. Instead of a BidirectionalIterator it is also possible to use a RandomAccessIterator. For example, to traverse a list or a deque a BidirectionalIterator may be useful.
RandomAccessIterators provide
random access to container
elements. An algorithm such as
sort() requires a RandomAccessIterator, and
can therefore not be used with lists or maps, which only provide
BidirectionalIterators.
copy() algorithm
has three parameters, the first two defining the range of visited elements,
and the third parameter defines the first position where the results of the
copy operation should be stored. With the copy() algorithm the number of
elements that are copied are usually available beforehand, since the number is
usually determined using pointer arithmetic. However, there are situations
where pointer arithmetic cannot be used. Analogously, the number of resulting
elements sometimes differs from the number of elements in the initial
range. The generic algorithm unique_copy() is a case in
point: the number of elements which are copied to the destination container is
normally not known beforehand.
In situations like these, an
inserter adaptor function may be used to
create elements in the destination container when they are needed.
There are three types of inserter() adaptors:
back_inserter(): calls the container's
push_back() member to add
new elements at the end of the container. E.g., to copy all elements of
source in reversed order to the back of destination:
copy(source.rbegin(), source.rend(), back_inserter(destination));
front_inserter() calls the container's
push_front() member to add
new elements at the beginning of the container. E.g., to copy all elements of
source to the front of the destination container (thereby also reversing
the order of the elements):
copy(source.begin(), source.end(), front_inserter(destination));
inserter() calls the container's
insert() member to add new
elements starting at a specified starting point. E.g., to copy all elements of
source to the destination container, starting at the beginning of
destination, shifting existing elements beyond the newly inserted
elements:
copy(source.begin(), source.end(), inserter(destination,
destination.begin()));
back_inserter(), this iterator expects the name
of a container having a member push_back(). This member is called by the
inserter's operator()() member. When a class (other than the abstract
containers) supports a push_back() container, its objects can also be
used as arguments of the back_inserter() if the class defines a
typedef DataType const &const_reference;
in its interface, where DataType const & is the type of the parameter
of the class's member function push_back(). For example, the following
program defines a (compilable) skeleton of a class IntStore, whose objects
can be used as arguments of the back_inserter iterator:
#include <algorithm>
#include <iterator>
using namespace std;
class Y
{
public:
typedef int const &const_reference;
void push_back(int const &)
{}
};
int main()
{
int arr[] = {1};
Y y;
copy(arr, arr + 1, back_inserter(y));
}
istream_iterator<Type>() can be used to define an iterator (pair) for
istream objects. The general form of the istream_iterator<Type>()
iterator is:
istream_iterator<Type> identifier(istream &inStream)
Here, Type is the type of the data elements that are read from
the istream stream. Type may be any type for which operator>> is
defined with istream objects.
The default constructor defines the end of the iterator pair, corresponding to end-of-stream. For example,
istream_iterator<string> endOfStream;
Note that the actual stream object which was specified for the
begin-iterator is not mentioned here.
Using a back_inserter() and a set of
istream_iterator<>() adaptors, all strings could be read from cin as
follows:
#include <algorithm>
#include <iterator>
#include <string>
#include <vector>
using namespace std;
int main()
{
vector<string> vs;
copy(istream_iterator<string>(cin), istream_iterator<string>(),
back_inserter(vs));
for
(
vector<string>::iterator from = vs.begin();
from != vs.end();
++from
)
cout << *from << " ";
cout << endl;
return 0;
}
In the above example, note the use of the
anonymous versions of the
istream_iterator adaptors. Especially note the use of the anonymous
default constructor. The following (non-anonymous) construction could have
been used instead of istream_iterator<string>():
istream_iterator<string> eos;
copy(istream_iterator<string>(cin), eos, back_inserter(vs));
Before istream_iterators can be used the following preprocessor
directive must have been specified:
#include <iterator>
This is implied when
iostream is included.
streambuf objects.
Before
istreambuf_iterators can be used the following preprocessor
directive must have been specified:
#include <iterator>
The
istreambuf_iterator is available for reading from streambuf
objects supporting
input operations. The standard operations
that are available for
istream_iterator objects are also available
for istreambuf_iterators. There are three
constructors:
istreambuf_iterator<Type>():This constructor represents the end-of-stream iterator while extracting values of typeTypefrom thestreambuf.
istreambuf_iterator<Type>(istream):This constructor constructs anistreambuf_iteratoraccessing thestreambufof theistreamobject, used as the constructor's argument.
istreambuf_iterator<Type>(streambuf *):This constructor constructs anistreambuf_iteratoraccessing thestreambufwhose address is used as the constructor's argument.
istreambuf_iterators and
ostreambuf_iterators.
ostream_iterator<Type>() can be used to define a destination iterator
for an
ostream object. The general forms
of the ostream_iterator<Type>() iterator are:
ostream_iterator<Type> identifier(ostream &outStream), // and:
ostream_iterator<Type> identifier(ostream &outStream, char const *delim);
Type is the type of the data elements that should be written to the
ostream stream. Type may be any type for which operator<< is defined
in combinations with ostream objects. The latter form of the
ostream_iterators separates the individual Type data elements by
delimiter strings. The former definition does not use any delimiters.
The following example shows how
istream_iterators and an ostream_iterator may
be used to copy information of a file to another file. A subtlety is the
statement in.unsetf(ios::skipws): it resets the
ios::skipws flag. The
consequence of this is that the default behavior of operator>>, to skip
whitespace, is modified. White space characters are simply returned by the
operator, and the file is copied unrestrictedly. Here is the program:
Before
ostream_iterators can be used the following preprocessor
directive must have been specified:
#include <iterator>
ostreambuf_iterator can be used the following preprocessor
directive must have been specified:
#include <iterator>
The
ostreambuf_iterator is available for writing to streambuf
objects supporting
output operations. The standard operations
that are available for
ostream_iterator objects are also available
for ostreambuf_iterators. There are two
constructors:
ostreambuf_iterator<Type>(ostream):This constructor constructs anostreambuf_iteratoraccessing thestreambufof theostreamobject, used as the constructor's argument, to insert values of typeType.
ostreambuf_iterator<Type>(streambuf *):This constructor constructs anostreambuf_iteratoraccessing thestreambufwhose address is used as the constructor's argument.
istreambuf_iterators and an ostreambuf_iterator, showing yet another
way to copy a stream:
#include <iostream>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
istreambuf_iterator<char> in(cin.rdbuf());
istreambuf_iterator<char> eof;
ostreambuf_iterator<char> out(cout.rdbuf());
copy(in, eof, out);
return 0;
}
fun(), a memory leak is created by calling
fun(): the allocated int value remains inaccessibly allocated:
void fun()
{
new int;
}
To prevent memory leaks strict bookkeeping is required: the programmer has
to make sure that the memory pointed to by a pointer is deleted just before
the pointer variable goes
out of scope. The above example could easily be
repaired:
void fun()
{
delete new int;
}
Now fun() merely wastes some time.
When a pointer variable points to a single value or object, the
bookkeeping requirements may be relaxed when the pointer variable is defined
as a std::unique_ptr
object. Unique_ptrs are objects,
masquerading as pointers. Since they're objects, their destructors are called
when they go out of scope, and because of that, their destructors will take
the responsibility of deleting the dynamically allocated memory.
Before unique_ptrs can be used the following preprocessor directive must
have been specified:
#include <memory>
Normally, an unique_ptr object is initialized with a dynamically
created value or object.
The following should be considered when using unique_ptrs:
unique_ptr to another move semantics is
used. If move semantics are not available compilation will fail. On the other
hand, if compilation succeeds then the used containers or generic algorithms
support the use of unique_ptrs. Here is an example:
std::unique_ptr<int> up1(new int);
std::unique_ptr<int> up2 = up1; // compilation error
The second definition fails to compile since unique_ptr's copy
constructor is private (the same holds true for the assignment
operator). The unique_ptr class does offer facilities to initialize
and assign from rvalue references:
class unique_ptr // interface partially shown
{
public:
unique_ptr(unique_ptr &&other); // rvalues bind here
private:
unique_ptr(const unique_ptr &other);
};
Consequently, in the above example move semantics should be used. E.g.,
std::unique_ptr<int> up1(new int);
std::unique_ptr<int> up2 = std::move(up3);
unique_ptr object should only point to memory that was made
available dynamically, as only
dynamically allocated memory can be deleted.
unique_ptr objects should not be allowed to point to the
same block of dynamically allocated memory. The unique_ptr's interface was
designed to prevent this from happening. Once an unique_ptr object goes out
of scope, it deletes the memory it points to, immediately changing any other
object also pointing to the allocated memory into a
wild
pointer.
class unique_ptr defines several member functions
to access the pointer itself or to have the unique_ptr point to another
block of memory. These member functions and ways to construct unique_ptr
objects are discussed in the next sections.
A unique_ptr (as well as a shared_ptr, see section 18.4) can
be used as a safe alternative to the now deprecated auto_ptr. It also
augments auto_ptr as it can be used with containers and algorithms; as it
adds customizable deleters and as it adds the possibility to handle arrays.
unique_ptr
objects. Each definition contains the usual <type> specifier between
angle brackets. Concrete examples are given in the coming sections, but an
overview of the various possibilities is presented here:
unique_ptr object to point to a block
of memory allocated by the new operator:
unique_ptr<type> identifier (new-expression [, deleter]);
This form is discussed in section 18.3.2.
unique_ptr object using move
semantics, either supported by the data type itself or forced using
std::move:
// type must support move semantics or std::move(other) must be used
unique_ptr<type> identifier(another unique_ptr for type);
This form is discussed in section 18.3.4.
unique_ptr object that
does not point to a particular block of memory:
unique_ptr<type> identifier;
This form is discussed in section 18.3.5.
unique_ptr
object is to provide its constructor with a block of memory allocated
by
operator new operator. The generic form is:
unique_ptr<type [, deleter_type]> identifier(new-expression
[, deleter = deleter_type()]);
The second (template) argument (deleter(_type)) is optional and may
refer to a class/object handling the destruction of the allocated memory. This
is used, e.g., in situations where a double pointer is allocated and the
destruction must visit each nested pointer to destroy the memory it points
at (see below).
Here is an example initializing an unique_ptr pointing to a string
object:
unique_ptr<string> strPtr(new string("Hello world"));
To initialize an unique_ptr to point to a double value the
following construction can be used:
unique_ptr<double> dPtr(new double(123.456));
Note the use of operator new in the above expressions. Using new
ensures the dynamic nature of the memory pointed to by the unique_ptr
objects and allows the deletion of the memory once unique_ptr objects go
out of scope. Also note that type does not mention the pointer:
the
type used in the unique_ptr construction
is the same as used in new expressions.
The next example shows how an explicitly defined deleter may be used to delete a dynamically allocated array of pointers to strings. Instead of using a template value parameter the deleter's constructor could of course also be given a parameter initializing the deleter with the number of elements to delete:
#include <iostream>
#include <memory>
using namespace std;
template <typename Type, size_t size>
struct D2
{
void operator()(Type * ptr) const
{
for (size_t idx = 0; idx < size; ++idx)
delete ptr[idx];
delete ptr;
}
};
int main()
{
unique_ptr<int *, D2<int *, 10>> sp2(new int *[10]);
D2<int *, 10> &obj = sp2.get_deleter();
}
In the example allocating an int values given in section 18.3,
the memory leak can be avoided using an unique_ptr object:
#include <memory>
using namespace std;
void fun()
{
unique_ptr<int> ip(new int);
}
All
member functions available for
objects allocated by the new expression can be reached via the
unique_ptr as if it was a plain pointer to the dynamically allocated
object. For example, in the following program the text `C++' is inserted
behind the word `hello':
#include <iostream>
#include <memory>
#include <cstring>
using namespace std;
int main()
{
unique_ptr<string> sp(new string("Hello world"));
cout << *sp << endl;
sp->insert(strlen("Hello "), "C++ ");
cout << *sp << endl;
}
/*
produced output:
Hello world
Hello C++ world
*/
unique_ptrs to stored arrays the dereferencing operator makes
little sense. Conversely, when arrays are handled by smart pointers they will
benefit from an index operator.
When using smart pointers (unique_ptr and shared_ptr, see section
18.4) to handle arrays the following additional syntax is available:
[] notation is
used. E.g.,
unique_ptr<int[]> intArr(new int[3]);
intArr[2] = intArr[0];
it() The default deleter will use tt(delete[]).
unique_ptr
may also be initialized
by another unique_ptr object for the same type provided type supports
move semantics or move semantics is forced using
std::move.
The generic form is:
unique_ptr<type> identifier(other unique_ptr object);
For example, to initialize an unique_ptr<string>, given the variable
sp defined in the previous section, the following construction can be
used:
unique_ptr<string> sp2(move(sp));
This move-construction can be used when the data type specified with
unique_ptr implements its move-constructor.
Analogously, the assignment operator can
be
used. An unique_ptr object may be assigned to another unique_ptr
object of the same type, again using move-semantics. For example:
#include <iostream>
#include <memory>
#include <string>
using namespace std;
int main()
{
unique_ptr<string> hello1(new string("Hello world"));
unique_ptr<string> hello2(move(hello1));
unique_ptr<string> hello3;
hello3 = move(hello2);
cout << // *hello1 << endl << // would segfault
// *hello2 << endl << // same
*hello3 << endl;
}
/*
Produced output:
Hello world
*/
Looking at the above example, we see that
hello1 is initialized as described in the previous section.
hello2 is defined, and it receives its value from
hello1 using a
move constructor initialization. This
effectively changes hello1 into a 0-pointer.
hello3 is defined as a default unique_ptr<string>, but
it receives its value through a move-assignment from hello2, which then
becomes a 0-pointer too.
hello1 or hello2 would be inserted
into cout a
segmentation fault would be generated. The reason for
this will now be clear: it is caused by dereferencing 0-pointers. In the end,
only hello3 actually points to a string.
unique_ptr object: Without arguments an empty unique_ptr object is
constructed not pointing to a particular block of memory:
unique_ptr<type> identifier;
In this case the
underlying pointer is set to
0 (zero). Although the unique_ptr object itself is not the pointer,
its value can be compared to 0 to see if it has been
initialized. E.g., code like
unique_ptr<int> ip;
if (!ip)
cout << "0-pointer with a unique_ptr object" << endl;
will produce the shown text. Alternatively, the member
get() is available. This member function,
as well as the other member functions of the class unique_ptr are
described in the next section.
unique_ptr:
unique_ptr &unique_ptr<Type>operator=(unique_ptr<Type> &other):This operator will transfer the memory pointed to by the rvalueunique_ptrobject to the lvalueunique_ptrobject using move semantics. So, the rvalue object loses the memory it pointed at, and turns into a 0-pointer.
unique_ptr &unique_ptr<Type>operator bool() const:This operator returnsfalseif theunique_ptrdoes not point to memory (i.e., itsget()member, see below, returns 0). Otherwise,trueis returned.
Type &unique_ptr<Type>operator*():This operator returns a reference to
the information stored in the unique_ptr object. It acts like a normal
pointer
dereference operator.
Type *unique_ptr<Type>operator->():This operator returns a pointer to the information stored in theunique_ptrobject. Through this operator members of a stored object can be selected. For example:unique_ptr<string> sp(new string("hello")); cout << sp->c_str() << endl;
unique_ptr objects:
Type *unique_ptr<Type>::get():This member does the same asoperator->(): it returns a pointer to the information stored in theunique_ptrobject. This pointer can be inspected: if it's zero theunique_ptrobject does not point to any memory. This member cannot be used to let theunique_ptrobject point to (another) block of memory.
Deleter &unique_ptr<Type>::get_deleter():This member returns a
reference to the
deleter class object used by the unique_ptr.
Type *unique_ptr<Type>::release():This member returns a pointer to the information stored in theunique_ptrobject, which loses the memory it pointed at (and changes into a 0-pointer). The member can be used to transfer the information stored in theunique_ptrobject to a plainTypepointer. It is the responsibility of the programmer to delete the memory returned by this member function.
void unique_ptr<Type>::reset(Type *):This member may also be called without argument, to delete the memory stored in theunique_ptrobject, or with a pointer to a dynamically allocated block of memory, which will thereupon be the memory accessed by theunique_ptrobject. This member function can be used to assign a new block of memory (new content) to anunique_ptrobject.
void unique_ptr<Type>::swap(unique_ptr<Type> &&):This member is used to swap two identically typed unique_ptrs.
std::auto_ptr<Type>. This class did
not support
move semantics although when one auto_ptr object was
assigned to another, the right-hand object lost the information it originally
pointed at.
The std::unique_ptr, available since the C++0x standard does not have
auto_ptr's drawbacks and consequently using auto_ptr is now
deprecated.
The following restrictions
apply to
auto_ptrs:
auto_ptr class does not support move semantics;
auto_ptr object cannot be used to point to
arrays of objects;
auto_ptr cannot be used as a data type of abstract
containers.
No further coverage of the auto_ptr class is offered by the C++
Annotations. Existing software should be modified to use unique_ptrs,
newly designed software should avoid using auto_ptr objects.
std::unique_ptr the C++0x standard offers a reference
counting smart pointer by the name of
std::shared_ptr<Type>.
The shared pointer will automatically destroy its contents once its reference count has decayed to zero.
Different from unique_ptrs shared_ptrs support copy constructors and
standard overloaded assignment operators. Since shared_ptrs share the
memory they point at
move semantics is not a requirement.
As with unique_ptr a shared_ptr may refer to a dynamically allocated
array.
shared_ptr
objects. Each definition contains the usual <type> specifier between
angle brackets. Concrete examples are given in the coming sections, but an
overview of the various possibilities is presented here:
shared_ptr object to point to a block
of memory allocated by the new operator:
shared_ptr<type> identifier (new-expression [, deleter]);
This form is discussed in section 18.4.2.
shared_ptr objects point at the same memory, and the
object's reference count is incremented.
shared_ptr object using move
semantics, either supported by the data type itself or forced using
std::move:
// type must support move semantics or std::move(other) must be used
shared_ptr<type> identifier(a temporary shared_ptr for type);
This form is discussed in section 18.4.3.
shared_ptr object that
does not point to a particular block of memory:
shared_ptr<type> identifier;
This form is discussed in section 18.4.4.
shared_ptr
object is to provide its constructor with a block of memory allocated
by
operator new operator. The generic form is:
shared_ptr<type [, deleter_type]> identifier(new-expression
[, deleter = deleter_type()]);
The second (template) argument (deleter(_type)) is optional and may
refer to a class/object handling the destruction of the allocated memory. It
is used in situations comparable to those encountered with unique_ptr
(cf. section 18.3.2).
Here is an example initializing an shared_ptr pointing to a string
object:
shared_ptr<string> strPtr(new string("Hello world"));
Note the use of operator new in the above expression. The type
specified for the shared_ptr is identical to the type used in new
expression.
All
member functions available for
objects allocated by the new expression can be reached via the
shared_ptr as if it was a plain pointer to the dynamically allocated
object. For example, in the following program the text `C++' is inserted
behind the word `hello'. This example also shows the use of a copy
constructed shared pointer and it shows that following the copy construction
both objects point at the same information:
#include <iostream>
#include <memory>
#include <cstring>
using namespace std;
int main()
{
shared_ptr<string> sp(new string("Hello world"));
shared_ptr<string> sp2(sp);
sp->insert(strlen("Hello "), "C++ ");
cout << *sp << '\n' <<
*sp2 << endl;
}
/*
produced output:
Hello C++ world
Hello C++ world
*/
shared_ptr may also be initialized by an r-value reference (a temporary
value) initialized an returned by a function. E.g.,
std::shared_ptr<std::string> &&makeString(char const *text)
{
return std::move(shared_ptr<string>(new std::string(text)));
}
std::shared_ptr<std::string> shared(makeString("hello world"));
shared_ptr object: Without arguments an empty shared_ptr object is
constructed not pointing to a particular block of memory:
shared_ptr<type> identifier;
In this case the
underlying pointer is set to
0 (zero). Like a std::unique_ptr a shared_ptr object can be
compared to 0 to see if it has been initialized. E.g., code like
shared_ptr<int> ip;
if (!ip)
cout << "0-pointer with a shared_ptr object" << endl;
will produce the shown text. Alternatively, the member
get() is available. This member function,
as well as the other member functions of the class shared_ptr are
described in the next section.
shared_ptr:
shared_ptr &shared_ptr<Type>operator=(shared_ptr<Type> &other):This operator reduces the reference count of the left-hand side object, deleting its memory when its count decays to zero, and sets its pointer to the memory pointed at by right-hand side object, incrementing its reference count.
shared_ptr &shared_ptr<Type>operator bool() const:This operator returnsfalseif theshared_ptrdoes not point to memory (i.e., itsget()member, see below, returns 0). Otherwise,trueis returned.
Type &shared_ptr<Type>operator*():This operator returns a reference to
the information stored in the shared_ptr object. It acts like a normal
pointer
dereference operator.
Type *shared_ptr<Type>operator->():This operator returns a pointer to the information stored in theshared_ptrobject. Through this operator members of a stored object can be selected. For example:shared_ptr<string> sp(new string("hello")); cout << sp->c_str() << endl;
The following
member functions are defined for shared_ptr objects:
Type *shared_ptr<Type>::get():This member does the same asoperator->(): it returns a pointer to the information stored in theshared_ptrobject. This pointer can be inspected: if it's zero theshared_ptrobject does not point to any memory. This member cannot be used to let theshared_ptrobject point to (another) block of memory.
Deleter &shared_ptr<Type>::get_deleter():This member returns a
reference to the
deleter class object used by the shared_ptr.
Type *shared_ptr<Type>::release():This member returns a pointer to the information stored in theshared_ptrobject, which loses the memory it pointed at (and changes into a 0-pointer). The member can be used to transfer the information stored in theshared_ptrobject to a plainTypepointer. It is the responsibility of the programmer to delete the memory returned by this member function.
void shared_ptr<Type>::reset(Type *):This member may also be called without argument, to delete the memory stored in theshared_ptrobject, or with a pointer to a dynamically allocated block of memory, which will thereupon be the memory accessed by theshared_ptrobject. This member function can be used to assign a new block of memory (new content) to anshared_ptrobject.
void shared_ptr<Type>::swap(shared_ptr<Type> &&):This member is used to swap two identically typed shared_ptrs.
bool shared_ptr<Type>unique() const:This member returns.trueif the memory is referenced by the current object only and returnsfalseotherwise
size_t shared_ptr<Type>use_count() const:This member returns the number of objects sharing the memory pointed at by their data pointers.
shared_ptr's main features have been described, consider
the following simple class:
// required #includes
class Map
{
std::map<string, Data> *d_map;
public:
Map(char const *filename) throw(std::exception);
};
The class's constructor Map() performs the following tasks:
std::map object;
Map::Map(char const *fname)
:
d_map(new std::map<std::string, Data>) throw(std::exception)
{
ifstream istr(fname);
if (!istr)
throw std::exception("can't open the file");
fillMap(istr);
}
What's wrong with this implementation? Its main weakness is that it hosts
a potential
memory leak. The memory leak only occurs when the exception is
actually thrown. In all other cases, the function operates perfectly
well. When the exception is thrown, the map has just been dynamically
allocated. However, even though the class's destructor will dutifully call
delete d_map, the destructor is actually never called, as the destructor
will only be called to destroy
objects that were constructed completely. Since the constructor terminates in
an exception, its associated object is not constructed completely, and
therefore that object's destructor is never called.
Shared_ptrs (as well as unique_ptrs, cf. section 18.3)
may be used to prevent these kinds of problems. By defining d_map as
std::shared_ptr<std::map<std::string, Data> >
it suddenly changes into an object. Now, Map's constructor may safely
throw an exception. As d_map is an object itself, its destructor will be
called by the time the (however incompletely constructed) Map object goes
out of scope.
As a
rule of thumb: classes should use shared_ptr or unique_ptr
objects, rather than plain pointers for their pointer data members if there's
any chance that their constructors will end prematurely in an exception.
The Annotations don't aim at introducing the concepts behind multi threading. It is a topic by itself and many good reference sources exist (cf. Nichols, B, et al.'s Pthreads Programming, O'Reilly for a good introduction to multi-threading).
Multi threading facilities are offered through the class
std::thread. Its
constructor and assignment operator accept a function (object) that will
handle the thread created by the thread object.
Thread synchronization is handled by objects of the class
std::mutex and
condition variables are implemented by the class
std::condition_variable.
In order to use multi threading in C++ programs the Gnu g++ compiler
requires the use of the
-pthread
flag. E.g., to compile a multi-threaded program defined in the source file
multi.cc the compiler is called as follows:
g++ --std=c++0x -pthread -Wall multi.cc
Threads in C++ are very much under development. It is likely that in the near future features will be added and possibly redefined. The sections below should therefore be read with this in mind.
std::thread class implements a new thread. It can be constructed empty
(in which case no new thread is started yet) but it may also be given a
function or function object, in which case the thread is started immediately.
Alternatively, a function or function object may be assigned to an
existing thread object causing a new thread to be launched using that
function (object).
There are several ways to end a launched thread. Currently a thread ends when
the function called from the thread object ends. Since the thread
object not only accepts functions but also function objects as its argument,
a
local context may be passed on to the the thread. Here are two examples
of how a thread may be started: in the first example a function is passed on
to the thread, in the second example a function object:
#include <iostream>
#include <thread>
#include <cstdlib>
using namespace std;
void hello()
{
cout << "hello world\n";
}
class Zero
{
int *d_data;
size_t d_size;
public:
Zero(int *data, size_t size)
:
d_data(data),
d_size(size)
{}
void operator()(int arg) const
{
for (int *ptr = d_data + d_size; ptr-- != d_data; )
*ptr = 0;
}
};
int main()
{
int data[30];
Zero zero(data, 30);
int value = 0;
std::thread t1(zero, value);
std::thread t2(hello);
};
Thread objects do not implement a copy constructor, but a move constructor is provided. Threads may be swapped as well, even if they're actually running a thread. E.g.,
std::thread t1(Thread());
std::thread t2(Thread());
t1.swap(t2);
According to the current specifications of the thread class its
constructor should also be able to accept additional arguments (in addition to
the function (object) handled by the thread object), but currently that
facility does not appears to be very well implemented. Any objects otherwise
passed to the function call operator could of course also be passed using
separate members or via the object's constructors, which is probably an
acceptable makeshift solution until multiple arguments can be passed to the
thread constructor itself (using perfect forwarding, cf. section
21.5.2).
The thread's overloaded assigment operator can also be used to start a thread. If the current thread object actually runs a thread it is stopped, and the function (object) assigned to the thread object becomes the new running thread. E.g.,
std::thread thr; // no thread runs from thr yet
thr = Thread(); // a thread is launched
Threads (among which the thread represented by main) may be forced to wait
for another threds's completion by calling the other thread's join()
member. E.g., in the following example main launches two threads and wait
for the completion of both:
int main()
{
std::thread t1(Thread());
std::thread t2(Thread());
t1.join(); // wait for t1 to complete
t2.join(); // same, t2
}
The thread's member detach() can be called to disassociate the current
thread from its starting thread. Ending a thread other than ending the
activities of the function defining the thread's activities appears currently
very well implemented.
std::mutex class a
std::recursive_mutex is
offered: when called multiple times by the same thread it will increase its
lock-count. Before other threads may access the protected data the recursive
mutex must be unlocked again that number of times. In addition
std::timed_mutex and
recursive_timed_mutex is offered: their locks
will time out after a preset amount of time.
In many situations locks will be released at the end of some action
block. To simplify locking additional template classes
std::unique_lock<> and
std::lock_guard<> are provided. As their
constructor locks the data and their destructor unlocks the data they can be
defined as local variables, unlocking their data once their scope terminates.
Here is a simple example showing its use; at the end of safeProcess guard
is destroyed, thereby releasing the lock on data:
std::mutex dataMutex;
Data data;
void safeProcess()
{
std::lock_guard<std::mutex> guard(dataMutex);
process(data);
}
A std::unique_lock is used similarly, but is used when timeouts must be
considered as well:
std::timed_mutex dataMutex;
Data data;
void safeProcess()
{
std::unique_lock<std::timed_mutex>
guard(dataMutex, std::chrono::milliseconds(3));
if (guard)
process(data);
}
In the above example guard tries to obtain the lock during three
milliseconds. If guard's bool conversion operator returns true the
lock was obtained and data can be processed safely.
A common cause of problems with multi threaded programs is the occurrence of
deadlocks. If a deadlock may occur when two locks are required to process
data, but one thread obtains the first lock and another thread obtains the
second lock. The C++0x standard defines a generic
std::lock function that
may be used to prevent problems like these. The std::lock function
can be used to lock multiple mutexes in one atomic action. Here is an example:
struct SafeString
{
std::mutex d_mutex;
std::string d_text;
};
void calledByThread(SafeString &first, SafeString &second)
{
std::unique_lock<std::mutex> // 1
lock_first(first.d_mutex, std::defer_lock);
std::unique_lock<std::mutex> // 2
lock_second(second.d_mutex, std::defer_lock);
std::lock(lock_first, lock_second); // 3
safeProcess(first.d_text, second.d_text);
}
At 1 and 2 unique_locks are created. Locking is deferred until calling
std::lock in 3. Having obtained the lock, the two SafeString text
members can both be safely processed by calledByThread.
Another problematic issue with threads involves initialization. If multiple threads are running and only the first thread encountering the initialization code should perform the initialization then this problem should not be solved by mutexes. Although the synchronization is accomplished, it will needlessly be accomplished time and again once the initialization phase has been completed. The C++0x standard offers several ways to perform a proper initialization:
g++ compiler and its
discussion is therefore postponen.
#include <iostream>
struct Cons
{
Cons()
{
std::cout << "Cons called\n";
}
};
void called(char const *time)
{
std::cout << time << "time called() activated\n";
static Cons cons;
}
int main()
{
std::cout << "Pre-1\n";
called("first");
called("second");
std::cout << "Pre-2\n";
Cons cons;
}
/*
Generated output:
Pre-1
firsttime called() activated
Cons called
secondtime called() activated
Pre-2
Cons called
*/
This feature causes a thread to wait automatically if another thread is
still initializing the static data (note that non-static data never cause
problems, as each non-static local variables have lifes that are completely
restricted to their own threads).
std::call_once and std::once_flag result in the one-time execution of
a specified function as illustrated by the next example:
std::string *global;
std::once_flag globalFlag;
void initializeGlobal()
{
global = new std::string("Hello world (why not?)");
}
void safeUse()
{
std::call_once(globalFlag, initializeGlobal);
process(*global);
}
Mutexes may be used in C++0x after including the header file
mutex.
At some point the producer will therefore have to wait until the client has consumed enough and there's space available in the producer's storage. Similarly, the client will have to wait until the producer has produced some more items.
Locking and polling the amount of available items/storage at fixed time intervals usually isn't a good option as it is in principle a wasteful scheme: threads continue to wait during the full interval even though the condition to continue may already have been met; reducing the interval, on the other hand, isn't an attractive option either as it results in a relative increase of the overhead associated with handling the associated mutexes.
Condition variables may be used to solve these kinds of problems. A thread simply sleeps until it is notified by another thread. In general terms this may be accomplished as follows:
producer loop:
- produce the next item
- wait until there's room to store the item,
then reduce the available room
- store the item
- increment the number of items in store
consumer loop:
- wait until there's an item in store,
then reduce the number of items in store
- remove the item from the store
- increment the number of available storage locations
- do something with the retrieved item
It is important that the two storage administrative tasks (registering the number of available items and available storage locations) are either performed by the client or by the producer. `Waiting' in this case means:
condition_variable is used. The variable
containing the actual count is called semaphore; and it can be protected
by a mutex sem_mutex. In addition a condition_variable condition is
defined. The waiting process is defined in the following function down
implemented as follows:
void down()
{
unique_lock<mutex> lk(sem_mutex); // get the lock
while (semaphore == 0)
condition.wait(lk); // see 1, below.
--semaphore; // dec. semaphore count
} // the lock is released
At 1 the condition variable's
wait()
member internally releases the
lock, wait for a notification to continue, and re-acquires the lock just
before returning. Consequently, the down's code always has complete and
unique control over semaphore.
What about notifying the condition variable? This is handled by the
`increment the number ...' parts in the abovementioned produced and consumer
loops. These parts are defined by the following up() function, implemented
as follows:
void up()
{
lock_guard<std::mutex> lk(sem_mutex); // get the lock
if (semaphore++ == 0)
condition.notify_one(); // see 2, below
} // the lock is released
At 2 semaphore is always incremented. However, by using a postfix
increment it can be tested for being zero at the same time and if it was zero,
initially then semaphore is now one. Consequently, the thread waiting for
semaphore being unequal to zero may now
continue. Condition.notify_one()
will notify a waiting
thread (see down's implementation above). If situations where multiple
threads are waiting `
notify_all()' can be used.
Handling semaphore can very well be implemented in a class
Semaphore, offering members down() and up(). For a more extensive
discussion of semaphores see
Tanenbaum, A.S. (2006)
Structured Computer Organization, Pearson Prentice-Hall.
Using the semaphore facilities, embedded in a class Semaphore whose
constructor expects an initial value of its semaphore data member, the
classic producer and consumer case can now easily be implemented in the
following multi-threaded program (A more elaborate example of the
producer-consumer program is found in the yo/stl/examples/events.cc file
in the C++ Annotations's source archive):
Semaphore available(10);
Semaphore filled(0);
std::queue itemQueue;
void producer()
{
size_t item = 0;
while (true)
{
++item;
available.down();
itemQueue.push(item);
filled.up();
}
}
void client()
{
while (true)
{
filled.down();
size_t item = itemQueue.front();
itemQueue.pop();
available.up();
process(item); // not implemented here
}
}
int main()
{
thread produce(producer);
thread consume(consumer);
produce.join();
return 0;
}
To use condition_variable objects the header file
condition_variable
must be included.
g++ compiler this section will only cover the concept, the proposed
syntax and some examples.
As we've seen the generic algorithms often accept an argument which can either
be a function object or a plain function. Examples are the sort and
find_if generic algorithms. When the function called by the generic
algorithm must remember its state a function object is appropriate, otherwise
a plain function will do.
The function or function object is usually not readily available. Often it must be defined in or near the function using the generic algorithm. Often the software engineer will resort to anonymous namespaces in which a class or function is defined that is thereupon used in the function calling the generic algorithm. If that function is itself a member function the need may be felt to access other members of its class from the function called by the generic algorithm. Often this results in a significant amount of code (defining the class), in complex code (to make available software elements that aren't native to the called function (object)), and -at the level of the source file- code that is irrelevant at the current level of specification. Nested classes don't solve these problems and, moreover, nested classes can't be used in templates.
Solutions to the above problems exist. In section 23.8 a general approach is discussed allowing the use of members and/or local variables in the context of a function or function object called from generic algorithms.
However, the solutions given in section 23.8 are in fact makeshift solutions that need to be used as long as the language doesn't offer lambda functions. A lambda function is an anonymous function. Such a function may be defined on the spot, and will exist only during the lifetime of the statement of which it is a part.
Here is an example of the definition of a lambda function:
[](int x, int y)
{
return x * y;
}
This particular function expects two int arguments and returns their
product. It could be used e.g., in combination with the accumulate generic
algorithm to compute the product of a series of int values stored in a
vector:
cout << accumulate(vi.begin(), vi.end(), 1,
[](int x, int y) { return x * y; });
The above lambda function will use the implicit return
ih(return type: implicit)
type decltype(x * y). An implicit return type can
only be used if the lambda function has a single statement of the form
return expression;.
Alternatively, the return type can be explicitly specified using a late-specified return type, (cf. section 3.3.5):
[](int x, int y) -> int
{
int z = x + y;
return z + x;
}
A lambda function not returning a value (i.e., a void-function) does not
have to specify a return type at all.
Variables having the same scope as the lambda function can be accessed from the lambda function using references. This allows passing the local context to a lambda function. Such variables are called a closure. Here is an example:
void showSum(vector<int> &vi)
{
int total = 0;
for_each(
vi.begin(), vi.end(),
[&total](int x)
{
total += x;
}
);
std::cout << total;
}
The variable int total is passed to the lambda function as a reference
([&total]) and can directly be accessed by the function. Its parameter
list merely defines an int x, which is initialized in turn by each of the
values stored in vi. Once the generic algorithm has completed
showSum's variable total has received a value equal to the sum of all
the vector's values. It has outlived the lambda function and its value is
displayed.
If a closure variable is defined without the reference symbol it becomes a
simple value which is initialized by the local variable when the lambda
function is passed to the generic algorithm. Usually closure variables are
passed by reference. If all local variables are to be passed by reference
the notation
[&] can be used (to pass the full closure by value use
[=]):
void show(vector<int> &vi)
{
int sum = 0;
int prod = 1;
for_each(
vi.begin(), vi.end(),
[&](int x)
{
sum += x;
prod *= x;
}
);
std::cout << sum << ' ' << prod;
}
Combinations of variables passed by value, but some by reference or passed by reference but some by value is possible. In that case the default is followed by a list of variables passed differently. E.g.,
[&, value](int x)
{
total += x * value;
};
In this case total will be passed by reference, value by value.
Within a class context, members may define lambda functions as well. In those cases the lambda function has full access to all the class's members. In other words, it is defined as a friend of the class. For example:
class Data
{
std::vector<std::string> d_names;
public:
void show() const;
{
int count = 0;
std::for_each(d_names.begin(), d_end(),
[this, &count](std::string const &name)
{
std::cout << ++count <<
this->capitalized(name) << endl;
}
)
}
private:
std::string capitalized(std::string const &name);
}
Note the use of the this pointer: inside the lambda function it must
explicitly be used (so,this->capitalized(name) is used rather than
capitalized(name)). In addition, in situations like the above this
will automatically be available when [&] or [=] is used. So, the above
lambda function could have been defined as:
[&](std::string const &name)
{
std::cout << ++count <<
this->capitalized(name) << endl;
}
When lambda functions are compared to, e.g., the function wrappers
discussed in section 23.8 it probably won't come as a surprise
that
lambda functions are function objects. It is possible to assign a
lambda function to a variable. An example of such an assignment (using
auto to define the type of the variable) is:
auto lambdaFun = [this]()
{
this->member();
};
Lambda functions are not yet supported by the g++ compiler.
To accomodate for these situations the C++0x standard introduces polymorphous (function object) wrappers. Polymorphous wrappers can refer to function pointers, member functions or functors, as long as they match in type and number of their parameters.
Polymorphous wrappers are not yet fully supported by the g++ compiler.
Using these distrbutions is based on new randum number generator functions, differing from the traditional rand(3) function provided by the C standard library. These random number generators are used to produce pseudo-random numbers, which are then filtered through a particular distribution to obtain values that are randomly selected from the specified distributino.
| Class template | Integral/Floating point | Quality | Speed | Size of state |
linear_congruential | Integral | Medium | Medium | 1 |
subtract_with_carry | Both | Medium | Fast | 25 |
| mersenne_twister | Integral | Good | Fast | 624 |
The linear_congruential random number generator computes
valuei+1 = a * valuei + c % ma, the additive constant c and the modulo value
m. E.g.,
linear_congruential<int, 10, 3, 13> lc;
The linear_congruential generator may also be seeded by providing its
constructor with a seeding-argument. E.g., lc(time(0)).
The subtract_with_carry random number generator computes
valuei = valuei-s - valuei-r - carryi-1 % mm, the subtractive constants s and r,
respectively. E.g.,
subtract_with_carry<int, 13, 3, 13> sc;
The subtract_with_carry generator may also be seeded by providing its
constructor with a seeding-argument. E.g., sc(time(0)).
The predefined mersenne_twister mt19937 (predefined using a typedef
defined by the random
header file) is used in the
examples below. It can be constructed using
std::mt19937 mt or it can
be seeded using an argument (e.g., std::mt19937 mt(time(0))). Other ways
to initialize the mersenne_twister are byond the scope of the C++
Annotations (but see Lewis
et al. (
Lewis, P.A.W., Goodman, A.S., and Miller, J.M. (1969), A pseudorandom
number generator for the System/360, IBM Systems Journal, 8, 136-146.) (1969)).
The random number generators may also be seeded by calling their members
seed() which accepts an unsigned long or a generator function (e.g.,
lc.seed(lc), lc.seed(mt)).
The random number generators implement members min() and max()
returning, respectively, their minimum and maximum values (inclusive). If a
reduced range is required the generators can be nested in a function or class
changing its range.
All distributions offer the following members (result_type being the
type name of the values returned by the distribution, distribution-name
being the type name of the mathematical distribution, e.g.,
bernoulli_distribution):
result_type operator()(RandomNumberGenerator):
The next randomly generated value is returned.
std::istream &operator<<(std::istream &in,
distribution-name &object):
A parameters of the distribution are extracted from an
std::istream;
std::ostream &operator<<(std::ostream &out,
bernoulli_distribution const &bd):
A parameters of the distribution are inserted into an
std::ostream
Distributions:
bernoulli_distributiontrue values are generated with a certain probability
p.bernoulli_distribution(double p = 0.5)
binomial_distribution<IntType = int, RealType = double>IntType: the type of the generated random value (by default
int). RealType: the probability parameter of the binomial
distribution.binomial_distribution<IntType, RealType>(IntType t,
RealType p =RealType(0.5))
gamma_distribution<Type = double>Type: the parameter of the gamma distribution.gamma_distribution<IntType, Type>(Type alpha = Type(1))
geometric_distribution<IntType = int, RealType = double>IntType: the type of the generated random value (by default
int). RealType: the probability parameter of the geometric
distribution.geometric_distribution<IntType, RealType>(RealType p =
RealType(0.5))
exponential_distribution<Type = double>Type: the parameter of the exponential distribution.exponential_distribution<IntType, Type>(Type lambda =
Type(1))
normal_distribution<Type = double>Type: the type of the generated random value (by default
double). normal_distribution<Type>(ReadType mean = Type(0),
Type sigma = Type(1))
poisson_distribution<IntType = int, RealType = double>IntType: the type of the generated random value (by default
int). RealType: the type of the parameter of the distribution (by default
double). mean.
Constructor:
poisson_distribution<IntType, RealType>(RealType mean =
RealType(1))
uniform_int<Type = int>Type: the type of the generated random value (by default
int). uniform_int<Type>(Type min = 0, Type max = 9)Type operator()(RandomNumberGenerator), Type upper
uniform_real<Type = real>Type: the type of the generated random value (by default
real). uniform_real<Type>(Type min = Type(0), Type max = Type(1))
GRN * (max - min) + min
which might not be what you'd expect.
Some of the distributions mentioned below appear not yet to be fully
operational in the STL. This appears to be the case with the
binomial_distribution, the gamma_distribution, the
normal_distribution, and the poisson_distribution. These distributions
will likely become fulle operational in the near future.
Here is an example showing the outcome of a statistical experiment consisting of throwing an honest coin 20 times:
#include <iostream>
#include <random>
#include <ctime>
using namespace std;
int main()
{
std::mt19937 mt(time(0));
bernoulli_distribution bd;
for (int idx = 0; idx < 20; ++idx)
cout << (bd(mt) ? "heads" : "tail") << (idx + 1 == 10 ? '\n' : ' ');
cout << endl;
}