Continue in Range Based for Loop
The range-based for loop changed in C++17 to allow the begin and end expressions to be of different types. And in C++20, an init-statement is introduced for initializing the variables in the loop-scope.
Overview
The range-based for loop (or range-for in short), along with auto
, is one of the most significant features added in the C++11 standard. These are some of the typical usages of range-based for loop:
// Iterate over STL container std::vector<int> v{1,2,3,4}; for(const auto& i : v) std::cout << i << "\n"; // Over std::string std::string s{"1234"}; for(auto c : s) std::cout << c << "\n"; // over an array int a[4]{1,2,3,4}; for(auto& i : a) std::cout << i << "\n"; // over a brace-init-list (std::initializer_list) for(auto& i : {1,2,3,4}) std::cout << i << "\n";
This article, inspired by the cppreference page, explains the internal functioning of the range-for loop. We will start by outlining how C++11/C++14 range-for works, and then briefly describe the changes made to it in C++17 and C++20 in later sections.
Range-for in C++11/C++14
The range-based for loop has the following format:
for (range_declaration : range_expression) { /*loop body*/ }
In C++11/C++14, the above format results in a code similar to the following:
/* modified code from cppreference */ { auto&& range = range_expression ; for (auto b = beginExpr, e = endExpr; b != e; ++b) { range_declaration = *b; /*loop body*/ } }
These are the focal points regarding the above implementation:
The Universal Reference:
The range is a universal reference (because it is declared as auto&&
). A universal reference, also known as a forwarding reference, can bind to either an lvalue or an rvalue expression. This suggests the range_expression can be anything including but not limited to - a variable, a const
reference, or a function call that returns a temporary.
The Nesting Block:
The entire implementation is nested within a block ({}) statement. If the range_expression returns a temporary, the temporary's lifetime is extended until the end of the loop by the enclosing block.
Types of beginExpr and endExpr:
The beginExpr and the endExpr are of the same type, and they are resolved depending upon the range_expression as follows:
-
If the range_expression is an array of N elements, the beginExpr is range and the endExpr is range+N.
-
If the range_expression is a class with members begin and end, the beginExpr is range.begin() and endExpr is range.end(). All the STL containers (e.g., std::vector and std::map ) have begin and end methods that return the iterators. Note that, if the begin and end members are not functions returning an iterator (or a pointer), this results in a compilation error.
-
If none of the above, the beginExpr is begin(range) and endExpr is end(range). Note that here, the begin and end are unqualified function names (e.g., begin instead of std::begin), and they are resolved using Argument Dependent Lookup (ADL). In a nutshell, ADL means, the compiler performs the lookup for an unqualified function name in its arguments' namespaces also.
A Custom range-for Iterable
Let's take an example of a custom range-for iterable type to get everything together. Consider a minimal null-terminated custom string class, FixedString, that can only store a fixed number of chars
. The FixedString class also has an inner type Iterator and - begin() and end() - methods so it can be used in a range-for loop:
template<size_t S> class FixedString { public: FixedString() = default; FixedString(const char* str) { if(str) ::strncpy(str_, str, S); } const char* c_str() const { return str_; } size_t count() const { return ::strlen(str_); } const char& operator[](size_t i) const { return str_[i]; } // default memberwise copies // Minimum required for range-for loop template<typename T> struct Iterator { T* p; T& operator*() { return *p; } bool operator != (const Iterator& rhs) { return p != rhs.p; } void operator ++() { ++p; } }; // auto return requires C++14 auto begin() const { // const version return Iterator<const char>{str_}; } auto end() const { // const version return Iterator<const char>{str_+count()}; } private: char str_[S+1]{}; // '\0' everywhere };
Although a FixedString object can be assigned a new value, its elements cannot be modified. This helps keep things straightforward for our purpose here as we don't have to worry about defining the non-const
begin and end methods. Note that instead of member functions begin/end, we could define free begin() and end() function templates in the same namespace to make FixedString iterable in the range-for loop:
template<size_t Size> const char* begin(const FixedString<Size>& fs) { return fs.c_str(); } template<size_t Size> const char* end(const FixedString<Size>& fs) { return fs.c_str() + fs.count(); }
We have chosen the member functions way because that helps us explain the C++17 changes in the next section. The FixedString can be used in a range-based loop as follows:
FixedString<12> fs("hello world"); // print all chars for(auto& c : fs) std::cout << c << "\n";
The range-based for loop has gone over some changes since C++11/C++14. The first change was made in C++17 to allow a range_expression's end to be of a different type than its begin. The second and most recent change, which is from the C++20 standard, adds an optional init-statement for initializing the variables in the loop-scope. We talk about these changes in the next two sections.
The C++17 Version
Until C++14, the beginExpr and endExpr had to be of the same type, hence constraining the end-ness of the range_expression. Observing the range-for structure shows that the endExpr only needs to be equality comparable to beginExpr. The C++17 standard lifts this restriction on endExpr to be of the same type as beginExpr. Thus, since C++17, the endExpr can also be a sentinel integer value( e.g., a null byte) or even a predicate. The C++17 composition of the range-for loop is:
{ auto&& range = range_expression ; auto b = beginExpr ; auto e = endExpr ; for ( ; b != e; ++b) { range_declaration = *b; /*loop body*/ } }
So how can we take advantage of this evolution for FixedString? We can modify the FixedString to have an end() method that returns a sentinel null char (\0
), instead of an end() method that returns an iterator. The FixedString::Iterator can be made comparable to a sentinel char
value instead of its own type:
template<size_t S> class FixedString { public: ....... template<typename T> struct Iterator { T* p; T& operator*() { return *p; } // compare with the sentinel byte bool operator != (char rhs) { return *p != rhs; } void operator ++() { ++p; } }; auto begin() const { return Iterator<const char>{str_}; } /* end method. Returns sentinel byte. */ auto end() const { return '\0'; } .... };
Please check out "An Iterable's End May Have a Different Type Than Its Begin" for more details on this C++17 change.
The C++20 Version
The range-for loop since C++20 has the following format:
for (init-statement(optional) range_declaration : range_expression) { /* loop body */ }
To understand the motivation behind adding init-statement, let's consider a range_expression that returns a temporary:
std::string foo() { return "This is a test string"; } for(auto& c : foo()) // OK with temporary std::cout << c << "\n";
That loop works fine because the lifetime of the temporary is extended until the end of the loop. However, if the range_expression is changed to have a temporary within it, the results are undefined:
class A { public: A(const char* s):str(s){} std::string& foo() { return str; } private: std::string str; }; for(auto& c : A("Hello World").foo()) //!! Undefined behavior std::cout << c << "\n";
A few more examples of a temporary within a range_expression that can be very tough to spot:
std::vector<std::vector<int>> foo(); for(auto i : foo().front()) //The parent vector will be destroyed!! std::cout << i; std::shared_ptr<std::string> foo(); for(auto c : *foo()) //The shared_ptr will be destroyed!! std::cout << c;
All the above cases can be resolved by introducing a variable before the loop to remove the intermediate temporary, e.g.:
A a("Hello World"); for(auto& c : a.foo()) std::cout << c << "\n";
Or, we can use the C++20's init-statement as an alternative to create a variable in the loop-scope:
for(A a("Hello World"); auto& c : a.foo()) std::cout << c << "\n";
Clearly, the C++20 init-statement offers an elegant way to initialize a local scope variable for a range_expression.
Further Reading
Range-based for loop: cppreference
Universal References in C++11: Scott Meyers
Argument-dependent lookup: cppreference
How the new range-based for loop in C++17 helps Ranges TS?: stackoverflow
Range-based for statements with initializer: open-std
Why use non-member begin and end functions in C++11? - stackoverflow
Source: https://www.nextptr.com/tutorial/ta1208652092/how-cplusplus-rangebased-for-loop-works
0 Response to "Continue in Range Based for Loop"
Post a Comment