Part 4
Q: Should my constructors use "initialization lists" or "assignment"?
A: Initialization lists. In fact, constructors should initialize as a rule all member objects in the initialization list. One exception is discussed further down. Consider the following constructor that initializes member object x_ using an initialization list:
Fred::Fred() : x_(whatever) { }.
The most common benefit of doing this is improved performance. For example, if the expression whatever is the same type as member variable x_, the result of the whatever expression is constructed directly inside x_ the compiler does not make a separate copy of the object.
Even if the types are not the same, the compiler is usually able to do a better job with initialization lists than with assignments. The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() { x_ = whatever; }. In this case the expression whatever causes a separate, temporary object to be created, and this temporary object is passed into the x_ object's assignment operator. Then that temporary object is destructed at the point. That's inefficient.
As if that wasn't bad enough, there's another source of inefficiency when using assignment in a constructor: the member object will get fully constructed by its default constructor, and this might, for example, allocate some default amount of memory or open some default file.
All this work could be for naught if the whatever expression and/or assignment operator causes the object to close that file and/or release that memory (e.g., if the default constructor didn't allocate a large enough pool of memory or if it opened the wrong file).
Conclusion: All other things being equal, your code will run faster if you use initialization lists rather than assignment. Note: There is no performance difference if the type of x_ is some built-in/intrinsic type, such as int or char* or float.
But even in these cases, my personal preference is to set those data members in the initialization list rather than via assignment for consistency.
Another symmetry argument in favor of using initialization lists even for built-in/intrinsic types: non-static const and nonstatic reference data members can't be assigned a value in the constructor, so for symmetry it makes sense to initialize everything in the initialization list. Now for the exceptions. Every rule has exceptions (hmmm; does "every rule has exceptions" have exceptions? reminds me of Gdel's Incompleteness Theorems), and there are a couple of exceptions to the "use initialization lists" rule.
Bottom line is to use common sense: if it's cheaper, better, faster, etc. to not use them, then by all means, don't use them. This might happen when your class has two constructors that need to initialize the this object's data members in different orders. Or it might happen when two data members are self-referential. Or when a datamember needs a reference to the this object, and you want to avoid a compiler warning about using the this keyword prior to the { that begins the constructor's body (when your particular compiler happens to issue that particular warning). Or when you need to do an if/throw test on a variable (parameter, global, etc.) prior to using that variable to initialize one of your this members. This list is not exhaustive; please don't write me asking me to add another "Or when...". The point is simply this: use common sense.
Q: Should you use the this pointer in the constructor?
A: Some people feel you should not use the this pointer in a constructor because the object is not fully formed yet.
However you can use this in the constructor (in the {body} and even in the initialization list) if you are careful.
Here is something that always works: the {body} of a constructor (or a function called from the constructor) can reliably access the data members declared in a base class and/or the data members declared in the constructor's own class.
This is because all those data members are guaranteed to have been fully constructed by the time the constructor's {body} starts executing.
Here is something that never works: the {body} of a constructor (or a function called from the constructor) cannot get down to a derived class by calling a virtual member function that is overridden in the derived class.
If your goal was to get to the overridden function in the derived class, you won't get what you want. Note that you won't get to the override in the derived class independent of how you call the virtual member function: explicitly using the this pointer (e.g., this->method()), implicitly using the this pointer (e.g., method()), or even calling some other function that calls the virtual member function on your this object.
The bottom line is this: even if the caller is constructing an object of a derived class, during the constructor of the base class, your object is not yet of that derived class. You have been warned. Here is something that sometimes works: if you pass any of the data members in this object to another data member's initializer, you must make sure that the other data member has already been initialized.
The good news is that you can determine whether the other data member has (or has not) been initialized using some straightforward language rules that are independent of the particular compiler you're using. The bad news it that you have to know those language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or virtual inheritance!), then data members defined in the class are initialized in the order in which they appear in the class declaration).
If you don't know these rules, then don't pass any data member from the this object (regardless of whether or not you explicitly use the this keyword) to any other data member's initializer! And if you do know the rules, please be careful.
Q: What is the "Named Constructor Idiom"?
A: A technique that provides more intuitive and/or safer construction operations for users of your class. The problem is that constructors always have the same name as the class.
Therefore the only way to differentiate between the various constructors of a class is by the parameter list. But if there are lots of constructors, the differences between them become somewhat subtle and error prone. With the Named Constructor Idiom, you declare all the class's constructors in the private or protected sections, and you provide public static methods that return an object. These static methods are the so-called "Named Constructors."
In general there is one such static method for each different way to construct an object. For example, suppose we are building a Point class that represents a position on the X-Y plane. Turns out there are two common ways to specify a 2-space coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle). (Don't worry if you can't remember these; the point isn't the particulars of coordinate systems; the point is that there are several ways to create a Point object.) Unfortunately the parameters for these two coordinate systems are the same: two floats. This would create an ambiguity error in the overloaded constructors:
class Point {
public:
Point(float x, float y); // Rectangular coordinates
Point(float r, float a); // Polar coordinates (radius and angle)
// ERROR: Overload is Ambiguous: Point::Point(float,float)
};
int main()
{
Point p = Point(5.7, 1.2); // Ambiguous: Which coordinate system?
...
}
One way to solve this ambiguity is to use the Named Constructor Idiom:
#include // To get sin() and cos()
class Point {
public:
static Point rectangular(float x, float y); // Rectangular coord's
static Point polar(float radius, float angle); // Polar coordinates
// These static methods are the so-called "named constructors"
...
private:
Point(float x, float y); // Rectangular coordinates
float x_, y_;
};
inline Point::Point(float x, float y)
: x_(x), y_(y) { }
inline Point Point::rectangular(float x, float y)
{ return Point(x, y); }
inline Point Point::polar(float radius, float angle)
{ return Point(radius*cos(angle), radius*sin(angle)); }
Now the users of Point have a clear and unambiguous syntax for creating Points in either
coordinate system:
int main()
{
Point p1 = Point::rectangular(5.7, 1.2); // Obviously rectangular
Point p2 = Point::polar(5.7, 1.2); // Obviously polar
...
}
Make sure your constructors are in the protected section if you expect Point to have derived classes. The Named Constructor Idiom can also be used to make sure your objects are always created via new. Note that the Named Constructor Idiom, at least as implemented above, is just as fast as directly calling a constructor modern compilers will not make any extra copies of your object.
A: Initialization lists. In fact, constructors should initialize as a rule all member objects in the initialization list. One exception is discussed further down. Consider the following constructor that initializes member object x_ using an initialization list:
Fred::Fred() : x_(whatever) { }.
The most common benefit of doing this is improved performance. For example, if the expression whatever is the same type as member variable x_, the result of the whatever expression is constructed directly inside x_ the compiler does not make a separate copy of the object.
Even if the types are not the same, the compiler is usually able to do a better job with initialization lists than with assignments. The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() { x_ = whatever; }. In this case the expression whatever causes a separate, temporary object to be created, and this temporary object is passed into the x_ object's assignment operator. Then that temporary object is destructed at the point. That's inefficient.
As if that wasn't bad enough, there's another source of inefficiency when using assignment in a constructor: the member object will get fully constructed by its default constructor, and this might, for example, allocate some default amount of memory or open some default file.
All this work could be for naught if the whatever expression and/or assignment operator causes the object to close that file and/or release that memory (e.g., if the default constructor didn't allocate a large enough pool of memory or if it opened the wrong file).
Conclusion: All other things being equal, your code will run faster if you use initialization lists rather than assignment. Note: There is no performance difference if the type of x_ is some built-in/intrinsic type, such as int or char* or float.
But even in these cases, my personal preference is to set those data members in the initialization list rather than via assignment for consistency.
Another symmetry argument in favor of using initialization lists even for built-in/intrinsic types: non-static const and nonstatic reference data members can't be assigned a value in the constructor, so for symmetry it makes sense to initialize everything in the initialization list. Now for the exceptions. Every rule has exceptions (hmmm; does "every rule has exceptions" have exceptions? reminds me of Gdel's Incompleteness Theorems), and there are a couple of exceptions to the "use initialization lists" rule.
Bottom line is to use common sense: if it's cheaper, better, faster, etc. to not use them, then by all means, don't use them. This might happen when your class has two constructors that need to initialize the this object's data members in different orders. Or it might happen when two data members are self-referential. Or when a datamember needs a reference to the this object, and you want to avoid a compiler warning about using the this keyword prior to the { that begins the constructor's body (when your particular compiler happens to issue that particular warning). Or when you need to do an if/throw test on a variable (parameter, global, etc.) prior to using that variable to initialize one of your this members. This list is not exhaustive; please don't write me asking me to add another "Or when...". The point is simply this: use common sense.
Q: Should you use the this pointer in the constructor?
A: Some people feel you should not use the this pointer in a constructor because the object is not fully formed yet.
However you can use this in the constructor (in the {body} and even in the initialization list) if you are careful.
Here is something that always works: the {body} of a constructor (or a function called from the constructor) can reliably access the data members declared in a base class and/or the data members declared in the constructor's own class.
This is because all those data members are guaranteed to have been fully constructed by the time the constructor's {body} starts executing.
Here is something that never works: the {body} of a constructor (or a function called from the constructor) cannot get down to a derived class by calling a virtual member function that is overridden in the derived class.
If your goal was to get to the overridden function in the derived class, you won't get what you want. Note that you won't get to the override in the derived class independent of how you call the virtual member function: explicitly using the this pointer (e.g., this->method()), implicitly using the this pointer (e.g., method()), or even calling some other function that calls the virtual member function on your this object.
The bottom line is this: even if the caller is constructing an object of a derived class, during the constructor of the base class, your object is not yet of that derived class. You have been warned. Here is something that sometimes works: if you pass any of the data members in this object to another data member's initializer, you must make sure that the other data member has already been initialized.
The good news is that you can determine whether the other data member has (or has not) been initialized using some straightforward language rules that are independent of the particular compiler you're using. The bad news it that you have to know those language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or virtual inheritance!), then data members defined in the class are initialized in the order in which they appear in the class declaration).
If you don't know these rules, then don't pass any data member from the this object (regardless of whether or not you explicitly use the this keyword) to any other data member's initializer! And if you do know the rules, please be careful.
Q: What is the "Named Constructor Idiom"?
A: A technique that provides more intuitive and/or safer construction operations for users of your class. The problem is that constructors always have the same name as the class.
Therefore the only way to differentiate between the various constructors of a class is by the parameter list. But if there are lots of constructors, the differences between them become somewhat subtle and error prone. With the Named Constructor Idiom, you declare all the class's constructors in the private or protected sections, and you provide public static methods that return an object. These static methods are the so-called "Named Constructors."
In general there is one such static method for each different way to construct an object. For example, suppose we are building a Point class that represents a position on the X-Y plane. Turns out there are two common ways to specify a 2-space coordinate: rectangular coordinates (X+Y), polar coordinates (Radius+Angle). (Don't worry if you can't remember these; the point isn't the particulars of coordinate systems; the point is that there are several ways to create a Point object.) Unfortunately the parameters for these two coordinate systems are the same: two floats. This would create an ambiguity error in the overloaded constructors:
class Point {
public:
Point(float x, float y); // Rectangular coordinates
Point(float r, float a); // Polar coordinates (radius and angle)
// ERROR: Overload is Ambiguous: Point::Point(float,float)
};
int main()
{
Point p = Point(5.7, 1.2); // Ambiguous: Which coordinate system?
...
}
One way to solve this ambiguity is to use the Named Constructor Idiom:
#include // To get sin() and cos()
class Point {
public:
static Point rectangular(float x, float y); // Rectangular coord's
static Point polar(float radius, float angle); // Polar coordinates
// These static methods are the so-called "named constructors"
...
private:
Point(float x, float y); // Rectangular coordinates
float x_, y_;
};
inline Point::Point(float x, float y)
: x_(x), y_(y) { }
inline Point Point::rectangular(float x, float y)
{ return Point(x, y); }
inline Point Point::polar(float radius, float angle)
{ return Point(radius*cos(angle), radius*sin(angle)); }
Now the users of Point have a clear and unambiguous syntax for creating Points in either
coordinate system:
int main()
{
Point p1 = Point::rectangular(5.7, 1.2); // Obviously rectangular
Point p2 = Point::polar(5.7, 1.2); // Obviously polar
...
}
Make sure your constructors are in the protected section if you expect Point to have derived classes. The Named Constructor Idiom can also be used to make sure your objects are always created via new. Note that the Named Constructor Idiom, at least as implemented above, is just as fast as directly calling a constructor modern compilers will not make any extra copies of your object.
0 comments:
Post a Comment