C++Builder and the VCL

C++Builder and the VCL - Part 01

In this chapter you will get a look at the interface between BCB and the Object Pascal code found in the VCL. This is a subject you need to understand if you want to take full advantage of the power of Borland C++Builder.

There are very few occasions when BCB programmers have to think explicitly about the fact that the VCL is written in Object Pascal. Most of the time you can forget its Delphi-based heritage without fear of missing out on anything important. There are, however, a few times when Object Pascal affects your programming. Almost all my descriptions of those occasions are concentrated in this chapter. Throughout most of the rest of this book the subject will never even arise.

The material in this chapter is divided into three main sections:

1. Understanding the VCL

2. Changes to the C++ Language

3. Classes created to imitate Object Pascal simple types that do not exist in C++

Understanding the VCL

All VCL objects are referenced as pointers. There is no such thing as a static or local instance of a VCL object. You are always working with a pointer to a VCL object. This is the result of the VCL's origin in the world of Object Pascal.

Delphi sought to simplify its syntax to the greatest degree possible. However, there were several hurdles that had to be crossed before the language could be made accessible to a wide variety of programmers. In particular, something had to be done about the complexity of dealing with pointers.

For various reasons related to performance and wise use of memory, it is usually best to declare an object on the heap, rather than on the stack, particularly if you are still inhabiting the segmented world of 16-bit programming. As a result, the designers of Object Pascal wanted to make it as simple as possible for their users to work with objects that live on the heap.

Rather than inflict pointer syntax on unwary Object Pascal programmers, the creators of the VCL decided that all objects would necessarily be created on the heap, but would support the syntax associated with local objects. In short, they created a world that eliminated pointer syntax for objects altogether. They could afford to do this because they made it impossible to create an object that was not a pointer. All other language types, such as strings, integers, arrays, structures, and floating-point numbers, could be treated either as pointers or as static objects. This rule applied only to objects.

It might help to illustrate this point with examples. Here is hypothetical code for how an Object Pascal programmer might have treated objects according to the standard rules of Pascal:

var

S: ^TObject;

begin

S := New(Tobject, Create);

S^.DoSomething;

S^.Free;

end;

The preceding code will not compile in Delphi, but it is an example of how Delphi code might have looked had the developers not done something to simplify matters. In particular, Delphi eliminated some syntactical clutter by enabling you to write the following:

var

S: TObject;

begin

S := TObject.Create;

S.DoSomething;

S.Free;

end;

Clearly this is an improvement over the first example. However, both samples produce essentially the same underlying machine code. In other words, both examples allocate memory for the object, call a constructor called Create, implicitly call a method called DoSomething, call a destructor called Destroy, and then Free the memory associated with the object. In the second example, all of this can be done without any need to dereference a pointer with the "^" symbol or without making explicit reference to the act of allocating memory. The point being that the compiler knows that the variable S has to be a pointer to an object, because Object Pascal forbids the creation of objects that are not pointers.

Clearly, in the Object Pascal world, it made sense to decide that all objects had to be pointers. There is no significant overhead involved with using pointers, and indeed they are often the fastest way to manipulate memory. So why not make this one hard and fast rule in order to make everyone's life simpler?

Translate this same concept into C++, and suddenly the rule doesn't make quite as much sense. In particular, you can only dare go so far when changing the C++ language to accommodate a new paradigm, and you therefore can't reap any of the benefits that adhered to the Object Pascal code shown in the second of the two most recently listed code samples. In other words, C++ reaps no benefits from this rule, while it does tend to limit your choices. In other words, you are no longer free to create VCL objects locally, but must perforce create them on the heap, whether you want to or not.

I will therefore take it as a given that the need to create VCL objects on the heap is not particularly beneficial to C++ programmers. On the other hand, there is no particular hardship inherent in doing so, nor does it force you to endure any hit on the performance of your application. It is therefore merely a fact of life, neither inherently good nor inherently bad. If you find this unduly irksome, you might consider the many other benefits that BCB brings you, and consider this one limitation as a small price to pay for getting the benefit of this product's strengths.

One final, and not at all unimportant, point: Only VCL objects have to be created on the heap. All standard C++ objects can be handled as you see fit. In other words, you can have both static or dynamic instances of all standard C++ objects. It's only VCL objects that must be addressed with pointers and must be allocated on the heap!

Changes to the C++ Language

It's now time to wrap up the introductory portion of this chapter, which provided an overview of the VCL and BCB programming. The next subject on the agenda involves how the VCL has impacted the C++ language and how the C++ language has affected the VCL. It's perhaps simplest to start with a run down of the new features in the C++ language.

I am, of course, aware of how controversial it is to add extensions to C++. However, I personally am only interested in good technology and the quality of the products I use. BCB is a good product, and part of what makes it good is the power of the VCL and the power of the component, property, event model of programming. The new extensions have been added to the language to make this kind of programming possible, and so I am in favor of these changes. The C++ committee has done excellent work over the years, but no committee can keep up with the furious pace of change in this industry.

The following extensions have been added to the language in order to support the VCL and the component, property, and delegation model of programming:

_ _declspec(delphiclass | delphireturn)

_ _automated

_ _published

_ _closure

_ _property

_ _classid(class)

_ _fastcall

As you can see, all of these keywords have two underscores prefixed to them. In this instance, I have separated the underscores with a single space to emphasize the fact that not one, but two underscores are needed. In your programs, the underscores should be contiguous, with no space between them.

I am going to go through all these new keywords and make sure that you understand what each means. The purpose of each of these sections is not to explain the how, when, where, and why of a particular keyword, but only to give you a sense of what you can and can't do with them. I will also say something about the relative importance of each new piece of syntax. My goal is to create a handy reference for the new keywords, but I will wait until later to explain most of them in real depth. For instance, a lengthy discussion of the __automated keyword appears in Chapter 27, "Distributed COM," which covers DCOM and Automation, and I discuss properties at length in Chapters 19 through 24, which cover creating components.

Automated

The automated keyword is for use in OLE automation classes. Only classes that descend from TAutoObject or a descendant of TAutoObject would have a use for this keyword.

Here is an example of a class that sports an automated section:

class TMyAutoObject : public TAutoObject

{

private:

public:

virtual __fastcall TMyAutoObject();

__automated:

AnsiString __fastcall GetName(void) { return "MyAutoObject"; }

};

If this class were compiled into a working program, the GetName function would be available to other programs through OLE automation. They can call the function and it will return the word "MyAutoObject".

The reason the GetName function can be accessed through OLE automation is because it appears in the automated section of an object. Notice that it is also declared __fastcall. This is necessary with automated functions. It means the compiler will attempt to pass the parameters in registers rather than on the stack.

The __automated directive makes OLE automation programming much easier, but it's effect is limited to this one area. It has no far reaching effect on the whole of the C++ language.

The Published Keyword

In addition to public, protected, and private, C++ now supports two new keywords used to delineate a section in a class declaration where properties that are to appear in the Object Inspector can be declared. If you are creating a component with new properties that you want to appear in the Object Inspector, you should place those properties in the published section. Properties that appear in the published section have very complete RTTI generated for them by the compiler.

It is only legal to use the __published keyword in VCL classes. Furthermore, though the compiler may or may not enforce the rule, it is generally not sensible to use the __published keyword in objects that do not descend from TComponent or from a descendant of TComponent. The reasoning here is simply that TComponent is a necessary ancestor of all components, and the __published keyword exists because it helps to enhance components. The RTTI associated with the __published section will have some effect on program performance, so you don't want to use it unless it will be helpful. Therefore, use __published only in classes that you want to make into components.

Here is an example of an object that features a published section:

class TMyObject : public TComponent

{

private:

int FSomeInteger;

protected:

int AnotherInteger;

public:

int PublicInteger;

__published:

__property int SomeInteger={read=FSomeInteger, write=FSomeInteger};

};

In this declaration, the property called SomeInteger is published, will be illuminated with RTTI, and will appear in the Object Inspector if you install TMyObject as a component.

The __published directive changes the way I think about classes, and it changes the way I write code. It currently has significance only for VCL classes and for your components. At this time it has no far-reaching implications in regard to the structure of the C++ language. It is just something that aids in the construction of VCL components.

Properties

Properties represent one of the most important, and most beneficial additions to C++ that you will find in BCB. There will be several places in this book where I discuss properties in depth. In this section I simply give you a general outline of the key points regarding this subject.

Properties have four primary purposes:

1. They support a technology that allows you to expose part of an object to visual programmers via the Object Inspector.

2. They provide an excellent means of creating a rigidly defined, and thoroughly correct, interface for an object.

3. They make it easy for you to protect the private data of an object, and the private portions of your implementation.

4. They support a very sophisticated form of RTTI which allows them to be both seen in the object inspector, and to be automatically streamed out to and read from disk without intervention from the programmer. The automatic streaming of objects is one of the great features of the VCL.

Properties can be added to either the public or published section of a class declaration. It makes no sense to put them in either the protected or private sections, because their primary purpose is to provide a public interface to your object.

Here is a declaration for a property:

class TMyObject : public TComponent

{

private:

int FSomeInteger;

__published:

__property int SomeInteger={read=FSomeInteger, write=FSomeInteger};

};

In this class, SomeInteger is a property of type int. It serves as the public interface for the private FSomeInteger variable.

To declare a property:

Use the __property keyword.

Declare the type of the property.

Add the name of the property.

Add an equal sign and an open and close curly brace followed by a semicolon.

Between the curly braces, properties can have read and write sections. These sections are used to designate how to set or get the value of a property. To declare the read section, write the word read followed by an equal sign, and do the same thing with the write section.

It is very common in VCL classes to declare private data with the letter F prefixed to it. The letter F stands for field. It serves as a reminder that this is a private variable and should not be accessed from another class.

You can declare a property to be read only or write only:

__property int SomeInteger={read=FSomeInteger};

__property int SomeInteger={write=FSomeInteger};

The first of these declarations is read only, the second is write only.

Because FSomeInteger is private, you might need to provide some means of accessing it from another object. You can and should sometimes use the friend notation, but that really violates the whole concept of the private directive. If you want to give someone access to your data, but continue to hide the specific implementation of that data, use properties. The preceding code provides one means of giving someone access to your data while still protecting it. It goes without saying that if your property does nothing else but call directly to a variable of the same type in the private section, the compiler generates code that gives you direct access to the type. In other words, it doesn't take any longer to directly use a property like SomeInteger than it would to use FSomeInteger directly. The compiler takes care of the details for you, and uses less code than it would with an inline access function.

NOTE: C++ programmers have long used get and set methods to provide access to private data. There is a certain amount of controversy as to whether or not properties add something to the object model outside of their capability to be seen in the Object Inspector.

Here is a second way to state the matter. Given the presence of the Object Inspector, properties clearly play an important role in BCB programming. A second debate, however, would involve the question of whether or not--absent the issue of the Object Inspector--properties add something new to the mix that would not be there if we had only access functions and get and set methods.

One thing I like a great deal about properties is the clarity of the syntax they present to the user. They make it easier to write clear, maintainable code. They also do a very good job of helping you protect the low-level implementation of your object. I recognize, however, that it could be argued that access functions provided sufficient power to accomplish these goals without the aid of properties.

I could safely hide behind the fact that the BCB programming model demands the presence of properties. However, I will stick my neck out on this one and say that I feel properties are an important contribution to C++, and should become part of the language. I recognize, however, that this is the kind of subject that fosters intense controversy, and concede that the debate is not entirely one-sided.
　

Here is another use of properties that differs from the one just shown:

class MyObject : public TObject

{

private:

int FSomeInteger;

int __fastcall GetSomeInteger() { return FSomeInteger; }

void __fastcall SetSomeInteger(int i) { FSomeInteger = i; }

__published:

__property int SomeInteger={read=GetSomeInteger, write=SetSomeInteger};

};

In this class, SomeInteger has get and set methods. These get and set methods are associated with a property and should always be declared with the __fastcall calling convention.

Get and set methods provide a means of performing calculations or other actions when getting and setting the value of a property. For instance, when you change the Width property in a TShape object, not only does some internal variable get set, but the whole object redraws itself. This is possible because there is a set method for this property that both sets the internal FWidth property to a new value and redraws the object to reflect these changes.

You could perform calculations from inside a get method. For instance, you could calculate the current time in a property designed to return the current time.

The existence of get and set methods represents another important reason for keeping the data of your object private, and for not giving anyone else access to it except through properties. In particular, you may change a value in a set or get method or have side effects that you want to be sure are executed. If someone accesses the data directly, the side effects or other changes will not occur. Therefore, you should keep the data private and let them access it through a function, and let the function itself be accessed through a property.

Properties can be of a wide variety of types. For instance, they can be declared as AnsiStrings, arrays, sets, objects, or enumerated types. Events, which are a kind of method pointer, are really a type of property, but they behave according to their own peculiar rules and are usually treated as a separate subject from properties.

Properties can be declared to have a default value:

__property int SomeInteger={read=GetSomeInteger, write=SetSomeInteger, default=1};

This syntax is related only tangentially to the concept of setting a property automatically to a particular value. If you want to give a property a predefined value, you should do so in the object's constructor.

Default is used to tell the VCL whether or not a value should be written out to a stream. The issue here is that streaming can result in producing large files, because so many objects have such large numbers of properties. To cut down on the size of your form files, and on the time spent writing the files to disk, the VCL enables you to declare a default value for a property. When it comes time to stream properties to disk, they will not be streamed if they are currently set to the default value. The assumption is that you will initialize them to that value in the object's constructor, so there is no need to save the value to disk. Many properties are declared to be nodefault, which means they should be streamed to disk.

A property can also have the stored directive. Confusingly enough, this is another means of deciding whether or not a property should be streamed. In this case, it gives you the option of changing whether or not a property can be streamed. For instance:

property TColor Color={read Fcolor, write SetColor,

stored=IsColorStored, default=clWindow};

This property calls a method named IsColorStored to determine whether the color should be stored at this time. For instance, if the property is set to have the same value as its parent, there is no need to store it, and IsColorStored returns False. This property will therefore only be stored if IsColorStored returns True and the color is not set to clWindow.

Say what?

Don't worry, most of the time you don't have to get involved with these tricky storage specifiers and their disarming schemes to save disk space. However, there are times when the matter comes to the front, and now you have a place to look to remember what they mean.

Properties play a big role in VCL programming, and furthermore, they can be used in objects that have nothing to do with the VCL. This subject has tremendous potential scope and will affect the way the C++ language as a whole is handled by BCB programmers.

That is all I'm going to say about properties for now. There is an example of creating and using an array property later in this chapter in the section called "Arrays of AnsiStrings."

Events: Understanding Delegation

Most of the time I will refer to closures as events. Technically, a closure is the type of method pointer used in event properties, but I will generally refer to the whole syntactical structure as an event that supports the delegation model.

As you learned in the last section, an event is really a kind of property. The primary difference between a standard property and an event is that the type of an event is a method pointer.

NOTE: Some people refer to events as closures, but it could be argued that the word has a rather stuffy, academic overtone to it. Certainly it is used in some circles in a very strict manner that does not necessarily conform in all its particulars to the way BCB handles events.

Consider the OnMouseUp event that is supported by many components and declared in CONTROLS.HPP:

__property TMouseEvent OnMouseUp = {read=FOnMouseUp, write=FOnMouseUp};

Like the OnMouseUp property, FOnMouseUp is, naturally, also declared to be of type TMouseEvent:

TMouseEvent FOnMouseUp.

So far the syntax used for events seems pretty much identical to that used for all properties. The new code that is specific to the delegation model is the actual declaration for the method pointer used by an event:

typedef void __fastcall (__closure *TMouseEvent)(System::TObject* Sender,

TMouseButton Button, Classes::TShiftState Shift, int X, int Y);

All OnMouseUp events are of this type. In other words, if an OnMouseUp event is not set to null, it is set to a method pointer of this type. You can then call the event by writing code of this type:

if (FOnMouseUp)

FOnMouseUp(this, Button, Shift, X, Y);

If the event is not set to null, call the method associated with the event and pass parameters of a type that conform with the signature of the relevant method pointer. This is called delegating the event.

NOTE: It is generally considered very bad form to return a value from an event. Trying to do so could cause a compiler error, and you should avoid including code of this type in your own programs. The VCL has gone through several iterations now, and returning values from events has worked in some versions and not in others. The team that wrote the VCL, however, asked the DOC team to state explicitly that events should not return values. It usually turns out badly when you get stuck with legacy code that worked in one version, but directly contradicts the desires of the folks who keep the reins in their own hands.

The method associated with the OnMouseUp event will look something like this:

void __fastcall TForm1::Button1MouseUp(

TObject *Sender,

TMouseButton Button,

TShiftState Shift,

int X,

int Y)

{

}

I generally call a method of this type an event handler. It handles an event when it is delegated by a component or object.

Most of the time an event is assigned to a method automatically by the compiler. However, you can do so explicitly if you desire, and in fact I do this in several places in my own code. If you want an example, see the Music program from Chapter 16, "Advanced InterBase Concepts." Here is how the assignment would be made from inside the Controls unit:

FOnMouseEvent = Button1MouseUp;

Code of this type appears everywhere in the VCL, and indeed it is one of the central constructs that drives BCB. This is the heart of delegation programming model. As I have said several times, BCB is primarily about components, properties, and the delegation model. Take events away, and you have some other product altogether. Events radically change the way C++ programs are written. The key effect of events on C++ seems to me to be the following: "In addition to using inheritance and virtual methods to change the behavior of an object, BCB allows you to customize object by delegating events. In particular, VCL objects often delegate events to the form on which they reside."

Like all properties, events work with both standard C++ classes and with VCL classes.

declspec(delphiclass | delphireturn | pascalimplementation)

All the additions to C++ discussed so far are primarily about finding ways to support the VCL. In particular, the new VCL object model has certain benefits associated with it, and so Borland has extended C++ in order to accommodate this new model. The property and event syntax, however, appears to have at least some other implications beyond its simple utilitarian capability to support the VCL. I think you will find the next three changes are very small potatoes compared to properties and events. All they really do is make it possible for BCB to use the VCL, and they don't really have much effect at all on the way we write code.

As you have no doubt gleaned by this time, VCL classes behave in specific ways. For instance, a VCL class can only by instantiated on the heap, and you can only use the __published or __automated keywords with VCL classes.

The majority of the time your own classes can inherit this behavior directly, simply by descending from a VCL class. For instance, here is part of the declaration for TControl

class __declspec(pascalimplementation) TControl : public Classes::TComponent

{

typedef Classes::TComponent inherited;

private:

TWinControl* FParent;

... // etc

}

As you can see, these classes are declared with the delphiclass or pascalimplementation attribute. This means they are implemented in Pascal, and the headers are the only part that appear in C++ code. If you have an Object Pascal unit that you want to use in a BCB application, the other declarations in the unit will automatically be translated and placed in a C++ header file. The class declarations in that header file will be similar to the one shown here.

However, if you create a descendant of TControl, you don't have to bother with any of this, because the attributes will be inherited:

class MyControl : public TControl

{

// My stuff here

}

Class MyControl is automatically a VCL class that can support __published properties and must be created on the heap. It inherits this capability from TControl. As a result, it is implicitly declared pascalimpementation, even though you don't explicitly use the syntax. I don't like to clutter up class declarations with __declspec(pascalimplementation), but it may appeal to some programmers because it makes it clear that the class is a VCL class and has to be treated in a particular manner.

If you want to create a forward declaration for a class with the pascalimplementation attribute, it must have the delphiclass attribute:

class __declspec(delphiclass) TControl;

This is a forward declaration for the TControl class found in CONTROLS.HPP. You do not inherit delphiclass in the same sense that you do pascalimplementation, and so you must use it explicitly in forward declarations!

delphireturn is used to tell the compiler to generate a particular kind of code compatible with the VCL when it returns objects or structures from a function. The only place in BCB where this is done is in SYSDEFS.H and DSTRINGS.H, which is where classes like Currency, AnsiString, and Variant are declared. If you need to return objects or structures to the VCL, you may need to declare your class with this directive. However, there are very few occasions when this is necessary, and most programmers can forget about delphireturn altogether.

NOTE: As shown later in this chapter, the TDateTime object is declared delphireturn. This is necessary because instances of the object can be passed to VCL functions such as DateTimeToString.

All of this business about dephireturn and pascalimplementation has little or no effect on the code written by most BCB programmers. delphiclass is slightly more important, because you will likely have occasion to declare a forward declaration for one of your own descendants of a VCL class. This is clearly a case where you can indeed afford to "pay no attention to the man behind the curtain." Most of this stuff is just hand signals passed back and forth between the compiler and VCL, and you can afford to ignore it. Its impact on the C++ programming as a whole is essentially nil.

classid(class)

Use classid if you need to pass the specific type of a class to a function or method used by the VCL RTTI routines. For instance, if you want to call the ClassInfo method of TObject, you need to pass in the type of the class you want to learn about. In Object Pascal you write

ClassInfo(TForm);

which means: Tell me about the TForm class. Unfortunately, the compiler won't accept this syntax. The correct way to pose the question in BCB is

ClassInfo(__classid(TForm));

Again, this is just the compiler and the VCL having a quiet little chat. Pay no attention to the man behind the curtain. RTTI looks one way in the VCL, another in standard C++. This syntax is used to bridge the gap between the two, and it has no far reaching implications for C++, the VCL, or anyone else. The excitement is all centered around the property and event syntax; this is a mere trifle.

__fastcall

The default calling convention in Object Pascal is called fastcall, and it is duplicated in C++Builder with the __fastcall keyword. __fastcall methods or functions usually have their parameters passed in registers, rather than on the stack. For instance, the value of one parameter might be inserted in a register such as EAX and then snagged from that register on the other side.

The __fastcall calling conventions pass the first three parameters in eax, edx, and ecx registers, respectively. Additional parameters and parameter data larger than 32 bits (such as doubles passed by value) are pushed on the stack.

Delphi-Specific Classes

Not all the features of C++ are available in Object Pascal, and not all the features of Object Pascal are available in BCB. As a result, the following classes are created to represent features of the Object Pascal language that are not present in C++:

AnsiString

Variant

ShortString

Currency

TDateTime

Set

Most of these classes are implemented or declared in SYSDEFS.H, though the crucial AnsiString class is declared in its own file called DSTRINGS.H.

NOTE: Neither the real nor comp types used in Object Pascal are supported adequately in BCB. Because Pascal programmers often used the comp type to handle currency, they should pay special attention to the section in this chapter on the BCB currency type. The real type is an artifact of the old segmented architecture days when anything and everything was being done to save space and clock cycles. It no longer plays a role in contemporary Object Pascal programming.

I will dedicate at least one section of this chapter to each of the types listed previously. Some important types, such as Sets and AnsiStrings, will receive rather lengthy treatment stretching over several sections.

Many C++ programmers will have questions about the presence of some of these classes. For instance, C++ comes equipped with a great set template and an excellent string class. Why does BCB introduce replacements for these classes? The answer to the question is simply that BCB needed a class that mimicked the specific behavior of Object Pascal sets and Object Pascal strings.

Here are some macros you can use with these classes:

OPENARRAY

ARRAYOFCONST

EXISTINGARRAY

SLICE

These macros are also discussed in the upcoming sections.

Introducing the AnsiString Class

The C language is famous for its null-terminated strings. These strings can be declared in many ways, but they usually look like one of these three examples:

char MyString[100]; // declare a string with 100 characters in it.

char *MyString; // A pointer to a string with no memory allocated for it.

LPSTR MyString; // A "portable" Windows declaration for a pointer to a string

Declarations like these are such a deep-rooted and long standing part of the C language that it is somewhat shocking to note that they are not as common as they once were. There is, of course, nothing wrong with these strings, but C++ has come up with a new, and perhaps better, way to deal with strings. (I speak of these strings as being exclusive to C, though of course these same type of strings are also used in Object Pascal, where they are called PChars.)

The "old-fashioned" types of strings shown previously cause people problems because it is easy to make memory allocation errors with them or to accidentally omit the crucial terminating zero (`/0') that marks the end of these strings.

These same strings are also famous for their accompanying string library, which includes a series of cryptic looking functions with names like strcpy, strlen, strcmp, strpbrk, strrchr, and so on. These functions, most of which occur in both C++ and Object Pascal, can be awkward to use at times, and can frequently lead to errors if a programmer gets careless.

In this book, I will try to avoid using any of the "old fashioned" C style strings whenever possible. Instead, I will do things the C++ way and use string classes. Most C++ programmers prefer string classes on the grounds that they are easier to use, easier to read, and much less likely to lead to an error involving memory allocation.

In this book there is a fourth reason for using string classes. In particular, a string class called AnsiString provides compatibility with the underlying strings found in the VCL. C++ programmers will find that AnsiStrings are similar to the standard ANSI C++ String class.

AnsiStrings are very easy to use. In fact, many programmers will find that they have an intuitive logic to them that almost eliminates the need for any kind of in-depth explanation. However, AnsiStrings happen to form one of the key building blocks on which a great deal of C++Builder code is based. It is therefore of paramount importance that users of BCB understand AnsiStrings.

The next few sections of the book take on the task of explaining AnsiStrings. To help illustrate this explanation with ready-made examples, you can turn to the UsingAnsiString program found on the book's CD-ROM. This program provides examples of essential AnsiString syntax and shows several tricks you can use when you add strings to your programs.

Working with the AnsiString Class

The AnsiString class is declared in the DSTRING.H unit from the ..\INCLUDE\VCL directory. This is a key piece of code, and one which all BCB programmers should take at least a few minutes to study.

The declaration for the class in the DSTRING.H unit is broken up into several discreet sections. This technique makes the code easy to read. For instance, one of the sections shows the operators used for assignments:

// Assignments

AnsiString& __fastcall operator =(const AnsiString& rhs);

AnsiString& __fastcall operator +=(const AnsiString& rhs);

Another section shows some comparison operators:

//Comparisons

bool __fastcall operator ==(const AnsiString& rhs) const;

bool __fastcall operator !=(const AnsiString& rhs) const;

bool __fastcall operator <(const AnsiString& rhs) const;

bool __fastcall operator >(const AnsiString& rhs) const;

bool __fastcall operator <=(const AnsiString& rhs) const;

bool __fastcall operator >=(const AnsiString& rhs) const;

Another handles the Unicode-related chores:

//Convert to Unicode

int __fastcall WideCharBufSize() const;

wchar_t* __fastcall WideChar(wchar_t* dest, int destSize) const;

There is, of course, much more to the declaration than what I show here. However, these code fragments should give you some sense of what you will find in DSTRING.H, and of how to start browsing through the code to find AnsiString methods or operators that you want to use.

The rest of the text in this section of the chapter examines most of the key parts of the AnsiString class and shows how to use them. However, you should definitely find time, either now or later, to open up DSTRING.H and to browse through its contents.

AnsiString Class Constructors

AnsiStrings are a class; they are not a simple type like the string type in Object Pascal, which they mimic. Object Pascal or BASIC programmers might find that the next line of code looks a little like a simple type declaration. It is not. Instead, this code calls the constructor for an object:

AnsiString S;

Here are the available constructors for the AnsiString class:

__fastcall AnsiString(): Data(0) {}

__fastcall AnsiString(const char* src);

__fastcall AnsiString(const AnsiString& src);

__fastcall AnsiString(const char* src, unsigned char len);

__fastcall AnsiString(const wchar_t* src);

__fastcall AnsiString(char src);

__fastcall AnsiString(int src);

__fastcall AnsiString(double src);

The simple AnsiString declaration shown at the beginning of this section would call the first constructor shown previously, which initializes to zero a private variable of the AnsiString class. This private variable, named Data, is of type char *. Data is the core C string around which the AnsiString class is built. In other words, the AnsiString class is a wrapper around a simple C string, and the class exists to make it easy to manipulate this string and to make the string compatible with the needs of the VCL.

The following simple declaration would call the second constructor shown in the previous list:

AnsiString S("Sam");

This is the typical method you would use when initializing a variable of type AnsiString.

NOTE: I have included a second example program called UsingAnsiString2, which features a small class that overrides all the constructors for the AnsiString class:

class MyAnsiString : public AnsiString

{

public:

__fastcall MyAnsiString(void): AnsiString() {}

__fastcall MyAnsiString(const char* src): AnsiString(src) {}

__fastcall MyAnsiString(const AnsiString& src): AnsiString(src) {}

__fastcall MyAnsiString(const char* src, unsigned char len): AnsiString (src, len) {}

__fastcall MyAnsiString(const wchar_t* src): AnsiString(src) {}

__fastcall MyAnsiString(char src): AnsiString(src) {}

__fastcall MyAnsiString(int src): AnsiString(src) {}

__fastcall MyAnsiString(double src): AnsiString(src) {}

};

This class is provided so you can step through the constructors to see which ones are being called. For instance, the three constructors shown immediately after this note have been rewritten in the UsingAnsiStrings2 program to use MyAnsiStrings rather than AnsiStrings. This gives you an easy-to-use system for explicitly testing which constructors are being called in which circumstance.

When I come up with a unit like this that may be of some general utility in multiple programs, I usually put it in the utils subdirectory located on the same level as the chapter subdirectories. In other words, I move or copy it out of the directory where the files for the current program are stored and place it in a subdirectory called utils that is on the same level as the directories called Chap01, Chap02, and so on.

You might need to add this directory to the include search path for your project, or the program might not be able to find the MyAnsiString.h unit. To set up the compiler for your system, go to Options | Project | Directories/Conditionals and change the Include Path to point to the directory where MyAnsiString.h is stored.

Sometimes I will leave one frozen copy of a unit in the directory where it was first introduced and continue development of the copy of the unit that I place in the utils subdirectory. That way, you can find one copy of the file that looks the way you expect it to look in the same directory as the program in which I introduce it, while continuing to develop the code in a separate unit of the same name found in the utils directory. Check the Readme.txt file on the CD that accompanies this book for further information.

The following are a few simple examples from the UsingAnsiStrings program that show examples of creating and using AnsiStrings:

void __fastcall TForm1::PassinCString1Click(TObject *Sender)

{

AnsiString S("Sam");

Memo1->Text = S;

}

void __fastcall TForm1::PassInInteger1Click(TObject *Sender)

{

AnsiString MyNum(5);

Memo1->Text = MyNum;

}

void __fastcall TForm1::PassInaDouble1Click(TObject *Sender)

{

AnsiString MyDouble(6.6);

Memo1->Text = MyDouble;

}

This code demonstrates several things. The first, and most important point, is that it shows how you can create an AnsiString object by initializing it with a string, an integer, or a double. In short, these constructors can automatically perform conversions for you. This means you can usually write code like this:

AnsiString S;

int I = 4;

S = I;

ShowMessage(S);

When working with C strings, you always have to be careful that the variable you have been working with has been properly initialized. This is not nearly as big a concern when you are working with AnsiStrings. Consider the following code:

void __fastcall TForm1::InitializetoZero1Click(TObject *Sender)

{

AnsiString S;

Memo1->Text = S.Length();

}

The first line creates an AnsiString class that has a zero length string. The second line performs a completely safe and legal call to one of the methods of this AnsiString. This is the type of situation that can be very dangerous in C. Here, for instance, is some code that is likely to blow up on you:

char *S;

int i = strlen(S);

This code appears to do more or less the same thing as the code in the InitializetoZero1Click method. In practice, however, this latter example actually raises an access violation, while the AnsiString code example succeeds. The explanation is simply that the declaration of the AnsiString class created a real instance of an object called S. You can safely call the methods of that object, even if the underlying string has no memory allocated for it! The second example, however, leaves the char * declared in the first line completely impotent, with no memory allocated for it. If you want to avoid trouble, you should not do anything with that variable until you allocate some memory for it.

In the last few paragraphs I have outlined an example illustrating what it is I like about the AnsiString class. In short, this class makes strings safe and easy to use.

NOTE: It goes without saying that C strings are generally faster than AnsiStrings, and that they take up less memory. Clearly, these are important features, and obviously I am not stating that C strings are now obsolete.

The reasoning on this issue is a bit like that we undertake when deciding whether to buy a car or a motorcycle. Cars are more expensive than motorcycles, and they don't let you weave back and forth between lanes when there is congested traffic. On the other hand, drivers of cars are much less likely to end up in the hospital, and cars are much more pleasant to be in during inclement weather. In the same way, AnsiStrings aren't as small and flexible as C strings, but they are less likely to crash the system, and they stand up better when you are in a rush or when handled by inexperienced programmers.

This book focuses on ways to quickly write safe, high-performance programs. If that is your goal, use AnsiStrings. If you are trying to write an operating system, a compiler, or the core module for a 3D game engine, you should probably concentrate more on speed than I do in this book and should use AnsiStrings only sparingly.

Please note that my point here is not that you can't use C++Builder to write highly optimized code, but only that this book usually does not focus on that kind of project.

Sticky Constructor Issues

The AnsiString constructors are easy to use, but things can be a bit confusing if you try to think about what is going on behind the scenes. For instance, consider what happens when you pass an AnsiString to a function:

AnsiString S;

MyFunc(S);

If the call to MyFunc is by value, not by reference, the constructor for S is going to be called each time you pass the string. This is not a tremendous burden on your program, but it is probably a bit more significant weight than you had in mind to impose on your code. As a result, you should pass in the address of the variable in most circumstances:

AnsiString S;

MyFunc(&S);

Even better, you should construct methods that declare all their string variables as being passed by reference:

int MyFunc(AnsiString &S)

{

S = "The best minds of my generation...";

return S.Length();

}

This is like a var parameter in Object Pascal in that it lets you pass the string by reference without worrying about pointers:

void __fastcall TForm1::Button1Click(TObject *Sender)

{

AnsiString S;

int i = MyFunc(S);

ShowMessage(S + " Length: " + i);

}

Even when MyFunc changes the string, the result of the changes is reflected in the calling module. This syntax passes a pointer to the function, but makes it seem as though you are working with a local stack-based variable on both the caller and calling sides.

Consider the following code samples:

MyAnsiString S = "The road to the contagious hospital";

MyAnsiString S1 = AnsiString("If I had a green automobile...");

MyAnsiString S2("All that came out of them came quiet, like the four seasons");

ShowMessage(S + `\r' + S1 + `\r' + S2);

It should be clear from looking at this code that the second example will take longer to execute than the third, because it calls two constructors rather than just one. You might also think that the first takes longer than the third. A logical course of reasoning would be to suppose that, at minimum, it would have to call both a constructor and the equals operator. In fact, when I stepped through the code, it became clear that the compiler simply called the constructor immediately in the first example, and that the machine code executed for the first and third examples was identical.

What is the lesson to be learned here? Unless you are writing a compiler, an operating system, or a 3D game engine, just don't worry about this stuff. Of course, if you love worrying about these issues, then worry away. I suppose someone has to. Otherwise, relax. There is almost no way to tell from looking at the code what the compiler is going to do, and 90 percent of the time the compiler is smart enough to generate optimal code unless you explicitly tell it to go out of its way, as I do in the second example.

If you are a beginning- or intermediate-level programmer, someone is going to tell you that you should pull your hair out worrying about optimization issues like this. Personally, I think there are more important subjects to which you can dedicate your time. Eliminating one constructor call might save a few nanoseconds in execution time, but nobody is going to notice it. Learn to separate the petty issues from the serious issues.

Most importantly of all, wait until you have a program up and running before you decide whether or not you have a performance bottleneck. If the program is too slow, use a profiler to find out where the bottlenecks are. Ninety-eight percent of the time the bottlenecks are in a few isolated places in the code, and fixing those spots clears up the problem. Don't spend two weeks looking for every place there is an extra constructor call only to find it improves your performance by less than one percent. Wait till you have a problem, find the problem, and then fix it. If it turns out that you are in a loop, and are making extra constructor calls in the loop, and that by doing some hand waving to get rid of the calls you improve program performance by 10 percent, then great. But don't spend days working on "optimizations" that end up improving performance by only one or two percent.

AnsiString Comparison Operators

Comparison operators are useful if you want to alphabetize strings. For instance, asking if StringOne is smaller than StringTwo is a means of finding out whether StringOne comes before StringTwo in the alphabet. For instance, if StringOne equals "Dole" and StringTwo equals "Clinton", StringOne is larger than StringTwo because it comes later in the alphabet. (At last, a politically controversial string comparison analysis!)

Here are the AnsiString class comparison operators and methods:

bool __fastcall operator ==(const AnsiString& rhs) const;

bool __fastcall operator !=(const AnsiString& rhs) const;

bool __fastcall operator <(const AnsiString& rhs) const;

bool __fastcall operator >(const AnsiString& rhs) const;

bool __fastcall operator <=(const AnsiString& rhs) const;

bool __fastcall operator >=(const AnsiString& rhs) const;

int __fastcall AnsiCompare(const AnsiString& rhs) const;

int __fastcall AnsiCompareIC(const AnsiString& rhs) const; //ignorecase

The UsingAnsiString program on this book's CD-ROM shows how to use these operators. For instance, the following method from that program shows how to use the equals operator:

void __fastcall TForm1::Equals1Click(TObject *Sender)

{

AnsiString StringOne, StringTwo;

InputDialog->GetStringsFromUser(&StringOne, &StringTwo);

if (StringOne == StringTwo)

Memo1->Text = "\"" + StringOne + "\" is equal to \"" + StringTwo + "\"";

else

Memo1->Text = "\"" + StringOne + "\" is not equal to \"" + StringTwo + "\"";

}

Here is an example of using the "smaller than" operator:

void __fastcall TForm1::SmallerThan1Click(TObject *Sender)

{

AnsiString String1, String2;

InputDialog->GetStringsFromUser(&String1, &String2);

if (String1 < String2)

Memo1->Text = "\"" + String1 + "\" is smaller than \"" + String2 + "\"";

else

Memo1->Text = "\"" + String1 + "\" is not smaller than \"" + String2 + "\"";

}

The AnsiCompareIC function is useful if you want to ignore the case of words when comparing them. This method returns zero if the strings are equal, a positive number if the string calling the function is larger than the string to which it is being compared, and negative number if it is not larger than the string to which it is being compared:

void __fastcall TForm1::IgnoreCaseCompare1Click(TObject *Sender)

{

AnsiString String1, String2;

InputDialog->GetStringsFromUser(&String1, &String2);

if (String1.AnsiCompareIC(String2) == 0)

Memo1->Text = "\"" + String1 + "\" equals \"" + String2 + "\"";

else if (String1.AnsiCompareIC(String2) > 0)

Memo1->Text = "\"" + String1 + "\" is larger than \"" + String2 + "\"";

else if (String1.AnsiCompareIC(String2) < 0)

Memo1->Text = "\"" + String1 + "\" is smaller than \"" + String2 + "\"";

}

Consider the following chart:

StringOne	StringTwo	Result
England	England	Returns zero
England	France	Return a negative number
France	England	Returns a positive number

Using the Pos Method

The Pos method is used to find the offset of a substring within a larger string. Here is a simple example from the UsingAnsiString program of how to use the Pos function:

void __fastcall TForm1::SimpleString1Click(TObject *Sender)

{

AnsiString S = "Sammy";

Memo1->Text = S;

int i = S.Pos("mm");

S.Delete(i + 1, 2);

Memo1->Lines->Add(S);

}

This code creates an AnsiString initialized to the name "Sammy". It then searches through the string for the place where the substring "mm" occurs, and returns that index so that it can be stored in the variable i. I then use the index as a guide when deleting the last two characters from the string, thereby transforming the word "Sammy" to the string "Sam".

Here is a similar case, except that this time code searches for a tab rather than an ordinary substring:

void __fastcall TForm1::Pos1Click(TObject *Sender)

{

AnsiString S("Sammy \t Mike");

Memo1->Text = S;

AnsiString Temp = Format("The tab is in the %dth position",

OPENARRAY(TVarRec, (S.Pos(`\t'))));

Memo1->Lines->Add(Temp);

}

The code in this example first initializes the AnsiString S to a string that contains a tab, and then searches through the string and reports the offset of the tab: "The tab is in the 7th position." The Format function shown here works in the same fashion as sprintf. In fact, the following code would have the same result as the Pos1Click method shown previously:

AnsiString S("Sammy \x009 Mike");

Memo1->Text = S;

char Temp[75];

sprintf(Temp, "The tab is in the %dth position", S.Pos(`\t'));

Memo1->Lines->Add(Temp);

Escape Sequences in C++

C++ uses a series of escape sequences to represent special characters such as tabs, backspaces, and so on. If you are not familiar with C++, you might find the following table of escape sequences useful:
　

Human readable name	Escape sequence	Hex representation
Bell	\a	\x007
Backspace	\b	\x008
Tab	\t	\x009
Newline	\n	\x00A
Form Feed	\f	\x00C
Carriage return	\r	\x00D
Double quote	\"	\x022
Single quote	\'	\x027
Backslash	\\	\x05C

Escape sequences are usually placed in single quotes, though in the previous case it would not matter whether I used single or double quotes in the Pos statement.

Using hex values has the exact same effect as using escape sequences. For instance, the following code creates identical output to the Pos1Click method shown previously:

AnsiString S("Sammy \x009 Mike");

Memo1->Text = S;

AnsiString Temp = Format("The tab is in the %dth position",

OPENARRAY(TVarRec, (S.Pos(`\x009'))));

Memo1->Lines->Add(Temp);

Though some programmers may find it simpler and more intuitive to use the Hex values shown in the last column of the table, the arcane-looking escape sequences cling on in C++ code because they are portable to platforms other than Windows and DOS.

Arrays of AnsiStrings

In this section you will see how to declare an array of AnsiStrings. The text includes examples of how to access them as standard null-terminated strings, and how to use them in moderately complex circumstances.

The following ChuangTzu program uses some Chinese quotes with a millennium or two of QA under their belt to illustrate the favorite theme of this book:

Easy is right. Begin right

And you are easy.

Continue easy and you are right.

The right way to go easy

Next (Part 02) >>

VMS Desenvolvimentos

Diversas Dicas, Apostilas, Arquivos Fontes, Tutoriais, Vídeo Aula, Download de Arquivos Relacionado a Programação em C++ Builder.

Voltar ao Site

Voltar ao Index