ICU 58.2  58.2
Data Structures | Namespaces | Macros | Typedefs | Functions
unistr.h File Reference

C++ API: Unicode String. More...

#include "unicode/utypes.h"
#include "unicode/rep.h"
#include "unicode/std_string.h"
#include "unicode/stringpiece.h"
#include "unicode/bytestream.h"
#include "unicode/ucasemap.h"

Go to the source code of this file.

Data Structures

class  icu::UnicodeString
 UnicodeString is a string class that stores Unicode characters directly and provides similar functionality as the Java String and StringBuffer/StringBuilder classes. More...
 

Namespaces

 icu
 File coll.h.
 

Macros

#define U_COMPARE_CODE_POINT_ORDER   0x8000
 Option bit for u_strCaseCompare, u_strcasecmp, unorm_compare, etc: Compare strings in code point order instead of code unit order. More...
 
#define U_STRING_CASE_MAPPER_DEFINED
 
#define US_INV   icu::UnicodeString::kInvariant
 Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string. More...
 
#define UNICODE_STRING(cs, _length)   icu::UnicodeString(TRUE, (const UChar *)L ## cs, _length)
 Unicode String literals in C++. More...
 
#define UNICODE_STRING_SIMPLE(cs)   UNICODE_STRING(cs, -1)
 Unicode String literals in C++. More...
 
#define UNISTR_FROM_CHAR_EXPLICIT
 This can be defined to be empty or "explicit". More...
 
#define UNISTR_FROM_STRING_EXPLICIT
 This can be defined to be empty or "explicit". More...
 
#define UNISTR_OBJECT_SIZE   64
 Desired sizeof(UnicodeString) in bytes. More...
 

Typedefs

typedef int32_t UStringCaseMapper(const UCaseMap *csm, UChar *dest, int32_t destCapacity, const UChar *src, int32_t srcLength, UErrorCode *pErrorCode)
 Internal string case mapping function type. More...
 

Functions

int32_t u_strlen (const UChar *s)
 Determine the length of an array of UChar. More...
 
U_COMMON_API UnicodeString icu::operator+ (const UnicodeString &s1, const UnicodeString &s2)
 Create a new UnicodeString with the concatenation of two others. More...
 

Detailed Description

C++ API: Unicode String.

Definition in file unistr.h.

Macro Definition Documentation

#define U_COMPARE_CODE_POINT_ORDER   0x8000

Option bit for u_strCaseCompare, u_strcasecmp, unorm_compare, etc: Compare strings in code point order instead of code unit order.

Stable:
ICU 2.2

Definition at line 47 of file unistr.h.

#define U_STRING_CASE_MAPPER_DEFINED
Internal:
Do not use.

This API is for internal use only.

Definition at line 63 of file unistr.h.

#define UNICODE_STRING (   cs,
  _length 
)    icu::UnicodeString(TRUE, (const UChar *)L ## cs, _length)

Unicode String literals in C++.

Dependent on the platform properties, different UnicodeString constructors should be used to create a UnicodeString object from a string literal. The macros are defined for maximum performance. They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.

The string parameter must be a C string literal. The length of the string, not including the terminating NUL, must be specified as a constant. The U_STRING_DECL macro should be invoked exactly once for one such string variable before it is used.

Stable:
ICU 2.0

Definition at line 120 of file unistr.h.

#define UNICODE_STRING_SIMPLE (   cs)    UNICODE_STRING(cs, -1)

Unicode String literals in C++.

Dependent on the platform properties, different UnicodeString constructors should be used to create a UnicodeString object from a string literal. The macros are defined for improved performance. They work only for strings that contain "invariant characters", i.e., only latin letters, digits, and some punctuation. See utypes.h for details.

The string parameter must be a C string literal.

Stable:
ICU 2.0

Definition at line 140 of file unistr.h.

#define UNISTR_FROM_CHAR_EXPLICIT

This can be defined to be empty or "explicit".

If explicit, then the UnicodeString(UChar) and UnicodeString(UChar32) constructors are marked as explicit, preventing their inadvertent use.

Stable:
ICU 49

Definition at line 155 of file unistr.h.

#define UNISTR_FROM_STRING_EXPLICIT

This can be defined to be empty or "explicit".

If explicit, then the UnicodeString(const char *) and UnicodeString(const UChar *) constructors are marked as explicit, preventing their inadvertent use.

In particular, this helps prevent accidentally depending on ICU conversion code by passing a string literal into an API with a const UnicodeString & parameter.

Stable:
ICU 49

Definition at line 175 of file unistr.h.

#define UNISTR_OBJECT_SIZE   64

Desired sizeof(UnicodeString) in bytes.

It should be a multiple of sizeof(pointer) to avoid unusable space for padding. The object size may want to be a multiple of 16 bytes, which is a common granularity for heap allocation.

Any space inside the object beyond sizeof(vtable pointer) + 2 is available for storing short strings inside the object. The bigger the object, the longer a string that can be stored inside the object, without additional heap allocation.

Depending on a platform's pointer size, pointer alignment requirements, and struct padding, the compiler will usually round up sizeof(UnicodeString) to 4 * sizeof(pointer) (or 3 * sizeof(pointer) for P128 data models), to hold the fields for heap-allocated strings. Such a minimum size also ensures that the object is easily large enough to hold at least 2 UChars, for one supplementary code point (U16_MAX_LENGTH).

sizeof(UnicodeString) >= 48 should work for all known platforms.

For example, on a 64-bit machine where sizeof(vtable pointer) is 8, sizeof(UnicodeString) = 64 would leave space for (64 - sizeof(vtable pointer) - 2) / U_SIZEOF_UCHAR = (64 - 8 - 2) / 2 = 27 UChars stored inside the object.

The minimum object size on a 64-bit machine would be 4 * sizeof(pointer) = 4 * 8 = 32 bytes, and the internal buffer would hold up to 11 UChars in that case.

See also
U16_MAX_LENGTH
Stable:
ICU 56

Definition at line 213 of file unistr.h.

#define US_INV   icu::UnicodeString::kInvariant

Constant to be used in the UnicodeString(char *, int32_t, EInvariant) constructor which constructs a Unicode string from an invariant-character char * string.

About invariant characters see utypes.h. This constructor has no runtime dependency on conversion code and is therefore recommended over ones taking a charset name string (where the empty string "" indicates invariant-character conversion).

Stable:
ICU 3.2

Definition at line 98 of file unistr.h.

Typedef Documentation

typedef int32_t UStringCaseMapper(const UCaseMap *csm, UChar *dest, int32_t destCapacity, const UChar *src, int32_t srcLength, UErrorCode *pErrorCode)

Internal string case mapping function type.

Internal:
Do not use. This API is for internal use only.

Definition at line 70 of file unistr.h.