BACKGROUND
Several components of ICU4C/J are data driven. They take patterns
(short strings, less than a line of text) or rules (typically one or
more lines of text) that parameterize them. Examples are
SimpleDateFormat, DecimalFormat, RuleBasedCollator,
RuleBasedTransliterator, RuleBasedBreakIterator, and other classes.
In these rules, a class of characters ('specials') must be escaped or
quoted if they are to be used as literals. Quoting mechanisms include
single quotes and backslash escapes (including \uXXXX).
Rule-based components typically return their rule string via methods
such as toPattern(), getRules(), etc. The returned string should be
directly usable to create new objects equivalent to the originals.
For this to work, special characters must be escaped in the returned
rule string.
RFE
Add a quoting function that escapes and/or quotes special characters.
This function should take a string, a set of characters that need
quoting, and return the quoted string.
C
/**
- Quote specials in-place. Return actual length (may be > capacity).
- @param specials set of characters that need quoting
*/
int32_t u_quoteSpecials(UChar* string, int32_t length, int32_t capacity,
UUnicodeSet* specials);
This might be a little cleaner:
/**
- Quote specials. Return result length (may be > capacity).
*/
int32_t u_quoteSpecials(const UChar* string, int32_t length,
UChar* result, int32_t resultCapacity,
UUnicodeSet* specials);
Java
/**
UnicodeString quoteSpecials(UnicodeString string,
UnicodeSet specials);