Ticket #6656 (closed defect: fixed)

Bug contains 1 commit(s) | SVN Diffs for #6656

 

Opened 1 year ago

Last modified 3 months ago

uregex_replaceAll does not correctly handle U_BUFFER_OVERFLOW_ERROR condition

Reported by: anonymous Assigned to: andy
Priority: medium Milestone: 4.2
Component: regexp Version: 3.6
Keywords: Cc:
Load: Xref:
Java Version: Operating System: Mac OS X 10.5
Project (C/J): ICU4C Weeks:
Review: srl

Description

uregex_replaceAll will not report the correct size to hold the entire replaced string in some circumstances.

The bug exists in uregex_replaceAll itself. The original relevant lines of codes:

int32_t len = 0;
uregex_reset(regexp, 0, status);
while (uregex_findNext(regexp, status)) {

len += uregex_appendReplacement(regexp, replacementText, replacementLength,

&destBuf, &destCapacity, status);

}
len += uregex_appendTail(regexp, &destBuf, &destCapacity, status);

Proposed fix:

int32_t len = 0, overflowed = 0;
uregex_reset(regexp, 0, status);
while (uregex_findNext(regexp, status)) {

len += uregex_appendReplacement(regexp, replacementText, replacementLength,

&destBuf, &destCapacity, status);

if(*status == U_BUFFER_OVERFLOW_ERROR) { overflowed = 1; *status = 0; }

}
if((*status == 0) && (overflowed == 1)) { *status = U_BUFFER_OVERFLOW_ERROR; }
len += uregex_appendTail(regexp, &destBuf, &destCapacity, status);

The bug is a result of the fact that once uregex_appendReplacement sets status to U_BUFFER_OVERFLOW_ERROR, the next call to uregex_find() will 'fail' because status != 0. Thus the size of any remaining replacements is skipped, causing the final calculated buffer replacement size to be short.

In the event of a U_BUFFER_OVERFLOW_ERROR, the returned replacement length should be sufficient to hold the fully replaced string. It should not return with a U_BUFFER_OVERFLOW_ERROR again since the correct buffer size should have been calculated by the first failed call.

The following demonstrates the bug. I coded it in the style of 'source/test/cintltst/reapits.c', and it can be placed in the replaceAll() section. I replicated the first replaceAll test, and placed this one right after it. However, I was working with just the uregex_openC and the first uregex_replaceAll test in a stand alone .c file, so I'm not sure if it can be dropped in completely unmodified in the full test suite. It should be pretty close, though.

As it stands, this test fails assertions 2, 3, 4 and passes assertions 1 and 5. It passes five as a fluke of the particular strings involved which just happen to only require two buffer overflow iterations to get the correct size. Different replacement texts / etc can cause additional buffer overflow errors until a large enough buffer is finally calculated.

status = U_ZERO_ERROR;
u_uastrncpy(replText, "<$1>[$1]", sizeof(replText)/2);
resultSz = uregex_replaceAll(re, replText, -1, buf, 4, &status); // Buffer size of 4 is deliberate.
TEST_ASSERT(status == U_BUFFER_OVERFLOW_ERROR);
TEST_ASSERT(resultSz == (int32_t)strlen("Replace <aa>[aa] <1>[1] <...>...."));
status = U_ZERO_ERROR; // Clear U_BUFFER_OVERFLOW_ERROR error
resultSz = uregex_replaceAll(re, replText, -1, buf, resultSz, &status); // previous replaceAll buffer size to hold all of the replaced string
TEST_ASSERT_SUCCESS(status);
TEST_ASSERT_STRING("Replace <aa>[aa] <1>[1] <...>....", buf, TRUE);
TEST_ASSERT(resultSz == (int32_t)strlen("Replace <aa>[aa] <1>[1] <...>...."));
u_uastrncpy(replText, "<$1>", sizeof(replText)/2); // Reset replText

Attachments

Change History

12/10/08 04:19:52 changed by jengelhart@...

opps, this is in regard to a bug filed with RegexKitLite.

RegexKitLite bug: http://sourceforge.net/tracker/index.php?func=detail&aid=2408447&group_id=204582&atid=990188

Original bug reporter: shogunjp (at) users.sourceforge.net

01/07/09 11:22:54 changed by yoshito

  • owner changed from somebody to andy.
  • weeks changed.
  • xref changed.
  • revw changed.

01/08/09 10:21:15 changed by andy

  • priority changed from assess to medium.
  • project changed from all to ICU4C.
  • status changed from new to closed.
  • resolution set to fixed.
  • milestone changed from UNSCH to 4.2.

01/08/09 10:21:36 changed by andy

  • status changed from closed to reopened.
  • resolution deleted.
  • revw set to srl.

(follow-up: ↓ 9 ) 05/11/09 18:01:19 changed by srl

  • status changed from reopened to closed.
  • resolution set to fixed.

10/07/09 00:11:56 changed by anonymous

10/15/09 21:57:13 changed by anonymous

12/03/09 00:33:55 changed by kareem198625@...

now,<a href="http://www.newpaulsmith.com/paul-smith-shoes-c-518.html"> Paul Smith shoes</a>,<a href="http://www.newpaulsmith.com/paul-smith-sweaters-c-526.html"> paul smith clothing</a>, and<a href="http://www.newpaulsmith.com/paul-smith-sweaters-c-526.html"> paul smith swimwear</a> are beautifully crafted with brilliant attention to detail. Paul Smith socs are an unexpected element of fun to any ensemble. Collections are primarily produced in England and Italy, while the fabrics used are mainly of Italian, French and British origin.

(in reply to: ↑ 5 ) 12/11/09 04:44:11 changed by mario

Replying to srl:

I'm not sure if I use the API in a wrong way or if the bug is just fixed half way (I'm using 4.2.1 with above mentioned fix included).

I use replaceAll two times. First time to determine the needed length of the result buffer and second time to replace the string. Snipped looks like this:

// get needed len for result
len = uregex_replaceAll(expr, , repl_strg, u_strlen(repl_strg), NULL, 0, %ecode);
// thanx to fix 6656, correct len is returned
len = (len+1) * sizeof(UChar);

// allocate result
UChar res[len];

// replace an store in allocated result
u_regex_replaceAll(expr, repl_strg, u_strlen(repl_strg), res, len, &ecode);

The second call to replaceAll still returns a BUFFER_OVERFLOW_ERROR.

Currently I use

ecode = U_ZERO_ERROR

between the two calls and it works fine. Even a reset or close/open operation between the two replaceAll calls lead to the BUFFER_OVERFLOW_ERROR.

I'm I doing something wrong ?

12/17/09 21:35:31 changed by ZacharySkermont@...

<a href="http://www.newpaulsmith.com">cheap paul smith</a> has grown from a small time surf brand, a global leader in luxury sheepskin footwear. Also<a href="http://www.newpaulsmith.com/paul-smith-bags-vertical-stripes-with-chocolate-leather-p-21897.html">paul smith wallets</a> been beyond the range of products include casual shoes, <a href="http://www.newpaulsmith.com/paul-smith-canvas-holdall-bookcase-mini-car-chocolate-leather-p-21887.html">paul smith belts</a>, shirt, as well as weather and Australia to make a fine sheepskin products.


Add/Change #6656 (uregex_replaceAll does not correctly handle U_BUFFER_OVERFLOW_ERROR condition)




Anti spam check: