Ticket #5498 (new defect)

SVN Diffs for #5498

 

Opened 2 years ago

Last modified 1 month ago

RBBI: if fBreakType not set and dictionary, Bad Things happen

Reported by: goldsmit(at)apple.com Assigned to: pedberg
Priority: minor Milestone: 4.2
Component: textbounds Version: cvs
Keywords: Cc: andy
Load: ibm:30 Xref:
Java Version: Operating System: all
Project (C/J): ICU4C Weeks:
Review:

Description (Last modified by grhoten)

If RuleBasedBreakIterator::fBreakType has its default value of -1, and the dictionary set is not empty, the break engine machinery doesn't cope well. Possible fixes:

1. Defaults value of fBreakType is not -1. 2. Beef up UnhandledEngine to deal with this better. 3. Turn off dictionary machinery if fBreakType is not a valid value.

Attachments

patch-i18npool_source_breakiterator_breakiterator_unicode.cxx (3.3 kB) - added by cokane@FreeBSD.org on 04/17/08 13:20:16.
Patch I made to allow OO.o to access setBreakType without rewriting ICU sources

Change History

04/04/07 13:32:24 changed by deborah

  • load changed.
  • status changed from new to assigned.
  • xref changed.
  • java changed.
  • weeks changed.
  • revw changed.

08/29/07 17:48:14 changed by deborah

  • milestone changed from 3.8 to 4.0.

11/13/07 14:53:17 changed by khong@...

Making setBreakType() public to allow external programs to set fBreakType themselves. We have done this in OpenOffice.org for line break.

12/11/07 09:56:19 changed by erack@...

See also http://www.openoffice.org/issues/show_bug.cgi?id=84467 that proposed a different approach to fix the issue.

01/15/08 15:05:18 changed by grhoten

  • cc set to andy.
  • keywords deleted.
  • description changed.
  • priority changed from major to critical.

Ticket 84467 from OpenOffice.org has a better fix than making setBreakType public.

The break type should not be settable by users after construction. OpenOffice.org's temporary fix is causing significant binary compatibility issues for ICU distributers.

03/25/08 23:36:49 changed by pedberg

  • owner changed from deborah to pedberg.
  • status changed from assigned to new.

04/17/08 13:20:16 changed by cokane@...

  • attachment patch-i18npool_source_breakiterator_breakiterator_unicode.cxx added.

Patch I made to allow OO.o to access setBreakType without rewriting ICU sources

04/17/08 13:24:50 changed by cokane@...

I just attached a patch that will hopefully be a good example of how to work-around the problem (and allow access to RuleBasedBreakIterator::setBreakType(type)) without going in and munging the rbbi.h header file, with drastic consequences!

Under FreeBSD, the "fix" produced by the OO.o team has rendered OO.o un-buildable if a user has also installed icu from ports (as needed with the latest GNOME). The above fix can be applied to the OO.o sources. The check for a public setBreakType in configure can be removed, and OO.o with build again against the system's ICU (--with-system-icu=yes configure arg).

I recommend that anybody else who's experiencing the same bug should try implementing a similar work-around in their own sources (rather than editing ICU) until the problem is resolved.

06/18/08 10:38:35 changed by pedberg

  • priority changed from critical to major.

The original problem that led to filing of this bug is no longer present - that is, all of the relevant code checks for fBreakType being out of range and handles it appropriately (there is still no way to set fBreakType for a BreakIterator created using "new RuleBasedBreakIterator()" but it is not clear that is desirable anyway.

What does remain as a possible issue is that the break iterator machinery does not cope with fData being NULL, which would happen as a result of doing something like the following:

UnicodeString text= ... BreakIterator* b = new RuleBasedBreakIterator(); b->setText(text); int32_t p = b->next();

Here, the call to next() will crash. This doesn't really make sense anyway since there are no rules. But if desired it could be fixed with some NULL tests on fData.

Based on this, downgrading from critical to major (could go lower).

06/19/08 11:45:34 changed by pedberg

  • milestone changed from 4.0 to 4.2.

07/09/08 16:03:46 changed by yoshito

  • load set to ibm:30.

07/23/08 11:44:45 changed by pedberg

  • priority changed from major to minor.

Add/Change #5498 (RBBI: if fBreakType not set and dictionary, Bad Things happen)




Anti spam check: