Ticket #4038 (closed defect: fixed)

Bug contains 4 commit(s) | SVN Diffs for #4038

 

Opened 4 years ago

Last modified 6 months ago

usearch_last misses matches when pattern length > 1

Reported by: dick(at)ximian.com Assigned to: michaelow
Priority: major Milestone: 4.0
Component: collation Version: 2.6.1
Keywords: Cc:
Load: Xref:
Java Version: Operating System: all
Project (C/J): ICU4C Weeks: 0.4
Review: srl

Description (Last modified by srl)

See the following example code. The UStringSearch iterator misses the pattern when searching backwards. From stepping through the code, it seems that this is because the pattern.defaultShiftSize is 2, so the test for matches falls in the middle of the matching part of the target.

ie, in this case:

ta != ::
da != ::
...
it != ::
:b != ::
y: != ::

etc. If 1 char is appended to the target string (and the length adjusted to 23 in the call to usearch_openFromCollator()) the match succeeds.

:; cat usr.c

#include <stdio.h>
#include <unicode/utypes.h>
#include <unicode/ustring.h>
#include <unicode/ucol.h>
#include <unicode/usearch.h>

int main(int argc, char **argv) {
        UCollator *coll;
        UErrorCode ec;
        UChar usrcstr[32], value[2];
        UStringSearch *search;
        int32_t pos= -1;
        int first = (argc==1);

        ec=U_ZERO_ERROR;
        coll=ucol_open("en_GB", &ec);

        if (U_FAILURE(ec)) {
                printf("ucol_open failed: %s\n", u_errorName(ec));
                exit(-1);
        }

        u_uastrcpy(usrcstr, "QBitArray::bitarr_data");
        u_uastrcpy(value, "::");

        ec=U_ZERO_ERROR;
        ucol_setAttribute(coll, UCOL_STRENGTH, UCOL_PRIMARY, &ec);
        ucol_setAttribute(coll, UCOL_CASE_LEVEL, UCOL_ON, &ec);
        ucol_setAttribute(coll, UCOL_ALTERNATE_HANDLING, UCOL_NON_IGNORABLE,
&ec);

        search=usearch_openFromCollator(value, 2, usrcstr, 22, coll, NULL,
&ec);
        if(U_FAILURE(ec)) {
                printf("usearch_openFromCollator failed: %s\n",
u_errorName(ec));
                exit(-1);
        }

        if(first) {
                pos=usearch_first(search, &ec);
        } else {
                pos=usearch_last(search, &ec);
        }

        printf("pos is %d\n", pos);

        exit(0);
}

:; ./usr

pos is 9

:; ./usr a

pos is -1

Attachments

Change History

12/31/69 17:29:05 changed by auditor

  • 10/26/04 12:31:05 schererm changed notes2: assign: "" to "weiv", priority: "" to "high", target: "UNSCH" to "3.2", weeks: "" to "0.4",
  • 10/26/04 12:31:05 schererm moved from incoming to collation
  • Wed Sep 28 10:43:27 2005 weiv changed notes2: target: "3.2" to "3.6",
  • Fri Oct 13 18:12:07 2006 andy changed notes2: target: "3.6" to "3.8 Candidate",

10/04/07 12:08:19 changed by grhoten

  • load changed.
  • xref changed.
  • java changed.
  • description changed.
  • owner changed from weiv to michaelow.
  • milestone changed from 3.8 candidate to 4.0.
  • keywords deleted.
  • revw changed.

10/19/07 12:08:13 changed by michaelow

  • revw set to srl.

The reason why search_last skips the a text is because when shifting backwards, the shift function uses the wrong ce to shift from. Therefore, the last ce found has to be use to properly shift backwards or forwards.

11/06/07 15:44:35 changed by michaelow

  • status changed from new to assigned.

06/05/08 17:59:29 changed by srl

  • description changed.

06/06/08 12:39:12 changed by srl

  • status changed from assigned to closed.
  • resolution set to fixed.

Add/Change #4038 (usearch_last misses matches when pattern length > 1)




Anti spam check: