Ticket #6612 (closed defect: fixed)

Bug contains 1 commit(s) | SVN Diffs for #6612

 

Opened 1 year ago

Last modified 6 months ago

ICU4J charset converter should not read source data beyond Buffer.limit()

Reported by: yoshito Assigned to: michaelow
Priority: major Milestone: 4.1.2
Component: conversion Version: Current
Keywords: Cc: son.two@gmail.com
Load: Xref:
Java Version: Operating System:
Project (C/J): ICU4J Weeks: 0.2
Review: yoshito

Description

This problem was reported by Oleg Sukhodolsky to the ICU support mailing list. Below is the original description.

Hi,

I've tried to use ICU4J with JavaMail API to have encoder for
x-mac-cyrillic (which is not provided by JDK).  And found that simple
call

MimeUtility.decodeWord("=?x-mac-cyrillic?B?ODU4OV0=?=");

Throws exception
java.nio.BufferOverflowException
       at java.nio.charset.CoderResult.throwException(CoderResult.java:259)
       at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:142)
       at java.lang.StringCoding.decode(StringCoding.java:173)
       at java.lang.String.<init>(String.java:444)
       at javax.mail.internet.MimeUtility.decodeWord(MimeUtility.java:834)

After investigating I have found that the problem is caused by the
capacity of byte array
that JavaMail passes to String ctor (and thus by capacity of
ByteBuffer which String ctor
passes to decode() method).  It is a little bit bigger than size of
text.  I.e. in term of
ByteBuffer capacity() > limit().  So, I've looked at the ICU code and
(I think) I have identified
the cause of the problem in CharsetMBCS.cnvMBCSSingleToBMPWithOffsets():

if (!cr[0].isError() && sourceArrayIndex < source.capacity() &&
!target.hasRemaining()) {
   /* target is full */
   cr[0] = CoderResult.OVERFLOW;
}

I believe source.limit() should be used instead of source.capacity() here.
So, here is a fix:

Index: src/com/ibm/icu/charset/CharsetMBCS.java
===================================================================
--- src/com/ibm/icu/charset/CharsetMBCS.java    (revision 24923)
+++ src/com/ibm/icu/charset/CharsetMBCS.java    (working copy)
@@ -2684,7 +2684,7 @@
                }
            }

-            if (!cr[0].isError() && sourceArrayIndex <
source.capacity() && !target.hasRemaining()) {
+            if (!cr[0].isError() && sourceArrayIndex < source.limit()
&& !target.hasRemaining()) {
                /* target is full */
                cr[0] = CoderResult.OVERFLOW;
            }


Should I file a separate bug about it?  Or it is a know problem?

With best regards, Oleg.

Attachments

Change History

11/04/08 11:39:56 changed by yoshito

  • cc set to son.two@gmail.com.

11/04/08 14:04:50 changed by michaelow

  • status changed from new to assigned.
  • revw set to yoshito.

12/15/08 22:20:52 changed by yoshito

  • status changed from assigned to closed.
  • resolution set to fixed.

09/15/09 07:59:48 changed by anonymous


Add/Change #6612 (ICU4J charset converter should not read source data beyond Buffer.limit())




Anti spam check: