Ticket #5069 (new enhancement)

Bug contains 1 commit(s) | SVN Diffs for #5069

 

Opened 3 years ago

Last modified 1 year ago

parse UCD XML format

Reported by: markus.icu(at)gmail.com Assigned to: markus
Priority: trivial Milestone: UNSCH
Component: tools Version:
Keywords: tools Cc:
Load: Xref:
Java Version: Operating System: all
Project (C/J): ICU4C Weeks: 2
Review:

Description

Parse the UCD XML format - the Unicode Character Database in XML format. Do not change the ICU binary data file formats, at least not just because of the change of the input format.

Use the toolutil XML parser, use property [value] aliases for property and value parsing, gather the preparation and generator code from each of the ICU UCD parser/generator tools (genprops, gennorm, etc.) and merge them.

Property [value] aliases may need to be parsed the old way so that they are available with correct enum values during runtime of the UCD XML parser.

Advantage: A single XML file provides all property data. In the old format, multiple files sometimes need to be read for a single property. (For example, for numeric values.)

Timing: Before the UCD XML format is ready for production use, there need to be tools verifying that the XML files reflect the UCD data precisely.

Attachments

Change History

09/26/07 16:35:59 changed by markus

  • load changed.
  • xref changed.
  • java changed.
  • revw changed.
  • summary changed from RFE: parse UCD XML format to parse UCD XML format.

Add/Change #5069 (parse UCD XML format)




Anti spam check: