unicode - How do I iterate through an UFT16 encoded string character by character? -
i have uft16 encoded string theuft16string
. contains double byte characters. interate through unicode character unicode character. understand chunk expressions work single-byte characters?
an example
we have following string
abcαβɣ
we want iterate through , put each character on line of own in container.
in livecode, there 2 ways character utf16 string. if string displayed in field, can do
select char 3 of fld 1
and if have russian or polish text in field, correctly select 1 character. however, feature isn't developed in livecode , fail many chinese, japanese , arabic (and other) languages. therefore, better use bytes now:
select byte 5 6 of fld 1
the latter compatible future versions of livecode, while former may not be.
anyway, have string in variable, means have handle string bytes (you use chars, bytes , chars dealt in same way in case, because data in variable). can iterate through variable steps of two, i.e. 1 char @ time:
repeat x = 1 number of bytes of theuft16string step 2 put byte x x+1 mychar // mychar here, e.g. reverse bytes? put byte 2 of mychar & char 1 of mychar after mynewstring end repeat // mynewstring contains entire theutf16string in reverse byte order.
(you in 3 lines instead of 4, purpose of example have added line stores bytes in var mychar).
Comments
Post a Comment