If a character string consists of valid Unicode UTF-8 or UTF-16 data, the UVALID function returns the value zero. If a character
string contains invalid Unicode data, the UVALID function returns the index of the first invalid element.
The function type is integer.
General Format
Arguments
- Argument-1 must be of class alphabetic, alphanumeric, national, or UTF-8.
Returned Values
- The returned value is an integer, which differs based on
argument-1:
- If
argument-1 is of class alphabetic, alphanumeric, or UTF-8, and it consists of valid UTF-8 encoded Unicode data, the returned value is
zero.
- If
argument-1 is of class alphabetic, alphanumeric, or UTF-8, and it contains invalid UTF-8 encoded Unicode data, the returned value is
the position of the first byte where the invalid UTF-8 data starts.
- If
argument-1 is of class national, and it consists of valid UTF-16 encoded Unicode data, the returned value is zero.
- If
argument-1 is of class national, and it contains invalid UTF-16 encoded Unicode data, the returned value is the position of the first
UTF-16 encoding unit where the invalid UTF-16 data starts. This position is one plus the number of well-formed UTF-16 encoding
units that precede the invalid data.
Note: The UVALID function indicates whether the character string contains well-formed Unicode UTF-8 or UTF-16 data. It does not
indicate whether any or all of the Unicode code points represented by the character string are assigned to characters.
Comments
This function supports ideographic variation selectors (IVS), allowing the font software to select a different glyph from
the default. (If no variation exists or supported then the font software will ignore it.) An IVS consists of Unicode characters
in the range U-E0100 – U-E01EF. UTF-16 strings use surrogate pairs in the range U-DB40 + DD00 - U-DB40 + DDEF, and UTF-8 strings
use the range 0xF3A08480 - 0xF3A087AF.
A Unicode character followed by an IVS is treated as one character when this function is processed.