View source code Display the source code in std/utf.d from which this page was generated on github. Improve this page Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using local clone. Page wiki View or edit the community-maintained wiki page associated with this page.

Module `std.utf`

Encode and decode UTF-8, UTF-16 and UTF-32 strings.

UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'.

Functions

Name	Description
`byCodeUnit`	Iterate a range of char, wchar, or dchars by code unit.
`codeLength`	Returns the number of code units that are required to `encode` `str` in a string whose character type is `C`. This is particularly useful when slicing one string with the length of another and the two string types use different character types.
`codeLength`	Returns the number of code units that are required to `encode` the code point `c` when `C` is the character type used to `encode` it.
`count`	Returns the total number of code points encoded in `str`.
`decode`	Decodes and returns the code point starting at `str[index]`. `index` is advanced to one past the decoded code point. If the code point is not well-formed, then a `UTFException` is thrown and `index` remains unchanged.
`decodeFront`	`decodeFront` is a variant of `decode` which specifically decodes the first code point. Unlike `decode`, `decodeFront` accepts any input range of code units (rather than just a string or random access range). It also takes the range by `ref` and pops off the elements as it decodes them. If `numCodeUnits` is passed in, it gets set to the number of code units which were in the code point which was decoded.
`encode`	Encodes `c` in `str`'s encoding and appends it to `str`.
`encode`	Encodes `c` into the static array, `buf`, and returns the actual length of the encoded character (a number between `1` and `4` for `char[4]` buffers and a number between `1` and `2` for `wchar[2]` buffers).
`encode`	Encodes `c` in `str`'s encoding and appends it to `str`.
`isValidDchar`	Returns whether `c` is a valid UTF-32 character.
`stride`	`stride` returns the length of the UTF-32 sequence starting at `index` in `str`.
`stride`	`stride` returns the length of the UTF-16 sequence starting at `index` in `str`.
`stride`	`stride` returns the length of the UTF-8 sequence starting at `index` in `str`.
`stride`	`stride` returns the length of the UTF-16 sequence starting at `index` in `str`.
`strideBack`	`strideBack` returns the length of the UTF-32 sequence ending one code unit before `index` in `str`.
`strideBack`	`strideBack` returns the length of the UTF-16 sequence ending one code unit before `index` in `str`.
`strideBack`	`strideBack` returns the length of the UTF-8 sequence ending one code unit before `index` in `str`.
`toUCSindex`	Given `index` into `str` and assuming that `index` is at the start of a UTF sequence, `toUCSindex` determines the number of UCS characters up to `index`. So, `index` is the `index` of a code unit at the beginning of a code point, and the return value is how many code points into the string that that code point is.
`toUTF16`	Encodes string `s` into UTF-16 and returns the encoded string.
`toUTF16z`	`toUTF16z` is a convenience function for `toUTFz!(const(wchar)*)`.
`toUTF32`	Encodes string `s` into UTF-32 and returns the encoded string.
`toUTF8`	Encodes string `s` into UTF-8 and returns the encoded string.
`toUTFindex`	Given a UCS index `n` into `str`, returns the UTF index. So, `n` is how many code points into the string the code point is, and the array index of the code unit is returned.
`validate`	Checks to see if `str` is well-formed unicode or not.

Classes

Name	Description
`UTFException`	Exception thrown on errors in `std.utf` functions.

Templates

Name	Description
`byUTF`	Iterate an input range of characters by char type C.
`toUTFz`	Returns a C-style zero-terminated string equivalent to `str`. `str` must not contain embedded `'\0'`'s as any C function will treat the first `'\0'` that it sees as the end of the string. If `str.empty` is `true`, then a string containing only `'\0'` is returned.

Enum values

Name	Type	Description
`replacementDchar`		Inserted in place of invalid UTF sequences.

Aliases

Name	Type	Description
`byChar`		Iterate an input range of characters by char, wchar, or dchar. These aliases simply forward to `byUTF` with the corresponding C argument.
`byDchar`		Iterate an input range of characters by char, wchar, or dchar. These aliases simply forward to `byUTF` with the corresponding C argument.
`byWchar`		Iterate an input range of characters by char, wchar, or dchar. These aliases simply forward to `byUTF` with the corresponding C argument.
`UseReplacementDchar`	`Flag!("useReplacementDchar")`	Whether or not to replace invalid UTF with `replacementDchar`

Module `std.utf`

See Also

Functions

Classes

Templates

Enum values

Aliases

Authors

License

Comments

Module std.utf

See Also

Functions

Classes

Templates

Enum values

Aliases

Authors

License

Comments

Module `std.utf`