RichTextKit v0.4

Basic Concepts

The page describes some basic concepts that are important to understand when working with RichTextKit.

Rich Strings

The easiest way to work with RichTextKit is with the RichString class which provides a convenient set of methods for constructing richly decorated text.

// Create a RichString
var rs = new RichString()
    .Alignment(TextAlignment.Center)
    .FontFamily("Segoe UI")
    .MarginBottom(20)
    .Add("Welcome To RichTextKit", fontSize:24, fontWeight: 700, fontItalic: true)
    .Paragraph().Alignment(TextAlignment.Left)
    .FontSize(18)
    .Add("This is a test string");

If you want to wrap and/or crop it, set the Max properties:

rs.MaxWidth = 640;
rs.MaxHeight = 480;

And then you can paint it:

// Paint it
rs.Paint(skia_canvas, new SKPoint(50, 50));

You can also get it's measured size:

Console.WriteLine($"Size: {rs.MeasuredWidth} x {rs.MeasuredHeight});

and lots more. See the RichString class for details.

Text Blocks

Text blocks are a lower level concept that describes a single block of text. Internally RichString builds a single text block for each paragraph.

Text blocks can contain forced line breaks with a newline character (\n). These should be considered as in-paragraph "soft returns", as opposed to hard paragraph terminators.

Currently RichTextKit doesn't support non-text inline elements, although this may be added in the future.

Text blocks are represented by the TextBlock class.

Styles

Styles are used to describe the display attributes of text in a text block. A text block is comprised of one of more runs of text each tagged with a single style and these text runs are referred to a "style runs".

Style runs are represented by the StyleRun class.

Styles are only used if dealing with TextBlocks directly and generally don't need to be used if you're using the higher-level RichString interface.

Font Fallback

Font fallback is the process of switching to a different font when the specified font doesn't contain the required glyphs.

Font fallback is often required to display emoji characters (since most fonts don't include emoji glyphs), for most complex scripts (eg: Arabic) and Asian scripts.

RichTextKit uses SkiaSharp's MatchCharacter function to resolve the typeface to use for font fallback.

Text Shaping

For most Latin based languages, rendering text is simply a matter of placing one glyph after another in a left-to-right manner. Other languages however require a more complicated process that often involves drawing multiple glyphs for a single character. This process is called "text shaping".

By default Skia (and therefore ShiaSharp) doesn't do text shaping and often displays text incorrectly. RichTextKit uses HarfBuzz for text shaping.

Font Runs

Font run are derived by splitting style runs into smaller segments when text is wrapped onto a new line, or when a font fallback is required.

Each font run represents a single run of text that uses the same font and other style attributes for every character in the run.

Font runs are represented by the FontRun class.

Style Runs vs Font Runs

Style runs and font runs are similar concepts but serve two different purposes:

  • Style runs describe the logical view of a text block (before layout)
  • Font runs describe the physical view of a text block (after layout)

Lines

After a text block has been laid out it results in a set of lines each consisting of one or more font runs.

Lines are represented by the TextLine class.

Bi-Directional, LTR and RTL Text

Bi-directional text is the process of correctly displaying text for mixed left-to-right and right-to-left based languages.

RichTextKit includes an implementation of the Unicode Bi-directional Text Algorithm (UAX #9) and each text block has a "base direction" that controls the default layout order of text in that paragraph.

The text direction of spans within the text block can be controlled using either:

  • Embedded control characters (as specifed by UAX #9)
  • By setting the text direction of style runs (See the IStyle)

When setting text direction property on style runs, the text is processed in the same manner as an "isolating sequence" as described in UAX #9.

UTF-16, UTF-32, Code Points and Clusters

Internally, RichTextKit works with UTF-32 encoded text. Specifically it uses the int type to represent a character rather than the char type.

To avoid confusion, the API and code base use the term "code point" to refer to a single UTF-32 character.

Since C# strings are encoded as UTF-16, when a string is added to a text block it will be automatically converted to UTF-32. It's important to note that indexes into the converted string may no longer match indicies into the original C# string. (although RichTextKit does provide functions to map between them).

RichTextKit uses the term CodePointIndex to refer to the index of a code point in a UTF-32 buffer.

Also note that line endings \r and \r\n are automatically normalized to \n. This conversion can also affect the mapping of code point indicies to character indicies in the original string.

(Note: \n\r style line endings aren't supported)

Some complex scripts require more that one code point to describe a single user perceived character. These groupings of code points are called "clusters". This is the same concept as used by HarfBuzz.

Measurement and Size Units

All measurements and sizes used by RichTextKit are logical.

The final size of rendered text will depend on the configuration of the target Canvas scaling and it's up to you to determine the appropriate system of measurement to use.