In our application, text with bullet lists is edited using the RichEditControl, the resulting RTF text is obtained from the "RtfText" property and is stored in a database. Later we load that RTF text from the database into an instance of RichEditDocumentServer and obtain plain text by reading from the "Text" property.
When a bullet list is pasted into the editor, the underlying RTF for the bullet symbol looks like this:
\u-3913'3f\tab
The same bullet list symbol when saved from Word 2016 looks like this:
\f3 'b7\tab
When formatting a bullet list directly in the RichEditControl as opposed to pasting one, the bullet list symbol looks like this:
\u183'b7\tab
In the first case, the character with code f0b7 is written (-3913 when treated as signed 16bit integer), followed by the character with code 3f (? character). No font is defined. Some magic causes this to be rendered with the correct font, even though most fonts do not have glyphs for characters in that range (it's the private use area after all). When converting to plain text, this character comes out unchanged, and appears as "invalid character" symbol when viewed in most text editors.
In the second case, font 3 is defined, which is "Symbol" in this document, followed by the character with code b7, which is the mid point character. The "Symbol" font has that character, and converting to plain text is no problem.
In the third case, no font is defined either, but the mid point character is written. Converting to plain text is also no problem.
The issue is obviously that pasting into the editor vs. formatting directly in the editor gives different results. I would expect pasting to leave the original character b7 unchanged instead of changing it to private use area character f0b7.
A small demo project is attached to demonstrate this.
Just tested with 19.1.3 as well, same behavior.
Hello Markus,
I have reproduced the behavior you described. However, I need additional time to research it. I will contact you as soon as I have any results.