Deltas from the Quill rich text editor seem to represent plain paragraphs as simple newlines in the text with no attributes. Is it possible to have paragraphs represented explicitly, like other block elements, and have the editor accept the revised delta format?
eg
...
{
attributes: {para:true},
insert: '\n'
}
No. The purpose of attributes is represent formatting so if there is none, there is no value. This is also how inline attributes work.
Attributes also does not necessarily represent a block element. For example, text alignment is implemented as a class or inline style.
Related
I want to change the color of date text tag {{_es_:signer:date}}. Is there a way to do that? I have been on EchoSign for 4 days. I am now looking for the color changing in it.
This is what they have in their Text tagging documentation,
"The form field formatting (font size, type, color, etc.) is determined by the format of the first ‘{‘. To ensure correct processing of Text Tags by EchoSign tag definitions should be specified in commonly occurring fonts within the document (Helvetica, Times New Roman, Arial, Verdana or Courier). Text Tag definitions are case sensitive and must be specified in lowercase text."
Or if you want to set color of the field dynamically, I guess you'll have to google about using Acrobat Javascript.
this should work
{{es:signer:date:font(color=green}}
probably too late, but might help someone
I'd like to change the font size of a chunk of RTF without erasing the bold / italic / underline formatting (an issue similar to the one in this question). The accepted answer is to modify the selection of the text box until the SelectionFont propery is null in order to find runs of consistently formatted text which can be modified individually. Sounds reasonable. However the actual behavior of the RichTextBox control seems to be inconsistent with the documentation.
In the documentation for RichTextBox.SelectionFont MSDN states:
If the current text selection has more than one font specified, this
property is null.
However, this code which uses mixed bold / regular text doesn't behave as you'd expect:
var rtb = new RichTextBox {
Rtf = #"{\rtf1 This is \b bold\b0.}"
};
rtb.SelectAll();
// Now you'd expect rtb.SelectionFont to be null,
// but it actually returns a Font object
Is there any other reliable way of formatting the text so that I can change the font size without clobbering the other formatting. (Manipulating the RTF directly is OK, I'm not absolutely set on using WinForms to achieve this).
I've given up on trying to go through Winforms to fix this. As I'm applying the change to a whole document (rather than just one portion), it turns out that it's not too hard to modify the RTF directly.
In this case I'm interested in the font size, which is represented by the \fs command. So to replace all the 8.5pt text with 10pt text, you can replace \fs17 with \fs20. (Yes, RTF font sizes come in units of half a point, apparently).
This seems to work well enough, although it does feel like one of those "let's mangle our HTML using regular expressions" type solutions, so I'm not convinced that it's very robust.
Take a look at this:
Changing font for richtextbox without losing formatting
I think it's the same issue. LarsTech's solution is working perfectly for me.
I've seen this link:
http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Content-Extraction-Tika
What I got is pure text without any style from Tika for Solr to search in .
Is it possible to have the text with its style from Solr?
In other words, we need to show text with its original style after searched by solr .
If you think about it, what is "original style" in a pdf? What components of the "style" do you want to keep?
It's not just font and weight, it's stroke, fill, angle, path, graphics, tracking, transparency, transformations and more. IF you got all that, how would you display it in your UI/Web?
You can't really replicate the original style any way other than displaying the original PDF. So that's the way people usually do it if they want the original formatting.
Otherwise, they just use the pure text.
How can you loop through all the "words" (spaces deliminate words) in an RTB (WPF Control) to see which ones are italicized?
thanks
Well, your task seems to be a quite complicated one.
The contents of a RichTextBox is a FlowDocument which can be found at the property Document. The FlowDocument, in turn, consists of several Blocks.
Each of the Blocks can be a Paragraph, a Section, a Table etc. You'll need to analyze each of them separately.
For the Paragraph, it consists of several Inlines, each of them can be a Span, which in turn may be an Italic. The Italic represents italicized text. The Italic can, in turn, have other inlines, containing other Spans (for example, Hyperlinks, which you may or may not want to include into your result).
You you basically need to traverse all the structure recursively and peek the text from your Italics. A special case may be the words where only a part is italicized, you'll need to have a strategy for them.
I am unaware of any easier methods to achieve what you want. HTH.
Edit:
Perhaps an easier alternate solution would be to traverse all the text using TextPointer from the beginning (richTextBox.Document.ContentStart), switching to the next position with position.GetNextContextPosition(LogicalDirection.Forward), and testing if your current position is inside an Italic using position.Parent. You should however care that Italic can be a non-immediate parent, so you'll perhaps need to traverse several parents upwards. Disclaimer: I did never try this idea in my code.
TextPointer tp = RTB.Document.ContentStart;
TextRange word = WordBreaker.GetWordRange(tp);
while (word.End.GetNextInsertionPosition(LogicalDirection.Forward) != null)
{
if (word.GetPropertyValue(TextElement.FontStyleProperty).ToString() == "Italic")
{
}
word = WordBreaker.GetWordRange(word.End.GetNextInsertionPosition(LogicalDirection.Forward));
}
}
with WordBreaker class from
Link
I'm looking to take the output of a WPF RichTextBox which is locked down to only allow certain formatting commands (Bold, Underlined and Italic), and parse it to be plaintext with HTML tags denoting the formatting. This is so that the formatting information can be picked up and parsed by an Oracle Publishing interface.
All other information such as font sizes, colors etc are not important, as they will be handled the Publishing template further down the line.
Ideally then we would have something like the following, stripping out all other rtf tags:
This is <b>some bold text, with <i>this bit</i> italic as well</b>
Is there a relatively easy way to do this? I've seen some Regex strings, but they always seem to let unwanted rtf material through. I don't want to use a commercial solution really, as its quite a small problem.
Any ideas?
You should parse RTF and replace necessary control codes with HTML tags. Considering complexity of RTF, I don't think Regex will be enough.
Rich Text Format (RTF) Specification, version 1.6. Syntax is relatively easy, you just need to process control codes like \b for bold etc., I think.
NRTFTree - A class library for RTF processing in C#. Its SAX parser is probably what you need.