L
lifenote
Unregistered / Unconfirmed
GUEST, unregistred user!
The Unofficial Newsletter of Delphi Users
by Robert Vivrette
--------------------------------------------------------------------------------
Automatic Syntax Highlighting Using a RichEdit Control (Updated with fixes)
Part 1 of "A Tale of Two RichEdits"
by Jon Hogan-Doran - jonhd@hotmail.com
Introduction
As a new programmer to Delphi, with a history in C, C++, Unix shells and a little Awk and Perl, I came to it with a particular goal in mind. I wanted to implement an Editor for a Australian-4GL Databse language called cl4. Having little to no experience in Windows programming, except some dalliance in Powerbuilder and SQLExpress, it seemed a pretty imposing goal.
One is often taught when learning a new language to read a few books, tryout the samples and then start some simple coding. Unfortunately I'm a bit of a sado-masochist, and love to jump in with all 10 fingers (and 10 toes) and try to implement the most difficult parts first, and all at once. Being able to print "Hello World" never caught my interest.
I guess I had a bad upbringing. I started programming by jumping into a language called "C" on SCO Xenix in 1986 with no programming experience at all. Just a book by two guys called Kerningham and Ritchie and some C code that wouldn't compile (it was BSD Unix based from memory). When I managed to get it to compile after 2 weeks, I decided to port the BinkleyTerm FidoNet mailer to Unix, while concurrently getting all the associated programs ported from Dos as well (the Fido mailer, Newsgroup controller, a entire BBS Program). I think I had 5 different DOS->UNIX programs compiling on 5 different virtual consoles all at once.
12 Months later and my [DosLib] and [FossilDriver] were done, BinkleyTerm was getting me my first newsgroups (comp.os.xenix.???) from a local friendly Sysop, and I was chewing up $1,000 in overseas phone bills organising a world-wide group of programmers to try out my handiwork.
So with Delphi I started on my next big challenge. I didn't want just any old editor for my old 4GL Unix database language. I wanted it all:
code-highlighting
on the spot syntax highlighting
code completion and suggesting
remote compile capabilities to my Linux box
jump to line with error
even a remote visual debugger
plus lots more goodies.
This is my adventure. Its a story not a traditional tutorial - the codes there if your not interested in reading my little tale. But if you are: sit back with a packet of chips and plenty of caffeine. And enjoy....
The Plan
No-one wants to re-invent the wheel. The whole point of OOP (Object oriented programming) is the reusability of objects. The whole point of the Internet was the reusability of someone else's object. Luckily I had read enough about these component "things" to want to find some more. So I connected up to the Internet and went "shopping".
My shopping list:
Internet controls (telnet, remote shell, remote execute, ftp)
Editor with multi-file capabilities
Syntax highlighter or Parser
Automatic Syntax highlighting
I shopped at:
Delphi Super Page
Torri's page
Borland
RxLib's
RzLib's
QuickReport Homepage
Within 4 hour I was off the net with a number of goodies, 23 or so assorted controls, programs and Internet suites, (and plenty of stuff that had nothing to do with the current project, but I thought might be of use on my next one). Eventually I whittled the choices down to:
YourPasEdit by D C AL CODA, Ken Hale and Coda Hale
PasToRTF by Martin Waldenburg (included in above)
TntClient by Francois Piette (and associated ICS suite)
RxTools by those great Russian dudes.
YourPasEdit was a great find, and its description said it all:
There are three main features in YourPasEdit. First is the PAS to RTF conversion unit, contained in mwPasToRtf.pas, by Martin Waldenburg. The second shows how to create TabSheets at run time, with a RichEdit. The third are the procedures written by Andrius Adamonis, which allow YourPasEdit to be associated with files and to open those files in the running instance of YourPasEdit, creating a new TabSheet and RichEdit for the newly opened file.
Unfortunately it also told me:
It is not intended to be a full Delphi file editor, because it does not highlight keywords on the fly, like a syntax aware editor.
But I wasn't complaining - I needed some challenge - otherwise it wouldn't be any fun! So I delved into how things were being done in YourPasEdit. To get the best from the following sections you should get your hands on at least YourPasEdit so you can follow and (more importantly) program along.
Syntax highlight (YourPasEdit)
In YourPasEdit, syntax highlighting was done by parsing the plain text file, dividing each line of text into separate "Tokens", working out what TokenType each token was, and formatting them based on preset Font and Color settings. The following tokens were supported, corresponding to the Token types in the Delphi 3.0 Editor: TTokenState = (tsAssembler, tsComment, tsCRLF, tsDirective, tsIdentifier, tsKeyWord, tsNumber, tsSpace, tsString, tsSymbol, tsUnknown);
How these tokens were formatted (that is with what Color and Attributes) was determined by using the Delphi 3.0 Editors own settings as they are stored in the Windows Registry. Basically a line of Pascal source such as:
procedure TForm1.FormCreate(Sender: TObject); { Create the Form }
would be divided up into:
procedure tsKeyWord
tsSpace
TForm1 tsIdentifier
. tsSymbol
FormCreate tsIdentifier
( tsSymbol
Sender tsIdentifier
: tsSymbol
TObject tsIdentifier
); tsSymbol
tsSpace
{Create Form} tsComment
<CR><LF> tsCRLF
How is it Done?
The RichEdit control normally loads preformatted text from .RTF files by way of by of the RichEdit.Lines.LoadFromFile() function. YourPasEdit uses the RichEdit.Lines.LoadFromStream() function to load the file from a TPasConversion - a custom TMemoryStream descendant. This stream takes the plaint text Pascal source file, loads it into its internal memory buffe, and then converts it from plain text to a text impregnated with RTF codes. This way when it is loaded into the RichEdit control via RichEdit.Lines.LoadFromStream the Pascal source file appears in the control color-syntax highlighted.
To the main Editor, this process is transparent - the code looks something like this:
begin
NewRichEdit := TRichEdit.Create;
PasCon.Clear; // Prepare the TPasConversion
PasCon.LoadFromFile(FName); // Load the File into the Memory Stream
PasCon.ConvertReadStream; // Convert the stream to RTF format
NewRichEdit.Lines.BeginUpdate;
NewRichEdit.Lines.LoadFromStream(PasCon); // Read from the TPasConversion
NewRichEdit.Lines.EndUpdate
NewRichEdit.Show;
Result := NewRichEdit;
end
EXAMPLE - snippet of code from the NewRichEditCreate(Fname) routine
As I said, it is the TMemoryStream derived TPasConversion which does all the hard work:
<SOURCE PASCAL FILE>
|
V
Plain source loaded into memory
(TPasConversion.LoadFromFile)
|
V
Converted internally by parsing the source file
(ConvertReadStream)
|
V
Result made available
(SetMemoryPointer)
|
V
RichEdit.LoadFromStream
Most of the work in TPasConversion is done by the ConvertReadStream procedure. Its purpose is to split up each line of source code into tokens (as showed previously) and then depending on its TokenType, load it into the outbuffer preceded by RTF codes to make it a particular Color, Bold, Italics etc. Here what it looks like:
// prepare the Outbuf to a certain default size
FOutBuffSize:= size+3;
ReAllocMem(FOutBuff, FOutBuffSize);
// Initialise the parser to its begining state
FTokenState := tsUnknown;
FComment := csNo;
FBuffPos := 0;
FReadBuff := Memory;
// Write leading RTF Header
WriteToBuffer('{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS SansSerif;}
{/f1/froman/fcharset2 Symbol;}{/f2/fmodern Courier New;}}'+#13+#10);
WriteToBuffer('{/colortbl/red0/green0/blue0;}'+#13+#10);
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
// Create the INSTREAM (FReadBuff) and tokenize it
Result:= Read(FReadBuff^, Size);
FReadBuff[Result] := #0;
if Result > 0 then
begin
Run:= FReadBuff;
TokenPtr:= Run;
while Run^ <> #0 do
begin
Case Run^ of
#13: // Deal with CRLFs
begin
FComment:= csNo;
HandleCRLF;
end;
#1..#9, #11, #12, #14..#32: // Deal with various whitespaces, control codes
begin
while Run^ in [#1..#9, #11, #12, #14..#32] do inc(Run);
FTokenState:= tsSpace;
TokenLen:= Run - TokenPtr;
SetString(TokenStr, TokenPtr, TokenLen);
SetRTF;
WriteToBuffer(Prefix + TokenStr + Postfix);
TokenPtr:= Run;
end;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~ much code removed ~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
end;
end
EXAMPLE - snippet showing the while loop that breaks up the INSTREAM into recognised tokens
Most of the work is done by the [case Run^ in ... end;] section which "breaks off" a token from the INSTREAM (FReadBuf) based on the logic in the case statement. The case statement is organised in such a way that it can quickly decipher the input stream into the various TokenTypes by examining each character in turn. Having worked out which tokentype it is, the actual encoding part is relatively easy:
FTokenState:= tsSpace;
TokenLen:= Run - TokenPtr;
SetString(TokenStr, TokenPtr, TokenLen);
ScanForRTF;
SetRTF;
WriteToBuffer(Prefix + TokenStr + Postfix);
EXAMPLE - basic steps in encoding the output stream of the TPasConversion object
What’s happening here is the program:
sets FTokenState to what we believe it is (in this part of the code it is tsSpace which matches any series of Whitespaces)
the length of the token is calculated by working out how far the current memory pointer (Run) has moved since we finished with the last token (TokenPtr).
the token is then copied from the Input buffer from the starting position of it in the memory buffer (TokenPtr) for the length of the token, into the variable TokenStr.
ScanForRtf just checks through the resultant TokenStr to ensure it doesn't have any funny characters that the RichEdit would confuse as RTF commands. If it finds any, it escapes them out.
SetRTF looks at the FTokenState to populate two global variables Prefix and Postfix with the appropriate RTF codes to give the token the right Color,Font,Boldness.
WriteToBuffer than simply puts the TokenStr with the Prefix and Postfix around it into the output buffer, and the loop then continues on.
Back to the topic: Syntax Highlighting (on-the-fly)
No source code is necessarily 100% applicable to your needs. I was fortunate in that most of the parser applied to my 4GL command syntax (e.g Strings were strings, Numbers were numbers, similar Keywords). As well YourPasEditor had implemented most of the basic accessory tasks such as Printing, Find, Find and Replace, Multi-File editing. It was just a matter of adding in the extras I was after.
PROBLEM #1 - No colours or fonts
One task the Parser didn't fully implement was Colors or Different Fonts, or even fonts sizes. The reason for this (after some trial and error) was that the SetRTF procedure new nothing about how to do this. It only used the information in regards [Bold], [Italics] and [Underline] stored in the Win95 Registry for the Delphi Editors Settings to determine how to highlight each token. As for fonts - well I hadn't realised that the Delphi Editor actually uses only one Font and Fontsize for all the different tokens - so that wasn't Pas2Rtf fault. I was just being greedy.
Luckily the comments in Pas2Rtt.pas told me what the other values in the Registry coded for, especially where the important foreground color was stored. This meant some changes to:
1. procedure SetDelphiRTF(S: String; aTokenState: TTokenState);
Add after the try;
Font.Color := StrToInt(Ed_List[0]);
2. procedure TPasConversion.SetPreAndPosFix
Add after FPreFix[aTokenState] = '';
FPreFixList[aTokenState] := ColorToRtf(aFont.Color);
The ColorToRtf codes is already present, but hadn't been used for some reasone. If you try it out you'll understand why . You get absolutely no change except lots of ';' in the wrong place.Change the ';' to '(space)' in ColorToRtf(), and you get rid of the ';' appearing in the RichEdit control, but no Colors anyway.
My first thought was that the value in Ed_List[0] didn't convert to a proper Font.Color. The easiest way to test this was to hard code Font.Color := clGreen; and see what happens. Again no luck. The format was consistent with the RTF codes I could see in the RTF header. What the $#%#$%# was wrong with it ?
It was about then that I realised I needed a crash course in RTF document structure. For this I rushed off to www.microsoft.com (please forgive me) and found a reference on RTF. After an hour of reading a Microsoft Technical Document I was even more confused. Oh well - this meant it was time to get dirty. Time to get down to real programmer stuff. Time to "cheat".
What did I do? I went into WordPad (which is just a glorified RichEdit version 2.0 on steroids) and saved various files into RTF format. I then opened them in NotePad so I could see the RTF codes and compare what happened in each case: what codes were produced depending on what I did, or didn't do. A similar sort of technique was used back in the 1980s to decipher the first Paradox database format Sorry Borland.
blank.rtf - empty -so I could see the "plain" header line
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}
{/colortbl/red0/green0/blue0;}/deflang1033/pard/plain/f2/fs20 /par }
plaintext.rtf - too see how having any text was handled
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f2/fs20 this is plain text
/par }
difffont.rtf - different font, same size, same text
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}{/f3/fswiss/fprq2 Arial;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f3/fs20 plain text different font/plain/f2/fs20
/par }
diffsize.rtf - text set to 18 point in the default font
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f2/fs36 plain text different font/plain/f2/fs20
/par }
diffcolor.rtf - etc. my favourite of course - blue.
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;/red0/green0/blue255;}
/deflang1033/pard/plain/f2/fs20/cf1 plain text different font/plain/f2/fs20
/par }
Looking at the resultant codes you see how the RTF stream is formatted. It comprises a:
INITIAL HEADER (/rtf1/.....)
FONTTABLE (/f0/fswiss...)
COLORTABLE (/colortbl)
MISCELLANEOUS
DEFAULT FORMAT (/pard....)
BODY OF THE FILE.
As a result of that I rewrote this code:
WriteToBuffer('{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS
SansSerif;}{/f1/froman/fcharset2 Symbol;}{/f2/fmodern Courier New;}}'+#13+#10);
WriteToBuffer('{/colortbl/red0/green0/blue0;}'+#13+#10);
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
to become:
WriteToBuffer('{/rtf1/ansi/deff0/deftab720');
WriteFontTable;
WriteColorTable;
WriteToBuffer('/deflang1033/pard/plain/f0/fs20 ');
The procedures Write[Font,Color]Table basically creates a table of fonts/colors we can reference later on. Each Font and Color type is stored by index in a TList internally. It acts as a lookup tables - by matching the Font name or Color value we can find the [num] to code into the RTF stream at the required moment:
/f[num] = the index of which Font you want to use, as pre-set in the "on the fly" font table
/fs[num] = point size - (for example 20 = 10point)
/cf[num] = the index of which Color to use, as preset in "on the fly" color table
/cb[num] = which background color to use - (ignored in RichEdit version 2.0)
PROBLEM#2 Crashes in long comments or text (existing problem)
There is a bug in ScanForRtf. Can you see it?
procedure TPasConversion.AllocStrBuff;
begin
FStrBuffSize:= FStrBuffSize + 1024;
ReAllocMem(FStrBuff, FStrBuffSize);
FStrBuffEnd:= FStrBuff + 1023;
end; { AllocStrBuff }
procedure TPasConversion.ScanForRtf;
var
i: Integer;
begin
RunStr:= FStrBuff;
FStrBuffEnd:= FStrBuff + 1023;
for i:=1 to TokenLen do
begin
Case TokenStr of
'/', '{', '}':
begin
RunStr^:= '/';
inc(RunStr);
end
end;
if RunStr >= FStrBuffEnd then AllocStrBuff;
RunStr^:= TokenStr;
inc(RunStr);
end;
RunStr^:= #0;
TokenStr:= FStrBuff;
end; { ScanForRtf }
EXAMPLE - code snippet from Pas2Rtf demonstrating the "long comment" bug
The problem: if FStrBuff is enlarged using AllocStrBuff() (to make it bigger to handle a very long comment) the Windows Memory manager probably has to re-allocate it by moving the entire string buffer somewhere else in memory. RunStr however is not adjusted for this change and stillpoints to the old memory area, now unallocated.
The fix: Reallocate RunStr in the AllocStrBuff routine so it points to the correct place in the new area of memory. Try and fix it yourself, or look at my garsely spaghetti code in jhdPasToRtf.pas.
Automatic Syntax Highlighting (my first implementation)
To understand how Automatic syntax highlighting works, you should have a close look at what happens in the Delphi 3.0 Editor. After all - if Borland was happy with it - who am I to argue
Take note when the "syntax" changes and what is affected. In retrospect the difficult thing is to implement a highlighter that is:
Fast
Accurate
Doesn't flicker
Isn't obvious ("the someone is chasing me phenomenon".. you'll see)
1. When should we do the re-highlighting ?
In YourPasEdit the highlighting is done as the file is read in. Once this is done, the only way to make use of that technique would be to write out the file everytime it changes and read it back in again - obviously a very slow process. In my case, I basically wanted to just reformat the line(s) that have been changed, immediately after the change had been done i.e. after every new character, DELETE or BACKSPACE or even Paste or DragDrop had been processed. I needed something that was triggered everytime the control was effected in such a way.
What I needed then was an [Event].
2. Which event - there's so many to choose from ?
A RichEdit, like any control, has a number of [Events] triggered when you do various things to the control. What is not obvious, is that many events trigger other events in turn. So in choosing which Event(s) to hang your code off you have to ensure that (a) it catches all situations where you need to "fix" the highlighting and (b) it doesn't become re-entrant (i.e. what you do in the [Event], doesn't trigger itself again or any other [Event] that would call the "highlighting code"). From a quick look at the helpfile, I decided that [OnChange] seemed a likely candidate. According to the Delphi 3.0 Helpfile:
Write an OnChange event handler to take specific action whenever the text for the edit control may have changed. Use the Modified property to see if a change actually occurred. The Text property of the edit control will already be updated to reflect any changes. This event provides the first opportunity to respond to modifications that the user types into the edit control.
You may be thinking however: "Heh? What about those other things - like Methods and Properties. Can't they also change the text?" They sure can - but most end up triggering [OnChange] anyhow.
3. Is it what I want? - Rich text controls (from Delphi3 Helpfile)
The rich text component is a memo control that supports rich text formatting. That is, you can change the formatting of individual characters, words, or paragraphs. It includes a Paragraphs property that contains information on paragraph formatting. Rich text controls also have printing and text-searching capabilities.
By default, the rich text editor supports
Font properties, such as typeface, size, color, bold, and italic format
Format properties, such as alignment, tabs, indents, and numbering
Automatic drag-and-drop of selected text
Display in both rich text and plain text formats.
(Set PlainText to True to remove formatting)
type TNotifyEvent = procedure(Sender: TObject) of object;
property OnChange: TNotifyEvent;
4. Is it the event I want - ie [OnChange] Event - the right one?
Live dangerously, let’s give it a go and see...by testing our assumptions out:
So I wrote my first [OnChange] event:
Create a New application
place on it one RichEdit (RichEdit1) and one Edit control (Edit1)
Code the [OnChange] for the RichEdit1 control like this:
procedure TForm1.RichEdit1Change(Sender: TObject);
begin
TRichEdit(Sender).Tag := TRichEdit(Sender).Tag + 1;
Edit1.Text := 'Tag=' + IntToStr(TRichEdit(Sender).Tag);
end;
In this case the Sender object is the RichEdit being changed. The code basically uses the RichEdit's Tag variable (initially 0) as a handy Control specific variable. Everytime the [OnChange] event is called, it increases the Tag by 1, and display its value in an Edit Control as Text. You should pre-set the RichEdit control with some text in it, otherwise the following may be confusing!
Compile and Run...
Click in the Control. Nothing...
Move around in it using CursorKeys... Nothing...
Click outside the control.. and then back inside.. Nothing...
Press the [Space Bar].. Tag=1...
Press [Backspace].. Tag=2...
Press return.. Tag=3..
Select some text.. No change..
CTRL-C some text.. No change..
CTRL-X some text.. Tag=4..
CTRL-V some text.. Tag=5..
As it looked good so far I then added to the Form1.OnShow event:
RichEdit1.Lines.LoadFromFile('c:/winzip.log'); {Just a plain text file hanging around }
to see what happened. And guess what - an [OnChange] event was called sometime and "Tag=1" was displayed in the Edit control as the proof when the Form appears for the first time. So we can see that procedures do call Events that apply to what they are doing.
5. What happens in Syntax Highlighting anyhow?
Watch carefully in the Delphi Editor. Now try and reproduce it. Open a WordPad (since WordPad is a souped up RichEdit basically). Read in a source file (e.g any Unit1.pas) and do syntax highlighting manually:
Select a token
Manipulate it using the buttons provided to change Font, Size, Color, and Bold
Move onto the next token
Goto 1
So therefore in [OnChange] we'll try and write code to reproduce what we have done manually.
6. Which text do I want.. and where do I get it ?
Hunting through the Delphi Helpfile on RichEdit controls we find that the actual text information in the RichEdit control is stored (or rather can be accessed from) either:
RichEdit.Text
Text contains a text string associated with the control.
TCaption = type string;
property Text: TCaption;
Description
Use the Text property to read the Text of the control or specify a new string for the Text value. By default, Text is the control name. For edit controls and memos, the Text appears within the control. For combo boxes, the Text is the content of the edit control portion of the combo box.
RichEdit.Lines
Lines contains the individual lines of text in the rich text edit control.
property Lines: TStrings;
Description
Use Lines to manipulate the text in the rich text edit control on a line by line basis. Lines is a TStrings object, so TStrings methods may be used for Lines to perform manipulations such as counting the lines of text, adding lines, deleting lines, or replacing the text in lines.
To work with the text as one chunk, use the Text property. To manipulate individual lines of text, the Lines property works better.
Now Lines seemed to be what I wanted - after all I wanted the Syntax highlighting to work on a line by line basis. So let’s have a look at whats been changed.
Oh.. look at what? How can I tell which line is the one that is changed?
Unlike some Events, the [OnChange] isn't passed any variable's save the identity of the RichEdit control affected. The RichEdit Control doesn't have a runtime variable that tells us either. The only variables are the SelStart and SelLength - but their about selecting text aren't they? I just want to know what line I'm on :-(
It was about then that I re-read the information on the Sel??? properties, and recalled my "concept" code. Selection - I realised - was the name of the game. By manipulating these variables I could reproduce what I was doing manually - selecting text - as program code. Once selected you can then manipulate the attributes of the selected text through the SelAttributes structure.
Let’s get familiar with these variables (in Summary)
.
SelStart Position of the Cursor, or the beginning of the selected text
SelLength 0 if SelStart = Cursor Pos, or length of selected Text
SelText empty if no text selected, or actual text selected
SelAttribute Default attributes if I was to start typing at the Cursor position OR the actual attributes of the selected text
There is actually no other way to access the attributes of the text already in the Control than by programmatically accessing them via manipulating Sel variables (*if you stick to using the defined properties and methods).
In the end RichEdit.Text and RichEdit.Lines are just plain old strings - not really "rich" at all. The other thing to note is that SelStart is a 0-based index on the first character of RichEdit.Text - so it looks like Richedit.Line is out the door.
7. Okay implement: Select a Token
Basically I wanted to start at the beginning of the line, send just that line to PasCon, and read it back in and replace the current line with the result. Trouble is RichEdit doesn't give you access to the 'RTF' representation of a single line. Plus I still can't tell when the beginning or end of the line is. Since the latter seems to be a nagging problem, we better fix that first - trouble is: How?
When all else fails - WinAPI calls of course.
Most visual controls in Delphi are in fact just native Windows controls encapsulate as Delphi types. You can still use Windows API functions to access the control underneath. This was it is possible to access information not accessable per Delphi public Properties, Method or Events. Time to delve through Win32.HLP and see what it has to say about RichEdit controls. Its stored in C:/Program Files/Borland/Delphi 3.0/Help if you don't have a shortcut to it.
Open Win32.HLP -> [ Contents ] -> [ RichEdit controls ] -> [ Rich Edit Controls ]
I spent some time getting to know the "full" capabilites of the RichEdit control hidden behind Delphi's implementation of it. Much of what I learned came in handy later on (as you'll see) and as a result I derived my second two Delphi Rules:
Delphi Rule #2: If your Project hinges on the capabilities of a certain control - make sure you know everthing about it - from the beginning.
Delphi Rule #3: "Reference" is not the same as "Summary" (also known as Win32.HLP Rule#1)
Eventually I discovered the key in the [ Rich Edit Control Reference ] under "Lines and Scrolling". I had thought this page was simply a summary of the messages discussed in the preceding help pages. Actually it included a number of extra messages not discussed elsewhere - the exact ones I was after!
Lines and Scrolling
EM_LINEFROMCHAR - give them a 0-based index and they'll return the line
EM_LINEINDEX - give them a line and you get the index of the first character
EM_LINELENGTH - give them a line and you get the length of the line
So lets start coding.
(NB: To use the constants (EM-?) you'll have to manually add RichEdit in the uses clause)
procedure TForm1.RichEdit1Change(Sender: TObject);
var WasSelStart,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
Row := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
BeginSelStart:= MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
// I didn't use the EM_LINELENGTH message, as the variables was avaiable via Delphi
Edit1.Text := IntToStr(WasSelstart) + '-' +
IntToStr(Row) + '-' +
InttoStr(BeginSelStart) + '-' +
IntToStr(EndSelStart);
end;
To start with I used the Edit1 Control to display the results of all these variables. I then tried manipulating text in the RichEdit to see what values I got. You should do the same. Type slowly in:
1234567890<CR>1234567890<CR>1234567890
and see how the results are reflected in the Edit control as you do so. Then experiment - try adding stuff to the ends of lines, and in the beginning of the line, and middle of lines. You may have to refer back to the Code to work out which number represents which variable.
Okay, now using the variables we have, lets try selecting the text of the current line, and display it in a new Edit Control (Edit2).
Add the following code to see what happens (don’t forget to add the second edit control and make it as wide as possible):
MyRe.SelStart := BeginSelStart;
MyRe.SelLength := EndSelStart - BeginSelStart;
Edit2.Text := MyRe.SelText;
end;
Run the program and try it out.
OOPS - That doesn't work - the text remains selected and the original cursor position is lost.
We need to reset SelStart and SelLength before we finish in the [OnChange] event. So let’s add at the end:
MyRe.SelStart := WasSelStart; //back to were we started
MyRe.SelLength := 0; // nothing selected
end;
While playing with text in the edit control I discovered something weird.
If you typed [1] then <CR> then [2] the Edit1 displayed [4-1-3-4].
But there were only two characters in the display.
I made a mistake. It appears that RichEdit.Text can tell you where the beginning and end of line is. Why? Because you can access the <CR><LF> characters in the Text string. So we could have manipulated the Text property of the control to work out the beginning and end of lines by reading back and forward from SelStart to find <CR><LF> characters. We may not have known which line we were on, but we would know where it began and ended. Nevertheless we should keep this in mind, it might come in handy later.
But it doesn't matter - the EM_###### messages are a neat way of doing things. And they work. For the moment at least we'll stick with them.
7. Okay implement: Part 2 - Change the format
After the line Edit2.Text := MyRe.SelText, but before the "resetting" part, lets put some logic in to turn lines RED when they are longer than a certain length:
if (MyRe.SelLength > 10) then MyRe.SelAttributes.Color := clRed;
You'll notice two things if you test this out. First - it does work. Second however, is that if you type a line > 10 characters, press return and type one character - its in Red. This is because it inherits the Attributes of the preceding text. Just like if you have bold on in a Word processor, it doesn't reset if you press return. So lets change the line to include an else situation:
else MyRe.SelAttributes.Color := clBlack;
That seems to work - except when you press return in the middle of a > 10 character line you have already typed (which is already Red) to leave a stump < 10 characters on the line above - it remains red. This is because the code leaves you on the next line, and SelStart refers to this new line, not the previous one. In our eventual code, we'll have to take care to ensure this doesn't happen - we have to catch this situation and deal with it. It wont be the only situation I'm sure....
PS: There will be a number of situation we're we'll have to be careful. Can you think of any now? Try putting a lot of text in the Control (or manipulate a loaded file) and selecting some and using the inherit Drag and Drop (move your Mouse over some selected text, press and hold down the Left MouseButton and then drag away) to move some text. This only Triggers one OnChange Event. We may also be moving multiple lines along the way. In the future we'll have to put in some code to detect this happening, and ensure the [OnChange] event can deal with the need to reformat in two different locations. That means thinking in the back of the head about how in the future we may have to deal with this kind of situation, and ensure our code to deal with the simple situation can be adapted - i.e. be "versatile".
8. Basically it all seems to kind-of work.. can't we do some real programming now?
Okay, okay. But first we have a problem. Actually a rather big problem. The problem is PasCon. Why?
First: It returns RTF code.
Problem: We can't use RTF code.
Second: its designed to work an entire stream, and then give it back to us again as a whole.
Problem: We actually want greater control over it than this "all or nothing" approach.
OOP to the Rescue
When you have something that works in a situation, and needs to be applied in another situation were it has to do a similar, but subtly different job - you have two choices:
copy the function, and re-write it for the new situation, or
kludge around it (e.g use Pas2Rtf, and then write a RtfCodes2RtfControl procedure).
Modern languages however give you an option: OOP it. "Objectify" it. This is more than just deriving something from an existing object. It is in a sense programming in a "state of mind". Controls should be created so they can be used in a variety of situations - father than situation specific. In this case all PasCon can deal with is tokenising the input stream and returning code RTF text. What we really need to do is divide it into two entitites. We need to separate the [Parsing/Recognise the Token and TokenType] from the [Encode it in RTF codes].
So lets start with ConvertReadStream, editing it so it looks something like this:
function TPasConversion.ConvertReadStream: Integer;
begin
FOutBuffSize := size+3;
ReAllocMem(FOutBuff, FOutBuffSize);
FTokenState := tsUnknown;
FComment := csNo;
FBuffPos := 0;
FReadBuff := Memory;
{Write leading RTF}
WriteToBuffer('{/rtf1/ansi/deff0/deftab720');
WriteFontTable;
WriteColorTable;
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
Result:= Read(FReadBuff^, Size);
if Result > 0 then
begin
FReadBuff[Result] := #0;
Run := FReadBuff;
while Run^ <> #0 do
begin
Run := GetToken(Run,FTokenState,TokenStr);
ScanForRTF;
SetRTF;
WriteToBuffer(PreFix + TokenStr + PostFix);
end;
{Write ending RTF}
WriteToBuffer(#13+#10+'/par }{'+#13+#10);
end;
Clear;
SetPointer(FOutBuff, fBuffPos-1) ;
end; { ConvertReadStream }
The code for ConvertReadStream is now much smaller, and also easier to understand. We can then take all the code that used to be in ConvertReadStream that did the tokenizing and create a new subroutine - the GetToken function that just does the recognizing and labelling of the individual tokens. In the process we also loose a huge number of repeated lines of code, as well as a number of sub-routines such as HandleBorCom and HandleString.
//
// My Get Token routine
//
function TPasConversion.GetToken(Run: PChar; var aTokenState: TTokenState;
var aTokenStr: string)Char;
begin
aTokenState := tsUnknown;
aTokenStr := '';
TokenPtr := Run; // Mark were we started
Case Run^ of
#13:
begin
aTokenState := tsCRLF;
inc(Run, 2);
end;
#1..#9, #11, #12, #14..#32:
begin
while Run^ in [#1..#9, #11, #12, #14..#32] do inc(Run);
aTokenState:= tsSpace;
end;
'A'..'Z', 'a'..'z', '_':
begin
aTokenState:= tsIdentifier;
inc(Run);
while Run^ in ['A'..'Z', 'a'..'z', '0'..'9', '_'] do inc(Run);
TokenLen:= Run - TokenPtr;
SetString(aTokenStr, TokenPtr, TokenLen);
if IsKeyWord(aTokenStr) then
begin
if IsDirective(aTokenStr) then aTokenState:= tsDirective
else aTokenState:= tsKeyWord;
end;
end;
'0'..'9':
begin
inc(Run);
aTokenState:= tsNumber;
while Run^ in ['0'..'9', '.', 'e', 'E'] do inc(Run);
end;
'{':
begin
FComment := csBor;
aTokenState := tsComment;
while not ((Run^ = '}') or (Run^ = #0)) do inc(Run);
inc(Run);
end;
'!','"', '%', '&', '('..'/', ':'..'@', '['..'^', '`', '~' :
begin
aTokenState:= tsUnknown;
while Run^ in ['!','"', '%', '&', '('..'/', ':'..'@', '['..'^',
'`', '~'] do
begin
Case Run^ of
'/':
if (Run + 1)^ = '/' then
begin
if (aTokenState = tsUnknown) then
begin
while (Run^ <> #13) and (Run^ <> #0) do inc(Run);
FComment:= csSlashes;
aTokenState := tsComment;
break;
end
else
begin
aTokenState := tsSymbol;
break;
end;
end;
'(':
if (Run + 1)^ = '*' then
begin
if (aTokenState = tsUnknown) then
begin
while (Run^ <> #0) and not ( (Run^ = ')') and ((Run - 1)^ = '*') ) do inc(Run);
FComment:= csAnsi;
aTokenState := tsComment;
inc(Run);
break;
end
else
begin
aTokenState := tsSymbol;
break;
end;
end;
end;
aTokenState := tsSymbol; inc(Run);
end;
if aTokenState = tsUnknown then aTokenState := tsSymbol;
end;
#39:
begin
aTokenState:= tsString;
FComment:= csNo;
repeat
Case Run^ of
#0, #10, #13: raise exception.Create('Invalid string');
end;
inc(Run);
until Run^ = #39;
inc(Run);
end;
'#':
begin
aTokenState:= tsString;
while Run^ in ['#', '0'..'9'] do inc(Run);
end;
'$':
begin
FTokenState:= tsNumber;
while Run^ in ['$','0'..'9', 'A'..'F', 'a'..'f'] do inc(Run);
end;
else
if Run^ <> #0 then inc(Run);
end;
TokenLen := Run - TokenPtr;
SetString(aTokenStr, TokenPtr, TokenLen);
Result := Run
end; { ConvertReadStream }
ASH - Automatic Syntax highlight (Attempt 2)
[Please note: I have my Delphi Editor colors set-to the [Ocean] colour speed settings for testing purposes. This setting works well on the default RichEdit white background, and most TokenTypes are in different colors from each other]
Okay now to do some real work. Most of the function have been written thereabouts. As a basis for writing this ASH I'm going to use Project1.dpr which comes out of mpas2rtf.zip in the YourPasEdit zip file yrpasedit.zip. This is because it much smaller than YourPasEdit, and thus quicker to compile.
I suggest you put the contents of the mpas2rtf.zip into a separate directory. Also copy mwPas2Rtf.pas to testinput.pas using the Explorer shell - we'll be using this file as a sample pascal file for benchmarking.
Open Project1.dpr in Delphi, compile Project1, run it, and open the file testinput.pas by pressing [Button 1] and selecting it in the [OpenFile Dialog]. Do it a number of times, and record the time taken for each once the file is stabilised in the system cache. On my system it averages about 0.47 - 0.41 seconds once its in the cache (P133 - 16M - Win95b)
Preparing Project1's Unit1.pas
Now replace the contents of mpas2rtf.pas with that code in jhdpas2rtf.pas. Recompile. Now open up the testinput.pas sample file again by using [Button 1]. As you see - we get color - but it takes a "lot" longer: 1.20-1.25 seconds.
Try and speed it up if you like. You can start by commenting out the pascal-code that codes in the different Font and FontSizes in TPasConversion.SetRtf. Recompile and run again. This time it improves a bit to 1.10-1.15. Now try commenting out the code for different Colors. Wow - the speed decreases down to 0.49 - 0.44.
Hmm. This font and color stuff really packs a punch. We may need to look at this later in more detail if things end up too slow. For the moment we'll leave the code back in full working condition (so you'll need to go back and uncomment the code).
Now put the following base code into the [OnChange] event of the RichEdit1 in Unit1.pas of Project1. Most of this code is just based on what we have already covered elsewhere.
procedure TForm1.RichEdit1Change(Sender: TObject);
var
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe: TRichEdit;
MyPBuff: array[0..255] of char;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
Row := WasRow;
end;
Were going to use the GetToken() function to do all the hard work. We'll need some extra variables to pass to the GetToken function, so add to the var section:
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
These are similar to the variables we used in the ConvertReadStream - in fact we want to do "exactly" the same thing, just one single line at a time. Add this code before the last end;
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MYPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
//
// ScanForRtf;
// SetRtf;
// WriteBuffer(Prefix + TokenStr + Postfix);
//
end;
end;
NB: As we will be using PasCon you'll have to move it from being a local variable of TForm1.Button1Click to be a global variable. This will mean you'll have to move all the initialising:
PasCon:=TPasConversion.Create;
PasCon.UseDelphiHighlighting(3);
to a TForm1.Show, and the PasCon.Free to TForm1.Close procedure. It will still work if you only move the variable definition - but not for long...
I've left the code from the old ConvertReadStream in the example above to show what we "logically" still need to implement in the current context - that is manipulating the RichEdit Control directly. What we have now is the ability to cut up the current line in to different tokens, and know what type they are. We now have to add these tokens to current line with the right attributes (Fonts,Colors,Bold etc).
But wait. They are already on the line - well the text is anyway, but maybe not in the correct format (Color,Bold etc). So what actually could do is to select each token in its corresponding positon in the RichEdit control and just apply the appropriate attributes to them.
We did this back in the beginning remeber? When we set the >10 character lines to the color red. But how do we do this now? Lets look at what we have in the variables at hand when we hit "// SetRtf" the first time:
(these example uses Uni1.pas as the input file as its more interesting)
VARIABLES
01234567901234567890
Lines.Strings[R0] unit Unit1;
MyPBuff unit Unit1;
MyTokenState tsIdentifier
MyTokenStr unit
MyRun Unit1;
So what we need to do is select the word 'unit' in the RichEdit control, and set its attributes. We do this by setting SelStart to the position of 'unit' in the RichEdit control, and SelLength to the length of the word 'unit'. And since 'unit' is at the beginning of the current line - thats position is BeginSelStart (which I conveninently have stored in MySelStart - you'll see why). Lets replace the "pseudo" comment code with the following:
MyRe.SelStart := MySelStart;
MyRe.SelLength := Length(MyTokenStr);
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
end;
But remember we are in a loop - when we go around again we'll have the next token in the line, and the variables will look like this:
VARIABLES
01234567901234567890
Lines.Strings[R0] unit Unit1;
MyPBuff unit Unit1;
MyTokenState tsSpace
MyTokenStr (space character)
MyRun Unit1;
But (space character) isn't at BeginSelStart (#0) in the RichEdit control. Its further along (at position #4). Which just happens to be BeginSelStart + Length('unit'). We need to update MySelStart after we process the preceeding token, but before we go around the loop again:
MySelStart := MySelStart + Length(MyTokenStr);
end;
Okay - this is where we are standing at the moment:
procedure TForm1.RichEdit1Change(Sender: TObject);
var
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
MyPBuff: array[0..255] of char;
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
Row := WasRow;
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MyPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
MyRe.SelStart := MySelStart;
MyRe.SelLength := Length(MyTokenStr);
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
MySelStart := MySelStart + Length(MyTokenStr);
end;
MyRE.SelStart := WasSelStart;
MyRE.SelLength := 0;
end;
Now: put the Debugging code on, do [Build All] and then [Run], and set a breakpoint on the first line of this Event. Open up the testinput.pas. When the debugger stops in the OnChange event, Press <F9> to continue on, and press <F9> again, and again - Do you see? We keep going back into the Event again and again (and again). What’s happening?
Somehow in our event we are triggering off another [OnChange] event. This call to the [OnChange] event code is stored in the message queue. When the event were currently in is finished, a new one is just waiting on the Event queue, which executes and creates more events... a re-entrant loop.
This behaviour is not surprising - after all we are actually changing the control in the process of our code, so no wonder another [OnChange] event is being triggered.
The way to fix such things is to ensure our actions do not trigger of the Event. We can do this by "temporarily" storing the RichEdits.OnChange property (which contains a reference to call our procedure TForm1.RichEdit1Change) in our own internal variable, and then setting the OnChange property to nil.
We then do all the processing we want to do - if it happens to trigger an [OnChange] event - there is nothing to call as OnChange is nil, and so the Event doesn't go onto the Message queue. When we're finished however we must return the OnChange property to it original value, otherwise the reprocessing want happen next time around.
If we look at the Delphi Helpfile we see that the OnChange property is of a certain type, the same type we have to make our SaveOnChangeIn variable:
var
SaveOnChangeIn: TNotifyEvent;
~~~~~rest of code
begin
MyRe := TRichEdit(Sender);
SaveOnChangeIn := MyRe.OnChange;
MyRe.OnChange := nil;
~~~~~rest of code
MyRe.OnChange := SaveOnChangeIn;
end;
Try it out!!!
Compile and Run
Open up Unit1.pas in the "editor" we have written
Click in the RichEdit in the center of the first lines, in the middle of "unit".
Press the [space bar]
Press the [backspace key]
Arrow to the end of the line
Press [Enter]
Press [BackSpace]
[BackSpace] away the entire line
Re-type the entire line
Result: "functionally" the Control should look that same as it did before we clicked in it. The line "unit Unit1;" should highlighted properly as per your Delphi 3.0 Editor (save the background colour). However its slow and flickers a great deal. Try opening up a new line and just type a long phrase - e.g "if (RichEdit = Santa) then GetPresents('box of choclate'); " and you'll agree with me that:
GOOD - It is highlighting properly
BAD - There is flickering
BAD - The longer the line gets, the longer it takes to do the re-highlighting
BAD - you get the "someone is chasing me effect"
The flickering is due to a number of components. We'll have to deal with each seperately.
The most obvious is the "selecting" of each Token. Visually the control is just repeating what we were able to do manually - when a piece of text is selected it becomes highlighted by the black stripe. We need to stop this from happening. Back to the helpfile(s) again. Have a search around, and come back after a snack break with some ideas... I'm hungry
Death to the black stripe
Marks: 5/10
Most of you would have found the HideSelection property of the RichEdit control. When it is set to TRUE and the RichEdit looses the focus (the user clicks onto another control) the selection bar (the black stripe) is hidden. In fact if you try it out by selecting some text in the RichEdit1 then clicking in the Edit1 control at the top of the "editor" you'll see the selection disappears! [Tab] back into the RichEdit control and it reappears. Lets do this programmatically:
begin
Edit1.SetFocus;
~~~~~
MyRe.SetFocus;
end;
Take my word for it, but if you look closely, the black strip is gone. Pity we got stuck with a new one in the Edit1 control :-(
If your programmed in Delphi you may know a little trick:
Delphi Rule #4: you can't SetFocus on a disabled Control.
The converse however is also true:
Delphi Rule #4b: a disabled Control is not "Focused"
So try instead we can just Disable (then Enable) the RichEdit control like this:
begin
MyRe := TRichEdit(Sender);
MyRe.Enabled := False;
~~~~~~
MyRe.Enabled := True;
end;
Oops. I should have known. After all I said it: a disabled Control is not "Focused" - barely ten lines ago! When the RichEdit is enabled again, we also have to SetFocus back to it. Shees..
begin
MyRe := TRichEdit(Sender);
MyRe.Enabled := False;
~~~~~~
MyRe.Enabled := True;
MyRe.SetFocus;
end;
Try it again. This time things are working better, and we're leaving poor old Edit1 Control alone. Thats good practice, as it may have had an [OnFocus] event that does wierder things than what we're trying to do. Maybe not now, but it could in the future!
Marks: 10/10
On the other kind, some of you may have found instead the EM_HIDESELECTION message in the Win32.HLP. If you had delved in, you would have found something very interesting. The Delphi HideSelection property only implements half the capabilities of this message. You can also, by calling it direct, tell it to Temporarily hide the black stripe even when the control has the focus. So instead you could use the following lines of code:
begin
MyRe := TRichEdit(Sender);
MyRe.Perform(EM_HIDESELECTION,1,0);
~~~~~~
MyRe.Perform(EM_HIDESELECTION,0,0);
end
Yummy. Nice clean coding:
Death to the FLICKER
The next major problem is this bloody flicker. You should pop back into Delphi for a second, and types some lines in its editor, to see if it flickers at all. It does. But only when it is changing colors when it recognizes a change has occured. Otherwise it doesn't bother. Now look at what’s happening in our "editor". Do you see?
The problem is that we are not "conserving" what we are doing. If something is still the same TokenType it doesn't need to be re-highlighted because it already correct on the screen. We need to check if the TokenType of each token has changed since last time we repainted this line, and only then repaint to.
In fact we don't need to do even that - we can just check whether the SelAttributes (which represents the current selection's attributes) is any different from what we want to change it to i.e. FParseFont[MyTokenType]. This way if even the TokenType had changed, but the new and old TokenType shared the same display attributes, we would still conserve our drawing.
Actually the problems is that the RichEdit isn't doing the conserving. In the old text based system I used to use, if you printed something to the screen, and it was the same as something already on the screen, in the same position, then the program would not rewrite it to the screen. It would "conserve" the amount of writing it did, as in the old days 1200 baud screens were SLOW, and printing the same characters was a waste of time.
Huh - and people said we have come so dar with windows. Sloppy, Sloppy, Sloppy I say!
So lets replace:
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
with:
If MyRe.SelAttributes.Name <> PasCon.FParseFont[MyTokenState].Name then
MyRe.SelAttributes.Name := PasCon.FParseFont[MyTokenState].Name;
If MyRe.SelAttributes.Color <> PasCon.FParseFont[MyTokenState].Color then
MyRe.SelAttributes.Color := PasCon.FParseFont[MyTokenState].Color;
if MyRe.SelAttributes.Style <> PasCon.FParseFont[MyTokenState].Style then
MyRe.SelAttributes.Style := PasCon.FParseFont[MyTokenState].Style;
And off you go and try it out... (PS. Yes the last bit of code is bad programming...)
SUCCESS - (Nearly...)
I think you'll agree we are pretty close. There is just a little bit of flicker. This flicker is the SelStart jumping the Cursor position around the text. We need to hide this. This "Cursor" is also known as a Caret. Looking throught Win32.Hlp again we find the lovely, and appropriately named, HideCaret() function.
Lets try this then: everytime we change the value of MyRe.SelStart lets call HideCaret(MyRe.Handle) immediately before.
I'll be kind - that doesn't work. I tried 2 x HideCaret(MyRe.Handle), and it still didn't work. Neither did three,four or 25x. So close - but yet - so far. I think its time for another Delphi Rule:
DELPHI RULE #5 - If you bother to get your way through the atrocious index of the Win32.HLP file to find what you are looking for - make sure you really read what you found properly!
The key was the last paragraph in the description of not HideCaret but ShowCaret (which I had also read as I thought we were going to need it, especially to reverse my 25x HideCaret()). You also need another Delphi Rule to understand it:
The caret is a shared resource; there is only one caret in the system. A window should show a caret only when the window has the keyboard focus or is active.
DELPHI RULE #6 - Everything (basically) is a Window
You see the RichEdit is a windows control and is also.. in a weird sense.. a window. It has a Handle, which is why HideCaret would accept it. So re-reading the last line again we get:
The caret is a shared resource; there is only one caret in the system. A [RichEdit] should show a caret only when the [RichEdit] has the keyboard focus or is active.
So - in the end - we're back to were we started - we have to disable the RichEdit to stop the final bit of flickering. This also (co-incidentially) means that EM_HIDESELECTION is not needed anymore (if HideSelection is set properly during Design time). So in the end everyone gets 10/10 for marks!
ASH Version 0.9b
procedure TForm1.RichEdit1Change(Sender: TObject);
var
SaveOnChangeIn: TNotifyEvent;
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
MyPBuff: array[0..255] of char;
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
begin
MyRe := TRichEdit(Sender);
SaveOnChangeIn := MyRe.OnChange;
MyRe.OnChange := nil;
MyRe.Enabled := False;
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
Row := WasRow;
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MYPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
MyRE.SelStart := MySelStart;
MyRE.SelLength := Length(MyTokenStr);
If MyRE.SelAttributes.Name <> PasCon.FParseFont[MyTokenState].Name then MyRE.SelAttributes.Name := PasCon.FParseFont[MyTokenState].Name;
If MyRE.SelAttributes.Color <> PasCon.FParseFont[MyTokenState].Color then MyRE.SelAttributes.Color := PasCon.FParseFont[MyTokenState].Color;
if MyRE.SelAttributes.Style <> PasCon.FParseFont[MyTokenState].Style then MyRE.SelAttributes.Style := PasCon.FParseFont[MyTokenState].Style;
MySelStart := MySelStart + Length(MyTokenStr);
end;
MyRE.SelStart := WasSelStart;
MyRE.SelLength := 0;
MyRe.OnChange := SaveOnChangeIn;
MyRe.Enabled := True;
MyRe.SetFocus;
end;
Towards - ASH Version 1.0b
Couple of problems with the last version if you try it out for size:
Its slightly inefficient in that everytime SelAttributes is changed it forces a repaint of the same token in the control. We should instead use some variable (e.g var DoFormat:Boolean) to decided if we need to reformat, and then check the value of DoFormat at the end of this checking, and do it all then by a simple SelAttribute.Assign(FParseFont[MyTokenState]). This means we can also change the seperate "if" statements to a single if ... then .. else .. if ... then .. else which should code faster - especially if you put the various test situations in the order of likeliness to occur (e.g font changes less frequently than the color, so should be further down the if..else..if)
For some reason if you type a {style comment} on a line, after about 4-7 characters it reverts to different colours. I can't seem to work out yet why this happens - but I understand why its not being picked up. SelAttributes returns the value of the "initial" styling of the entire selected block. So if you select text which starts off black and then becomes blue, SelAttributes.Color will equal clBlack. We must also examine SelAttributes.ConsistentAttributes to ensure that the entire selection is consistent in the way it is highlighted. If it isn't - then we want to force it to be rehighlighted - its obviously not in the correct format.
Multi-line comments are a big pain e.g { words <CR> word <CR> words }. I don't have them in my 4GL so I didn't need to fix this sort of problem. However I do have muti-line strings - so I need to be able to string strings across many lines. The trouble is we have to code to program over a number of lines - but have a look at what happens in Delphi when you place a "{" anywhere in the code. The highlighting can force a repaint of the entire 2,000,000 lines of text in the control. We could catch that situation - ie if the last token on the line is a tsComment and it doesn't end in '}' we could increase SelLength until it did or we reach the end of the RichEdit.Lines. (That basically what the tokeniser does anyway with all that inc(Run).)
That easy. But what happens if you then delete the "{"? You need to go forward 2,000,000 lines and put the highlighting back again? We could decide to keep going until the if...then..else..list didn't set DoFormat := True. But what happens if we're in a colour mode were Comment highlighting style = KeyWord highlighting style. We would stop prematurely. So this "logic" wont help in all situations.
You can still get the "someone is chasing you effect" - except now its "someone is fleeing from you" effect. It happens when you have (* This is a comment *) and delete the first *-character. The control takes an appreciable time to rehighlight the text.
While looking for a fix for the last problem, I remembered the Richedit.Lines.BeginUpdate function. But that didn't help either. What we need is a Richedit.BeginUpdate. What would that do? It would increase an internal counter by one everytime it was called. RichEdit.EndUpdate would do the opposite. Then we would create our own WM_PAINT message handler. This is received everytime Windows wants the control to repaint a portion of itself. If we catch this message then we can stop processing of these message until the internal counter = 0 again. Then, and only then, will the Control repaint itself - ditching we would hope most of the intervening steps.
Fixing the mult-line comments:
My current idea is to use the RichEdit.Lines.Object to store the TokenType of the first token on each line. This way we could easily know how far we need to go when re-hightlighting multi-line comments. Initially this would be set to nil. I think this will work.
[Editor update: This didn't actually work - as the RichEdit.Lines.Object isn't implemented in TRichEdit control. It is
always nil regardless of what you assigned to it]
Upgrading to RichEdit98:
I'm also in the process of updating to the RichEdit98 components for Delphi 3.0-4.0. version 1.34 Author Alexander Obukhov, Minsk, Belarus. This control has a number of advances on the standard RichEdit control that ships with Delphi. Included in this are:
BeginUpdate,EndUpdate
Independant background colours
Margins
Hotspots
(Source code in full)
Anyway I hope you have enjoyed the adventure.I'm sorry if not all the examples compile as written. They may need some fixing to compile if you copy straight from the Browser into the Delphi Editor. Please send any comments to jonhd@hotmail.com.
Jon HogDog
by Robert Vivrette
--------------------------------------------------------------------------------
Automatic Syntax Highlighting Using a RichEdit Control (Updated with fixes)
Part 1 of "A Tale of Two RichEdits"
by Jon Hogan-Doran - jonhd@hotmail.com
Introduction
As a new programmer to Delphi, with a history in C, C++, Unix shells and a little Awk and Perl, I came to it with a particular goal in mind. I wanted to implement an Editor for a Australian-4GL Databse language called cl4. Having little to no experience in Windows programming, except some dalliance in Powerbuilder and SQLExpress, it seemed a pretty imposing goal.
One is often taught when learning a new language to read a few books, tryout the samples and then start some simple coding. Unfortunately I'm a bit of a sado-masochist, and love to jump in with all 10 fingers (and 10 toes) and try to implement the most difficult parts first, and all at once. Being able to print "Hello World" never caught my interest.
I guess I had a bad upbringing. I started programming by jumping into a language called "C" on SCO Xenix in 1986 with no programming experience at all. Just a book by two guys called Kerningham and Ritchie and some C code that wouldn't compile (it was BSD Unix based from memory). When I managed to get it to compile after 2 weeks, I decided to port the BinkleyTerm FidoNet mailer to Unix, while concurrently getting all the associated programs ported from Dos as well (the Fido mailer, Newsgroup controller, a entire BBS Program). I think I had 5 different DOS->UNIX programs compiling on 5 different virtual consoles all at once.
12 Months later and my [DosLib] and [FossilDriver] were done, BinkleyTerm was getting me my first newsgroups (comp.os.xenix.???) from a local friendly Sysop, and I was chewing up $1,000 in overseas phone bills organising a world-wide group of programmers to try out my handiwork.
So with Delphi I started on my next big challenge. I didn't want just any old editor for my old 4GL Unix database language. I wanted it all:
code-highlighting
on the spot syntax highlighting
code completion and suggesting
remote compile capabilities to my Linux box
jump to line with error
even a remote visual debugger
plus lots more goodies.
This is my adventure. Its a story not a traditional tutorial - the codes there if your not interested in reading my little tale. But if you are: sit back with a packet of chips and plenty of caffeine. And enjoy....
The Plan
No-one wants to re-invent the wheel. The whole point of OOP (Object oriented programming) is the reusability of objects. The whole point of the Internet was the reusability of someone else's object. Luckily I had read enough about these component "things" to want to find some more. So I connected up to the Internet and went "shopping".
My shopping list:
Internet controls (telnet, remote shell, remote execute, ftp)
Editor with multi-file capabilities
Syntax highlighter or Parser
Automatic Syntax highlighting
I shopped at:
Delphi Super Page
Torri's page
Borland
RxLib's
RzLib's
QuickReport Homepage
Within 4 hour I was off the net with a number of goodies, 23 or so assorted controls, programs and Internet suites, (and plenty of stuff that had nothing to do with the current project, but I thought might be of use on my next one). Eventually I whittled the choices down to:
YourPasEdit by D C AL CODA, Ken Hale and Coda Hale
PasToRTF by Martin Waldenburg (included in above)
TntClient by Francois Piette (and associated ICS suite)
RxTools by those great Russian dudes.
YourPasEdit was a great find, and its description said it all:
There are three main features in YourPasEdit. First is the PAS to RTF conversion unit, contained in mwPasToRtf.pas, by Martin Waldenburg. The second shows how to create TabSheets at run time, with a RichEdit. The third are the procedures written by Andrius Adamonis, which allow YourPasEdit to be associated with files and to open those files in the running instance of YourPasEdit, creating a new TabSheet and RichEdit for the newly opened file.
Unfortunately it also told me:
It is not intended to be a full Delphi file editor, because it does not highlight keywords on the fly, like a syntax aware editor.
But I wasn't complaining - I needed some challenge - otherwise it wouldn't be any fun! So I delved into how things were being done in YourPasEdit. To get the best from the following sections you should get your hands on at least YourPasEdit so you can follow and (more importantly) program along.
Syntax highlight (YourPasEdit)
In YourPasEdit, syntax highlighting was done by parsing the plain text file, dividing each line of text into separate "Tokens", working out what TokenType each token was, and formatting them based on preset Font and Color settings. The following tokens were supported, corresponding to the Token types in the Delphi 3.0 Editor: TTokenState = (tsAssembler, tsComment, tsCRLF, tsDirective, tsIdentifier, tsKeyWord, tsNumber, tsSpace, tsString, tsSymbol, tsUnknown);
How these tokens were formatted (that is with what Color and Attributes) was determined by using the Delphi 3.0 Editors own settings as they are stored in the Windows Registry. Basically a line of Pascal source such as:
procedure TForm1.FormCreate(Sender: TObject); { Create the Form }
would be divided up into:
procedure tsKeyWord
tsSpace
TForm1 tsIdentifier
. tsSymbol
FormCreate tsIdentifier
( tsSymbol
Sender tsIdentifier
: tsSymbol
TObject tsIdentifier
); tsSymbol
tsSpace
{Create Form} tsComment
<CR><LF> tsCRLF
How is it Done?
The RichEdit control normally loads preformatted text from .RTF files by way of by of the RichEdit.Lines.LoadFromFile() function. YourPasEdit uses the RichEdit.Lines.LoadFromStream() function to load the file from a TPasConversion - a custom TMemoryStream descendant. This stream takes the plaint text Pascal source file, loads it into its internal memory buffe, and then converts it from plain text to a text impregnated with RTF codes. This way when it is loaded into the RichEdit control via RichEdit.Lines.LoadFromStream the Pascal source file appears in the control color-syntax highlighted.
To the main Editor, this process is transparent - the code looks something like this:
begin
NewRichEdit := TRichEdit.Create;
PasCon.Clear; // Prepare the TPasConversion
PasCon.LoadFromFile(FName); // Load the File into the Memory Stream
PasCon.ConvertReadStream; // Convert the stream to RTF format
NewRichEdit.Lines.BeginUpdate;
NewRichEdit.Lines.LoadFromStream(PasCon); // Read from the TPasConversion
NewRichEdit.Lines.EndUpdate
NewRichEdit.Show;
Result := NewRichEdit;
end
EXAMPLE - snippet of code from the NewRichEditCreate(Fname) routine
As I said, it is the TMemoryStream derived TPasConversion which does all the hard work:
<SOURCE PASCAL FILE>
|
V
Plain source loaded into memory
(TPasConversion.LoadFromFile)
|
V
Converted internally by parsing the source file
(ConvertReadStream)
|
V
Result made available
(SetMemoryPointer)
|
V
RichEdit.LoadFromStream
Most of the work in TPasConversion is done by the ConvertReadStream procedure. Its purpose is to split up each line of source code into tokens (as showed previously) and then depending on its TokenType, load it into the outbuffer preceded by RTF codes to make it a particular Color, Bold, Italics etc. Here what it looks like:
// prepare the Outbuf to a certain default size
FOutBuffSize:= size+3;
ReAllocMem(FOutBuff, FOutBuffSize);
// Initialise the parser to its begining state
FTokenState := tsUnknown;
FComment := csNo;
FBuffPos := 0;
FReadBuff := Memory;
// Write leading RTF Header
WriteToBuffer('{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS SansSerif;}
{/f1/froman/fcharset2 Symbol;}{/f2/fmodern Courier New;}}'+#13+#10);
WriteToBuffer('{/colortbl/red0/green0/blue0;}'+#13+#10);
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
// Create the INSTREAM (FReadBuff) and tokenize it
Result:= Read(FReadBuff^, Size);
FReadBuff[Result] := #0;
if Result > 0 then
begin
Run:= FReadBuff;
TokenPtr:= Run;
while Run^ <> #0 do
begin
Case Run^ of
#13: // Deal with CRLFs
begin
FComment:= csNo;
HandleCRLF;
end;
#1..#9, #11, #12, #14..#32: // Deal with various whitespaces, control codes
begin
while Run^ in [#1..#9, #11, #12, #14..#32] do inc(Run);
FTokenState:= tsSpace;
TokenLen:= Run - TokenPtr;
SetString(TokenStr, TokenPtr, TokenLen);
SetRTF;
WriteToBuffer(Prefix + TokenStr + Postfix);
TokenPtr:= Run;
end;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~ much code removed ~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
end;
end
EXAMPLE - snippet showing the while loop that breaks up the INSTREAM into recognised tokens
Most of the work is done by the [case Run^ in ... end;] section which "breaks off" a token from the INSTREAM (FReadBuf) based on the logic in the case statement. The case statement is organised in such a way that it can quickly decipher the input stream into the various TokenTypes by examining each character in turn. Having worked out which tokentype it is, the actual encoding part is relatively easy:
FTokenState:= tsSpace;
TokenLen:= Run - TokenPtr;
SetString(TokenStr, TokenPtr, TokenLen);
ScanForRTF;
SetRTF;
WriteToBuffer(Prefix + TokenStr + Postfix);
EXAMPLE - basic steps in encoding the output stream of the TPasConversion object
What’s happening here is the program:
sets FTokenState to what we believe it is (in this part of the code it is tsSpace which matches any series of Whitespaces)
the length of the token is calculated by working out how far the current memory pointer (Run) has moved since we finished with the last token (TokenPtr).
the token is then copied from the Input buffer from the starting position of it in the memory buffer (TokenPtr) for the length of the token, into the variable TokenStr.
ScanForRtf just checks through the resultant TokenStr to ensure it doesn't have any funny characters that the RichEdit would confuse as RTF commands. If it finds any, it escapes them out.
SetRTF looks at the FTokenState to populate two global variables Prefix and Postfix with the appropriate RTF codes to give the token the right Color,Font,Boldness.
WriteToBuffer than simply puts the TokenStr with the Prefix and Postfix around it into the output buffer, and the loop then continues on.
Back to the topic: Syntax Highlighting (on-the-fly)
No source code is necessarily 100% applicable to your needs. I was fortunate in that most of the parser applied to my 4GL command syntax (e.g Strings were strings, Numbers were numbers, similar Keywords). As well YourPasEditor had implemented most of the basic accessory tasks such as Printing, Find, Find and Replace, Multi-File editing. It was just a matter of adding in the extras I was after.
PROBLEM #1 - No colours or fonts
One task the Parser didn't fully implement was Colors or Different Fonts, or even fonts sizes. The reason for this (after some trial and error) was that the SetRTF procedure new nothing about how to do this. It only used the information in regards [Bold], [Italics] and [Underline] stored in the Win95 Registry for the Delphi Editors Settings to determine how to highlight each token. As for fonts - well I hadn't realised that the Delphi Editor actually uses only one Font and Fontsize for all the different tokens - so that wasn't Pas2Rtf fault. I was just being greedy.
Luckily the comments in Pas2Rtt.pas told me what the other values in the Registry coded for, especially where the important foreground color was stored. This meant some changes to:
1. procedure SetDelphiRTF(S: String; aTokenState: TTokenState);
Add after the try;
Font.Color := StrToInt(Ed_List[0]);
2. procedure TPasConversion.SetPreAndPosFix
Add after FPreFix[aTokenState] = '';
FPreFixList[aTokenState] := ColorToRtf(aFont.Color);
The ColorToRtf codes is already present, but hadn't been used for some reasone. If you try it out you'll understand why . You get absolutely no change except lots of ';' in the wrong place.Change the ';' to '(space)' in ColorToRtf(), and you get rid of the ';' appearing in the RichEdit control, but no Colors anyway.
My first thought was that the value in Ed_List[0] didn't convert to a proper Font.Color. The easiest way to test this was to hard code Font.Color := clGreen; and see what happens. Again no luck. The format was consistent with the RTF codes I could see in the RTF header. What the $#%#$%# was wrong with it ?
It was about then that I realised I needed a crash course in RTF document structure. For this I rushed off to www.microsoft.com (please forgive me) and found a reference on RTF. After an hour of reading a Microsoft Technical Document I was even more confused. Oh well - this meant it was time to get dirty. Time to get down to real programmer stuff. Time to "cheat".
What did I do? I went into WordPad (which is just a glorified RichEdit version 2.0 on steroids) and saved various files into RTF format. I then opened them in NotePad so I could see the RTF codes and compare what happened in each case: what codes were produced depending on what I did, or didn't do. A similar sort of technique was used back in the 1980s to decipher the first Paradox database format Sorry Borland.
blank.rtf - empty -so I could see the "plain" header line
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}
{/colortbl/red0/green0/blue0;}/deflang1033/pard/plain/f2/fs20 /par }
plaintext.rtf - too see how having any text was handled
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f2/fs20 this is plain text
/par }
difffont.rtf - different font, same size, same text
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}{/f3/fswiss/fprq2 Arial;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f3/fs20 plain text different font/plain/f2/fs20
/par }
diffsize.rtf - text set to 18 point in the default font
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;}
/deflang1033/pard/plain/f2/fs36 plain text different font/plain/f2/fs20
/par }
diffcolor.rtf - etc. my favourite of course - blue.
{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS Sans Serif;}{/f1/froman/fcharset2 Symbol;}{/f2/froman Times New Roman;}}{/colortbl/red0/green0/blue0;/red0/green0/blue255;}
/deflang1033/pard/plain/f2/fs20/cf1 plain text different font/plain/f2/fs20
/par }
Looking at the resultant codes you see how the RTF stream is formatted. It comprises a:
INITIAL HEADER (/rtf1/.....)
FONTTABLE (/f0/fswiss...)
COLORTABLE (/colortbl)
MISCELLANEOUS
DEFAULT FORMAT (/pard....)
BODY OF THE FILE.
As a result of that I rewrote this code:
WriteToBuffer('{/rtf1/ansi/deff0/deftab720{/fonttbl{/f0/fswiss MS
SansSerif;}{/f1/froman/fcharset2 Symbol;}{/f2/fmodern Courier New;}}'+#13+#10);
WriteToBuffer('{/colortbl/red0/green0/blue0;}'+#13+#10);
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
to become:
WriteToBuffer('{/rtf1/ansi/deff0/deftab720');
WriteFontTable;
WriteColorTable;
WriteToBuffer('/deflang1033/pard/plain/f0/fs20 ');
The procedures Write[Font,Color]Table basically creates a table of fonts/colors we can reference later on. Each Font and Color type is stored by index in a TList internally. It acts as a lookup tables - by matching the Font name or Color value we can find the [num] to code into the RTF stream at the required moment:
/f[num] = the index of which Font you want to use, as pre-set in the "on the fly" font table
/fs[num] = point size - (for example 20 = 10point)
/cf[num] = the index of which Color to use, as preset in "on the fly" color table
/cb[num] = which background color to use - (ignored in RichEdit version 2.0)
PROBLEM#2 Crashes in long comments or text (existing problem)
There is a bug in ScanForRtf. Can you see it?
procedure TPasConversion.AllocStrBuff;
begin
FStrBuffSize:= FStrBuffSize + 1024;
ReAllocMem(FStrBuff, FStrBuffSize);
FStrBuffEnd:= FStrBuff + 1023;
end; { AllocStrBuff }
procedure TPasConversion.ScanForRtf;
var
i: Integer;
begin
RunStr:= FStrBuff;
FStrBuffEnd:= FStrBuff + 1023;
for i:=1 to TokenLen do
begin
Case TokenStr of
'/', '{', '}':
begin
RunStr^:= '/';
inc(RunStr);
end
end;
if RunStr >= FStrBuffEnd then AllocStrBuff;
RunStr^:= TokenStr;
inc(RunStr);
end;
RunStr^:= #0;
TokenStr:= FStrBuff;
end; { ScanForRtf }
EXAMPLE - code snippet from Pas2Rtf demonstrating the "long comment" bug
The problem: if FStrBuff is enlarged using AllocStrBuff() (to make it bigger to handle a very long comment) the Windows Memory manager probably has to re-allocate it by moving the entire string buffer somewhere else in memory. RunStr however is not adjusted for this change and stillpoints to the old memory area, now unallocated.
The fix: Reallocate RunStr in the AllocStrBuff routine so it points to the correct place in the new area of memory. Try and fix it yourself, or look at my garsely spaghetti code in jhdPasToRtf.pas.
Automatic Syntax Highlighting (my first implementation)
To understand how Automatic syntax highlighting works, you should have a close look at what happens in the Delphi 3.0 Editor. After all - if Borland was happy with it - who am I to argue
Take note when the "syntax" changes and what is affected. In retrospect the difficult thing is to implement a highlighter that is:
Fast
Accurate
Doesn't flicker
Isn't obvious ("the someone is chasing me phenomenon".. you'll see)
1. When should we do the re-highlighting ?
In YourPasEdit the highlighting is done as the file is read in. Once this is done, the only way to make use of that technique would be to write out the file everytime it changes and read it back in again - obviously a very slow process. In my case, I basically wanted to just reformat the line(s) that have been changed, immediately after the change had been done i.e. after every new character, DELETE or BACKSPACE or even Paste or DragDrop had been processed. I needed something that was triggered everytime the control was effected in such a way.
What I needed then was an [Event].
2. Which event - there's so many to choose from ?
A RichEdit, like any control, has a number of [Events] triggered when you do various things to the control. What is not obvious, is that many events trigger other events in turn. So in choosing which Event(s) to hang your code off you have to ensure that (a) it catches all situations where you need to "fix" the highlighting and (b) it doesn't become re-entrant (i.e. what you do in the [Event], doesn't trigger itself again or any other [Event] that would call the "highlighting code"). From a quick look at the helpfile, I decided that [OnChange] seemed a likely candidate. According to the Delphi 3.0 Helpfile:
Write an OnChange event handler to take specific action whenever the text for the edit control may have changed. Use the Modified property to see if a change actually occurred. The Text property of the edit control will already be updated to reflect any changes. This event provides the first opportunity to respond to modifications that the user types into the edit control.
You may be thinking however: "Heh? What about those other things - like Methods and Properties. Can't they also change the text?" They sure can - but most end up triggering [OnChange] anyhow.
3. Is it what I want? - Rich text controls (from Delphi3 Helpfile)
The rich text component is a memo control that supports rich text formatting. That is, you can change the formatting of individual characters, words, or paragraphs. It includes a Paragraphs property that contains information on paragraph formatting. Rich text controls also have printing and text-searching capabilities.
By default, the rich text editor supports
Font properties, such as typeface, size, color, bold, and italic format
Format properties, such as alignment, tabs, indents, and numbering
Automatic drag-and-drop of selected text
Display in both rich text and plain text formats.
(Set PlainText to True to remove formatting)
type TNotifyEvent = procedure(Sender: TObject) of object;
property OnChange: TNotifyEvent;
4. Is it the event I want - ie [OnChange] Event - the right one?
Live dangerously, let’s give it a go and see...by testing our assumptions out:
So I wrote my first [OnChange] event:
Create a New application
place on it one RichEdit (RichEdit1) and one Edit control (Edit1)
Code the [OnChange] for the RichEdit1 control like this:
procedure TForm1.RichEdit1Change(Sender: TObject);
begin
TRichEdit(Sender).Tag := TRichEdit(Sender).Tag + 1;
Edit1.Text := 'Tag=' + IntToStr(TRichEdit(Sender).Tag);
end;
In this case the Sender object is the RichEdit being changed. The code basically uses the RichEdit's Tag variable (initially 0) as a handy Control specific variable. Everytime the [OnChange] event is called, it increases the Tag by 1, and display its value in an Edit Control as Text. You should pre-set the RichEdit control with some text in it, otherwise the following may be confusing!
Compile and Run...
Click in the Control. Nothing...
Move around in it using CursorKeys... Nothing...
Click outside the control.. and then back inside.. Nothing...
Press the [Space Bar].. Tag=1...
Press [Backspace].. Tag=2...
Press return.. Tag=3..
Select some text.. No change..
CTRL-C some text.. No change..
CTRL-X some text.. Tag=4..
CTRL-V some text.. Tag=5..
As it looked good so far I then added to the Form1.OnShow event:
RichEdit1.Lines.LoadFromFile('c:/winzip.log'); {Just a plain text file hanging around }
to see what happened. And guess what - an [OnChange] event was called sometime and "Tag=1" was displayed in the Edit control as the proof when the Form appears for the first time. So we can see that procedures do call Events that apply to what they are doing.
5. What happens in Syntax Highlighting anyhow?
Watch carefully in the Delphi Editor. Now try and reproduce it. Open a WordPad (since WordPad is a souped up RichEdit basically). Read in a source file (e.g any Unit1.pas) and do syntax highlighting manually:
Select a token
Manipulate it using the buttons provided to change Font, Size, Color, and Bold
Move onto the next token
Goto 1
So therefore in [OnChange] we'll try and write code to reproduce what we have done manually.
6. Which text do I want.. and where do I get it ?
Hunting through the Delphi Helpfile on RichEdit controls we find that the actual text information in the RichEdit control is stored (or rather can be accessed from) either:
RichEdit.Text
Text contains a text string associated with the control.
TCaption = type string;
property Text: TCaption;
Description
Use the Text property to read the Text of the control or specify a new string for the Text value. By default, Text is the control name. For edit controls and memos, the Text appears within the control. For combo boxes, the Text is the content of the edit control portion of the combo box.
RichEdit.Lines
Lines contains the individual lines of text in the rich text edit control.
property Lines: TStrings;
Description
Use Lines to manipulate the text in the rich text edit control on a line by line basis. Lines is a TStrings object, so TStrings methods may be used for Lines to perform manipulations such as counting the lines of text, adding lines, deleting lines, or replacing the text in lines.
To work with the text as one chunk, use the Text property. To manipulate individual lines of text, the Lines property works better.
Now Lines seemed to be what I wanted - after all I wanted the Syntax highlighting to work on a line by line basis. So let’s have a look at whats been changed.
Oh.. look at what? How can I tell which line is the one that is changed?
Unlike some Events, the [OnChange] isn't passed any variable's save the identity of the RichEdit control affected. The RichEdit Control doesn't have a runtime variable that tells us either. The only variables are the SelStart and SelLength - but their about selecting text aren't they? I just want to know what line I'm on :-(
It was about then that I re-read the information on the Sel??? properties, and recalled my "concept" code. Selection - I realised - was the name of the game. By manipulating these variables I could reproduce what I was doing manually - selecting text - as program code. Once selected you can then manipulate the attributes of the selected text through the SelAttributes structure.
Let’s get familiar with these variables (in Summary)
.
SelStart Position of the Cursor, or the beginning of the selected text
SelLength 0 if SelStart = Cursor Pos, or length of selected Text
SelText empty if no text selected, or actual text selected
SelAttribute Default attributes if I was to start typing at the Cursor position OR the actual attributes of the selected text
There is actually no other way to access the attributes of the text already in the Control than by programmatically accessing them via manipulating Sel variables (*if you stick to using the defined properties and methods).
In the end RichEdit.Text and RichEdit.Lines are just plain old strings - not really "rich" at all. The other thing to note is that SelStart is a 0-based index on the first character of RichEdit.Text - so it looks like Richedit.Line is out the door.
7. Okay implement: Select a Token
Basically I wanted to start at the beginning of the line, send just that line to PasCon, and read it back in and replace the current line with the result. Trouble is RichEdit doesn't give you access to the 'RTF' representation of a single line. Plus I still can't tell when the beginning or end of the line is. Since the latter seems to be a nagging problem, we better fix that first - trouble is: How?
When all else fails - WinAPI calls of course.
Most visual controls in Delphi are in fact just native Windows controls encapsulate as Delphi types. You can still use Windows API functions to access the control underneath. This was it is possible to access information not accessable per Delphi public Properties, Method or Events. Time to delve through Win32.HLP and see what it has to say about RichEdit controls. Its stored in C:/Program Files/Borland/Delphi 3.0/Help if you don't have a shortcut to it.
Open Win32.HLP -> [ Contents ] -> [ RichEdit controls ] -> [ Rich Edit Controls ]
I spent some time getting to know the "full" capabilites of the RichEdit control hidden behind Delphi's implementation of it. Much of what I learned came in handy later on (as you'll see) and as a result I derived my second two Delphi Rules:
Delphi Rule #2: If your Project hinges on the capabilities of a certain control - make sure you know everthing about it - from the beginning.
Delphi Rule #3: "Reference" is not the same as "Summary" (also known as Win32.HLP Rule#1)
Eventually I discovered the key in the [ Rich Edit Control Reference ] under "Lines and Scrolling". I had thought this page was simply a summary of the messages discussed in the preceding help pages. Actually it included a number of extra messages not discussed elsewhere - the exact ones I was after!
Lines and Scrolling
EM_LINEFROMCHAR - give them a 0-based index and they'll return the line
EM_LINEINDEX - give them a line and you get the index of the first character
EM_LINELENGTH - give them a line and you get the length of the line
So lets start coding.
(NB: To use the constants (EM-?) you'll have to manually add RichEdit in the uses clause)
procedure TForm1.RichEdit1Change(Sender: TObject);
var WasSelStart,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
Row := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
BeginSelStart:= MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
// I didn't use the EM_LINELENGTH message, as the variables was avaiable via Delphi
Edit1.Text := IntToStr(WasSelstart) + '-' +
IntToStr(Row) + '-' +
InttoStr(BeginSelStart) + '-' +
IntToStr(EndSelStart);
end;
To start with I used the Edit1 Control to display the results of all these variables. I then tried manipulating text in the RichEdit to see what values I got. You should do the same. Type slowly in:
1234567890<CR>1234567890<CR>1234567890
and see how the results are reflected in the Edit control as you do so. Then experiment - try adding stuff to the ends of lines, and in the beginning of the line, and middle of lines. You may have to refer back to the Code to work out which number represents which variable.
Okay, now using the variables we have, lets try selecting the text of the current line, and display it in a new Edit Control (Edit2).
Add the following code to see what happens (don’t forget to add the second edit control and make it as wide as possible):
MyRe.SelStart := BeginSelStart;
MyRe.SelLength := EndSelStart - BeginSelStart;
Edit2.Text := MyRe.SelText;
end;
Run the program and try it out.
OOPS - That doesn't work - the text remains selected and the original cursor position is lost.
We need to reset SelStart and SelLength before we finish in the [OnChange] event. So let’s add at the end:
MyRe.SelStart := WasSelStart; //back to were we started
MyRe.SelLength := 0; // nothing selected
end;
While playing with text in the edit control I discovered something weird.
If you typed [1] then <CR> then [2] the Edit1 displayed [4-1-3-4].
But there were only two characters in the display.
I made a mistake. It appears that RichEdit.Text can tell you where the beginning and end of line is. Why? Because you can access the <CR><LF> characters in the Text string. So we could have manipulated the Text property of the control to work out the beginning and end of lines by reading back and forward from SelStart to find <CR><LF> characters. We may not have known which line we were on, but we would know where it began and ended. Nevertheless we should keep this in mind, it might come in handy later.
But it doesn't matter - the EM_###### messages are a neat way of doing things. And they work. For the moment at least we'll stick with them.
7. Okay implement: Part 2 - Change the format
After the line Edit2.Text := MyRe.SelText, but before the "resetting" part, lets put some logic in to turn lines RED when they are longer than a certain length:
if (MyRe.SelLength > 10) then MyRe.SelAttributes.Color := clRed;
You'll notice two things if you test this out. First - it does work. Second however, is that if you type a line > 10 characters, press return and type one character - its in Red. This is because it inherits the Attributes of the preceding text. Just like if you have bold on in a Word processor, it doesn't reset if you press return. So lets change the line to include an else situation:
else MyRe.SelAttributes.Color := clBlack;
That seems to work - except when you press return in the middle of a > 10 character line you have already typed (which is already Red) to leave a stump < 10 characters on the line above - it remains red. This is because the code leaves you on the next line, and SelStart refers to this new line, not the previous one. In our eventual code, we'll have to take care to ensure this doesn't happen - we have to catch this situation and deal with it. It wont be the only situation I'm sure....
PS: There will be a number of situation we're we'll have to be careful. Can you think of any now? Try putting a lot of text in the Control (or manipulate a loaded file) and selecting some and using the inherit Drag and Drop (move your Mouse over some selected text, press and hold down the Left MouseButton and then drag away) to move some text. This only Triggers one OnChange Event. We may also be moving multiple lines along the way. In the future we'll have to put in some code to detect this happening, and ensure the [OnChange] event can deal with the need to reformat in two different locations. That means thinking in the back of the head about how in the future we may have to deal with this kind of situation, and ensure our code to deal with the simple situation can be adapted - i.e. be "versatile".
8. Basically it all seems to kind-of work.. can't we do some real programming now?
Okay, okay. But first we have a problem. Actually a rather big problem. The problem is PasCon. Why?
First: It returns RTF code.
Problem: We can't use RTF code.
Second: its designed to work an entire stream, and then give it back to us again as a whole.
Problem: We actually want greater control over it than this "all or nothing" approach.
OOP to the Rescue
When you have something that works in a situation, and needs to be applied in another situation were it has to do a similar, but subtly different job - you have two choices:
copy the function, and re-write it for the new situation, or
kludge around it (e.g use Pas2Rtf, and then write a RtfCodes2RtfControl procedure).
Modern languages however give you an option: OOP it. "Objectify" it. This is more than just deriving something from an existing object. It is in a sense programming in a "state of mind". Controls should be created so they can be used in a variety of situations - father than situation specific. In this case all PasCon can deal with is tokenising the input stream and returning code RTF text. What we really need to do is divide it into two entitites. We need to separate the [Parsing/Recognise the Token and TokenType] from the [Encode it in RTF codes].
So lets start with ConvertReadStream, editing it so it looks something like this:
function TPasConversion.ConvertReadStream: Integer;
begin
FOutBuffSize := size+3;
ReAllocMem(FOutBuff, FOutBuffSize);
FTokenState := tsUnknown;
FComment := csNo;
FBuffPos := 0;
FReadBuff := Memory;
{Write leading RTF}
WriteToBuffer('{/rtf1/ansi/deff0/deftab720');
WriteFontTable;
WriteColorTable;
WriteToBuffer('/deflang1033/pard/plain/f2/fs20 ');
Result:= Read(FReadBuff^, Size);
if Result > 0 then
begin
FReadBuff[Result] := #0;
Run := FReadBuff;
while Run^ <> #0 do
begin
Run := GetToken(Run,FTokenState,TokenStr);
ScanForRTF;
SetRTF;
WriteToBuffer(PreFix + TokenStr + PostFix);
end;
{Write ending RTF}
WriteToBuffer(#13+#10+'/par }{'+#13+#10);
end;
Clear;
SetPointer(FOutBuff, fBuffPos-1) ;
end; { ConvertReadStream }
The code for ConvertReadStream is now much smaller, and also easier to understand. We can then take all the code that used to be in ConvertReadStream that did the tokenizing and create a new subroutine - the GetToken function that just does the recognizing and labelling of the individual tokens. In the process we also loose a huge number of repeated lines of code, as well as a number of sub-routines such as HandleBorCom and HandleString.
//
// My Get Token routine
//
function TPasConversion.GetToken(Run: PChar; var aTokenState: TTokenState;
var aTokenStr: string)Char;
begin
aTokenState := tsUnknown;
aTokenStr := '';
TokenPtr := Run; // Mark were we started
Case Run^ of
#13:
begin
aTokenState := tsCRLF;
inc(Run, 2);
end;
#1..#9, #11, #12, #14..#32:
begin
while Run^ in [#1..#9, #11, #12, #14..#32] do inc(Run);
aTokenState:= tsSpace;
end;
'A'..'Z', 'a'..'z', '_':
begin
aTokenState:= tsIdentifier;
inc(Run);
while Run^ in ['A'..'Z', 'a'..'z', '0'..'9', '_'] do inc(Run);
TokenLen:= Run - TokenPtr;
SetString(aTokenStr, TokenPtr, TokenLen);
if IsKeyWord(aTokenStr) then
begin
if IsDirective(aTokenStr) then aTokenState:= tsDirective
else aTokenState:= tsKeyWord;
end;
end;
'0'..'9':
begin
inc(Run);
aTokenState:= tsNumber;
while Run^ in ['0'..'9', '.', 'e', 'E'] do inc(Run);
end;
'{':
begin
FComment := csBor;
aTokenState := tsComment;
while not ((Run^ = '}') or (Run^ = #0)) do inc(Run);
inc(Run);
end;
'!','"', '%', '&', '('..'/', ':'..'@', '['..'^', '`', '~' :
begin
aTokenState:= tsUnknown;
while Run^ in ['!','"', '%', '&', '('..'/', ':'..'@', '['..'^',
'`', '~'] do
begin
Case Run^ of
'/':
if (Run + 1)^ = '/' then
begin
if (aTokenState = tsUnknown) then
begin
while (Run^ <> #13) and (Run^ <> #0) do inc(Run);
FComment:= csSlashes;
aTokenState := tsComment;
break;
end
else
begin
aTokenState := tsSymbol;
break;
end;
end;
'(':
if (Run + 1)^ = '*' then
begin
if (aTokenState = tsUnknown) then
begin
while (Run^ <> #0) and not ( (Run^ = ')') and ((Run - 1)^ = '*') ) do inc(Run);
FComment:= csAnsi;
aTokenState := tsComment;
inc(Run);
break;
end
else
begin
aTokenState := tsSymbol;
break;
end;
end;
end;
aTokenState := tsSymbol; inc(Run);
end;
if aTokenState = tsUnknown then aTokenState := tsSymbol;
end;
#39:
begin
aTokenState:= tsString;
FComment:= csNo;
repeat
Case Run^ of
#0, #10, #13: raise exception.Create('Invalid string');
end;
inc(Run);
until Run^ = #39;
inc(Run);
end;
'#':
begin
aTokenState:= tsString;
while Run^ in ['#', '0'..'9'] do inc(Run);
end;
'$':
begin
FTokenState:= tsNumber;
while Run^ in ['$','0'..'9', 'A'..'F', 'a'..'f'] do inc(Run);
end;
else
if Run^ <> #0 then inc(Run);
end;
TokenLen := Run - TokenPtr;
SetString(aTokenStr, TokenPtr, TokenLen);
Result := Run
end; { ConvertReadStream }
ASH - Automatic Syntax highlight (Attempt 2)
[Please note: I have my Delphi Editor colors set-to the [Ocean] colour speed settings for testing purposes. This setting works well on the default RichEdit white background, and most TokenTypes are in different colors from each other]
Okay now to do some real work. Most of the function have been written thereabouts. As a basis for writing this ASH I'm going to use Project1.dpr which comes out of mpas2rtf.zip in the YourPasEdit zip file yrpasedit.zip. This is because it much smaller than YourPasEdit, and thus quicker to compile.
I suggest you put the contents of the mpas2rtf.zip into a separate directory. Also copy mwPas2Rtf.pas to testinput.pas using the Explorer shell - we'll be using this file as a sample pascal file for benchmarking.
Open Project1.dpr in Delphi, compile Project1, run it, and open the file testinput.pas by pressing [Button 1] and selecting it in the [OpenFile Dialog]. Do it a number of times, and record the time taken for each once the file is stabilised in the system cache. On my system it averages about 0.47 - 0.41 seconds once its in the cache (P133 - 16M - Win95b)
Preparing Project1's Unit1.pas
Now replace the contents of mpas2rtf.pas with that code in jhdpas2rtf.pas. Recompile. Now open up the testinput.pas sample file again by using [Button 1]. As you see - we get color - but it takes a "lot" longer: 1.20-1.25 seconds.
Try and speed it up if you like. You can start by commenting out the pascal-code that codes in the different Font and FontSizes in TPasConversion.SetRtf. Recompile and run again. This time it improves a bit to 1.10-1.15. Now try commenting out the code for different Colors. Wow - the speed decreases down to 0.49 - 0.44.
Hmm. This font and color stuff really packs a punch. We may need to look at this later in more detail if things end up too slow. For the moment we'll leave the code back in full working condition (so you'll need to go back and uncomment the code).
Now put the following base code into the [OnChange] event of the RichEdit1 in Unit1.pas of Project1. Most of this code is just based on what we have already covered elsewhere.
procedure TForm1.RichEdit1Change(Sender: TObject);
var
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe: TRichEdit;
MyPBuff: array[0..255] of char;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
Row := WasRow;
end;
Were going to use the GetToken() function to do all the hard work. We'll need some extra variables to pass to the GetToken function, so add to the var section:
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
These are similar to the variables we used in the ConvertReadStream - in fact we want to do "exactly" the same thing, just one single line at a time. Add this code before the last end;
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MYPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
//
// ScanForRtf;
// SetRtf;
// WriteBuffer(Prefix + TokenStr + Postfix);
//
end;
end;
NB: As we will be using PasCon you'll have to move it from being a local variable of TForm1.Button1Click to be a global variable. This will mean you'll have to move all the initialising:
PasCon:=TPasConversion.Create;
PasCon.UseDelphiHighlighting(3);
to a TForm1.Show, and the PasCon.Free to TForm1.Close procedure. It will still work if you only move the variable definition - but not for long...
I've left the code from the old ConvertReadStream in the example above to show what we "logically" still need to implement in the current context - that is manipulating the RichEdit Control directly. What we have now is the ability to cut up the current line in to different tokens, and know what type they are. We now have to add these tokens to current line with the right attributes (Fonts,Colors,Bold etc).
But wait. They are already on the line - well the text is anyway, but maybe not in the correct format (Color,Bold etc). So what actually could do is to select each token in its corresponding positon in the RichEdit control and just apply the appropriate attributes to them.
We did this back in the beginning remeber? When we set the >10 character lines to the color red. But how do we do this now? Lets look at what we have in the variables at hand when we hit "// SetRtf" the first time:
(these example uses Uni1.pas as the input file as its more interesting)
VARIABLES
01234567901234567890
Lines.Strings[R0] unit Unit1;
MyPBuff unit Unit1;
MyTokenState tsIdentifier
MyTokenStr unit
MyRun Unit1;
So what we need to do is select the word 'unit' in the RichEdit control, and set its attributes. We do this by setting SelStart to the position of 'unit' in the RichEdit control, and SelLength to the length of the word 'unit'. And since 'unit' is at the beginning of the current line - thats position is BeginSelStart (which I conveninently have stored in MySelStart - you'll see why). Lets replace the "pseudo" comment code with the following:
MyRe.SelStart := MySelStart;
MyRe.SelLength := Length(MyTokenStr);
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
end;
But remember we are in a loop - when we go around again we'll have the next token in the line, and the variables will look like this:
VARIABLES
01234567901234567890
Lines.Strings[R0] unit Unit1;
MyPBuff unit Unit1;
MyTokenState tsSpace
MyTokenStr (space character)
MyRun Unit1;
But (space character) isn't at BeginSelStart (#0) in the RichEdit control. Its further along (at position #4). Which just happens to be BeginSelStart + Length('unit'). We need to update MySelStart after we process the preceeding token, but before we go around the loop again:
MySelStart := MySelStart + Length(MyTokenStr);
end;
Okay - this is where we are standing at the moment:
procedure TForm1.RichEdit1Change(Sender: TObject);
var
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
MyPBuff: array[0..255] of char;
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
begin
MyRe := TRichEdit(Sender);
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
Row := WasRow;
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MyPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
MyRe.SelStart := MySelStart;
MyRe.SelLength := Length(MyTokenStr);
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
MySelStart := MySelStart + Length(MyTokenStr);
end;
MyRE.SelStart := WasSelStart;
MyRE.SelLength := 0;
end;
Now: put the Debugging code on, do [Build All] and then [Run], and set a breakpoint on the first line of this Event. Open up the testinput.pas. When the debugger stops in the OnChange event, Press <F9> to continue on, and press <F9> again, and again - Do you see? We keep going back into the Event again and again (and again). What’s happening?
Somehow in our event we are triggering off another [OnChange] event. This call to the [OnChange] event code is stored in the message queue. When the event were currently in is finished, a new one is just waiting on the Event queue, which executes and creates more events... a re-entrant loop.
This behaviour is not surprising - after all we are actually changing the control in the process of our code, so no wonder another [OnChange] event is being triggered.
The way to fix such things is to ensure our actions do not trigger of the Event. We can do this by "temporarily" storing the RichEdits.OnChange property (which contains a reference to call our procedure TForm1.RichEdit1Change) in our own internal variable, and then setting the OnChange property to nil.
We then do all the processing we want to do - if it happens to trigger an [OnChange] event - there is nothing to call as OnChange is nil, and so the Event doesn't go onto the Message queue. When we're finished however we must return the OnChange property to it original value, otherwise the reprocessing want happen next time around.
If we look at the Delphi Helpfile we see that the OnChange property is of a certain type, the same type we have to make our SaveOnChangeIn variable:
var
SaveOnChangeIn: TNotifyEvent;
~~~~~rest of code
begin
MyRe := TRichEdit(Sender);
SaveOnChangeIn := MyRe.OnChange;
MyRe.OnChange := nil;
~~~~~rest of code
MyRe.OnChange := SaveOnChangeIn;
end;
Try it out!!!
Compile and Run
Open up Unit1.pas in the "editor" we have written
Click in the RichEdit in the center of the first lines, in the middle of "unit".
Press the [space bar]
Press the [backspace key]
Arrow to the end of the line
Press [Enter]
Press [BackSpace]
[BackSpace] away the entire line
Re-type the entire line
Result: "functionally" the Control should look that same as it did before we clicked in it. The line "unit Unit1;" should highlighted properly as per your Delphi 3.0 Editor (save the background colour). However its slow and flickers a great deal. Try opening up a new line and just type a long phrase - e.g "if (RichEdit = Santa) then GetPresents('box of choclate'); " and you'll agree with me that:
GOOD - It is highlighting properly
BAD - There is flickering
BAD - The longer the line gets, the longer it takes to do the re-highlighting
BAD - you get the "someone is chasing me effect"
The flickering is due to a number of components. We'll have to deal with each seperately.
The most obvious is the "selecting" of each Token. Visually the control is just repeating what we were able to do manually - when a piece of text is selected it becomes highlighted by the black stripe. We need to stop this from happening. Back to the helpfile(s) again. Have a search around, and come back after a snack break with some ideas... I'm hungry
Death to the black stripe
Marks: 5/10
Most of you would have found the HideSelection property of the RichEdit control. When it is set to TRUE and the RichEdit looses the focus (the user clicks onto another control) the selection bar (the black stripe) is hidden. In fact if you try it out by selecting some text in the RichEdit1 then clicking in the Edit1 control at the top of the "editor" you'll see the selection disappears! [Tab] back into the RichEdit control and it reappears. Lets do this programmatically:
begin
Edit1.SetFocus;
~~~~~
MyRe.SetFocus;
end;
Take my word for it, but if you look closely, the black strip is gone. Pity we got stuck with a new one in the Edit1 control :-(
If your programmed in Delphi you may know a little trick:
Delphi Rule #4: you can't SetFocus on a disabled Control.
The converse however is also true:
Delphi Rule #4b: a disabled Control is not "Focused"
So try instead we can just Disable (then Enable) the RichEdit control like this:
begin
MyRe := TRichEdit(Sender);
MyRe.Enabled := False;
~~~~~~
MyRe.Enabled := True;
end;
Oops. I should have known. After all I said it: a disabled Control is not "Focused" - barely ten lines ago! When the RichEdit is enabled again, we also have to SetFocus back to it. Shees..
begin
MyRe := TRichEdit(Sender);
MyRe.Enabled := False;
~~~~~~
MyRe.Enabled := True;
MyRe.SetFocus;
end;
Try it again. This time things are working better, and we're leaving poor old Edit1 Control alone. Thats good practice, as it may have had an [OnFocus] event that does wierder things than what we're trying to do. Maybe not now, but it could in the future!
Marks: 10/10
On the other kind, some of you may have found instead the EM_HIDESELECTION message in the Win32.HLP. If you had delved in, you would have found something very interesting. The Delphi HideSelection property only implements half the capabilities of this message. You can also, by calling it direct, tell it to Temporarily hide the black stripe even when the control has the focus. So instead you could use the following lines of code:
begin
MyRe := TRichEdit(Sender);
MyRe.Perform(EM_HIDESELECTION,1,0);
~~~~~~
MyRe.Perform(EM_HIDESELECTION,0,0);
end
Yummy. Nice clean coding:
Death to the FLICKER
The next major problem is this bloody flicker. You should pop back into Delphi for a second, and types some lines in its editor, to see if it flickers at all. It does. But only when it is changing colors when it recognizes a change has occured. Otherwise it doesn't bother. Now look at what’s happening in our "editor". Do you see?
The problem is that we are not "conserving" what we are doing. If something is still the same TokenType it doesn't need to be re-highlighted because it already correct on the screen. We need to check if the TokenType of each token has changed since last time we repainted this line, and only then repaint to.
In fact we don't need to do even that - we can just check whether the SelAttributes (which represents the current selection's attributes) is any different from what we want to change it to i.e. FParseFont[MyTokenType]. This way if even the TokenType had changed, but the new and old TokenType shared the same display attributes, we would still conserve our drawing.
Actually the problems is that the RichEdit isn't doing the conserving. In the old text based system I used to use, if you printed something to the screen, and it was the same as something already on the screen, in the same position, then the program would not rewrite it to the screen. It would "conserve" the amount of writing it did, as in the old days 1200 baud screens were SLOW, and printing the same characters was a waste of time.
Huh - and people said we have come so dar with windows. Sloppy, Sloppy, Sloppy I say!
So lets replace:
MyRe.SelAttributes.Assign(PasCon.FParseFont[MyTokenState]);
with:
If MyRe.SelAttributes.Name <> PasCon.FParseFont[MyTokenState].Name then
MyRe.SelAttributes.Name := PasCon.FParseFont[MyTokenState].Name;
If MyRe.SelAttributes.Color <> PasCon.FParseFont[MyTokenState].Color then
MyRe.SelAttributes.Color := PasCon.FParseFont[MyTokenState].Color;
if MyRe.SelAttributes.Style <> PasCon.FParseFont[MyTokenState].Style then
MyRe.SelAttributes.Style := PasCon.FParseFont[MyTokenState].Style;
And off you go and try it out... (PS. Yes the last bit of code is bad programming...)
SUCCESS - (Nearly...)
I think you'll agree we are pretty close. There is just a little bit of flicker. This flicker is the SelStart jumping the Cursor position around the text. We need to hide this. This "Cursor" is also known as a Caret. Looking throught Win32.Hlp again we find the lovely, and appropriately named, HideCaret() function.
Lets try this then: everytime we change the value of MyRe.SelStart lets call HideCaret(MyRe.Handle) immediately before.
I'll be kind - that doesn't work. I tried 2 x HideCaret(MyRe.Handle), and it still didn't work. Neither did three,four or 25x. So close - but yet - so far. I think its time for another Delphi Rule:
DELPHI RULE #5 - If you bother to get your way through the atrocious index of the Win32.HLP file to find what you are looking for - make sure you really read what you found properly!
The key was the last paragraph in the description of not HideCaret but ShowCaret (which I had also read as I thought we were going to need it, especially to reverse my 25x HideCaret()). You also need another Delphi Rule to understand it:
The caret is a shared resource; there is only one caret in the system. A window should show a caret only when the window has the keyboard focus or is active.
DELPHI RULE #6 - Everything (basically) is a Window
You see the RichEdit is a windows control and is also.. in a weird sense.. a window. It has a Handle, which is why HideCaret would accept it. So re-reading the last line again we get:
The caret is a shared resource; there is only one caret in the system. A [RichEdit] should show a caret only when the [RichEdit] has the keyboard focus or is active.
So - in the end - we're back to were we started - we have to disable the RichEdit to stop the final bit of flickering. This also (co-incidentially) means that EM_HIDESELECTION is not needed anymore (if HideSelection is set properly during Design time). So in the end everyone gets 10/10 for marks!
ASH Version 0.9b
procedure TForm1.RichEdit1Change(Sender: TObject);
var
SaveOnChangeIn: TNotifyEvent;
WasSelStart,WasRow,Row,BeginSelStart,EndSelStart: Integer;
MyRe : TRichEdit;
MyPBuff: array[0..255] of char;
MyTokenStr:string;
MyTokenState:TTokenState;
MyRunChar;
MySelStart: Integer;
begin
MyRe := TRichEdit(Sender);
SaveOnChangeIn := MyRe.OnChange;
MyRe.OnChange := nil;
MyRe.Enabled := False;
WasSelStart := MyRE.SelStart;
WasRow := MyRE.Perform(EM_LINEFROMCHAR, MyRE.SelStart, 0);
Row := WasRow;
BeginSelStart := MyRe.Perform(EM_LINEINDEX, Row, 0);
EndSelStart := BeginSelStart + Length(MyRE.Lines.Strings[Row]);
StrPCopy(MyPBuff,MyRE.Lines.Strings[Row]);
MYPBuff[Length(MyRE.Lines.Strings[Row])] := #0;
MySelStart := BeginSelStart;
MyRun := MyPBuff;
while(MyRun^ <> #0) do
begin
MyRun := PasCon.GetToken(MyRun,MyTokenState,MyTokenStr);
MyRE.SelStart := MySelStart;
MyRE.SelLength := Length(MyTokenStr);
If MyRE.SelAttributes.Name <> PasCon.FParseFont[MyTokenState].Name then MyRE.SelAttributes.Name := PasCon.FParseFont[MyTokenState].Name;
If MyRE.SelAttributes.Color <> PasCon.FParseFont[MyTokenState].Color then MyRE.SelAttributes.Color := PasCon.FParseFont[MyTokenState].Color;
if MyRE.SelAttributes.Style <> PasCon.FParseFont[MyTokenState].Style then MyRE.SelAttributes.Style := PasCon.FParseFont[MyTokenState].Style;
MySelStart := MySelStart + Length(MyTokenStr);
end;
MyRE.SelStart := WasSelStart;
MyRE.SelLength := 0;
MyRe.OnChange := SaveOnChangeIn;
MyRe.Enabled := True;
MyRe.SetFocus;
end;
Towards - ASH Version 1.0b
Couple of problems with the last version if you try it out for size:
Its slightly inefficient in that everytime SelAttributes is changed it forces a repaint of the same token in the control. We should instead use some variable (e.g var DoFormat:Boolean) to decided if we need to reformat, and then check the value of DoFormat at the end of this checking, and do it all then by a simple SelAttribute.Assign(FParseFont[MyTokenState]). This means we can also change the seperate "if" statements to a single if ... then .. else .. if ... then .. else which should code faster - especially if you put the various test situations in the order of likeliness to occur (e.g font changes less frequently than the color, so should be further down the if..else..if)
For some reason if you type a {style comment} on a line, after about 4-7 characters it reverts to different colours. I can't seem to work out yet why this happens - but I understand why its not being picked up. SelAttributes returns the value of the "initial" styling of the entire selected block. So if you select text which starts off black and then becomes blue, SelAttributes.Color will equal clBlack. We must also examine SelAttributes.ConsistentAttributes to ensure that the entire selection is consistent in the way it is highlighted. If it isn't - then we want to force it to be rehighlighted - its obviously not in the correct format.
Multi-line comments are a big pain e.g { words <CR> word <CR> words }. I don't have them in my 4GL so I didn't need to fix this sort of problem. However I do have muti-line strings - so I need to be able to string strings across many lines. The trouble is we have to code to program over a number of lines - but have a look at what happens in Delphi when you place a "{" anywhere in the code. The highlighting can force a repaint of the entire 2,000,000 lines of text in the control. We could catch that situation - ie if the last token on the line is a tsComment and it doesn't end in '}' we could increase SelLength until it did or we reach the end of the RichEdit.Lines. (That basically what the tokeniser does anyway with all that inc(Run).)
That easy. But what happens if you then delete the "{"? You need to go forward 2,000,000 lines and put the highlighting back again? We could decide to keep going until the if...then..else..list didn't set DoFormat := True. But what happens if we're in a colour mode were Comment highlighting style = KeyWord highlighting style. We would stop prematurely. So this "logic" wont help in all situations.
You can still get the "someone is chasing you effect" - except now its "someone is fleeing from you" effect. It happens when you have (* This is a comment *) and delete the first *-character. The control takes an appreciable time to rehighlight the text.
While looking for a fix for the last problem, I remembered the Richedit.Lines.BeginUpdate function. But that didn't help either. What we need is a Richedit.BeginUpdate. What would that do? It would increase an internal counter by one everytime it was called. RichEdit.EndUpdate would do the opposite. Then we would create our own WM_PAINT message handler. This is received everytime Windows wants the control to repaint a portion of itself. If we catch this message then we can stop processing of these message until the internal counter = 0 again. Then, and only then, will the Control repaint itself - ditching we would hope most of the intervening steps.
Fixing the mult-line comments:
My current idea is to use the RichEdit.Lines.Object to store the TokenType of the first token on each line. This way we could easily know how far we need to go when re-hightlighting multi-line comments. Initially this would be set to nil. I think this will work.
[Editor update: This didn't actually work - as the RichEdit.Lines.Object isn't implemented in TRichEdit control. It is
always nil regardless of what you assigned to it]
Upgrading to RichEdit98:
I'm also in the process of updating to the RichEdit98 components for Delphi 3.0-4.0. version 1.34 Author Alexander Obukhov, Minsk, Belarus. This control has a number of advances on the standard RichEdit control that ships with Delphi. Included in this are:
BeginUpdate,EndUpdate
Independant background colours
Margins
Hotspots
(Source code in full)
Anyway I hope you have enjoyed the adventure.I'm sorry if not all the examples compile as written. They may need some fixing to compile if you copy straight from the Browser into the Delphi Editor. Please send any comments to jonhd@hotmail.com.
Jon HogDog