Beginning ANTLR4 definition for the Clarion language

Greetings -
Below is the ANTLR4 (ANother Tool for Language Recognition antlr.org ). Try with ANTLR Test Lab

—started— definition for the Clarion language created by ChatGPT.
The template language is probably better as a separate grammar
Hope this will help the folks wanting to create a cross-reference.
Regards,
Roberto Artigas


// ==============================
// Parser rules
// ==============================

// Program structure

program
    : programHeader? mapSection? declarationSection* codeSection EOF
    ;

programHeader
    : PROGRAM
    ;

mapSection
    : MAP mapEntry* END
    ;

mapEntry
    : moduleDecl
    | prototypeDecl
    ;

moduleDecl
    : MODULE LPAREN STRING RPAREN prototypeDecl* END
    ;

prototypeDecl
    : ID LPAREN paramList? RPAREN (COMMA returnType)? (COMMA PROC)? NEWLINE?
      // e.g. DoSomething(LONG),LONG,PROC
    ;

paramList
    : param (COMMA param)*
    ;

param
    : typeSpec
    ;

returnType
    : typeSpec
    ;

// ---------------------
// Types
// ---------------------

typeSpec
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | STRINGKW       // Clarion STRING
    | DATE
    | TIME
    | BOOL
    | QUEUE          // for nested queues
    | GROUP          // for groups
    | FILE           // for files
    | ID             // user-defined / other types
    ;

// ---------------------
// Declarations
// ---------------------

declarationSection
    : dataDecl
    | queueDecl
    | windowDecl
    | procedureDecl
    ;

// ----- DATA declarations -----
//
//   Cust:Name        STRING(40),STATIC,PRIVATE
//   Count            LONG,DIM(10)
//   Alias            LIKE(Cust:Name)
//   FileVar          FILE,DRIVER('TOPSPEED'),PRE(FIL)

dataDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// Either LIKE(SomeField) or a type with optional args
dataLikeOrType
    : LIKE LPAREN fieldRef RPAREN
    | typeSpec typeArgs?
    ;

// e.g. STRING(40), DECIMAL(7,2)
typeArgs
    : LPAREN argList? RPAREN
    ;

argList
    : expr (COMMA expr)*
    ;

// Attributes after the type:
//   PRE(FIL), USE(?SomeControl), DIM(10), OVER(SomeField),
//   STATIC, PRIVATE, THREAD,
//   generic: DRIVER('TPS'), BINDABLE, NOCASE, etc.
dataAttr
    : PRE LPAREN ID RPAREN                        // PRE(FIL)
    | USE LPAREN fieldRef RPAREN                  // USE(?SomeControl)
    | DIM LPAREN expr (COMMA expr)* RPAREN        // DIM(10) or DIM(10,20)
    | OVER LPAREN fieldRef RPAREN                 // OVER(SomeOtherField)
    | STATIC                                      // STATIC
    | PRIVATE                                     // PRIVATE
    | THREAD                                      // THREAD
    | ID (LPAREN argList? RPAREN)?                // generic attr (DRIVER('TPS'), BINDABLE, ...)
    ;

// ----- QUEUE declarations -----
//
//   Customer         QUEUE,PRE(CUS)
//   Cust:ID            LONG
//   Cust:Name          STRING(40)
//   Cust:Balance       DECIMAL(9,2),OVER(Fil:Balance)
//                    END

queueDecl
    : ID QUEUE (COMMA queueAttr)* NEWLINE
      queueFieldDecl+
      END
    ;

queueAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

// Queue fields reuse richer DATA syntax
queueFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- WINDOW declarations (simplified) -----
//
//   MyWindow   WINDOW('Title'),AT(,,300,200),CENTER
//                LIST,AT(...),USE(?List)
//                BUTTON('Close'),AT(...),USE(?Close)
//              END

windowDecl
    : ID WINDOW windowArgs? NEWLINE
      windowControlDecl*
      END
    ;

windowArgs
    : LPAREN argList? RPAREN (COMMA windowAttr)*
    ;

windowAttr
    : ID LPAREN argList? RPAREN
    | ID
    ;

windowControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

controlType
    : BUTTON
    | LIST
    | ENTRY
    | STRINGKW       // STRING control
    | OTHERCONTROL
    ;

controlAttrs
    : (COMMA controlAttr)+
    ;

controlAttr
    : ID
    | ID LPAREN argList? RPAREN
    ;

// ----- Procedure declarations -----
//
//   MyProc       PROCEDURE()
//   CODE
//     ...
//   END

procedureDecl
    : ID procedureHeader? NEWLINE
      declarationSection*
      codeSection
    ;

procedureHeader
    : LPAREN paramList? RPAREN (COMMA returnType)? (COMMA PROC)?
    ;

// ---------------------
// CODE section & statements
// ---------------------

codeSection
    : CODE NEWLINE statement*
    ;

statement
    : assignStmt
    | procCallStmt
    | ifStmt
    | caseStmt
    | loopStmt
    | acceptStmt
    | breakStmt
    | returnStmt
    | openCloseStmt
    | NEWLINE           // blank line
    ;

assignStmt
    : fieldRef EQUAL expr NEWLINE
    ;

procCallStmt
    : fieldRef LPAREN argList? RPAREN NEWLINE
    ;

ifStmt
    : IF expr NEWLINE
      statement*
      (ELSE NEWLINE statement*)?
      END
    ;

caseStmt
    : CASE expr NEWLINE
      caseBranch*
      (ELSE NEWLINE statement*)?
      END
    ;

caseBranch
    : OF expr NEWLINE statement*
    ;

loopStmt
    : LOOP NEWLINE statement* END
    ;

acceptStmt
    : ACCEPT NEWLINE statement* END
    ;

breakStmt
    : BREAK NEWLINE
    ;

returnStmt
    : RETURN expr? NEWLINE
    ;

openCloseStmt
    : (OPEN | CLOSE) LPAREN fieldRef RPAREN NEWLINE
    ;

// ---------------------
// Expressions
// ---------------------

expr
    : expr binaryOp expr
    | unaryOp expr
    | literal
    | fieldRef
    | LPAREN expr RPAREN
    ;

binaryOp
    : PLUS
    | MINUS
    | STAR
    | DIV
    | EQUAL
    | NEQ
    | LT
    | LTE
    | GT
    | GTE
    | AND
    | OR
    ;

unaryOp
    : NOT
    | MINUS
    ;

fieldRef
    : ID (COLON ID)*        // Cust:Name, File:Field
    ;

literal
    : INTLIT
    | REALLIT
    | STRING
    | TRUE
    | FALSE
    ;

// ==============================
// Lexer rules
// ==============================

// Keywords

PROGRAM     : 'PROGRAM';
MAP         : 'MAP';
MODULE      : 'MODULE';
END         : 'END';
CODE        : 'CODE';

// Types
BYTE        : 'BYTE';
SHORT       : 'SHORT';
USHORT      : 'USHORT';
LONG        : 'LONG';
ULONG       : 'ULONG';
DECIMAL     : 'DECIMAL';
REAL        : 'REAL';
STRINGKW    : 'STRING';
DATE        : 'DATE';
TIME        : 'TIME';
BOOL        : 'BOOL';

// Complex types / structures
QUEUE       : 'QUEUE';
WINDOW      : 'WINDOW';
GROUP       : 'GROUP';
FILE        : 'FILE';

// Data attributes / modifiers
PRE         : 'PRE';
USE         : 'USE';
LIKE        : 'LIKE';
DIM         : 'DIM';
OVER        : 'OVER';
STATIC      : 'STATIC';
PRIVATE     : 'PRIVATE';
THREAD      : 'THREAD';

// Controls
BUTTON      : 'BUTTON';
LIST        : 'LIST';
ENTRY       : 'ENTRY';
OTHERCONTROL: 'PROMPT' | 'CHECK' | 'RADIO' | 'GROUP' | 'SHEET' | 'TAB';

// Procedures / logic
PROC        : 'PROC';
ACCEPT      : 'ACCEPT';
CASE        : 'CASE';
OF          : 'OF';
ELSE        : 'ELSE';
IF          : 'IF';
LOOP        : 'LOOP';
BREAK       : 'BREAK';
RETURN      : 'RETURN';
OPEN        : 'OPEN';
CLOSE       : 'CLOSE';
TRUE        : 'TRUE';
FALSE       : 'FALSE';
AND         : 'AND';
OR          : 'OR';
NOT         : 'NOT';

// Operators / punctuation

LTE         : "<=";
GTE         : ">=";
NEQ         : "<>";
LT          : "<";
GT          : ">";
EQUAL       : "=";

PLUS        : '+' ;
MINUS       : '-' ;
STAR        : '*' ;
DIV         : '/' ;

LPAREN      : '(' ;
RPAREN      : ')' ;
COMMA       : ',' ;
COLON       : ':' ;

// Literals

INTLIT      : [0-9]+ ;
REALLIT     : [0-9]+ '.' [0-9]+ ;
STRING      : '\'' (~['\\] | '\\' .)* '\'' ;

// Identifiers (no colon inside; colon used as separator in fieldRef)
ID
    : [A-Za-z_][A-Za-z0-9_]*
    ;

// Line handling

NEWLINE
    : [\r\n]+
    ;

// Comments & whitespace

LINE_COMMENT
    : '!' ~[\r\n]* -> skip
    ;

BLOCK_COMMENT
    : '!--' .*? '--!' -> skip
    ;

WS
    : [ \t\f]+ -> skip
    ;

Carl Edit: A few simple ANTLR rules to know. See this Meta Language Spec that has a nice table of symbols.

It uses a lot of RegEx and it is fairly normal ?*+ (Group) [Char Sets] ~[Not Set] |=Or

Tokens start with a Captial letter e.g. PROGRAMand STRINGKW are Tokens which lower down are the definitions (else you get an Implicit error):
PROGRAM : 'PROGRAM';
STRINGKW : 'STRING';

Parser rule names always start with a lower case letter e.g. program and programHeader. Below is the program rule it has a programHeader which is the Token PROGRAM. The ? after programHeader is the RegEx for 0 or 1 time making it optional.

program
    : programHeader? mapSection? declarationSection* codeSection EOF
    ;
programHeader
    : PROGRAM

Hi @RobertoArtigas

You may know a while back I started down the path of creating an Antlr definition for the vs-code language extension.

For it to work as one would expect from a language server extension, it’s quite a mammoth task, and I gave up in the end.

You can see the files in the repo as to where I was at in this branch: antlr code... does not work · msarson/Clarion-Extension@bacae55 · GitHub files are under server/antlr folder.

I hope this helps, and if you get a full clarion language version, feel free to pass on the work. I could see about adding back to the repo and using it :slight_smile:

Mark

Greetings - Thank you for your response. The magnitude of the task is rather large.
This was my playing on ChatGPT for today results, and I passed them on.
Maybe there are other brave souls out there that want to play with the definition.
Will get back to this eventually, but not today. Got other things in the queue.
Regards,
Roberto Artigas

Its time which is a limiting factor for most, and then having the right tools to work with it. Currently text editors are pretty much free form, so whilst we have to remember the rules for any language, because the text editors have the flexibility, it will take time because even the rules are not documentated in a TLDR form for quick assimilation.

Updated grammar

Mod edit to split into separate Parser and Lexar blocks for pasting into ANTLR Lab

// ==============================
// Parser rules
// ==============================

// Program structure

program
    : programHeader? mapSection? declarationSection* codeSection EOF
    ;

programHeader
    : PROGRAM
    ;

mapSection
    : MAP mapEntry* (END | DOT)
    ;

mapEntry
    : moduleDecl
    | prototypeDecl
    ;

moduleDecl
    : MODULE LPAREN STRING RPAREN prototypeDecl* (END | DOT)
    ;

// MAP prototype, with optional parameters in angle brackets
prototypeDecl
    : ID LPAREN prototypeParamList? RPAREN (COMMA returnType)? (COMMA PROC)? NEWLINE?
    ;

prototypeParamList
    : prototypeParam (COMMA prototypeParam)*
    ;

// Parameters in MAP: by value, by address (*type), or optional <...>
prototypeParam
    : STAR typeSpec               // by-address parameter: *LONG
    | typeSpec                    // by-value parameter: LONG
    | LT STAR typeSpec GT         // optional by-address: <*STRING>
    | LT typeSpec GT              // optional by-value: <STRING>
    ;

// ---------------------
// Types
// ---------------------

typeSpec
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | STRINGKW
    | DATE
    | TIME
    | BOOL
    | QUEUE
    | GROUP
    | FILE
    | ID
    ;

// ---------------------
// Declarations
// ---------------------

declarationSection
    : dataDecl
    | queueDecl
    | windowDecl
    | procedureDecl
    ;

// ----- DATA declarations -----

dataDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

dataLikeOrType
    : LIKE LPAREN fieldRef RPAREN
    | typeSpec typeArgs?
    ;

typeArgs
    : LPAREN argList? RPAREN
    ;

argList
    : expr (COMMA expr)*
    ;

dataAttr
    : PRE LPAREN ID RPAREN
    | USE LPAREN fieldRef RPAREN
    | DIM LPAREN expr (COMMA expr)* RPAREN
    | OVER LPAREN fieldRef RPAREN
    | STATIC
    | PRIVATE
    | THREAD
    | ID (LPAREN argList? RPAREN)?
    ;

// ----- QUEUE declarations -----

queueDecl
    : ID QUEUE (COMMA queueAttr)* NEWLINE
      queueFieldDecl+
      (END | DOT)
    ;

queueAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

queueFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- WINDOW declarations -----

windowDecl
    : ID WINDOW windowArgs? NEWLINE
      windowControlDecl*
      (END | DOT)
    ;

windowArgs
    : LPAREN argList? RPAREN (COMMA windowAttr)*
    ;

windowAttr
    : ID LPAREN argList? RPAREN
    | ID
    ;

windowControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

controlType
    : BUTTON
    | LIST
    | ENTRY
    | STRINGKW
    | GROUP
    | OTHERCONTROL
    ;

controlAttrs
    : (COMMA controlAttr)+
    ;

controlAttr
    : ID
    | ID LPAREN argList? RPAREN
    ;

// ----- Procedure declarations -----

procedureDecl
    : ID procedureHeader? NEWLINE
      declarationSection*
      codeSection
    ;

procedureHeader
    : LPAREN paramList? RPAREN (COMMA returnType)? (COMMA PROC)?
    ;

paramList
    : param (COMMA param)*
    ;

param
    : typeSpec
    ;

// Return type shared by prototypes and procedures
returnType
    : typeSpec
    ;

// ---------------------
// CODE section & statements
// ---------------------

codeSection
    : CODE NEWLINE statement*
    ;

// QUESTION may prefix any core statement (DEBUG-mode marker),
// NEWLINE alone is a blank/empty statement.
statement
    : QUESTION? coreStatement
    | NEWLINE
    ;

coreStatement
    : assignStmt
    | procCallStmt
    | ifStmt
    | caseStmt
    | loopStmt
    | acceptStmt
    | breakStmt
    | returnStmt
    | openCloseStmt
    ;

// "=", "&=", ":=:" assignments
assignStmt
    : fieldRef (EQUAL | AMP_EQUAL | DEEP_ASSIGN) expr NEWLINE
    ;

procCallStmt
    : fieldRef LPAREN argList? RPAREN NEWLINE
    ;

ifStmt
    : IF expr NEWLINE
      statement*
      (ELSE NEWLINE statement*)?
      (END | DOT)
    ;

caseStmt
    : CASE expr NEWLINE
      caseBranch*
      (ELSE NEWLINE statement*)?
      (END | DOT)
    ;

caseBranch
    : OF expr NEWLINE statement*
    ;

loopStmt
    : LOOP NEWLINE statement* (END | DOT)
    ;

acceptStmt
    : ACCEPT NEWLINE statement* (END | DOT)
    ;

breakStmt
    : BREAK NEWLINE
    ;

returnStmt
    : RETURN expr? NEWLINE
    ;

openCloseStmt
    : (OPEN | CLOSE) LPAREN fieldRef RPAREN NEWLINE
    ;

// ---------------------
// EXPRESSIONS (precedence-based)
// ---------------------
//
// Precedence (low → high):
//   OR
//   AND
//   =, <>, &=          (equality / reference equality)
//   <, <=, >, >=
//   +, -, &            (additive + concatenation)
//   *, /, %
//   ^                  (exponentiation, right-associative)
//   unary NOT, ~, unary -
//   primary (with [] slices and {} property/repeat postfixes)

expr
    : orExpr
    ;

orExpr
    : andExpr (OR andExpr)*
    ;

andExpr
    : equalityExpr (AND equalityExpr)*
    ;

equalityExpr
    : relationalExpr ((EQUAL | NEQ | AMP_EQUAL) relationalExpr)*
    ;

relationalExpr
    : additiveExpr ((LT | LTE | GT | GTE) additiveExpr)*
    ;

additiveExpr
    : multiplicativeExpr ((PLUS | MINUS | AMP) multiplicativeExpr)*
    ;

multiplicativeExpr
    : powExpr ((STAR | DIV | PERCENT) powExpr)*
    ;

// Exponentiation: right-associative
powExpr
    : unaryExpr (CARET powExpr)?
    ;

unaryExpr
    : (NOT | TILDE | MINUS) unaryExpr
    | primary
    ;

// Primary with optional slice and property/repeat postfixes
primary
    : primaryBase
      ( LBRACE propertyParamList RBRACE      // Field{PROP:Text,@N3} or '='{10}
      | LBRACE expr RBRACE                   // simple repeat count {n}
      | LBRACKET sliceRange RBRACKET         // S[3:7], S[:5], etc.
      )*
    ;

primaryBase
    : literal
    | fieldRef
    | LPAREN expr RPAREN
    ;

// Slice range: [expr? : expr?]
sliceRange
    : (expr)? SLICE_COLON (expr)?
    ;

propertyParamList
    : expr (COMMA expr)*
    ;

literal
    : INTLIT
    | REALLIT
    | STRING
    | TRUE
    | FALSE
    | PICTURE
    ;

// ---------------------
// Field references
// ---------------------

fieldRef
    : ID (DOT ID)*      // Struct.Member, Win$Form:Field, etc.
    ;

// ==============================
// Lexer rules
// ==============================

// Keywords

PROGRAM     : 'PROGRAM';
MAP         : 'MAP';
MODULE      : 'MODULE';
END         : 'END';
CODE        : 'CODE';

// Types
BYTE        : 'BYTE';
SHORT       : 'SHORT';
USHORT      : 'USHORT';
LONG        : 'LONG';
ULONG       : 'ULONG';
DECIMAL     : 'DECIMAL';
REAL        : 'REAL';
STRINGKW    : 'STRING';
DATE        : 'DATE';
TIME        : 'TIME';
BOOL        : 'BOOL';

// Complex types / structures
QUEUE       : 'QUEUE';
WINDOW      : 'WINDOW';
GROUP       : 'GROUP';
FILE        : 'FILE';

// Data attributes / modifiers
PRE         : 'PRE';
USE         : 'USE';
LIKE        : 'LIKE';
DIM         : 'DIM';
OVER        : 'OVER';
STATIC      : 'STATIC';
PRIVATE     : 'PRIVATE';
THREAD      : 'THREAD';

// Controls
BUTTON      : 'BUTTON';
LIST        : 'LIST';
ENTRY       : 'ENTRY';
OTHERCONTROL
    : 'PROMPT'
    | 'CHECK'
    | 'RADIO'
    | 'SHEET'
    | 'TAB'
    ;

// Procedures / logic
PROC        : 'PROC';
ACCEPT      : 'ACCEPT';
CASE        : 'CASE';
OF          : 'OF';
ELSE        : 'ELSE';
IF          : 'IF';
LOOP        : 'LOOP';
BREAK       : 'BREAK';
RETURN      : 'RETURN';
OPEN        : 'OPEN';
CLOSE       : 'CLOSE';
TRUE        : 'TRUE';
FALSE       : 'FALSE';
AND         : 'AND';
OR          : 'OR';
NOT         : 'NOT';

// Operators / punctuation
// Multi-character tokens must come before their prefixes.

LTE         : "<=";
GTE         : ">=";
NEQ         : "<>";
AMP_EQUAL   : "&=";
DEEP_ASSIGN : ':=:';

SLICE_COLON : ':';      // for [start:stop] slices

LT          : "<";
GT          : ">";
EQUAL       : "=";

CARET       : '^';
PLUS        : '+';
MINUS       : '-';
STAR        : '*';
DIV         : '/';
PERCENT     : '%';
TILDE       : '~';
AMP         : '&';
DOT         : '.';

LPAREN      : '(' ;
RPAREN      : ')' ;
LBRACE      : '{' ;
RBRACE      : '}' ;
LBRACKET    : '[' ;
RBRACKET    : ']' ;
COMMA       : ',' ;

QUESTION    : '?' ;

// Literals

INTLIT      : [0-9]+ ;
REALLIT     : [0-9]+ '.' [0-9]+ ;
STRING      : '\'' (~['\\] | '\\' .)* '\'' ;

// Picture tokens: start with '@' and run until whitespace or punctuation
PICTURE
    : '@' ~[ \t\r\n,;(){}\[\]]+
    ;

// Identifiers with implicit sigils, colon-prefix, and dollar connectors
ID
    : '"' [A-Za-z_][A-Za-z0-9_:$]*    // implicit STRING variable
    | '$' [A-Za-z_][A-Za-z0-9_:$]*    // implicit REAL variable
    | '#' [A-Za-z_][A-Za-z0-9_:$]*    // implicit LONG variable
    | [A-Za-z_][A-Za-z0-9_:$]*        // normal identifier (can contain : and $)
    ;

// ==============================
// Line continuation + newline
// ==============================

// "|" at end of line joins with next line
LINE_CONTINUATION
    : '|' [ \t]* [\r\n]+ -> skip
    ;

// NEWLINE also includes semicolon
NEWLINE
    : [\r\n]+
    | ';'
    ;

// ==============================
// Comments & whitespace
// ==============================

LINE_COMMENT
    : '!' ~[\r\n]* -> skip
    ;

BLOCK_COMMENT
    : '/*' .*? '*/' -> skip
    ;

WS
    : [ \t\f]+ -> skip
    ;

1 Like

Are you looking for some input?

CASE allows OROF e.g.OF expr OROF expr ...
and allows OF TO e.g. OF expr TO expr
same with OROF TO e.g. OROF expr TO expr

Those can optionally have a NEWLINE before

Test code from help plus I added a few:

CASE Name[1]     !Get first letter of name
OF 'A' TO 'M'    !1st half of alphabet
OROF 'a' TO 'm'
   DO FirstHalf
OF 'N' TO 'Z' OROF 'n' TO 'z'  !2nd half
   DO SecondHalf
OF CHR(39) OROF '-' OROF '.' OROF ',' 
   DO SpecialChar
OF '"' ; DO QuotedName  !it's "name"
ELSE
    DO Name1Else
END 

CASE MaritalStatus   !Simple CASE without OROF and TO
OF 'S'  ;  DO SinglePerson
OF 'M'  ;  DO MarriedPeople
OF 'H'  ;  DO HeadOfHouse
ELSE    ;  DO MarriedUnknown
END  

Are you trying this in an editor?

The Tools page lists a plugin for VS Code.

I tried it on my phone in ANTLR Lab. I don’t think it worked, output is not like the sample that shows a parse tree. I need to try it on my PC. Note that on the upper-right corner of your spec here is a Copy button that grabs it all to paste into Lab.

The CASE statement can also be written like this:

CASE name[1]
   OF ‘A’ to ‘Z’
         DO FirstHalfUpper
   OROF ‘a’ to ‘z’
         DO FirstHalfLower
END

It looks like in the Test Lab your file has to be split into 2 parts the Lexer and the Parser to paste into 2 separate inputs. I’ll do that on my PC.

Greetings -
Thank you for the feedback here. Please keep going with what you find missing.
I am getting a lot of PM feedback in missing items department from a few individuals and syntax being incorrect. So I will allocate some time during the weekends and keep posting the syntax here for you folks to take apart and provide feedback.
Please be patient. I will keep adding as I get time for some new ChatGPT sessions.
This is a long term process, since there is a lot of Clarion code in existence. Some of it ancient.
Consider that there will be a need to write a program to read and validate source files from a directory against the parser.
Regards
Roberto Artigas

For those folks that are trying in the ANTLR lab or Test Lab, I might need to talk you about setting that up on my PC. Currently, I am working on getting the syntax as correct as I can.
I am sure that getting this to provide actual output on ANTLR might require some additional steps.
Please feel free to contribute. Thank you.

Greetings - Final posting for today.


// ==============================
// Parser rules
// ==============================

// Program structure

program
    : programHeader? mapSection? declarationSection* codeSection EOF
    ;

programHeader
    : PROGRAM
    ;

// ----- MAP section -----
//
// MAP
//   prototypes
//   [ MODULE( [expr] )
//       prototypes
//     END ]
// END | .
mapSection
    : MAP mapEntry* (END | DOT)
    ;

mapEntry
    : moduleDecl
    | prototypeDecl
    ;

// MODULE inside MAP
moduleDecl
    : MODULE LPAREN expr? RPAREN NEWLINE?
      prototypeDecl*
      (END | DOT)
    ;

// ----- MAP prototypes -----

// MAP prototype, with optional parameters in angle brackets
prototypeDecl
    : ID LPAREN prototypeParamList? RPAREN (COMMA returnType)? (COMMA PROC)? NEWLINE?
    ;

prototypeParamList
    : prototypeParam (COMMA prototypeParam)*
    ;

// Parameters in MAP: by value, by address (*type), or optional <...>
prototypeParam
    : STAR typeSpec               // by-address parameter: *LONG
    | typeSpec                    // by-value parameter: LONG
    | LT STAR typeSpec GT         // optional by-address: <*STRING>
    | LT typeSpec GT              // optional by-value: <STRING>
    ;

// ---------------------
// Types
// ---------------------

// Note: GROUP / FILE / REPORT are block declarations, not simple scalar types

typeSpec
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | SREAL
    | STRINGKW
    | BSTRING
    | PSTRING
    | BLOB
    | MEMO
    | DATE
    | TIME
    | BOOL
    | QUEUE
    | WINDOW
    | ANY
    | ID
    ;

// Builtin types only (no ID) – for labelDecl "type" lines
builtinType
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | SREAL
    | STRINGKW
    | BSTRING
    | PSTRING
    | BLOB
    | MEMO
    | DATE
    | TIME
    | BOOL
    | QUEUE
    | WINDOW
    | ANY
    ;

// ---------------------
// Procedure parameter lists for PROCEDURE prototypes/impls
// ---------------------

procedureProtoParamList
    : procedureProtoParam (COMMA procedureProtoParam)*
    ;

procedureProtoParam
    : typeSpec (ID)?       // BYTE xVersion  or just BYTE
    ;

// ---------------------
// Declarations
// ---------------------

declarationSection
    : dataDecl
    | queueDecl
    | windowDecl
    | reportDecl
    | procedureDecl
    | labelDecl
    | enumDecl
    | interfaceDecl
    | classDecl
    | procedureProtoDecl
    | itemizeDecl
    | recordDecl
    | fileDecl
    | groupDecl
    | moduleDecl2
    ;

// ----- DATA declarations -----

dataDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

dataLikeOrType
    : LIKE LPAREN fieldRef RPAREN
    | typeSpec typeArgs?
    ;

typeArgs
    : LPAREN argList? RPAREN
    ;

argList
    : expr (COMMA expr)*
    ;

dataAttr
    : PRE LPAREN ID RPAREN
    | USE LPAREN fieldRef RPAREN
    | DIM LPAREN expr (COMMA expr)* RPAREN
    | OVER LPAREN fieldRef RPAREN
    | STATIC
    | PRIVATE
    | THREAD
    | AUTO
    | BINARYKW
    | ID (LPAREN argList? RPAREN)?
    ;

// ----- QUEUE declarations -----
//
// Label QUEUE( [ group ] )
//   [,PRE] [,STATIC] [,THREAD] [,TYPE] [,BINDABLE] [,EXTERNAL] [,DLL]
//   fieldLabel variable [,NAME( )]
// END | .
queueDecl
    : ID QUEUE queueTypeArgs? (COMMA queueAttr)* NEWLINE
      queueFieldDecl+
      (END | DOT)
    ;

queueTypeArgs
    : LPAREN fieldRef? RPAREN
    ;

queueAttr
    : PRE (LPAREN ID RPAREN)?              // PRE or PRE(ID)
    | STATIC
    | THREAD
    | TYPEKW
    | BINDABLE
    | EXTERNAL
    | DLL (LPAREN argList? RPAREN)?
    | ID (LPAREN argList? RPAREN)?         // other attrs, e.g. NAME(), etc.
    ;

queueFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- WINDOW declarations -----
//
// label WINDOW('title')  [,AT( )] [,CENTER] [,SYSTEM] [,MAX] [,ICON( )] [,STATUS( )] [,HLP( )]
//                        [,CURSOR( )] [,MDI] [,MODAL] [,MASK] [,FONT( )] [,GRAY][,TIMER( )]
//                        [,ALRT( )] [,ICONIZE] [,MAXIMIZE] [,MSG( )] [,PALETTE( )] [,DROPID( )] [,IMM]
//                        [,AUTO] [,COLOR( )] [,TOOLBOX] [,DOCK( )] [,DOCKED( )] [,LAYOUT( )]
//                        [,TILED] [,HSCROLL] [,DOUBLE] [,CENTERED] [,VSCROLL] [,NOFRAME]
//                        [,HVSCROLL] [,RESIZE]
//   [MENUBAR ... END]
//   [TOOLBAR ... END]
//   Controls
// END | .
windowDecl
    : ID WINDOW windowArgs? NEWLINE
      windowMenuBarSection?
      windowToolBarSection?
      windowControlDecl*
      (END | DOT)
    ;

windowArgs
    : LPAREN argList? RPAREN (COMMA windowAttr)*
    ;

windowAttr
    : ATKW      LPAREN argList? RPAREN
    | FONTKW    LPAREN argList? RPAREN
    | ICONKW    LPAREN argList? RPAREN
    | STATUSKW  LPAREN argList? RPAREN
    | HLPKW     LPAREN argList? RPAREN
    | CURSORKW  LPAREN argList? RPAREN
    | TIMERKW   LPAREN argList? RPAREN
    | ALRTKW    LPAREN argList? RPAREN
    | MSGKW     LPAREN argList? RPAREN
    | PALETTEKW LPAREN argList? RPAREN
    | DROPIDKW  LPAREN argList? RPAREN
    | DOCKKW    LPAREN argList? RPAREN
    | DOCKEDKW  LPAREN argList? RPAREN
    | LAYOUTKW  LPAREN argList? RPAREN
    | COLORKW   LPAREN argList? RPAREN
    | CENTER
    | CENTERED
    | SYSTEMKW
    | MAXKW
    | MDI
    | MODAL
    | MASK
    | GRAY
    | ICONIZE
    | MAXIMIZE
    | IMMKW
    | AUTO
    | TOOLBOX
    | TILED
    | HSCROLL
    | VSCROLL
    | NOFRAME
    | HVSCROLL
    | RESIZEKW
    | DOUBLE
    | ID (LPAREN argList? RPAREN)?
    ;

// MENUBAR ... END
windowMenuBarSection
    : MENUBAR NEWLINE
      windowMenuDecl*
      (END | DOT)
    ;

// Loose "menus and/or items" line
windowMenuDecl
    : ID (LPAREN argList? RPAREN)?
      (COMMA ID (LPAREN argList? RPAREN)?)* NEWLINE
    ;

// TOOLBAR ... END
windowToolBarSection
    : TOOLBAR NEWLINE
      windowControlDecl*
      (END | DOT)
    ;

windowControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

controlType
    : BUTTON
    | LIST
    | ENTRY
    | STRINGKW
    | GROUP
    | BOX
    | LINE
    | OLE
    | OTHERCONTROL
    ;

controlAttrs
    : (COMMA controlAttr)+
    ;

controlAttr
    : ID
    | ID LPAREN argList? RPAREN
    ;

// ----- REPORT declarations -----
//
// label REPORT([jobname]),
//       AT() [,FONT()] [,PRE()] [,LANDSCAPE] [,PREVIEW] [,PAPER]
//       [,COLOR()]
//       [THOUS | MM | POINTS]
//   [FORM   controls END]
//   [HEADER controls END]
//   label DETAIL controls END
//   [label BREAK(...) controls END]
//   [FOOTER controls END]
// END | .
reportDecl
    : ID REPORTKW LPAREN argList? RPAREN
      COMMA ATKW LPAREN argList? RPAREN
      (COMMA reportAttr)* NEWLINE
      reportSection*
      (END | DOT)
    ;

reportAttr
    : FONTKW   LPAREN argList? RPAREN
    | PRE      LPAREN ID RPAREN
    | LANDSCAPEKW
    | PREVIEWKW
    | PAPERKW
    | COLORKW  LPAREN argList? RPAREN
    | THOUS
    | MMKW
    | POINTSKW
    | ID (LPAREN argList? RPAREN)?
    ;

reportSection
    : reportFormSection
    | reportHeaderSection
    | reportDetailSection
    | reportBreakSection
    | reportFooterSection
    ;

// FORM ... END
reportFormSection
    : FORMKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// HEADER ... END
reportHeaderSection
    : HEADERKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// label DETAIL ... END
reportDetailSection
    : ID DETAILKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// label BREAK(...) ... END
reportBreakSection
    : ID BREAK LPAREN argList? RPAREN NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// FOOTER ... END
reportFooterSection
    : FOOTERKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// reuse same shape as window controls
reportControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

// ----- Procedure prototypes (no CODE) -----
//
// Name PROCEDURE [ ( param list ) ] [ ,ReturnType ]
procedureProtoDecl
    : ID PROCEDUREKW (LPAREN procedureProtoParamList? RPAREN)?
      (COMMA returnType)?
      NEWLINE
    ;

// ----- Procedure implementations (with CODE) -----
//
// Name PROCEDURE [ ( param list ) ] [ ,ReturnType ]
//   local data
// CODE
//   statements
procedureDecl
    : ID PROCEDUREKW (LPAREN procedureProtoParamList? RPAREN)?
      (COMMA returnType)? NEWLINE
      declarationSection*          // "local data"
      codeSection                  // CODE ... statements
    ;

// Return type shared by prototypes and procedures
returnType
    : typeSpec
    ;

// ----- Standalone MODULE declarations -----
//
// MODULE(sourcefile)
//   prototypeDecl*
// END | .
moduleDecl2
    : MODULE LPAREN expr RPAREN NEWLINE
      prototypeDecl*
      (END | DOT)
    ;

// ----- Label / picture / type / EQUATE lines -----

// label  EQUATE( [ constant ] )
// picture
// type   (builtin types only)
labelDecl
    : ID EQUATE LPAREN expr? RPAREN NEWLINE
    | PICTURE NEWLINE
    | builtinType NEWLINE
    ;

// ----- ENUM declarations -----
//
// Label ENUM
//   Item1
//   ...
//   ItemN
// END | .
enumDecl
    : ID ENUM_KW NEWLINE
      enumItem+
      (END | DOT)
    ;

enumItem
    : ID NEWLINE
    ;

// ----- ITEMIZE declarations -----
//
// [Label] ITEMIZE( [ seed ] ) [,PRE( expr )]
//   ID EQUATE( [ expr ] ) ...
// END | .
itemizeDecl
    : ID? ITEMIZEKW LPAREN expr? RPAREN
      (COMMA PRE LPAREN expr? RPAREN)?
      NEWLINE
      itemizeEquate+
      (END | DOT)
    ;

itemizeEquate
    : ID EQUATE LPAREN expr? RPAREN NEWLINE
    ;

// ----- RECORD declarations -----
//
// Label RECORD [,PRE( )] [,NAME( )]
//   fields
// END | .
recordDecl
    : ID RECORD (COMMA recordAttr)* NEWLINE
      recordFieldDecl+
      (END | DOT)
    ;

recordAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

recordFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- GROUP declarations -----
//
// Label GROUP( [ group ] )
//   [,PRE( )] [,DIM( )] [,OVER( )] [,NAME( )] [,EXTERNAL] [,DLL] [,STATIC]
//   [,THREAD] [,BINDABLE] [,TYPE] [,PRIVATE] [,PROTECTED]
//   declarations
// END | .
groupDecl
    : ID GROUP LPAREN fieldRef? RPAREN (COMMA groupAttr)* NEWLINE
      declarationSection*
      (END | DOT)
    ;

groupAttr
    : PRE LPAREN ID RPAREN
    | DIM LPAREN expr (COMMA expr)* RPAREN
    | OVER LPAREN fieldRef RPAREN
    | ID (LPAREN argList? RPAREN)?      // NAME(), EXTERNAL, DLL, STATIC, THREAD, BINDABLE, TYPE, PRIVATE, PROTECTED
    ;

// ----- FILE declarations -----
//
// Label FILE,DRIVER( ) [,CREATE] ...
//   entries (RECORD, KEY, INDEX, MEMO, BLOB)
// END | .
fileDecl
    : ID FILE (COMMA fileAttr)* NEWLINE
      fileEntry*
      (END | DOT)
    ;

fileAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

fileEntry
    : recordDecl
    | fileKeyDecl
    | fileIndexDecl
    | fileMemoDecl
    | fileBlobDecl
    ;

fileKeyDecl
    : ID (KEYKW LPAREN argList? RPAREN)? NEWLINE
    ;

fileIndexDecl
    : ID (INDEXKW LPAREN argList? RPAREN)? NEWLINE
    ;

fileMemoDecl
    : ID (MEMO LPAREN argList? RPAREN)? NEWLINE
    ;

fileBlobDecl
    : ID (BLOB)? NEWLINE
    ;

// ----- INTERFACE declarations -----
//
// label INTERFACE( [ parentinterface ] ) [,TYPE] [,COM]
//   [methods]
// END | .
interfaceDecl
    : ID INTERFACE LPAREN fieldRef? RPAREN
      (COMMA TYPEKW)?
      (COMMA COMKW)?
      NEWLINE
      interfaceMethod*
      (END | DOT)
    ;

interfaceMethod
    : prototypeDecl          // reuse MAP-style prototype syntax for methods
    ;

// ----- CLASS declarations -----
//
// label CLASS( [ parentclass ] )
// [,EXTERNAL] [,IMPLEMENTS] [,DLL( )] [,STATIC] [,THREAD] [,BINDABLE]
// [,MODULE( )] [,LINK( )] [,TYPE] [,DIM(dimension)] [,NETCLASS] [,PARTIAL]
//   [ data members and methods ]
// END | .
classDecl
    : ID CLASSKW LPAREN fieldRef? RPAREN classAttrList? NEWLINE
      classMember*
      (END | DOT)
    ;

classAttrList
    : COMMA classAttr (COMMA classAttr)*
    ;

classAttr
    : EXTERNAL
    | IMPLEMENTS
    | DLL LPAREN expr? RPAREN
    | STATIC
    | THREAD
    | BINDABLE
    | MODULE LPAREN expr? RPAREN
    | LINKKW LPAREN expr? RPAREN
    | TYPEKW
    | DIM LPAREN expr RPAREN
    | NETCLASS
    | PARTIALKW
    ;

classMember
    : dataDecl
    | prototypeDecl
    ;

// ---------------------
// CODE section & statements
// ---------------------

codeSection
    : CODE NEWLINE statement*
    ;

// QUESTION may prefix any core statement (DEBUG-mode marker),
// NEWLINE alone is a blank/empty statement.
statement
    : QUESTION? coreStatement
    | NEWLINE
    ;

coreStatement
    : assignStmt
    | procCallStmt
    | doStmt
    | ifStmt
    | caseStmt
    | executeStmt
    | loopStmt
    | acceptStmt
    | breakStmt
    | cycleStmt
    | exitStmt
    | routineStmt
    | routineDataBlock
    | codeMarkerStmt
    | includeStmt
    | returnStmt
    | openCloseStmt
    ;

// "=", "&=", ":=:" assignments
assignStmt
    : fieldRef (EQUAL | AMP_EQUAL | DEEP_ASSIGN) expr NEWLINE
    ;

procCallStmt
    : fieldRef LPAREN argList? RPAREN NEWLINE
    ;

// DO Label
doStmt
    : DO fieldRef NEWLINE
    ;

// INCLUDE(filename [,section]) [,ONCE]
includeStmt
    : INCLUDEKW LPAREN expr (COMMA expr)? RPAREN (COMMA ONCEKW)? NEWLINE
    ;

// EXECUTE expression ... [BEGIN..END] ... [ELSE ...] END
executeStmt
    : EXECUTE expr NEWLINE
      executeBranch+
      executeElse?
      (END | DOT)
    ;

executeBranch
    : BEGIN NEWLINE statement* (END | DOT)   // block of statements for one index
    | statement                              // single statement for one index
    ;

executeElse
    : ELSE NEWLINE statement*
    ;

// IF logical expression [THEN]
//   statements
// [ ELSIF logical expression [THEN]
//   statements ]*
// [ ELSE
//   statements ]?
// END | .
ifStmt
    : IF expr (THEN)? NEWLINE
      statement*
      elsifClause*
      elseClause?
      (END | DOT)
    ;

elsifClause
    : ELSIF expr (THEN)? NEWLINE
      statement*
    ;

elseClause
    : ELSE NEWLINE
      statement*
    ;

// CASE with OF / OROF labels and ranges
caseStmt
    : CASE expr NEWLINE
      caseBranch*
      (ELSE NEWLINE statement*)?
      (END | DOT)
    ;

caseBranch
    : OF caseLabel (OROF caseLabel)* NEWLINE
      statement*
    ;

caseLabel
    : expr (TO expr)?      // simple value or range: 'A' or 'A' TO 'M'
    ;

// LOOP variants:
//
// [label] LOOP [ loopHead ] NEWLINE
//   statements
// [ loopTail ]
// END | .
//
// loopHead:
//   expr TIMES
//   ID = expr TO expr [ BY expr ]
//   UNTIL expr
//   WHILE expr
//
// loopTail:
//   UNTIL expr
//   WHILE expr
loopStmt
    : (ID)? LOOP loopHead? NEWLINE
      statement*
      loopTail?
      (END | DOT)
    ;

loopHead
    : expr TIMES                        // LOOP 10 TIMES
    | ID EQUAL expr TO expr (BY expr)?  // LOOP i = 1 TO 10 BY 2
    | UNTIL expr                        // LOOP UNTIL condition
    | WHILE expr                        // LOOP WHILE condition
    ;

loopTail
    : UNTIL expr                        // ... UNTIL condition
    | WHILE expr                        // ... WHILE condition
    ;

acceptStmt
    : ACCEPT NEWLINE statement* (END | DOT)
    ;

// BREAK [ label ]
breakStmt
    : BREAK (ID)? NEWLINE
    ;

// CYCLE [ label ]
cycleStmt
    : CYCLE (ID)? NEWLINE
    ;

// EXIT
exitStmt
    : EXIT NEWLINE
    ;

// ROUTINE label line
routineStmt
    : ID ROUTINE NEWLINE
    ;

// ROUTINE-local DATA block inside CODE
routineDataBlock
    : DATAKW NEWLINE
      declarationSection*
    ;

// Inner CODE marker line inside CODE section / ROUTINE
codeMarkerStmt
    : CODE NEWLINE
    ;

returnStmt
    : RETURN expr? NEWLINE
    ;

openCloseStmt
    : (OPEN | CLOSE) LPAREN fieldRef RPAREN NEWLINE
    ;

// ---------------------
// EXPRESSIONS (precedence-based)
// ---------------------
//
// Precedence (low → high):
//   OR
//   AND
//   =, <>, &=
//   <, <=, >, >=
//   +, -, &
//   *, /, %
//   ^
//   unary NOT, ~, unary -
//   primary (with [] and {} postfixes)

expr
    : orExpr
    ;

orExpr
    : andExpr (OR andExpr)*
    ;

andExpr
    : equalityExpr (AND equalityExpr)*
    ;

equalityExpr
    : relationalExpr ((EQUAL | NEQ | AMP_EQUAL) relationalExpr)*
    ;

relationalExpr
    : additiveExpr ((LT | LTE | GT | GTE) additiveExpr)*
    ;

additiveExpr
    : multiplicativeExpr ((PLUS | MINUS | AMP) multiplicativeExpr)*
    ;

multiplicativeExpr
    : powExpr ((STAR | DIV | PERCENT) powExpr)*
    ;

// Exponentiation: right-associative
powExpr
    : unaryExpr (CARET powExpr)?
    ;

unaryExpr
    : (NOT | TILDE | MINUS) unaryExpr
    | primary
    ;

// Primary with optional index/slice and property/repeat postfixes
primary
    : primaryBase
      ( LBRACE propertyParamList RBRACE      // Field{PROP:Text,@N3} or '='{10}
      | LBRACE expr RBRACE                   // simple repeat count {n}
      | LBRACKET indexOrSlice RBRACKET       // Name[1], S[1 : 5], S[ : 5], S[ : ]
      )*
    ;

primaryBase
    : literal
    | fieldRef
    | ENUM_KW LPAREN argList? RPAREN
    | ADDRESS LPAREN argList? RPAREN
    | LIKE LPAREN argList? RPAREN
    | LPAREN expr RPAREN
    ;

// Index or slice
// Style: slices are written with spaces around colon: [start : end]
indexOrSlice
    : expr                     // Name[1]
    | expr SLICE_COLON expr?   // S[1 : 5] or S[1 : ]
    | SLICE_COLON expr?        // S[ : 5] or S[ : ]
    ;

propertyParamList
    : expr (COMMA expr)*
    ;

literal
    : INTLIT
    | REALLIT
    | STRING
    | TRUE
    | FALSE
    | PICTURE
    | NULLKW
    ;

// ---------------------
// Field references
// ---------------------

fieldRef
    : ID (DOT ID)*      // Struct.Member, Win$Form:Field, etc.
    ;

// ==============================
// Lexer rules
// ==============================

// Keywords

PROGRAM     : 'PROGRAM';
MAP         : 'MAP';
MODULE      : 'MODULE';
END         : 'END';
CODE        : 'CODE';
DATAKW      : 'DATA';

// Types
BYTE        : 'BYTE';
SHORT       : 'SHORT';
USHORT      : 'USHORT';
LONG        : 'LONG';
ULONG       : 'ULONG';
DECIMAL     : 'DECIMAL';
REAL        : 'REAL';
SREAL       : 'SREAL';
STRINGKW    : 'STRING';
BSTRING     : 'BSTRING';
PSTRING     : 'PSTRING';
BLOB        : 'BLOB';
MEMO        : 'MEMO';
DATE        : 'DATE';
TIME        : 'TIME';
BOOL        : 'BOOL';
QUEUE       : 'QUEUE';
WINDOW      : 'WINDOW';
FILE        : 'FILE';
GROUP       : 'GROUP';
ANY         : 'ANY';

// Data attributes / modifiers
PRE         : 'PRE';
USE         : 'USE';
LIKE        : 'LIKE';
DIM         : 'DIM';
OVER        : 'OVER';
STATIC      : 'STATIC';
PRIVATE     : 'PRIVATE';
THREAD      : 'THREAD';
AUTO        : 'AUTO';
BINARYKW    : 'BINARY';

// Controls
BUTTON      : 'BUTTON';
LIST        : 'LIST';
ENTRY       : 'ENTRY';
BOX         : 'BOX';
LINE        : 'LINE';
OLE         : 'OLE';
OTHERCONTROL
    : 'PROMPT'
    | 'CHECK'
    | 'RADIO'
    | 'SHEET'
    | 'TAB'
    ;

// Procedures / logic
PROC        : 'PROC';
ACCEPT      : 'ACCEPT';
CASE        : 'CASE';
OF          : 'OF';
OROF        : 'OROF';
ELSE        : 'ELSE';
ELSIF       : 'ELSIF';
IF          : 'IF';
THEN        : 'THEN';
LOOP        : 'LOOP';
BREAK       : 'BREAK';
CYCLE       : 'CYCLE';
EXIT        : 'EXIT';
ROUTINE     : 'ROUTINE';
RETURN      : 'RETURN';
OPEN        : 'OPEN';
CLOSE       : 'CLOSE';
DO          : 'DO';
EXECUTE     : 'EXECUTE';
BEGIN       : 'BEGIN';
TRUE        : 'TRUE';
FALSE       : 'FALSE';
AND         : 'AND';
OR          : 'OR';
NOT         : 'NOT';
TO          : 'TO';
ENUM_KW     : 'ENUM';
ADDRESS     : 'ADDRESS';
EQUATE      : 'EQUATE';
TIMES       : 'TIMES';
WHILE       : 'WHILE';
UNTIL       : 'UNTIL';
BY          : 'BY';
INTERFACE   : 'INTERFACE';
TYPEKW      : 'TYPE';
COMKW       : 'COM';
CLASSKW     : 'CLASS';
EXTERNAL    : 'EXTERNAL';
IMPLEMENTS  : 'IMPLEMENTS';
DLL         : 'DLL';
BINDABLE    : 'BINDABLE';
LINKKW      : 'LINK';
NETCLASS    : 'NETCLASS';
PARTIALKW   : 'PARTIAL';
PROCEDUREKW : 'PROCEDURE';
ITEMIZEKW   : 'ITEMIZE';
INCLUDEKW   : 'INCLUDE';
ONCEKW      : 'ONCE';
KEYKW       : 'KEY';
INDEXKW     : 'INDEX';
NULLKW      : 'NULL';

// Window/report & UI-related
REPORTKW    : 'REPORT';
ATKW        : 'AT';
FONTKW      : 'FONT';
COLORKW     : 'COLOR';

CENTER      : 'CENTER';
CENTERED    : 'CENTERED';
SYSTEMKW    : 'SYSTEM';
MAXKW       : 'MAX';
ICONKW      : 'ICON';
STATUSKW    : 'STATUS';
HLPKW       : 'HLP';
CURSORKW    : 'CURSOR';
MDI         : 'MDI';
MODAL       : 'MODAL';
MASK        : 'MASK';
GRAY        : 'GRAY';
TIMERKW     : 'TIMER';
ALRTKW      : 'ALRT';
ICONIZE     : 'ICONIZE';
MAXIMIZE    : 'MAXIMIZE';
MSGKW       : 'MSG';
PALETTEKW   : 'PALETTE';
DROPIDKW    : 'DROPID';
IMMKW       : 'IMM';
TOOLBOX     : 'TOOLBOX';
DOCKKW      : 'DOCK';
DOCKEDKW    : 'DOCKED';
LAYOUTKW    : 'LAYOUT';
TILED       : 'TILED';
HSCROLL     : 'HSCROLL';
VSCROLL     : 'VSCROLL';
NOFRAME     : 'NOFRAME';
HVSCROLL    : 'HVSCROLL';
RESIZEKW    : 'RESIZE';
DOUBLE      : 'DOUBLE';

MENUBAR     : 'MENUBAR';
TOOLBAR     : 'TOOLBAR';

// Report sections
FORMKW      : 'FORM';
HEADERKW    : 'HEADER';
DETAILKW    : 'DETAIL';
FOOTERKW    : 'FOOTER';
LANDSCAPEKW : 'LANDSCAPE';
PREVIEWKW   : 'PREVIEW';
PAPERKW     : 'PAPER';
THOUS       : 'THOUS';
MMKW        : 'MM';
POINTSKW    : 'POINTS';

// Operators / punctuation
// Multi-character tokens must come before their prefixes.

LTE         : "<=";
GTE         : ">=";
NEQ         : "<>";
AMP_EQUAL   : "&=";
DEEP_ASSIGN : ':=:';   // deep assignment

SLICE_COLON : ':';     // used only in slice grammar

LT          : "<";
GT          : ">";
EQUAL       : "=";

CARET       : '^';
PLUS        : '+';
MINUS       : '-';
STAR        : '*';
DIV         : '/';
PERCENT     : '%';
TILDE       : '~';
AMP         : '&';
DOT         : '.';

LPAREN      : '(' ;
RPAREN      : ')' ;
LBRACE      : '{' ;
RBRACE      : '}' ;
LBRACKET    : '[' ;
RBRACKET    : ']' ;
COMMA       : ',' ;

QUESTION    : '?' ;

// Literals

INTLIT      : [0-9]+ ;
REALLIT     : [0-9]+ '.' [0-9]+ ;
STRING      : '\'' (~['\\] | '\\' .)* '\'' ;

// Picture tokens: start with '@' and run until whitespace or punctuation
PICTURE
    : '@' ~[ \t\r\n,;(){}\[\]]+
    ;

// Identifiers with implicit variable suffixes and colon/dollar in the body
ID
    : [A-Za-z_][A-Za-z0-9_:$]* '#'
    | [A-Za-z_][A-Za-z0-9_:$]* '$'
    | [A-Za-z_][A-Za-z0-9_:$]* '"'
    | [A-Za-z_][A-Za-z0-9_:$]*
    ;

// ==============================
// Line continuation + newline
// ==============================

// "|" at end of line joins with next line
LINE_CONTINUATION
    : '|' [ \t]* [\r\n]+ -> skip
    ;

// NEWLINE also includes semicolon
NEWLINE
    : [\r\n]+
    | ';'
    ;

// ==============================
// Comments & whitespace
// ==============================

LINE_COMMENT
    : '!' ~[\r\n]* -> skip
    ;

BLOCK_COMMENT
    : '/*' .*? '*/' -> skip
    ;

WS
    : [ \t\f]+ -> skip
    ;

1 Like

Great job - rock on!

I played around with the ANTLR Lap. I really know nothing and read little. Back in your prior post I had split the file into 2 blocks Parser and Lexar. I pasted those into the separate tabs of ANTLR Lab and that did not work well .. at first . Then I figured out each needed a line at the top, see below

#1 In the Lab the Lexar tab needs all the “Sample” contents deleted, you can leave the first line "\\Delete This" as a reminder.

(NB: See below, this turned out to be WRONG. The right way is to keep the line lexer grammar ExprLexer; and paste the Clarion Lexer lines below.)


#2 in the Parser tab leave the below 2 lines at the top.

parser grammar ExprParser;
options { tokenVocab=ExprLexer; }

Without these 2 lines you’ll get many java.lang.UnsupportedOperationException’s. Below those 2 lines paste the Parser text from Roberto (but not the Lexar text).


I found my Lab tests worked best in Edge, but it would also throw exceptions. In Chrome it would error with exceptions or not refresh. But maybe I was doing something wrong. I tried click Run with this simple program.

   PROGRAM
   MAP
   END
   CODE

One time I got this result which I realized is useful pointing out that lines with Double Quotes " should be Single quotes '.

So I fixed those Quotes and then got all the below Lexer errors. (NB: At this point I had it WRONG with Lexer text after the Parser text on the Parser tab … keep reading…)

So either that needs to move to the Lexer tab, or I’ll try inserting the line lexer grammar ExprLexer; seen in the sample at line 363:

That didn’t work. The error told me the way I had the Lexer combined with the Parser was not right. There probably is a way but I don’t know it, so I moved (cut) the Lexer specs to Lexar tab and included the line lexer grammar ExprLexer; at the top as is seen in the sample when it opens.

Now I get just one error which I show line 116 below:

image

I didn’t try to look up the RegEx escape rules so took : '@' ~[ \t\r\n,;(){}\[\]]+ and removed the \[\] to keep testing … I clicked run and I think it worked as it shows the Parse Tree with 3 “extraneous input” warnings.

Changed the Parse Tree (above) to Hierarchy view (below):


A little more test

   PROGRAM
   MAP
   END
   CODE
   Message('Hello World','Alert',ICON:Asterisk,BUTTON:Ok)
   RETURN
!FYI must often have "New Line" on the last statement e.g. a RETURN alone
!    or get: 6:9 mismatched input '' expecting {'TRUE', 'FALSE', 'NOT', '-', '~', '(', INTLIT, REALLIT, STRING, PICTURE, ID, NEWLINE}

And it parses and shows a Tree

1 Like

To recap my post of how to test.

First you need to split Roberto’s spec into 2 parts parser and Lexer:

  1. Copy it from his post and paste it into an Editor
  2. Search for “Lexer”. You should find // Lexer rules.
  3. Cut everything from that line down into a separate Editor Tab
  4. This leaves the lines above that are the Parser specs
  5. Include a final CRLF at the bottom of these.

Open ANTLR Lab in Edge


In the Parser Tab paste these 2 lines:

parser grammar ExprParser;
options { tokenVocab=ExprLexer; }

Then paste Roberto’s Paser lines from step 4 above.


In the Lexer Tab paste this line:

lexer grammar ExprLexer;

Then paste Roberto’s Lexer lines from step 3 above.


Now you can paste in Clarion code and click Run.

Thank you, Mr. Carl.
This shows that it might be possible to get the complete Clarion language defined and to create utilities from it. If you could please post both of your corrected files back, it might provide additional folks in getting more of this working.
Very good instructions and explanations on what you did.

In your final file today these lines with Double Quotes " need to be changed to Single ':

// Operators / punctuation
// Multi-character tokens must come before their prefixes.

LTE         : "<=";
GTE         : ">=";
NEQ         : "<>";
AMP_EQUAL   : "&=";
DEEP_ASSIGN : ':=:';   // deep assignment

SLICE_COLON : ':';     // used only in slice grammar

LT          : "<";
GT          : ">";
EQUAL       : "=";

The below throws error “invalid escape sequence \[”:

// Picture tokens: start with '@' and run until whitespace or punctuation
PICTURE
    : '@' ~[ \t\r\n,;(){}\[\]]+
    ;

The RegEx must not be right. I guessed that the Backslash was not needed on the Open Bracket \[ (just the closed) so took it out and [ alone worked.

// Picture tokens: start with '@' and run until whitespace or punctuation
PICTURE
    : '@' ~[ \t\r\n,;(){}[\]]+
    ;

The [] surround the character set. Inside a set you cannot have an inner set, so another [ would not need an \ escape.


Now I’m left with “warning: 488:9: implicit definition of token RECORD in parser”

A search turned up that an “Implicit Definition” warning means the Lexar is missing the token. So I added the line RECORD : 'RECORD'; on the Lexar tab. Now it parses, even in Chrome:

So 3 things to fix in your file. Double Quotes "; @ Regex \[; Add RECORD : 'RECORD';

Here’s what I am pasting into ANTLR Lab based on your “final post for today” then fix " quotes, @ \[ and RECORD implicit.

Lexar tab:

lexer grammar ExprLexer;

// ==============================
// Lexer rules for Clarion Language 
// ==============================

// Keywords
RECORD      : 'RECORD';    // Carl added 1st, move down after FILE
PROGRAM     : 'PROGRAM';
MAP         : 'MAP';
MODULE      : 'MODULE';
END         : 'END';
CODE        : 'CODE';
DATAKW      : 'DATA';

// Types
BYTE        : 'BYTE';
SHORT       : 'SHORT';
USHORT      : 'USHORT';
LONG        : 'LONG';
ULONG       : 'ULONG';
DECIMAL     : 'DECIMAL';
REAL        : 'REAL';
SREAL       : 'SREAL';
STRINGKW    : 'STRING';
BSTRING     : 'BSTRING';
PSTRING     : 'PSTRING';
BLOB        : 'BLOB';
MEMO        : 'MEMO';
DATE        : 'DATE';
TIME        : 'TIME';
BOOL        : 'BOOL';
QUEUE       : 'QUEUE';
WINDOW      : 'WINDOW';
FILE        : 'FILE';
GROUP       : 'GROUP';
ANY         : 'ANY';

// Data attributes / modifiers
PRE         : 'PRE';
USE         : 'USE';
LIKE        : 'LIKE';
DIM         : 'DIM';
OVER        : 'OVER';
STATIC      : 'STATIC';
PRIVATE     : 'PRIVATE';
THREAD      : 'THREAD';
AUTO        : 'AUTO';
BINARYKW    : 'BINARY';

// Controls
BUTTON      : 'BUTTON';
LIST        : 'LIST';
ENTRY       : 'ENTRY';
BOX         : 'BOX';
LINE        : 'LINE';
OLE         : 'OLE';
OTHERCONTROL
    : 'PROMPT'
    | 'CHECK'
    | 'RADIO'
    | 'SHEET'
    | 'TAB'
    ;

// Procedures / logic
PROC        : 'PROC';
ACCEPT      : 'ACCEPT';
CASE        : 'CASE';
OF          : 'OF';
OROF        : 'OROF';
ELSE        : 'ELSE';
ELSIF       : 'ELSIF';
IF          : 'IF';
THEN        : 'THEN';
LOOP        : 'LOOP';
BREAK       : 'BREAK';
CYCLE       : 'CYCLE';
EXIT        : 'EXIT';
ROUTINE     : 'ROUTINE';
RETURN      : 'RETURN';
OPEN        : 'OPEN';
CLOSE       : 'CLOSE';
DO          : 'DO';
EXECUTE     : 'EXECUTE';
BEGIN       : 'BEGIN';
TRUE        : 'TRUE';
FALSE       : 'FALSE';
AND         : 'AND';
OR          : 'OR';
NOT         : 'NOT';
TO          : 'TO';
ENUM_KW     : 'ENUM';
ADDRESS     : 'ADDRESS';
EQUATE      : 'EQUATE';
TIMES       : 'TIMES';
WHILE       : 'WHILE';
UNTIL       : 'UNTIL';
BY          : 'BY';
INTERFACE   : 'INTERFACE';
TYPEKW      : 'TYPE';
COMKW       : 'COM';
CLASSKW     : 'CLASS';
EXTERNAL    : 'EXTERNAL';
IMPLEMENTS  : 'IMPLEMENTS';
DLL         : 'DLL';
BINDABLE    : 'BINDABLE';
LINKKW      : 'LINK';
NETCLASS    : 'NETCLASS';
PARTIALKW   : 'PARTIAL';
PROCEDUREKW : 'PROCEDURE';
ITEMIZEKW   : 'ITEMIZE';
INCLUDEKW   : 'INCLUDE';
ONCEKW      : 'ONCE';
KEYKW       : 'KEY';
INDEXKW     : 'INDEX';
NULLKW      : 'NULL';

// Window/report & UI-related
REPORTKW    : 'REPORT';
ATKW        : 'AT';
FONTKW      : 'FONT';
COLORKW     : 'COLOR';

CENTER      : 'CENTER';
CENTERED    : 'CENTERED';
SYSTEMKW    : 'SYSTEM';
MAXKW       : 'MAX';
ICONKW      : 'ICON';
STATUSKW    : 'STATUS';
HLPKW       : 'HLP';
CURSORKW    : 'CURSOR';
MDI         : 'MDI';
MODAL       : 'MODAL';
MASK        : 'MASK';
GRAY        : 'GRAY';
TIMERKW     : 'TIMER';
ALRTKW      : 'ALRT';
ICONIZE     : 'ICONIZE';
MAXIMIZE    : 'MAXIMIZE';
MSGKW       : 'MSG';
PALETTEKW   : 'PALETTE';
DROPIDKW    : 'DROPID';
IMMKW       : 'IMM';
TOOLBOX     : 'TOOLBOX';
DOCKKW      : 'DOCK';
DOCKEDKW    : 'DOCKED';
LAYOUTKW    : 'LAYOUT';
TILED       : 'TILED';
HSCROLL     : 'HSCROLL';
VSCROLL     : 'VSCROLL';
NOFRAME     : 'NOFRAME';
HVSCROLL    : 'HVSCROLL';
RESIZEKW    : 'RESIZE';
DOUBLE      : 'DOUBLE';

MENUBAR     : 'MENUBAR';
TOOLBAR     : 'TOOLBAR';

// Report sections
FORMKW      : 'FORM';
HEADERKW    : 'HEADER';
DETAILKW    : 'DETAIL';
FOOTERKW    : 'FOOTER';
LANDSCAPEKW : 'LANDSCAPE';
PREVIEWKW   : 'PREVIEW';
PAPERKW     : 'PAPER';
THOUS       : 'THOUS';
MMKW        : 'MM';
POINTSKW    : 'POINTS';

// Operators / punctuation
// Multi-character tokens must come before their prefixes.

LTE         : '<=';
GTE         : '>=';
NEQ         : '<>';
AMP_EQUAL   : '&=';
DEEP_ASSIGN : ':=:';   // deep assignment

SLICE_COLON : ':';     // used only in slice grammar

LT          : '<';
GT          : '>';
EQUAL       : '=';

CARET       : '^';
PLUS        : '+';
MINUS       : '-';
STAR        : '*';
DIV         : '/';
PERCENT     : '%';
TILDE       : '~';
AMP         : '&';
DOT         : '.';

LPAREN      : '(' ;
RPAREN      : ')' ;
LBRACE      : '{' ;
RBRACE      : '}' ;
LBRACKET    : '[' ;
RBRACKET    : ']' ;
COMMA       : ',' ;

QUESTION    : '?' ;

// Literals

INTLIT      : [0-9]+ ;
REALLIT     : [0-9]+ '.' [0-9]+ ;
STRING      : '\'' (~['\\] | '\\' .)* '\'' ;

// Picture tokens: start with '@' and run until whitespace or punctuation
PICTURE
    : '@' ~[ \t\r\n,;(){}[\]]+
    ;

// Identifiers with implicit variable suffixes and colon/dollar in the body
ID
    : [A-Za-z_][A-Za-z0-9_:$]* '#'
    | [A-Za-z_][A-Za-z0-9_:$]* '$'
    | [A-Za-z_][A-Za-z0-9_:$]* '"'
    | [A-Za-z_][A-Za-z0-9_:$]*
    ;

// ==============================
// Line continuation + newline
// ==============================

// "|" at end of line joins with next line
LINE_CONTINUATION
    : '|' [ \t]* [\r\n]+ -> skip
    ;

// NEWLINE also includes semicolon
NEWLINE
    : [\r\n]+
    | ';'
    ;

// ==============================
// Comments & whitespace
// ==============================

LINE_COMMENT
    : '!' ~[\r\n]* -> skip
    ;

BLOCK_COMMENT
    : '/*' .*? '*/' -> skip
    ;

WS
    : [ \t\f]+ -> skip
    ;


Parser tab:

parser grammar ExprParser;
options { tokenVocab=ExprLexer; }

// ==============================
// Parser rules for Clarion Language 
// ==============================

// Program structure

program
    : programHeader? mapSection? declarationSection* codeSection EOF
    ;

programHeader
    : PROGRAM
    ;

// ----- MAP section -----
//
// MAP
//   prototypes
//   [ MODULE( [expr] )
//       prototypes
//     END ]
// END | .
mapSection
    : MAP mapEntry* (END | DOT)
    ;

mapEntry
    : moduleDecl
    | prototypeDecl
    ;

// MODULE inside MAP
moduleDecl
    : MODULE LPAREN expr? RPAREN NEWLINE?
      prototypeDecl*
      (END | DOT)
    ;

// ----- MAP prototypes -----

// MAP prototype, with optional parameters in angle brackets
prototypeDecl
    : ID LPAREN prototypeParamList? RPAREN (COMMA returnType)? (COMMA PROC)? NEWLINE?
    ;

prototypeParamList
    : prototypeParam (COMMA prototypeParam)*
    ;

// Parameters in MAP: by value, by address (*type), or optional <...>
prototypeParam
    : STAR typeSpec               // by-address parameter: *LONG
    | typeSpec                    // by-value parameter: LONG
    | LT STAR typeSpec GT         // optional by-address: <*STRING>
    | LT typeSpec GT              // optional by-value: <STRING>
    ;

// ---------------------
// Types
// ---------------------

// Note: GROUP / FILE / REPORT are block declarations, not simple scalar types

typeSpec
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | SREAL
    | STRINGKW
    | BSTRING
    | PSTRING
    | BLOB
    | MEMO
    | DATE
    | TIME
    | BOOL
    | QUEUE
    | WINDOW
    | ANY
    | ID
    ;

// Builtin types only (no ID) – for labelDecl "type" lines
builtinType
    : BYTE
    | SHORT
    | USHORT
    | LONG
    | ULONG
    | DECIMAL
    | REAL
    | SREAL
    | STRINGKW
    | BSTRING
    | PSTRING
    | BLOB
    | MEMO
    | DATE
    | TIME
    | BOOL
    | QUEUE
    | WINDOW
    | ANY
    ;

// ---------------------
// Procedure parameter lists for PROCEDURE prototypes/impls
// ---------------------

procedureProtoParamList
    : procedureProtoParam (COMMA procedureProtoParam)*
    ;

procedureProtoParam
    : typeSpec (ID)?       // BYTE xVersion  or just BYTE
    ;

// ---------------------
// Declarations
// ---------------------

declarationSection
    : dataDecl
    | queueDecl
    | windowDecl
    | reportDecl
    | procedureDecl
    | labelDecl
    | enumDecl
    | interfaceDecl
    | classDecl
    | procedureProtoDecl
    | itemizeDecl
    | recordDecl
    | fileDecl
    | groupDecl
    | moduleDecl2
    ;

// ----- DATA declarations -----

dataDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

dataLikeOrType
    : LIKE LPAREN fieldRef RPAREN
    | typeSpec typeArgs?
    ;

typeArgs
    : LPAREN argList? RPAREN
    ;

argList
    : expr (COMMA expr)*
    ;

dataAttr
    : PRE LPAREN ID RPAREN
    | USE LPAREN fieldRef RPAREN
    | DIM LPAREN expr (COMMA expr)* RPAREN
    | OVER LPAREN fieldRef RPAREN
    | STATIC
    | PRIVATE
    | THREAD
    | AUTO
    | BINARYKW
    | ID (LPAREN argList? RPAREN)?
    ;

// ----- QUEUE declarations -----
//
// Label QUEUE( [ group ] )
//   [,PRE] [,STATIC] [,THREAD] [,TYPE] [,BINDABLE] [,EXTERNAL] [,DLL]
//   fieldLabel variable [,NAME( )]
// END | .
queueDecl
    : ID QUEUE queueTypeArgs? (COMMA queueAttr)* NEWLINE
      queueFieldDecl+
      (END | DOT)
    ;

queueTypeArgs
    : LPAREN fieldRef? RPAREN
    ;

queueAttr
    : PRE (LPAREN ID RPAREN)?              // PRE or PRE(ID)
    | STATIC
    | THREAD
    | TYPEKW
    | BINDABLE
    | EXTERNAL
    | DLL (LPAREN argList? RPAREN)?
    | ID (LPAREN argList? RPAREN)?         // other attrs, e.g. NAME(), etc.
    ;

queueFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- WINDOW declarations -----
//
// label WINDOW('title')  [,AT( )] [,CENTER] [,SYSTEM] [,MAX] [,ICON( )] [,STATUS( )] [,HLP( )]
//                        [,CURSOR( )] [,MDI] [,MODAL] [,MASK] [,FONT( )] [,GRAY][,TIMER( )]
//                        [,ALRT( )] [,ICONIZE] [,MAXIMIZE] [,MSG( )] [,PALETTE( )] [,DROPID( )] [,IMM]
//                        [,AUTO] [,COLOR( )] [,TOOLBOX] [,DOCK( )] [,DOCKED( )] [,LAYOUT( )]
//                        [,TILED] [,HSCROLL] [,DOUBLE] [,CENTERED] [,VSCROLL] [,NOFRAME]
//                        [,HVSCROLL] [,RESIZE]
//   [MENUBAR ... END]
//   [TOOLBAR ... END]
//   Controls
// END | .
windowDecl
    : ID WINDOW windowArgs? NEWLINE
      windowMenuBarSection?
      windowToolBarSection?
      windowControlDecl*
      (END | DOT)
    ;

windowArgs
    : LPAREN argList? RPAREN (COMMA windowAttr)*
    ;

windowAttr
    : ATKW      LPAREN argList? RPAREN
    | FONTKW    LPAREN argList? RPAREN
    | ICONKW    LPAREN argList? RPAREN
    | STATUSKW  LPAREN argList? RPAREN
    | HLPKW     LPAREN argList? RPAREN
    | CURSORKW  LPAREN argList? RPAREN
    | TIMERKW   LPAREN argList? RPAREN
    | ALRTKW    LPAREN argList? RPAREN
    | MSGKW     LPAREN argList? RPAREN
    | PALETTEKW LPAREN argList? RPAREN
    | DROPIDKW  LPAREN argList? RPAREN
    | DOCKKW    LPAREN argList? RPAREN
    | DOCKEDKW  LPAREN argList? RPAREN
    | LAYOUTKW  LPAREN argList? RPAREN
    | COLORKW   LPAREN argList? RPAREN
    | CENTER
    | CENTERED
    | SYSTEMKW
    | MAXKW
    | MDI
    | MODAL
    | MASK
    | GRAY
    | ICONIZE
    | MAXIMIZE
    | IMMKW
    | AUTO
    | TOOLBOX
    | TILED
    | HSCROLL
    | VSCROLL
    | NOFRAME
    | HVSCROLL
    | RESIZEKW
    | DOUBLE
    | ID (LPAREN argList? RPAREN)?
    ;

// MENUBAR ... END
windowMenuBarSection
    : MENUBAR NEWLINE
      windowMenuDecl*
      (END | DOT)
    ;

// Loose "menus and/or items" line
windowMenuDecl
    : ID (LPAREN argList? RPAREN)?
      (COMMA ID (LPAREN argList? RPAREN)?)* NEWLINE
    ;

// TOOLBAR ... END
windowToolBarSection
    : TOOLBAR NEWLINE
      windowControlDecl*
      (END | DOT)
    ;

windowControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

controlType
    : BUTTON
    | LIST
    | ENTRY
    | STRINGKW
    | GROUP
    | BOX
    | LINE
    | OLE
    | OTHERCONTROL
    ;

controlAttrs
    : (COMMA controlAttr)+
    ;

controlAttr
    : ID
    | ID LPAREN argList? RPAREN
    ;

// ----- REPORT declarations -----
//
// label REPORT([jobname]),
//       AT() [,FONT()] [,PRE()] [,LANDSCAPE] [,PREVIEW] [,PAPER]
//       [,COLOR()]
//       [THOUS | MM | POINTS]
//   [FORM   controls END]
//   [HEADER controls END]
//   label DETAIL controls END
//   [label BREAK(...) controls END]
//   [FOOTER controls END]
// END | .
reportDecl
    : ID REPORTKW LPAREN argList? RPAREN
      COMMA ATKW LPAREN argList? RPAREN
      (COMMA reportAttr)* NEWLINE
      reportSection*
      (END | DOT)
    ;

reportAttr
    : FONTKW   LPAREN argList? RPAREN
    | PRE      LPAREN ID RPAREN
    | LANDSCAPEKW
    | PREVIEWKW
    | PAPERKW
    | COLORKW  LPAREN argList? RPAREN
    | THOUS
    | MMKW
    | POINTSKW
    | ID (LPAREN argList? RPAREN)?
    ;

reportSection
    : reportFormSection
    | reportHeaderSection
    | reportDetailSection
    | reportBreakSection
    | reportFooterSection
    ;

// FORM ... END
reportFormSection
    : FORMKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// HEADER ... END
reportHeaderSection
    : HEADERKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// label DETAIL ... END
reportDetailSection
    : ID DETAILKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// label BREAK(...) ... END
reportBreakSection
    : ID BREAK LPAREN argList? RPAREN NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// FOOTER ... END
reportFooterSection
    : FOOTERKW NEWLINE
      reportControlDecl*
      (END | DOT)
    ;

// reuse same shape as window controls
reportControlDecl
    : ID? controlType controlAttrs NEWLINE
    ;

// ----- Procedure prototypes (no CODE) -----
//
// Name PROCEDURE [ ( param list ) ] [ ,ReturnType ]
procedureProtoDecl
    : ID PROCEDUREKW (LPAREN procedureProtoParamList? RPAREN)?
      (COMMA returnType)?
      NEWLINE
    ;

// ----- Procedure implementations (with CODE) -----
//
// Name PROCEDURE [ ( param list ) ] [ ,ReturnType ]
//   local data
// CODE
//   statements
procedureDecl
    : ID PROCEDUREKW (LPAREN procedureProtoParamList? RPAREN)?
      (COMMA returnType)? NEWLINE
      declarationSection*          // "local data"
      codeSection                  // CODE ... statements
    ;

// Return type shared by prototypes and procedures
returnType
    : typeSpec
    ;

// ----- Standalone MODULE declarations -----
//
// MODULE(sourcefile)
//   prototypeDecl*
// END | .
moduleDecl2
    : MODULE LPAREN expr RPAREN NEWLINE
      prototypeDecl*
      (END | DOT)
    ;

// ----- Label / picture / type / EQUATE lines -----

// label  EQUATE( [ constant ] )
// picture
// type   (builtin types only)
labelDecl
    : ID EQUATE LPAREN expr? RPAREN NEWLINE
    | PICTURE NEWLINE
    | builtinType NEWLINE
    ;

// ----- ENUM declarations -----
//
// Label ENUM
//   Item1
//   ...
//   ItemN
// END | .
enumDecl
    : ID ENUM_KW NEWLINE
      enumItem+
      (END | DOT)
    ;

enumItem
    : ID NEWLINE
    ;

// ----- ITEMIZE declarations -----
//
// [Label] ITEMIZE( [ seed ] ) [,PRE( expr )]
//   ID EQUATE( [ expr ] ) ...
// END | .
itemizeDecl
    : ID? ITEMIZEKW LPAREN expr? RPAREN
      (COMMA PRE LPAREN expr? RPAREN)?
      NEWLINE
      itemizeEquate+
      (END | DOT)
    ;

itemizeEquate
    : ID EQUATE LPAREN expr? RPAREN NEWLINE
    ;

// ----- RECORD declarations -----
//
// Label RECORD [,PRE( )] [,NAME( )]
//   fields
// END | .
recordDecl
    : ID RECORD (COMMA recordAttr)* NEWLINE
      recordFieldDecl+
      (END | DOT)
    ;

recordAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

recordFieldDecl
    : ID dataLikeOrType? (COMMA dataAttr)* NEWLINE
    ;

// ----- GROUP declarations -----
//
// Label GROUP( [ group ] )
//   [,PRE( )] [,DIM( )] [,OVER( )] [,NAME( )] [,EXTERNAL] [,DLL] [,STATIC]
//   [,THREAD] [,BINDABLE] [,TYPE] [,PRIVATE] [,PROTECTED]
//   declarations
// END | .
groupDecl
    : ID GROUP LPAREN fieldRef? RPAREN (COMMA groupAttr)* NEWLINE
      declarationSection*
      (END | DOT)
    ;

groupAttr
    : PRE LPAREN ID RPAREN
    | DIM LPAREN expr (COMMA expr)* RPAREN
    | OVER LPAREN fieldRef RPAREN
    | ID (LPAREN argList? RPAREN)?      // NAME(), EXTERNAL, DLL, STATIC, THREAD, BINDABLE, TYPE, PRIVATE, PROTECTED
    ;

// ----- FILE declarations -----
//
// Label FILE,DRIVER( ) [,CREATE] ...
//   entries (RECORD, KEY, INDEX, MEMO, BLOB)
// END | .
fileDecl
    : ID FILE (COMMA fileAttr)* NEWLINE
      fileEntry*
      (END | DOT)
    ;

fileAttr
    : PRE LPAREN ID RPAREN
    | ID (LPAREN argList? RPAREN)?
    ;

fileEntry
    : recordDecl
    | fileKeyDecl
    | fileIndexDecl
    | fileMemoDecl
    | fileBlobDecl
    ;

fileKeyDecl
    : ID (KEYKW LPAREN argList? RPAREN)? NEWLINE
    ;

fileIndexDecl
    : ID (INDEXKW LPAREN argList? RPAREN)? NEWLINE
    ;

fileMemoDecl
    : ID (MEMO LPAREN argList? RPAREN)? NEWLINE
    ;

fileBlobDecl
    : ID (BLOB)? NEWLINE
    ;

// ----- INTERFACE declarations -----
//
// label INTERFACE( [ parentinterface ] ) [,TYPE] [,COM]
//   [methods]
// END | .
interfaceDecl
    : ID INTERFACE LPAREN fieldRef? RPAREN
      (COMMA TYPEKW)?
      (COMMA COMKW)?
      NEWLINE
      interfaceMethod*
      (END | DOT)
    ;

interfaceMethod
    : prototypeDecl          // reuse MAP-style prototype syntax for methods
    ;

// ----- CLASS declarations -----
//
// label CLASS( [ parentclass ] )
// [,EXTERNAL] [,IMPLEMENTS] [,DLL( )] [,STATIC] [,THREAD] [,BINDABLE]
// [,MODULE( )] [,LINK( )] [,TYPE] [,DIM(dimension)] [,NETCLASS] [,PARTIAL]
//   [ data members and methods ]
// END | .
classDecl
    : ID CLASSKW LPAREN fieldRef? RPAREN classAttrList? NEWLINE
      classMember*
      (END | DOT)
    ;

classAttrList
    : COMMA classAttr (COMMA classAttr)*
    ;

classAttr
    : EXTERNAL
    | IMPLEMENTS
    | DLL LPAREN expr? RPAREN
    | STATIC
    | THREAD
    | BINDABLE
    | MODULE LPAREN expr? RPAREN
    | LINKKW LPAREN expr? RPAREN
    | TYPEKW
    | DIM LPAREN expr RPAREN
    | NETCLASS
    | PARTIALKW
    ;

classMember
    : dataDecl
    | prototypeDecl
    ;

// ---------------------
// CODE section & statements
// ---------------------

codeSection
    : CODE NEWLINE statement*
    ;

// QUESTION may prefix any core statement (DEBUG-mode marker),
// NEWLINE alone is a blank/empty statement.
statement
    : QUESTION? coreStatement
    | NEWLINE
    ;

coreStatement
    : assignStmt
    | procCallStmt
    | doStmt
    | ifStmt
    | caseStmt
    | executeStmt
    | loopStmt
    | acceptStmt
    | breakStmt
    | cycleStmt
    | exitStmt
    | routineStmt
    | routineDataBlock
    | codeMarkerStmt
    | includeStmt
    | returnStmt
    | openCloseStmt
    ;

// "=", "&=", ":=:" assignments
assignStmt
    : fieldRef (EQUAL | AMP_EQUAL | DEEP_ASSIGN) expr NEWLINE
    ;

procCallStmt
    : fieldRef LPAREN argList? RPAREN NEWLINE
    ;

// DO Label
doStmt
    : DO fieldRef NEWLINE
    ;

// INCLUDE(filename [,section]) [,ONCE]
includeStmt
    : INCLUDEKW LPAREN expr (COMMA expr)? RPAREN (COMMA ONCEKW)? NEWLINE
    ;

// EXECUTE expression ... [BEGIN..END] ... [ELSE ...] END
executeStmt
    : EXECUTE expr NEWLINE
      executeBranch+
      executeElse?
      (END | DOT)
    ;

executeBranch
    : BEGIN NEWLINE statement* (END | DOT)   // block of statements for one index
    | statement                              // single statement for one index
    ;

executeElse
    : ELSE NEWLINE statement*
    ;

// IF logical expression [THEN]
//   statements
// [ ELSIF logical expression [THEN]
//   statements ]*
// [ ELSE
//   statements ]?
// END | .
ifStmt
    : IF expr (THEN)? NEWLINE
      statement*
      elsifClause*
      elseClause?
      (END | DOT)
    ;

elsifClause
    : ELSIF expr (THEN)? NEWLINE
      statement*
    ;

elseClause
    : ELSE NEWLINE
      statement*
    ;

// CASE with OF / OROF labels and ranges
caseStmt
    : CASE expr NEWLINE
      caseBranch*
      (ELSE NEWLINE statement*)?
      (END | DOT)
    ;

caseBranch
    : OF caseLabel (OROF caseLabel)* NEWLINE
      statement*
    ;

caseLabel
    : expr (TO expr)?      // simple value or range: 'A' or 'A' TO 'M'
    ;

// LOOP variants:
//
// [label] LOOP [ loopHead ] NEWLINE
//   statements
// [ loopTail ]
// END | .
//
// loopHead:
//   expr TIMES
//   ID = expr TO expr [ BY expr ]
//   UNTIL expr
//   WHILE expr
//
// loopTail:
//   UNTIL expr
//   WHILE expr
loopStmt
    : (ID)? LOOP loopHead? NEWLINE
      statement*
      loopTail?
      (END | DOT)
    ;

loopHead
    : expr TIMES                        // LOOP 10 TIMES
    | ID EQUAL expr TO expr (BY expr)?  // LOOP i = 1 TO 10 BY 2
    | UNTIL expr                        // LOOP UNTIL condition
    | WHILE expr                        // LOOP WHILE condition
    ;

loopTail
    : UNTIL expr                        // ... UNTIL condition
    | WHILE expr                        // ... WHILE condition
    ;

acceptStmt
    : ACCEPT NEWLINE statement* (END | DOT)
    ;

// BREAK [ label ]
breakStmt
    : BREAK (ID)? NEWLINE
    ;

// CYCLE [ label ]
cycleStmt
    : CYCLE (ID)? NEWLINE
    ;

// EXIT
exitStmt
    : EXIT NEWLINE
    ;

// ROUTINE label line
routineStmt
    : ID ROUTINE NEWLINE
    ;

// ROUTINE-local DATA block inside CODE
routineDataBlock
    : DATAKW NEWLINE
      declarationSection*
    ;

// Inner CODE marker line inside CODE section / ROUTINE
codeMarkerStmt
    : CODE NEWLINE
    ;

returnStmt
    : RETURN expr? NEWLINE
    ;

openCloseStmt
    : (OPEN | CLOSE) LPAREN fieldRef RPAREN NEWLINE
    ;

// ---------------------
// EXPRESSIONS (precedence-based)
// ---------------------
//
// Precedence (low ? high):
//   OR
//   AND
//   =, <>, &=
//   <, <=, >, >=
//   +, -, &
//   *, /, %
//   ^
//   unary NOT, ~, unary -
//   primary (with [] and {} postfixes)

expr
    : orExpr
    ;

orExpr
    : andExpr (OR andExpr)*
    ;

andExpr
    : equalityExpr (AND equalityExpr)*
    ;

equalityExpr
    : relationalExpr ((EQUAL | NEQ | AMP_EQUAL) relationalExpr)*
    ;

relationalExpr
    : additiveExpr ((LT | LTE | GT | GTE) additiveExpr)*
    ;

additiveExpr
    : multiplicativeExpr ((PLUS | MINUS | AMP) multiplicativeExpr)*
    ;

multiplicativeExpr
    : powExpr ((STAR | DIV | PERCENT) powExpr)*
    ;

// Exponentiation: right-associative
powExpr
    : unaryExpr (CARET powExpr)?
    ;

unaryExpr
    : (NOT | TILDE | MINUS) unaryExpr
    | primary
    ;

// Primary with optional index/slice and property/repeat postfixes
primary
    : primaryBase
      ( LBRACE propertyParamList RBRACE      // Field{PROP:Text,@N3} or '='{10}
      | LBRACE expr RBRACE                   // simple repeat count {n}
      | LBRACKET indexOrSlice RBRACKET       // Name[1], S[1 : 5], S[ : 5], S[ : ]
      )*
    ;

primaryBase
    : literal
    | fieldRef
    | ENUM_KW LPAREN argList? RPAREN
    | ADDRESS LPAREN argList? RPAREN
    | LIKE LPAREN argList? RPAREN
    | LPAREN expr RPAREN
    ;

// Index or slice
// Style: slices are written with spaces around colon: [start : end]
indexOrSlice
    : expr                     // Name[1]
    | expr SLICE_COLON expr?   // S[1 : 5] or S[1 : ]
    | SLICE_COLON expr?        // S[ : 5] or S[ : ]
    ;

propertyParamList
    : expr (COMMA expr)*
    ;

literal
    : INTLIT
    | REALLIT
    | STRING
    | TRUE
    | FALSE
    | PICTURE
    | NULLKW
    ;

// ---------------------
// Field references
// ---------------------

fieldRef
    : ID (DOT ID)*      // Struct.Member, Win$Form:Field, etc.
    ;


   PROGRAM
   MAP
   END
   CODE
   Message('Hello World','Alert',ICON:Asterisk,BUTTON:Ok)
   RETURN
   
1 Like

Looking at the parser file

line 10 program

  • appears to show map always appearing before the data section
    they are mixed together

line 45 prototypeDecl

  • there are two forms, one with procedure name as a label and with the keyword Procedure, one indented and with no keyword
  • PROC is supported, but what about ,RAW,NAME,DLL(n),C,PASCAL, EXTERNAL
    line 54 prototypeParam
  • CONST is missing
  • default values are missing (granted the one that matters is later, but they do parse here)

line 67 typeSpec and line 91 builtinType

  • CSTRING is missing
  • ? and *? are parameter types are appear to missing
  • notations for dimensioned arguments appear to be missing
  • BFLOAT4 and BFLOAT8 appear to be missing (can’t say that I’ve used those)
  • PDECIMAL appears to be missing
  • did we ever get USTRING :wink:

line 129 declarionSection

  • has an enumDecl (see line 455) – there are no enums in clarion

line 166 dataAttr

  • has USE – I believe these only apply inside of window structures
  • PRIVATE - only applies inside of a class (when we get to a class, then add PROTECTED)
  • BINARYKW - doesn’t appear to be defined (and when it is, it only applies to a MEMO or BLOB
  • NAME missing
  • unsure what ID is about

line 206 queuFieldDecl

  • ID is optional
  • dataLikeOrType doesn’t seem to handle Reference variables &Yada

line 297 controlType

  • OTHERCONTROL doesn’t appear to be defined
  • Box and Line are mentioned, ARC and ELLIPSE appear to be missing
  • COMBO appears to be missing
  • STRINGKW doesn’t appear to be defined

(( I skipped over reports, I’m running short on time ))

line 402 procedureProtDecl

  • lead to line 121 procedureProtParam
    • missing a number of things, including default values, CONST, ?, *?, dimensioned arguments
  • return type doesn’t appear to support *
  • PROCEDUREKW doesn’t appear to be defined (will need FUNCTION too)

line 455 enumDecl

  • I am unaware of ENUM existing in clarion

line 470 itemizeDecl

  • line 472 uses PRE probably needs to be PREWK and defined (reminder you can add extra ‘:’ in a prefix)

line 509 groupDecl

  • ID is optional

line 587 classDecl

  • CLASSKW doesn’t appear to be defined

line 597 classAttr

  • NETCLASS is not part of clarion for windows
  • PARTIALKW is not part of clarion for windows
  • I think there might be a ,COM attribute (there is one on INTERFACE)

line 653 assignment

  • should there be support for *= += /=

line 721 caseBranch

  • OF 5 to 42 is not quite supported, as line 722 only adds the TO in an OROF

line 755 cycleStmt

  • appears to be used as a coreStmt, but it only is meaningful inside of a LOOP or ACCEPT
  • same for BREAK
  • similiar logic for exitStmt which can only appear inside of a ROUTINE

((( more to dig into, but I’ve gotta run )))

Looking at Mark’s extensive list…

Types UNSIGNED and SIGNED (like BOOL) are in Equates as LONG and commonly used.
More recently added for 64 bit are POINTER_T and COUNT_T as EQUATE(LONG)
BSTRING is missing?
And ASTRING also?
USTRING is currently not to be implemented

controlType missing compared to Equates.clw CREATE:xxx. I left out Report Bands
CHECK
COMBO
ELLIPSE
IMAGE
ITEM
MENU
MENUBAR
OPTION
PANEL
PROGRESS
PROMPT
RADIO only as child of OPTION
REGION
SHEET
SLIDER
SPIN
STATE3
TAB only as child of SHEET
TEXT SINGLELINE is an Attribute for TEXT

ENUM is for Clarion Sharp? As are some others like NETCLASS?

Thank you for the feedback.

Please note that some of the language is generated from the Clarion documentation (some of .NET sneaks in) and I can only do much in a session since there is a limit. And I have not made it through all the controls yet. ChatGPT is doing more analysis as I add more syntax and is taking a bit longer to think about it. As of last night it got to a point that after changes I have to download since ChatGPT cannot show the grammar completely as it is rather big. Eventually this will have to be broken into multiple g4 files.

Again this is a long term as I get a chance to experiment. And right now I have a bit of nights and weekends to work on it. But sooner or later I will run out of time. So someone else will have to pick up the baton and continue. Please note that Mr. Mark Sarson also worked on this for a while. So you can look at that and see if you can get it to work.

And sooner or latter some folks with more experience with the clarion language will have to do changes to the grammar. I will get another language upload this evening, it will still not be complete.

To do a cross-reference you do have to get all the identifiers and line numbers and references so it might take a language definition for that task.