dte-syntax

A dte syntax file consists of multiple states. A state consists of optional conditionals and one default action. The best way understand the syntax is to read through some of the built-in syntax files, which can be printed with dte -b, for example:

dte -b syntax/dte

The basic syntax used is the same as in dterc files, but the available commands are different.

Commands

Main commands

syntax name

Begin a new syntax. One syntax file can contain multiple syntax definitions, but you should only define one real syntax in one syntax file.

state name [emit-name]

Add new state. Conditionals (if any) and one default action must follow. The first state in each syntax is the start state.

default color name...

Set default color for emitted name.

Example:

default numeric oct dec hex

If there is no color defined for oct, dec or hex then color numeric is used instead.

list [-i] name string...

Define a list of strings, for use with the inlist command.

Example:

list keyword if else for while do continue switch case

-i: Make list case-insensitive

Conditionals

Any number of conditionals can appear between a state command and its final default action.

During syntax highlighting, when a state is entered, its conditions are checked in the same order as authored. If a condition is met, the matching text is colored in accordance with the emit-name argument and processing transitions to the destination state.

If the emit-name argument of a conditional is left unspecified, the emit-name (or name) of the destination state is used instead. This can often be used to reduce verbosity.

The special destination state this can be used to jump to the current state.

bufis [-i] string destination [emit-name]

Test if buffered bytes are the same as string. If they are, emit emit-name and jump to destination state.

-i: Case-insensitive

char [-bn] characters destination [emit-name]

Test if the current byte appears in characters. If so, emit emit-name and jump to destination state.

Character ranges can be specified by using - as a delimiter. For example, a-f is the same as abcdef and a-d.q-t- is the same as abcd.qrst-.

-b: Add byte to buffer (if matched)
-n: Invert character bitmap

heredocend destination

Compare following characters to heredoc end delimiter (as established by heredocbegin) and go to destination state, if comparison is true.

inlist list destination [emit-name]

Test if the buffered bytes are found in list. If found, emit emit-name and jump to destination state.

str [-i] string destination [emit-name]

Check if the next bytes are the same as string. If so, emit emit-name and jump to the destination state.

-i: Case-insensitive

NOTE: This conditional can be slow, especially if string is longer than two bytes.

Default actions

The last command of every state must be a default action. It represents an unconditional jump to a destination state.

As with conditionals, the special destination state this can be used to re-enter the current state.

eat destination [emit-name]

Consume byte, emit emit-name color and continue to destination state.

heredocbegin subsyntax return-state

Store buffered bytes as the heredoc end delimiter and go to subsyntax. The sub-syntax is like any other sub-syntax, but it must contain a heredocend conditional.

noeat [-b] destination

Continue to destination state without emitting color or consuming byte.

-b: Don't stop buffering

Other commands

recolor color [count]

If count is given, recolor count previous bytes. Otherwise, recolor buffered bytes.

This command should be used sparingly, since it recolors text that was already consumed and colored by the other commands above. In most cases it makes sense to assign the correct color on the first pass, unless doing so is not possible or is considerably more verbose.

Sub-syntaxes

Sub-syntaxes are useful when the same states are needed in many contexts.

Sub-syntax names must be prefixed with .. It's recommended to also use the main syntax name in the prefix. For example .c-comment if c is the main syntax.

A sub-syntax is a syntax in which some destination state name is END. END is a special state name that is replaced by the state specified in another syntax.

Example:

# Sub-syntax
syntax .c-comment

state comment
    char "*" star
    eat comment

state star comment
    # END is a special state name
    char / END comment
    noeat comment

# Main syntax
syntax c

state c code
    char " \t\n" c
    char -b a-zA-Z_ ident
    char "\"" string
    char "'" char
    # Call sub-syntax
    str "/*" .c-comment:c
    eat c

# Other states removed

In this example the destination state .c-comment:c is a special syntax for calling a sub-syntax. .c-comment is the name of the sub-syntax and c is the return state defined in the main syntax. The whole sub-syntax tree is copied into the main syntax and all destination states in the sub-syntax whose name is END are replaced with c.