lex

Description
Lexical analyser construction
Latest
lex-1.2.0.20240216.82808.tar (.sig), 2024-Mar-31, 70.0 KiB
Maintainer
Stefan Monnier <monnier@iro.umontreal.ca>
Atom feed
lex.xml
Website
https://elpa.gnu.org/packages/lex.html
Browse repository
CGit or Gitweb
Badge

To install this package from Emacs, use package-install or list-packages.

Full description

Format of regexps is the same as used for `rx' and `sregex'.
Additions:
- (ere RE) specify regexps using the ERE syntax.
- (inter REs...) (aka `&') make a regexp that only matches
  if all its branches match.  E.g. (inter (ere ".*a.*") (ere ".*b.*"))
  match any string that contain both an "a" and a "b", in any order.
- (case-fold REs...) and (case-sensitive REs...) make a regexp that
  is case sensitive or not, regardless of case-fold-search.

Input format of lexers:

ALIST of the form ((RE . VAL) ...)

Format of compiled DFA lexers:

nil                     ; The trivial lexer that fails
(CHAR . LEXER)
(table . CHAR-TABLE)
(stop VAL . LEXER)      ; Match the empty string at point or LEXER.
(check (PREDICATE . ARG) SUCCESS-LEXER . FAILURE-LEXER)

Intermediate NFA nodes may additionally look like:
(or LEXERs...)
(orelse LEXERs...)
(and LEXERs...)
(join CONT . EXIT)
Note: we call those things "NFA"s but they're not really NFAs.

Bugs:

- `inter' doesn't work right.  Matching `join' to the corresponding `and'
  is done incorrectly in some cases.
- since `negate' uses intersections, it doesn't work right either.
- "(\<)*" leads to a DFA that gets stuck in a cycle.

Todo:

- dfa "no-fail" simplifier
- dfa minimization
- dfa compaction (different representation)
- submatches
- backrefs?
- search rather than just match
- extensions:
  - repeated submatches
  - negation
  - lookbehind and lookahead
  - match(&search?) backward
  - agrep

Old versions

lex-1.1.0.20221221.80437.tar.lz2022-Dec-2115.0 KiB
lex-1.1.0.20221212.224904.tar.lz2022-Dec-1315.0 KiB
lex-1.1.0.20201201.211755.tar.lz2020-Dec-1414.9 KiB
lex-1.1.0.20201201.161755.tar.lz2021-Oct-0914.9 KiB

News