Note: This is taken from the Chicken Wiki, where a more recent version could be available.

Introduction

Provides access to PCRE structures beyond that of the regex unit.

Documentation

There are two extensions in this egg - regex-extras) & regex-chardefs.

The regex-extras extension provides access to PCRE data structures:

pcre_extra
pcre_compile options
pcre_fullinfo
pcre_config

The regex-chardefs extension allows an override of the character definitions.

Regex Extras

Usage

(require-extension regex-extras)

regex-version

<procedure>(regex-version)</procedure>

Extra Info

Extra Info Symbols:

match-limit
match-limit-recursion
callout-data
tables

regexp-extra-info-set!

<procedure>(regexp-extra-info-set! REGEXP EXTRA-INFO-SYMBOL EXTRA-INFO-VALUE ...)</procedure>

Sets the compiled PCRE regular expression extra-info structure fields.

EXTRA-INFO-FIELD is an Extra Info Field Symbol.

EXTRA-INFO-VALUE is an object of appropriate type.

regexp-extra-info

<procedure>(regexp-extra-info REGEXP EXTRA-INFO-SYMBOL ...)</procedure>

Returns a list of the compiled PCRE regular expression extra-info structure fields.

pcre_compile Options

Option Symbols OPTION:

caseless
Character case insensitive match
multiline
Equivalent to Perl's /m option
dotall
Equivalent to Perl's /s option
extended
Ignore whitespace
anchored
Anchor pattern match
dollar-endonly
`$' metacharacter in the pattern matches only at the end of the subject string
extra
Currently of very little use
notbol
First character of the string is not the beginning of a line
noteol
End of the string is not the end of a line
ungreedy
Inverts the "greediness" of the quantifiers so that they are not greedy by default
notempty
The empty string is not considered to be a valid match
utf8
UTF-8 encoded characters
no-auto-capture
Disables the use of numbered capturing parentheses
no-utf8-check
Skip valid UTF-8 sequence check
auto-callout
Automatically inserts callout items (not defined here)
partial
Partial match ok
firstline
An unanchored pattern is required to match before or at the first newline
dupnames
Names used to identify capturing subpatterns need not be unique
newline-cr
Newline definition is `\r'
newline-lf
Newline definition is `\n'
newline-crlf
Newline definition is `\r\n'
newline-anycrlf
Newline definition is any of `\r', `\n', or `\r\n'
newline-any
Newline definition is any Unicode newline sequence
bsr-anycrlf
`\R' escape sequence matches only CR, LF, or CRLF
bsr-unicode
`\R' escape sequence matches only Unicode newline sequence
dfa-shortest
Ignored
dfa-restart
Ignored

regexp-options-set!

<procedure>(regexp-options-set! REGEXP OPTION-SYMBOL ...)</procedure>

Sets the compiled PCRE regular expression options.

regexp-options

<procedure>(regexp-options REGEXP | OPTION-BITS)</procedure>

Returns a list of compiled PCRE regular expression options.

OPTION-BITS is an integer.

Fullinfo

Field Symbols:

options
bitfield
size
integer
capturecount
integer
backrefmax
integer
firstbyte
integer
firstchar
integer
firsttable
chardef-table
lastliteral
integer
nameentrysize
integer
namecount
integer
nametable
pointer
studysize
integer
default-tables
pointer
okpartial
boolean
jchanged
boolean
hascrorlf
boolean

regexp-info

<procedure>(regexp-info REGEXP [FULLINFO-SYMBOL] ...)</procedure>

Returns a list of compiled PCRE regular expression fullinfo fields.

regexp-info-nametable

<procedure>(regexp-info-nametable REGEXP)</procedure>

Returns a list of the compiled PCRE regular expression nametable fields, (NAME INDEX CC).

Config Info

Config Info Field Symbols:

utf8
boolean
newline
integer
link-size
integer
posix-malloc-threshold
integer
match-limit
integer
stackrecurse
boolean
unicode-properties
boolean
match-limit-recursion
integer
bsr
boolean

regex-build-config-info

<procedure>(regex-build-config-info [CONFIG-INFO-SYMBOL] ...)</procedure>

Returns a list of the PCRE build configuration fields.

Regex Chardefs

Usage

(require-extension regex-chardefs)

regex-chardef-set!

<procedure>(regex-chardef-set! CHARDEFS-TABLE CHAR/INDEX CHARDEF-VECTOR)</procedure>

Sets the character definition for the CHAR or at the INDEX in the CHARDEFS-TABLE to the new character definition, CHARDEF-VECTOR.

The CHARDEF-VECTOR is a character definition vector where:

element 0
the lower-case character or #f
element 1
the flipped-case character or #f
element 2
list of character class names or #f
element 3
list of character type names or #f

Character Class Symbols:

space
xdigit
digit
upper
lower
word
graph
print
punct
cntrl

Character Type Symbols:

space
letter
digit
xdigit
word
meta

regex-chardefs-update!

<procedure>(regex-chardefs-update! CHARDEFS-TABLE CHARDEFS-VECTOR)</procedure>

Updates the CHARDEFS-TABLE with the new character definitions, CHARDEFS-VECTOR.

The CHARDEFS-VECTOR is a character definitions vector. Each of the 256 elements is either #f, for no definition change, or a character definition vector, as above.

regex-chardefs

<procedure>(regex-chardefs CHARDEFS-TABLE)</procedure>

Returns a character definitions vector, as above.

Examples

Notes

Bugs and Limitations

Author

Kon Lovett

License

Copyright © 2008 Kon Lovett. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Requirements

Version history

1.0