Note: This is taken from the Chicken Wiki, where a more recent version could be available.

Introduction

RFC 3454 Internationalized string preparation

Examples

XMPP Nodeprep profile

(define nodeprep

 (make-stringprepper
   (list appendix-b1 appendix-b2) ; Mappings
   #t ; Normalize
   (char-set-union
     appendix-c ; Forbid everything in Appendix C
     (char-set #\" #\& #\' #\/ #\: #\< #\> #\@)) ; And this stuff
   #t ; Bidirectional check))

Authors

Adam C. Emerson <azure@umich.edu>

License

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

Requirements

make-stringprepper

(make-stringprepper mappings normalize? prohibited bidi?)

make-stringprepper returns a function from strings to strings.

It will throw (exn invalid) if the string contains prohibited characters or fails the bidirectionality check.

appendix-b1

Mapping given in Table B.1 of Appendix B, "Commonly mapped to nothing."

appendix-b2

Mapping given in Table B.2 of Appendix B, "Mapping for case-folding used with NKFC."

appendix-b3

Mapping given in Table B.3 of Appendix B, "Mapping for case-folding used with no normalization."

appendix-c1.1

Character set given in Table C.1.1 of Appendix C, "ASCII space characters."

appendix-c1.2

Character set given in Table C.1.2 of Appendix C, "Non-ASCII space characters."

appendix-c1

Union of appendix-c1.1 and appendix-c1.2.

appendix-c2.1

Character set given in Table C.2.1 of Appendix C, "ASCII control characters."

appendix-c2.2

Character set given in Table C.2.2 of Appendix C, "Non-ASCII control characters."

appendic-c2

Union of appendix-c2.1 and appendix-c2.2

appendix-c3

Character set given in Table C.3 of Appendix C, "Private use"

appendix-c4

Character set given in Table C.4 of Appendix C, "Non-character code points"

appendix-c5

Character set given in Table C.5 of Appendix C, "Surrogate codes"

appendix-c6

Character set given in Table C.6 of Appendix C, "Inappropriate for plain text"

appendix-c7

Character set given in Table C.7 of Appendix C, "Inappropriate for canonical representation"

appendix-c8

Character set given in Table C.8 of Appendix C, "Change display properties or are deprecated"

appendix-c9

Character set given in Table C.9 of Appendix C, "Tagging characters"

appendix-c

Union of appendix-c1, appendix-c2, appendix-c3, appendix-c4, appendix-c5, appendix-c6, appendix-c7, appendix-c8, and appendix-c9

Version History