Note: This is taken from the Chicken Wiki, where a more recent version could be available.
RFC 3454 Internationalized string preparation
(define nodeprep
(make-stringprepper (list appendix-b1 appendix-b2) ; Mappings #t ; Normalize (char-set-union appendix-c ; Forbid everything in Appendix C (char-set #\" #\& #\' #\/ #\: #\< #\> #\@)) ; And this stuff #t ; Bidirectional check))
Adam C. Emerson <azure@umich.edu>
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
(make-stringprepper mappings normalize? prohibited bidi?)
make-stringprepper returns a function from strings to strings.
It will throw (exn invalid) if the string contains prohibited characters or fails the bidirectionality check.
Mapping given in Table B.1 of Appendix B, "Commonly mapped to nothing."
Mapping given in Table B.2 of Appendix B, "Mapping for case-folding used with NKFC."
Mapping given in Table B.3 of Appendix B, "Mapping for case-folding used with no normalization."
Character set given in Table C.1.1 of Appendix C, "ASCII space characters."
Character set given in Table C.1.2 of Appendix C, "Non-ASCII space characters."
Union of appendix-c1.1 and appendix-c1.2.
Character set given in Table C.2.1 of Appendix C, "ASCII control characters."
Character set given in Table C.2.2 of Appendix C, "Non-ASCII control characters."
Union of appendix-c2.1 and appendix-c2.2
Character set given in Table C.3 of Appendix C, "Private use"
Character set given in Table C.4 of Appendix C, "Non-character code points"
Character set given in Table C.5 of Appendix C, "Surrogate codes"
Character set given in Table C.6 of Appendix C, "Inappropriate for plain text"
Character set given in Table C.7 of Appendix C, "Inappropriate for canonical representation"
Character set given in Table C.8 of Appendix C, "Change display properties or are deprecated"
Character set given in Table C.9 of Appendix C, "Tagging characters"
Union of appendix-c1, appendix-c2, appendix-c3, appendix-c4, appendix-c5, appendix-c6, appendix-c7, appendix-c8, and appendix-c9