A programmable filter
Mistie is a programmable filter. Its primary aim is to let the user define a document's markup using Scheme.
By itself, Mistie does not require any style of markup or format of either its input or its output. It simply copies its standard input to standard output as is. E.g.,
csi -R mistie -e '(mistie-main)' < input.doc > output.doc
produces an
output.doc
that is indistinguishable
from
input.doc
.
mistie-main can be given a
file's name as argument, in which case it reads
that file instead of standard input. Thus, the above
command is equivalent to
csi -R mistie -e '(mistie-main "input.doc")' > output.doc
To make Mistie do something more interesting than copying input verbatim to output, the user must supply a format file. A format file is a Scheme file that describes the markup of the input document in terms of the desired output format. Format files are normal Scheme files and can be loaded with mistie-load. E.g.,
csi -R mistie myformat.mistie -e '(mistie-main)' < input.doc
produces a formatted version of
input.doc
,
the formatting being dictated by the format file
myformat.mistie
. The formatted version may either
go to standard output or to some file depending on
myformat.mistie
. We will use the
.mistie
extension for Scheme files used as format files,
but this is just a convention.
In general, a format file will use the Mistie infrastructure to define a particular markup, deciding both what the input document should look like and what kind of output to emit. Format authors are not limited to a specialized sublanguage -- they can use full Scheme, including all the nonstandard features of the particular Scheme dialect they have at their disposal.
Writing a format file requires some Scheme programming skill. If you're already a Scheme programmer, you are all set. If not, you can rely on format files written by people whose taste you trust. If it helps, Mistie is somewhat like TeX in its mode of operation (though not in its domain), with the ``macro'' language being Scheme. The analogy is not perfect though: There are no predefined primitives (everything must be supplied via a format file), and the output style is CFD (completely format dependent) rather than some DVI (device independent). (Hope that wasn't too mistie-rious.)
The distribution includes several sample format
files: Format files may be combined in the call to
mistie.scm
, e.g.,
csi -R mistie plain.mistie footnote.mistie -e '(mistie-main "file.doc}")' > file.html csi -R mistie plain.mistie multipage.mistie -e '(mistie-main "file.doc")'
Alternatively, a new combination format file can
be written that loads other format files. E.g.,
the following format file
basic.mistie
combines
within itself the effects of
plain.mistie
,
scmhilit.mistie
, and
multipage.mistie
:
; File: basic.mistie (mistie-load "plain.mistie") ;or use `load' with full pathnames (mistie-load "scmhilit.mistie") (mistie-load "multipage.mistie")
It is invoked in the usual manner:
csi -R mistie basic.mistie -e '(mistie-main "file.doc")'
Note that the format file
multipage.mistie
creates
a set of
.html
files whose names are based on
the name of the input document. Therefore, when
using this format file, whether explicitly or
implicitly, redirection of standard input or standard
output is inappropriate.
The name Mistie stands for Markup In Scheme That Is Extensible. Possible pronunciations are miss-tea and miss-tie.
(mistie-def-char #\< (lambda () (display "<"))) (mistie-def-char #\> (lambda () (display ">"))) (mistie-def-char #\& (lambda () (display "&"))) (mistie-def-char #\" (lambda () (display """)))
mistie-def-char
takes two arguments: The first is
the character that is defined, and the second is
the procedure associated with it. Here, the procedure
writes the HTML encoded version of the character.
Suppose we want a contiguous sequence of blank lines
to be come out as the paragraph separator,
<p>
. We could
mistie-def-char
the newline
character as follows:
(mistie-def-char #\newline (lambda () (newline) (let* ((s (h-read-whitespace)) (n (h-number-of-newlines s))) (if (> n 0) (begin (display "<p>") (newline) (newline)) (display s)))))
This will cause newline to read up all the following
whitespace, and then check to see how many further
newlines it picked up. If there was at least one,
it outputs the paragraph separator, viz.,
<p>
followed by two newlines (added for human
readability). Otherwise, it merely prints the
picked up whitespace as is. The help procedures
h-read-whitespace
and
h-number-of-newlines
are ordinary Scheme procedures:
(define h-read-whitespace (lambda () (let loop ((r '())) (let ((c (peek-char))) (if (or (eof-object? c) (not (char-whitespace? c))) (list->string (reverse r)) (loop (cons (read-char) r))))))) (define h-number-of-newlines (lambda (ws) (let ((n (string-length ws))) (let loop ((i 0) (k 0)) (if (>= i n) k (loop (+ i 1) (if (char=? (string-ref ws i) #\newline) (+ k 1) k)))))))
(mistie-def-ctl-seq 'br (lambda () (display "<br>")))
Before a control sequence can be used, we must fix the escape character. The following sets it to backslash:
(set! mistie-escape-char #\\)
We can now invoke the
br
control sequence as
\br
.
(mistie-def-ctl-seq 'obeylines (lambda () (mistie-push-frame) (mistie-def-char #\newline (lambda () (display "<br>") (newline))) (mistie-def-ctl-seq 'endobeylines (lambda () (mistie-pop-frame)))))
The
obeylines
control sequence first pushes a new
frame on to the Mistie environment, using the Mistie
procedure
mistie-push-frame
. What this means is
that any definitions (whether
mistie-def-char
or
mistie-def-ctl-seq
) will shadow existing
definitions. The Mistie procedure
mistie-pop-frame
exits the frame, causing
the older definitions to take effect again.
In this case, we create a shadowing
mistie-def-char
for newline, so that it will emit
<br>
instead of performing its default action
(which, as we described above, was to look for
paragraph separation). We also define a control
sequence
endobeylines
which will pop the frame
pushed by
obeylines
. With this definition in
place, any text sandwiched between
\obeylines
and
\endobeylines
(assuming
\
is the escape
character) will be output with a
<br>
at the end
of each of its lines.
(mistie-def-ctl-seq 'eval (lambda () (eval (read))))
This will cause
\eval
followed by a Scheme
expression to evaluate that Scheme expression.
E.g.,
\eval (display (+ 21 21))
will cause
42
to be printed at the point where the
\eval
statement is placed. Of course, once you
have arbitrary access to Scheme within your document,
the amount of kooky intertextual stuff you can do is
limited only by your imagination. A mundane use for
\eval
is to reset the escape character at
arbitrary locations in the document, should the
existing character be needed (temporarily or
permanently) for something else.
To load a mistie file, you should use the (mistie-load FILENAME ...) procedure, which will search the working directory and the directory given in the parameter mistie-path (which defaults to the value of (repository-path), the path where chicken-setup will install the initially provided .mistie files).
(mistie-main [FILENAME]) will invoke the filtering process.
mistie.scm -f plain.mistie input.doc > input.html
plain
converts the characters
<
,
>
,
&
,
and
"
to their HTML encodings. One or more blank
lines are treated as paragraph separation.
plain
provides a small set of control sequences
geared for manual writing. The default escape
character is
\
(backslash). Typically,
arguments of
plain
's control sequences are
specified within braces (
{...}
), as in TeX or
LaTeX.
\i
typesets its argument in italic. E.g.,
\i{italic}
produces
italic. Other control
sequences in this vein are
\b
for bold and
\small
for small print.
\p
puts its argument in monospace fixed font
and is used for program code. If it is not
convenient to enclose
\p
's argument in braces
(e.g., the enclosed code contains non-matching
braces), then the argument may be specified by the
placing the same character on each side. (This is
like LaTeX's
\verb
.) Another useful feature of
the
\p
control sequence: If its argument starts
with a newline, it is displayed with the linebreaks
preserved.
Use
\title
for specifying a document's
title, which is used as both the internal title
and the external (bookmarkable) title.
\stylesheet{file.css}
causes the resulting
HTML file to use the file
file.css
as its style
sheet. A sample style sheet
mistie.css
is
included in the distribution.
\section
,
\subsection
,
\subsubsection
produce numbered section headers of the
appropriate depth.
\section*
, etc., produce
unnumbered sections.
\urlh{URL}{TEXT}
typesets TEXT as a link to URL.
\obeylines{...}
preserves linebreaks for its argument.
Note that this is dissimilar in call, though not in
function, to TeX's
{\obeylines ...}
.
\flushright
is like
\obeylines
, but sets its argument
lines flush right.
\input FILE
or
\input{FILE}
includes the
contents of
FILE
.
\eval
evaluates the following Scheme expression.
\scmkeyword (prog1 block) \scmconstant (true false)
A style sheet (see
plain.mistie
) is used to
set the colors. The style sheet
mistie.css
,
provided with this distribution, has the following
style class settings:
.scheme { color: brown; } .scheme .keyword { color: #cc0000; font-weight: bold; } .scheme .variable { color: navy; } .scheme .number,.string,.char,.boolean,.constant { color: green; } .scheme .comment { color: teal; }
The class
.scheme
specifies the background
punctuation style, and the various subclasses,
--
.keyword
,
.variable
, etc. -- specify the
styles for the various syntactic categories. Note
that we have combined the subclasses for numbers,
strings, etc., into one, but you can separate them out
if you want to distinguish between them.
You may wish to modify these settings for your documents. Additionally, there are browser-specific ways you can use to override the settings of other authors' documents.
Navigation bars at the bottom allow the user to travel across the pages.
\bibitem
can be used to enumerate bibliographic entries.
\cite{BIBKEY}
points to the entry introduced by
\bibitem{BIBKEY}
.
\cite
's argument can list multiple
keys, with comma as the separator.
2.6 timestamp.mistie
This prints the date of last modification at the bottom
of the (first) page.
Copyright (c) 2006, Dorai Sitaram. All rights reserved. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.