From 1ffd1da38f779145d6d3685b705fc51e4f90a17b Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Sat, 5 Mar 2011 21:29:24 -0500 Subject: documentation, remove some files --- data/doc/sisu/html/_sisu | 1 - data/doc/sisu/html/homepage/index.html | 264 --- data/doc/sisu/html/index.html | 1 - data/doc/sisu/html/sisu3.1.html | 3520 -------------------------------- 4 files changed, 3786 deletions(-) delete mode 120000 data/doc/sisu/html/_sisu delete mode 100644 data/doc/sisu/html/homepage/index.html delete mode 120000 data/doc/sisu/html/index.html delete mode 100644 data/doc/sisu/html/sisu3.1.html (limited to 'data') diff --git a/data/doc/sisu/html/_sisu b/data/doc/sisu/html/_sisu deleted file mode 120000 index 0e977351..00000000 --- a/data/doc/sisu/html/_sisu +++ /dev/null @@ -1 +0,0 @@ -../sisu_manual/_sisu \ No newline at end of file diff --git a/data/doc/sisu/html/homepage/index.html b/data/doc/sisu/html/homepage/index.html deleted file mode 100644 index 6c55c9c0..00000000 --- a/data/doc/sisu/html/homepage/index.html +++ /dev/null @@ -1,264 +0,0 @@ - - - -SiSU information Structuring Universe - Structured -information, Serialized Units - software for electronic texts, -documents, books, digital libraries in plaintext, html, xhtml, XML, -ODF (OpenDocument), LaTeX, pdf, SQL (PostgreSQL and SQLite), and -for search - - - - - - - - - - - - - -
-

- - SiSU >> - -

-

- SiSU information Structuring Universe -

-

- Structured information, Serialized Units -

-

-software for electronic texts, document collections, books, digital libraries, and search -

-

- with "atomic search" and text positioning system (shared text citation numbering: "ocn") -

-

-outputs include: plaintext, html, xhtml, XML, ODF (OpenDocument), LaTeX, pdf, SQL (PostgreSQL and SQLite) -

-
-
-

- - SiSU - -

-

- --- -

-

- - What does SiSU do? Summary - -

-

- --- -

-

- - Book Samples and Markup Examples - -

-

- --- -

-

- - Object Citation Numbering - ocn - -

-

(a text positioning system)

-

- --- -

-

-

- - Search - "Atomic" - -

-

- Of interest is the ease of streaming documents to a relational database, at an object (roughly paragraph) level and the potential for increased precision in the presentation of matches that results thereby. The ability to serialise html, LaTeX, XML, SQL, (whatever) is also inherent in / incidental to the design. For a description see the - - abandoned U.S. provisional patent application - -

-

- --- -

-

- - Download - -

-

- --- -

-

- - Changelog - -

-

- --- -

-

- - License - -

-

- Gnu / Linux / Unix -

-

- ============= -

-

- - sisu man pages - -

-

- --- -

-

- document preparation can be on any platform, in any editor: - (syntax highlight support currently for: vim, kate, write, gedit, diakonos) -

-

- - Syntax highlighting - -

-

- ============= -

-

- - * Composite document - -

-

- the composite document is a superset of the following documents: -

-

- - SiSU description - -

-

- - SiSU examples - -

-

- - SiSU chronology - -

-

- - SiSU technical - -

-

- - SiSU FAQ - -

-

- - SiSU download - -

-

- - SiSU changelog - -

-

- - SiSU license - -

-

- - SiSU standard - -

-

- - SiSU abandoned provisional patent - -

-

- Note: the placement of SiSU documents on the Net predate the release of SiSU. -

-
-
-

- For less markup than the most elementary HTML you can have so much more. -

-

SiSU - Structured information, Serialized Units for Electronic Documents, is an information structuring, transforming and publishing framework with the following features:

-

(i) markup syntax: (a) simpler than html, (b) mnemonic, influenced by mail/messaging/wiki markup practices, (c) human readable, and easily writable,

-

(ii) (a) minimal markup requirement, (b) single file marked up for multiple outputs,

-

(iii) (a) multiple outputs include amongst others: html; pdf via LaTeX; (structured) XML; ODF (OpenDocument); sql - currently PostgreSQL (and SQLite); ascii, (also texinfo), (b) takes advantage of the strengths implicit in these very different output types, (e.g. pdfs produced using typsetting of LaTeX, databases populated with documents at an individual object/paragraph level, with implications for search possibilities...)

-

(iv) provides a common object positioning and citation system for all outputs, which is human relevant and machine usable: object citation numbering, all objects (paragraphs, headings, verse, tables etc. and images) are numbered identically, for citation purposes, in all outputs (html, pdf, sql etc.),

-

(v) use of Dublin Core and other meta-tags to permit the addition of some semantic information on documents, and making easy integration of rdf/rss feeds etc.,

-

(vi) creates organised directory/file structure for (file-system) output, easily mapped with its clearly defined structure, with all text objects numbered, you know in advance where in each document output type, a bit of text will be found (eg. from an sql search, you know where to go to find the prepared html output or pdf etc.)... there is more; easy directory management and document associations, the document preparation (sub-)directory may be used to determine output (sub-)directory, the skin used, and the sql database used,

-

(vii) search of document sets, at object/paragraph level, the relational database retains information on the document structure, and citation numbering makes it possible for example to present search matches as an index of documents and locations within the document where the match is found,

-

(viii) "Concordance file" wordmap, consisting of all the words in a document and their (text/ object) locations within the text, (and the possibility of adding vocabularies),

-

(ix) document content certification and comparison considerations: (a) the document and each object within it stamped with an md5 hash making it possible to easily check or guarantee that the substantive content of a document is unchanged, (b) version control, documents integrated with time based source control system, default RCS or CVS with use of $Id$ tag, which SiSU checks -

(x) SiSU's minimalist markup makes for meaningful "diffing" of the substantive content of markup-files,

-

(xi) easily skinnable, document appearance on a project/site wide, directory wide, or document instance level easily controlled/changed,

-

(xii) in many cases a regular expression may be used (once in the document header) to define all or part of a documents structure obviating or reducing the need to provide structural markup within the document,

-

(xiii) is a batch processor for handling large document sets, ... though once generated they need not be re-generated, unless changes are made to the desired presentation of a particular output type,

-

(xiv) possible to pre-process, which permits: the easy creation of standard form documents, and templates/term-sheets, or; building of composite documents (master documents) from other sisu marked up documents, or marked up parts, i.e. import documents or parts of text into a main document should this be desired

-

(xv) future proofing, a framework for adding further capability or updating existing capability as required: (a) modular, (thanks in no small part to Ruby) another output format required, write another module....(b) easy to update output formats (eg html, xhtml, latex/pdf produced can be updated in program and run against whole document set), (c) easy to add, modify, or have alternative syntax rules for input, should you need to,

-

(xvi) scalability, dependent on your file-system (in my case Reiserfs) and on the relational database used (currently Postgresql and SQLite), and your hardware,

-

(xvii) only marked up files need be backed up, to secure the larger document set produced,

-

(xviii) document management,

-

(xix) use your favourite editor, syntax highlighting files for markup, primarily (g)vim so far,

-

(xx) remote operations: (a) run SiSU on a remote server, (having prepared sisu markup documents locally or on that server, i.e. this solution where sisu is installed on the remote server, would work whatever type of machine you chose to prepare your markup documents on), (b) alternatively, (assuming sisu is available to you locally but not installed on the remote server) configure sisu to securely copy (scp) its output to your remote host and run sisu locally, (c) request a remotely located sisu markup file and process it locally by identifying it by its' url.

-
-

-More information on SiSU provided at www.jus.uio.no/sisu/SiSU

-
-
-

- More information on SiSU provided at: - - www.jus.uio.no/sisu/SiSU - -

-SiSU was developed in relation to legal documents, and is strong across a wide variety of texts (law, literature...(humanities, law and part of the social sciences)). SiSU handles images but is not suitable for formulae/ statistics, or for technical writing at this time.

-

-SiSU has been developed and has been in use for several years. Requirements to cover a wide range of documents within its use domain have been explored.

-

-Some modules are more mature than others, the most mature being Html and LaTeX / pdf. PostgreSQL and search functions are useable and together with ocn unique (to the best of my knowledge). The XML output document set is "well formed" but largely proof of concept, as is the OpenDocument output which is a limited SiSU feature set (SiSU is interested in a very limited ODF feature set).

-

- -ralph@amissah.com - -

-

- -ralph.amissah@gmail.com - -

-

-2007 -

-

-w3 since October 3 1993 -

-
- - diff --git a/data/doc/sisu/html/index.html b/data/doc/sisu/html/index.html deleted file mode 120000 index c6513ea0..00000000 --- a/data/doc/sisu/html/index.html +++ /dev/null @@ -1 +0,0 @@ -../sisu_manual/index.html \ No newline at end of file diff --git a/data/doc/sisu/html/sisu3.1.html b/data/doc/sisu/html/sisu3.1.html deleted file mode 100644 index 905b05ec..00000000 --- a/data/doc/sisu/html/sisu3.1.html +++ /dev/null @@ -1,3520 +0,0 @@ - - - - - -"sisu"("1") manual page - - -Table of Contents

- -

Name

-sisu - documents: markup, structuring, publishing in multiple standard -formats, and search -

Synopsis

-sisu [-abcDdFehIiMmNnopqRrSsTtUuVvwXxYyZz0-9] -[filename/wildcard] -

sisu [-P] [language_directory/filename language_directory] - -

sisu [-Ddcv] [instruction] [filename/wildcard] -

sisu [-CcFLSVvW] -

sisu ---v2 [operations] -

sisu --v1 [operations] SISU - MANUAL, RALPH AMISSAH -

WHAT -IS SISU? -

1. INTRODUCTION - WHAT IS SISU? -

SiSU is a framework for document -structuring, publishing (in multiple open standard formats) and search, -comprising of: (a) a lightweight document structure and presentation markup -syntax; and (b) an accompanying engine for generating standard document -format outputs from documents prepared in sisu markup syntax, which is -able to produce multiple standard outputs (including the population of -sql databases) that (can) share a common numbering system for the citation -of text within a document. -

SiSU is developed under an open source, software -libre license (GPL3). Its use case for development is work with medium to -large document sets and cope with evolving document formats/ representation -technologies. Documents are prepared once, and generated as need be to update -the technical presentation or add additional output formats. Various output -formats (including search related output) share a common mechanism for -cross-output-format citation. -

SiSU both defines a markup syntax and provides -an engine that produces open standards format outputs from documents prepared -with SiSU markup. From a single lightly prepared document sisu custom builds -several standard output formats which share a common (text object) numbering -system for citation of content within a document (that also has implications -for search). The sisu engine works with an abstraction of the document’s -structure and content from which it is possible to generate different forms -of representation of the document. Significantly SiSU markup is more sparse -than html and outputs which include html, EPUB, LaTeX, landscape and portrait -pdfs, Open Document Format (ODF), all of which can be added to and updated. -SiSU is also able to populate SQL type databases at an object level, which -means that searches can be made with that degree of granularity. -

Source -document preparation and output generation is a two step process: (i) document -source is prepared, that is, marked up in sisu markup syntax and (ii) the -desired output subsequently generated by running the sisu engine against -document source. Output representations if updated (in the sisu engine) -can be generated by re-running the engine against the prepared source. Using -SiSU markup applied to a document, SiSU custom builds (to take advantage -of the strengths of different ways of representing documents) various standard -open output formats including plain text, HTML, XHTML, XML, EPUB, OpenDocument, -LaTeX or PDF files, and populate an SQL database with objects[^1] (equating -generally to paragraph-sized chunks) so searches may be performed and matches -returned with that degree of granularity ( e.g. your search criteria is met -by these documents and at these locations within each document). Document -output formats share a common object numbering system for locating content. -This is particularly suitable for "published" works (finalized texts as -opposed to works that are frequently changed or updated) for which it provides -a fixed means of reference of content. -

In preparing a SiSU document you -optionally provide semantic information related to the document in a document -header, and in marking up the substantive text provide information on the -structure of the document, primarily indicating heading levels and footnotes. -You also provide information on basic text attributes where used. The rest -is automatic, sisu from this information custom builds[^2] the different -forms of output requested. -

SiSU works with an abstraction of the document -based on its structure which is comprised of its headings[^3] and objects[^4], -which enables SiSU to represent the document in many different ways, and -to take advantage of the strengths of different ways of presenting documents. -The objects are numbered, and these numbers can be used to provide a common -basis for citing material within a document across the different output -format types. This is significant as page numbers are not well suited to -the digital age, in web publishing, changing a browser’s default font or -using a different browser can mean that text will appear on a different -page; and publishing in different formats, html, landscape and portrait -pdf etc. again page numbers are not useful to cite text. Dealing with documents -at an object level together with object numbering also has implications -for search that SiSU is able to take advantage of. -

One of the challenges -of maintaining documents is to keep them in a format that allows use of -them independently of proprietary platforms. Consider issues related to -dealing with legacy proprietary formats today and what guarantee you have -that old proprietary formats will remain (or can be read without proprietary -software/equipment) in 15 years time, or the way the way in which html -has evolved over its relatively short span of existence. SiSU provides the -flexibility of producing documents in multiple non-proprietary open formats -including html, pdf[^5] ODF,[^6] and EPUB.[^7] Whilst SiSU relies on software, -the markup is uncomplicated and minimalistic which guarantees that future -engines can be written to run against it. It is also easily converted to -other formats, which means documents prepared in SiSU can be migrated to -other document formats. Further security is provided by the fact that the -software itself, SiSU is available under GPL3 a licence that guarantees -that the source code will always be open, and free as in libre, which means -that that code base can be used, updated and further developed as required -under the terms of its license. Another challenge is to keep up with a moving -target. SiSU permits new forms of output to be added as they become important, -(Open Document Format text was added in 2006 when it became an ISO standard -for office applications and the archival of documents), EPUB was introduced -in 2009; and allows the technical representations existing output to be -updated (html has evolved and the related module has been updated repeatedly -over the years, presumably when the World Wide Web Consortium (w3c) finalises -html 5 which is currently under development, the html module will again -be updated allowing all existing documents to be regenerated as html 5). - -

The document formats are written to the file-system and available for -indexing by independent indexing tools, whether off the web like Google -and Yahoo or on the site like Lucene and Hyperestraier. -

SiSU also provides -other features such as concordance files and document content certificates, -and the working against an abstraction of document structure has further -possibilities for the research and development of other document representations, -the availability of objects is useful for example for topic maps and thesauri, -together with the flexibility of SiSU offers great possibilities. -

SiSU -is primarily for published works, which can take advantage of the citation -system to reliably reference its documents. SiSU works well in a complementary -manner with such collaborative technologies as Wikis, which can take advantage -of and be used to discuss the substance of content prepared in SiSU -

-<http://www.jus.uio.no/sisu -> -

2. COMMANDS SUMMARY -

2.1 DESCRIPTION -

SiSU SiSU -is a document publishing system, that from a simple single marked-up document, -produces multiple of output formats including: plaintext, html, xhtml, -XML, epub, odt (odf text), LaTeX, pdf, info, and SQL (PostgreSQL and SQLite), -which share numbered text objects ("object citation numbering") and the -same document structure information. For more see: <http://www.jus.uio.no/sisu -> - -

2.2 DOCUMENT PROCESSING COMMAND FLAGS -

-

- -
-a [filename/wildcard]
-
produces plaintext -with Unix linefeeds and without markup, (object numbers are omitted), has -footnotes at end of each paragraph that contains them [  -A  for  equivalent - dos  (linefeed)  output  file] [see  -e  for (Options include: --endnotes for endnotes ---footnotes for footnotes at the end of each paragraph --unix for unix linefeed -(default) --msdos for msdos linefeed) -

- -
-b [filename/wildcard]
-
see --xhtml -

- -
--color-toggle -[filename/wildcard]
-
screen toggle ansi screen colour on or off depending -on default set (unless -c flag is used: if sisurc colour default is set -to ’true’, output to screen will be with colour, if sisurc colour default -is set to ’false’ or is undefined screen output will be without colour). Alias - -

- -

c -

- -
--concordance [filename/wildcard]
-
produces concordance (wordmap) a rudimentary -index of all the words in a document. (Concordance files are not generated -for documents of over 260,000 words unless this limit is increased in the -file sisurc.yml). Alias -w -

- -
-C [--init-site]
-
configure/initialise shared output -directory files initialize shared output directory (config files such as -css and dtd files are not updated if they already exist unless modifier -is used). -C --init-site configure/initialise site more extensive than -C on -its own, shared output directory files/force update, existing shared output -config files such as css and dtd files are updated if this modifier is -used. -

- -
-CC
-
configure/initialise shared output directory files initialize -shared output directory (config files such as css and dtd files are not -updated if they already exist unless modifier is used). The equivalent of: --C --init-site configure/initialise site, more extensive than -C on its own, -shared output directory files/force update, existing shared output config -files such as css and dtd files are updated if -CC is used. -

- -
-c [filename/wildcard] -
-
-

see --color-toggle -

- -
--dal [filename/wildcard/url]
-
assumed for most other flags, -creates new intermediate files for processing (document abstraction) that -is used in all subsequent processing of other output. This step is assumed -for most processing flags. To skip it see -n. Alias -m -

- -
--delete [filename/wildcard] -
-
-

see --zap -

- -
-D [instruction] [filename]
-
see --pg -

- -
-d [--db-[database  type  (sqlite|pg)]] ---[instruction] [filename]
-
see --sqlite -

- -
--epub [filename/wildcard]
-
produces -an epub document, [sisu  version  2  only] (filename.epub). Alias -e -

- -
-e [filename/wildcard] -
-
-

see --epub -

- -
-F [--webserv=webrick]
-
see --sample-search-form -

- -
--git [filename/wildcard] -
-
produces or updates markup source file structure in a git repo (experimental -and subject to change). Alias -g -

- -
-g [filename/wildcard]
-
see --git -

- -
--harvest -*.ss[tm]
-
makes two lists of sisu output based on the sisu markup documents -in a directory: list of author and authors works (year and titles), and; -list by topic with titles and author. Makes use of header metadata fields -(author, title, date, topic_register). Can be used with maintenance (-M) -and remote placement (-R) flags. -

- -
--help [topic]
-
provides help on the selected -topic, where topics (keywords) include: list, (com)mands, short(cuts), -(mod)ifiers, (env)ironment, markup, syntax, headers, headings, endnotes, -tables, example, customise, skin, (dir)ectories, path, (lang)uage, db, -install, setup, (conf)igure, convert, termsheet, search, sql, features, - -

license -

- -
--html [filename/wildcard]
-
produces html output, segmented text -with table of contents (toc.html and index.html) and the document in a single -file (scroll.html). Alias -h -

- -
-h [filename/wildcard]
-
see --html -

- -
-I [filename/wildcard] -
-
-

see --texinfo -

- -
-i [filename/wildcard]
-
see --manpage -

- -
-L
-
prints license information. - -

- -
--machine [filename/wildcard/url]
-
see --dal (document abstraction level/layer) - -

- -
--maintenance [filename/wildcard/url]
-
maintenance mode files created for -processing preserved and their locations indicated. (also see -V). Alias -M - -

- -
--manpage [filename/wildcard]
-
produces man page of file, not suitable for -all outputs. Alias -i -

- -
-M [filename/wildcard/url]
-
see --maintenance -

- -
-m [filename/wildcard/url] -
-
see --dal (document abstraction level/layer) -

- -
--no-ocn
-
[with  --html  --pdf  or  --epub] -switches off object citation numbering. Produce output without identifying -numbers in margins of html or LaTeX/pdf output. -

- -
-N [filename/wildcard/url] -
-
document digest or document content certificate ( DCC ) as md5 digest tree -of the document: the digest for the document, and digests for each object -contained within the document (together with information on software versions -that produced it) (digest.txt). -NV for verbose digest output to screen. -

- -
-n -[filename/wildcard/url]
-
skip the creation of intermediate processing files -(document abstraction) if they already exist, this skips the equivalent -of -m which is otherwise assumed by most processing flags. -

- -
--odf [filename/wildcard/url] -
-
-

see --odt -

- -
--odt [filename/wildcard/url]
-
output basic document in opendocument -file format (opendocument.odt). Alias -o -

- -
-o [filename/wildcard/url]
-
see --odt - -

- -
--pdf [filename/wildcard]
-
produces LaTeX pdf (portrait.pdf & landscape.pdf). -Default paper size is set in config file, or document header, or provided -with additional command line parameter, e.g. --papersize-a4 preset sizes include: -’A4’, U.S. ’letter’ and -

- -
--pg [instruction] [filename]
-
database postgresql ( --pgsql -may be used instead) possible instructions, include: --createdb; --create; ---dropall; --import [filename]; --update [filename]; --remove [filename]; see database -section below. Alias -D -

- -
--po [language_directory/filename language_directory] -
-
-

see --po4a -

- -
--po4a [language_directory/filename language_directory]
-
produces -.pot and po files for the file in the languages specified by the language -directory. SiSU markup is placed in subdirectories named with the language -code, e.g. en/ fr/ es/. The sisu config file must set the output directory -structure to multilingual. v3, experimental -

- -
-P [language_directory/filename -language_directory]
-
see --po4a -

- -
-p [filename/wildcard]
-
see --pdf -

- -
--quiet [filename/wildcard] -
-
quiet less output to screen. -

- -
-q [filename/wildcard]
-
see --quiet -

- -
--rsync [filename/wildcard] -
-
copies sisu output files to remote host using rsync. This requires that -sisurc.yml has been provided with information on hostname and username, -and that you have your "keys" and ssh agent in place. Note the behavior -of rsync different if -R is used with other flags from if used alone. Alone -the rsync --delete parameter is sent, useful for cleaning the remote directory -(when -R is used together with other flags, it is not). Also see --scp. Alias - -

- -

R -

- -
-R [filename/wildcard]
-
see --rsync -

- -
-r [filename/wildcard]
-
see --scp -

- -
--sample-search-form -[--webserv=webrick]
-
generate examples of (naive) cgi search form for sqlite -and pgsql depends on your already having used sisu to populate an sqlite -and/or pgsql database, (the sqlite version scans the output directories -for existing sisu_sqlite databases, so it is first necessary to create -them, before generating the search form) see -d -D and the database section -below. If the optional parameter --webserv=webrick is passed, the cgi examples -created will be set up to use the default port set for use by the webrick -server, (otherwise the port is left blank and the system setting used, -usually 80). The samples are dumped in the present work directory which -must be writable, (with screen instructions given that they be copied to -the cgi-bin directory). -Fv (in addition to the above) provides some information -on setting up hyperestraier for sisu. Alias -F -

- -
--scp [filename/wildcard]
-
copies -sisu output files to remote host using scp. This requires that sisurc.yml -has been provided with information on hostname and username, and that you -have your "keys" and ssh agent in place. Also see --rsync. Alias -r -

- -
--sqlite ---[instruction] [filename]
-
database type default set to sqlite, (for which ---sqlite may be used instead) or to specify another database --db-[pgsql,  sqlite] -(however see -D) possible instructions include: --createdb; --create; --dropall; ---import [filename]; --update [filename]; --remove [filename]; see database section -below. Alias -d -

- -
--sisupod
-
produces a sisupod a zipped sisu directory of markup -files including sisu markup source files and the directories local configuration -file, images and skins. Note: this only includes the configuration files -or skins contained in ./_sisu not those in ~/.sisu -S [filename/wildcard] -option. Note: (this
- option is tested only with zsh). Alias -S -

- -
--sisupod [filename/wildcard]
-
produces -a zipped file of the prepared document specified along with associated -images, by default named sisupod.zip they may alternatively be named with -the filename extension .ssp This provides a quick way of gathering the relevant -parts of a sisu document which can then for example be emailed. A sisupod -includes sisu markup source file, (along with associated documents if a -master file, or available in multilingual versions), together with related -images and skin. SiSU commands can be run directly against a sisupod contained -in a local directory, or provided as a url on a remote site. As there is -a security issue with skins provided by other users, they are not applied -unless the flag --trust or --trusted is added to the command instruction, it -is recommended that file that are not your own are treated as untrusted. -The directory structure of the unzipped file is understood by sisu, and -sisu commands can be run within it. Note: if you wish to send multiple files, -it quickly becomes more space efficient to zip the sisu markup directory, -rather than the individual files for sending). See the -S option without -[filename/wildcard]. Alias -S -

- -
--source [filename/wildcard]
-
copies sisu markup -file to output directory. Alias -s -

- -
-S
-
see --sisupod -

- -
-S [filename/wildcard] -
-
-

see --sisupod -

- -
-s [filename/wildcard]
-
see --source -

- -
--texinfo [filename/wildcard] -
-
produces texinfo and info file, (view with pinfo). Alias -I -

- -
--txt [filename/wildcard] -
-
produces plaintext with Unix linefeeds and without markup, (object numbers -are omitted), has footnotes at end of each paragraph that contains them -[  -A  for  equivalent  dos  (linefeed)  output  file] [see  -e  for (Options include: ---endnotes for endnotes --footnotes for footnotes at the end of each paragraph ---unix for unix linefeed (default) --msdos for msdos linefeed). Alias -t -

- -
-T [filename/wildcard - (*.termsheet.rb)]
-
standard form document builder, preprocessing feature -

-

- -
-t [filename/wildcard]
-
see --txt -

- -
--urls [filename/wildcard]
-
prints url output -list/map for the available processing flags options and resulting files -that could be requested, (can be used to get a list of processing options -in relation to a file, together with information on the output that would -be produced), -u provides url output mapping for those flags requested for -processing. The default assumes sisu_webrick is running and provides webrick -url mappings where appropriate, but these can be switched to file system -paths in sisurc.yml. Alias -U -

- -
-U [filename/wildcard]
-
see --urls -

- -
-u [filename/wildcard] -
-
provides url mapping of output files for the flags requested for processing, - -

also see -U -

- -
--v1 [filename/wildcard]
-
invokes the sisu v1 document parser/generator. -For use with sisu v1 markup documents. (Markup conversion to v2 involves -the modification of document headers) -

- -
--v2 [filename/wildcard]
-
invokes the -sisu v2 document parser/generator. This is the default and is normally omitted. - -

- -
--verbose [filename/wildcard]
-
provides verbose output of what is being generated, -where output is placed (and error messages if any), as with -u flag provides -a url mapping of files created for each of the processing flag requests. - -

Alias -v -

- -
-V
-
on its own, provides SiSU version and environment information -(sisu --help env) -

- -
-V [filename/wildcard]
-
even more verbose than the -v flag. - -

- -
-v
-
on its own, provides SiSU version information -

- -
-v [filename/wildcard] -
-
-

see --verbose -

- -
--webrick
-
starts ruby’s webrick webserver points at sisu output -directories, the default port is set to 8081 and can be changed in the -resource configuration files. [tip:  the  webrick  server  requires  link  suffixes, - so  html  output  should  be  created  using  the  -h  option  rather  than and search --H  ;  also,  note  -F  webrick  ]. Alias -W -

- -
-W
-
see --webrick -

- -
--wordmap [filename/wildcard] -
-
-

see --concordance -

- -
-w [filename/wildcard]
-
see --concordance -

- -
--xhtml [filename/wildcard] -
-
produces xhtml/XML output for browser viewing (sax parsing). Alias -b -

- -
--xml-dom -[filename/wildcard]
-
produces XML output with deep document structure, in -the nature of dom. Alias -X -

- -
--xml-sax [filename/wildcard]
-
produces XML output -shallow structure (sax parsing). Alias -x -

- -
-X [filename/wildcard]
-
see --xml-dom - -

- -
-x [filename/wildcard]
-
see --xml-sax -

- -
-Y [filename/wildcard]
-
produces a short -sitemap entry for the document, based on html output and the sisu_manifest. ---sitemaps generates/updates the sitemap index of existing sitemaps. (Experimental, -[g,y,m  announcement  this  week]) -

- -
-y [filename/wildcard]
-
produces an html -summary of output generated (hyperlinked to content) and document specific -metadata (sisu_manifest.html). This step is assumed for most processing flags. - -

- -
--zap [filename/wildcard]
-
Zap, if used with other processing flags deletes -output files of the type about to be processed, prior to processing. If --Z is used as the lone processing related flag (or in conjunction with a -combination of -[mMvVq]), will remove the related document output directory. - -

Alias -Z -

- -
-Z [filename/wildcard]
-
see --zap -

-
-3. COMMAND LINE MODIFIERS -

-

- -
--no-ocn -
-
[with  --html  --pdf  or  --epub] switches off object citation numbering. Produce -output without identifying numbers in margins of html or LaTeX/pdf output. - -

- -
--no-annotate
-
strips output text of editor endnotes[^*1] denoted by asterisk - -

or dagger/plus sign -

- -
--no-asterisk
-
strips output text of editor endnotes[^*2] - -

denoted by asterisk sign -

- -
--no-dagger
-
strips output text of editor endnotes[^+1] - -

denoted by dagger/plus sign -

-
-4. DATABASE COMMANDS -

dbi - database interface - -

-D or --pgsql set for postgresql -d or --sqlite default set for sqlite -d is -modifiable with --db=[database  type  (pgsql  or  sqlite)] -

-

- -
--pg -v --createall
-
initial -step, creates required relations (tables, indexes) in existing postgresql -database (a database should be created manually and given the same name -as working directory, as requested) (rb.dbi) [  -dv  --createall sqlite  equivalent] -it may be necessary to run sisu -Dv --createdb initially NOTE: at the present -time for postgresql it may be necessary to manually create the database. -The command would be ’createdb [database  name]’ where database name would -be SiSU_[present  working  directory  name (without  path)]. Please use only -alphanumerics and underscores. -

- -
--pg -v --import
-
[filename/wildcard] imports -data specified to postgresql db (rb.dbi) [  -dv  --import  sqlite  equivalent] - -

- -
--pg -v --update
-
[filename/wildcard] updates/imports specified data to postgresql -db (rb.dbi) [  -dv  --update  sqlite  equivalent] -

- -
--pg --remove
-
[filename/wildcard] -removes specified data to postgresql db (rb.dbi) [  -d  --remove  sqlite  equivalent] - -

- -
--pg --dropall
-
kills data" and drops (postgresql or sqlite) db, tables & indexes -[  -d --dropall  sqlite  equivalent] -

The -v is for verbose output. -

-
-5. SHORTCUTS, -SHORTHAND FOR MULTIPLE FLAGS -

-

- -
--update [filename/wildcard]
-
Checks existing -file output and runs the flags required to update this output. This means -that if only html and pdf output was requested on previous runs, only the --hp files will be applied, and only these will be generated this time, together -with the summary. This can be very convenient, if you offer different outputs -of different files, and just want to do the same again. -

- -
-0 to -5 [filename - or  wildcard]
-
Default shorthand mappings (note that the defaults can be -changed/configured in the sisurc.yml file): -

- -
-0
-
-mNhwpAobxXyYv [this  is  the - default  action  run  when  no  i.e.  on  ’sisu  [filename]’] -

- -
-1
-
-mhewpy -

- -
-2
-
-mhewpaoy - -

- -
-3
-
-mhewpAobxXyY -

- -
-4
-
-mhewpAobxXDyY --import -

- -
-5
-
-mhewpAobxXDyY --update -

add -v -for verbose mode and -c for color, e.g. sisu -2vc [filename  or -

consider -u - -

for appended url info or -v for verbose output -

-
-5.1 COMMAND LINE WITH FLAGS -- BATCH PROCESSING -

In the data directory run sisu -mh filename or wildcard -eg. "sisu -h cisg.sst" or "sisu -h *.{sst,ssm}" to produce html version of all -documents. -

Running sisu (alone without any flags, filenames or wildcards) -brings up the interactive help, as does any sisu command that is not recognised. -Enter to escape. -

6. HELP -

6.1 SISU MANUAL -

The most up to date information -on sisu should be contained in the sisu_manual, available at: -

<http://sisudoc.org/sisu/sisu_manual/ ->
- -

The manual can be generated from source, found respectively, either -within the SiSU tarball or installed locally at: -

./data/doc/sisu/v2/sisu_markup_samples/sisu_manual/
- -

/usr/share/doc/sisu/v2/sisu_markup_samples/sisu_manual/
- -

move to the respective directory and type e.g.: -

sisu sisu_manual.ssm
- -

6.2 SISU MAN PAGES -

If SiSU is installed on your system usual man commands -should be available, try: -

man sisu
- -

man sisu_markup
- -

man sisu_commands
- -

Most SiSU man pages are generated directly from sisu documents that -are used to prepare the sisu manual, the sources files for which are located -within the SiSU tarball at: -

./data/doc/sisu/v2/sisu_markup_samples/sisu_manual/
- -

Once installed, directory equivalent to: -

/usr/share/doc/sisu/sisu_manual/
- -

Available man pages are converted back to html using man2html: -

/usr/share/doc/sisu/v2/html/
- -

./data/doc/sisu/v2/html/
- -

An online version of the sisu man page is available here: -

* various -sisu man pages <http://www.jus.uio.no/sisu/man/ -> [^8] -

* sisu.1 <http://www.jus.uio.no/sisu/man/sisu.1.html -> -[^9] -

6.3 SISU BUILT-IN INTERACTIVE HELP -

This is particularly useful for -getting the current sisu setup/environment information: -

sisu --help
- -

sisu --help [subject]
- -

sisu --help commands
- -

sisu --help markup
- -

sisu --help env [for  feedback  on  the  way  your  system  is
- setup  with  regard  to  sisu]
- -

sisu -V [environment  information,  same  as  above  command]
- -

sisu (on its own provides version and some help information)
- -

Apart from real-time information on your current configuration the SiSU -manual and man pages are likely to contain more up-to-date information than -the sisu interactive help (for example on commands and markup). -

NOTE: -Running the command sisu (alone without any flags, filenames or wildcards) -brings up the interactive help, as does any sisu command that is not recognised. -Enter to escape. -

6.4 HELP SOURCES -

For lists of alternative help sources, -see: -

man page -

man sisu_help_sources
- -

man2html -

/usr/share/doc/sisu/v2/html/sisu.1.html
- -

<http://sisudoc.org/sisu/sisu_help_sources/index.html ->
- -

7. INTRODUCTION TO SISU MARKUP[^10] -

7.1 SUMMARY -

SiSU source documents -are plaintext (UTF-8)[^11] files -

All paragraphs are separated by an empty -line. -

Markup is comprised of: -

* at the top of a document, the document -header made up of semantic meta-data about the document and if desired additional -processing instructions (such an instruction to automatically number headings -from a particular level down) -

* followed by the prepared substantive -text of which the most important single characteristic is the markup of -different heading levels, which define the primary outline of the document -structure. Markup of substantive text includes: -

* heading levels defines -document structure
- -

* text basic attributes, italics, bold etc.
- -

* grouped text (objects), which are to be treated differently, such -as code
- blocks or poems.
- -

* footnotes/endnotes
- -

* linked text and images
- -

* paragraph actions, such as indent, bulleted, numbered-lists, etc.
- -

Some interactive help on markup is available, by typing sisu and selecting - -

markup or sisu --help markup -

To check the markup in a file: -

sisu --identify -[filename].sst
- -

For brief descriptive summary of markup history -

sisu --query-history
- -

or if for a particular version: -

sisu --query-0.38
- -

7.2 MARKUP EXAMPLES -

7.2.1 ONLINE -

Online markup examples are available -together with the respective outputs produced from <http://www.jus.uio.no/sisu/SiSU/examples.html -> -or from <http://www.jus.uio.no/sisu/sisu_examples/ -> -

There is of course this -document, which provides a cursory overview of sisu markup and the respective -output produced: <http://www.jus.uio.no/sisu/sisu_markup/ -> -

Some example marked -up files are available as html with syntax highlighting for viewing: <http://www.jus.uio.no/sisu/sample/syntax -> - -

an alternative presentation of markup syntax: <http://www.jus.uio.no/sisu/sample/on_markup.txt -> - -

7.2.2 INSTALLED -

With SiSU installed sample skins may be found in: /usr/share/doc/sisu/sisu_markup_samples/dfsg -(or equivalent directory) and if sisu-markup-samples is installed also under: - -

/usr/share/doc/sisu/sisu_markup_samples/non-free -

8. MARKUP OF HEADERS -

- Headers contain either: semantic meta-data about a document, which can -be used by any output module of the program, or; processing instructions. - -

Note: the first line of a document may include information on the markup -version used in the form of a comment. Comments are a percentage mark at -the start of a paragraph (and as the first character in a line of text) -followed by a space and the comment: -

-


-

  % this would be a comment
-
-

8.1 SAMPLE HEADER -

This current document is loaded by a master document -that has a header similar to this one: -

-


-

  % SiSU master 2.0
-  @title: SiSU
-   :subtitle: Manual
-  @creator: :author: Amissah, Ralph
-  @rights: Copyright (C) Ralph Amissah 2007, License GPL 3
-  @classify:
-   :type: information
-   :topic_register: SiSU:manual;electronic documents:SiSU:manual
-   :subject: ebook, epublishing, electronic book, electronic publishing,
-      electronic document, electronic citation, data structure,
-       citation systems, search
-  % used_by: manual
-  @date: :published: 2008-05-22
-   :created: 2002-08-28
-   :issued: 2002-08-28
-   :available: 2002-08-28
-   :modified: 2010-03-03
-  @make: :num_top: 1
-   :breaks: new=C; break=1
-   :skin: skin_sisu_manual
-   :bold: /Gnu|Debian|Ruby|SiSU/
-   :manpage: name=sisu - documents: markup, structuring, publishing
-       in multiple standard formats, and search;
-       synopsis=sisu  [-abcDdeFhIiMmNnopqRrSsTtUuVvwXxYyZz0-9]  [filename/wildcard
- ]
-       . sisu  [-Ddcv]  [instruction]
-       . sisu  [-CcFLSVvW]
-       . sisu --v2  [operations]
-       . sisu --v1  [operations]
-  @links: { SiSU Manual }http://www.jus.uio.no/sisu/sisu_manual/
-    { Book Samples and Markup Examples }http://www.jus.uio.no/sisu/SiSU/examples.html
-    { SiSU @ Wikipedia }http://en.wikipedia.org/wiki/SiSU
-    { SiSU @ Freshmeat }http://freshmeat.net/projects/sisu/
-    { SiSU @ Ruby Application Archive }http://raa.ruby-lang.org/project/sisu/
-    { SiSU @ Debian }http://packages.qa.debian.org/s/sisu.html
-    { SiSU Download }http://www.jus.uio.no/sisu/SiSU/download.html
-    { SiSU Changelog }http://www.jus.uio.no/sisu/SiSU/changelog.html
-    { SiSU help }http://www.jus.uio.no/sisu/sisu_manual/sisu_help/
-    { SiSU help sources }http://www.jus.uio.no/sisu/sisu_manual/sisu_help_sources/
-
-

8.2 AVAILABLE HEADERS -

Header tags appear at the beginning of a document -and provide meta information on the document (such as the Dublin Core), -or information as to how the document as a whole is to be processed. All -header instructions take either the form @headername: or 0~headername. All - -

Dublin Core meta tags are available -

@indentifier: information or instructions - -

where the "identifier" is a tag recognised by the program, and the "information" -or "instructions" belong to the tag/indentifier specified -

Note: a header -where used should only be used once; all headers apart from @title: are -optional; the @structure: header is used to describe document structure, -and can be useful to know. -

This is a sample header -

-


-

  % SiSU 2.0  [declared  file-type  identifier  with  markup  version]
-
-


-

  @title:  [title  text]  [this  header  is  the  only  one  that  is  mandatory]
-    :subtitle:  [subtitle  if  any]
-    :language: English
-
-


-

  @creator: :author:  [Lastname,  First  names]
-   :illustrator:  [Lastname,  First  names]
-   :translator:  [Lastname,  First  names]
-   :prepared_by:  [Lastname,  First  names]
-
-


-

  @date: :published:  [year  or  yyyy-mm-dd]
-   :created:  [year  or  yyyy-mm-dd]
-   :issued:  [year  or  yyyy-mm-dd]
-   :available:  [year  or  yyyy-mm-dd]
-   :modified:  [year  or  yyyy-mm-dd]
-   :valid:  [year  or  yyyy-mm-dd]
-   :added_to_site:  [year  or  yyyy-mm-dd]
-   :translated:  [year  or  yyyy-mm-dd]
-
-


-

  @rights: :copyright: Copyright (C)  [Year  and  Holder]
-   :license:  [Use  License  granted]
-   :text:  [Year  and  Holder]
-   :translation:  [Name,  Year]
-   :illustrations:  [Name,  Year]
-
-


-

  @classify:
-   :topic_register: SiSU:markup sample:book;book:novel:fantasy
-   :type:
-   :subject:
-   :description:
-   :keywords:
-   :abstract:
-   :isbn:  [ISBN]
-   :loc:  [Library  of  Congress  classification]
-   :dewey:  [Dewey  classification
-  :pg:  [Project  Gutenberg  text  number]
-
-


-

  @links: { SiSU }http://www.jus.uio.no/sisu/
-    { FSF }http://www.fsf.org
-
-


-

  @make:
-   :skin: skin_name
-     [skins change default settings related to the appearance of documents
-generated]
-   :num_top: 1
-   :headings:  [text  to  match  for  each  level
-     (e.g. PART; Chapter; Section; Article;
-      or another: none; BOOK|FIRST|SECOND; none; CHAPTER;)
-   :breaks: new=:C; break=1
-   :promo: sisu, ruby, sisu_search_libre, open_society
-   :bold: [regular expression of words/phrases to be made bold]
-   :italics:  [regular  expression  of  words/phrases  to  italicise]
-
-


-

  @original: :language:  [language]
-
-


-

  @notes: :comment:
-   :prefix:  [prefix  is  placed  just  after  table  of  contents]
-
-

9. MARKUP OF SUBSTANTIVE TEXT -

9.1 HEADING LEVELS -

Heading levels are -:A~ ,:B~ ,:C~ ,1~ ,2~ ,3~ ... :A - :C being part / section headings, followed -by other heading levels, and 1 -6 being headings followed by substantive -text or sub-headings. :A~ usually the title :A~? conditional level 1 heading -(used where a stand-alone document may be imported into another) -

:A~ [heading - text] Top level heading [this  usually  has  similar  content  to  the  ] NOTE: -the heading levels described here are in 0.38 notation, see heading -

:B~ -[heading  text] Second level heading [this  is  a  heading  level  divider] -

- :C~ [heading  text] Third level heading [this  is  a  heading  level  divider] - -

1~ [heading  text] Top level heading preceding substantive text of document -or sub-heading 2, the heading level that would normally be marked 1. or 2. -or 3. etc. in a document, and the level on which sisu by default would break -html output into named segments, names are provided automatically if none -are given (a number), otherwise takes the form 1~my_filename_for_this_segment - -

2~ [heading  text] Second level heading preceding substantive text of -document or sub-heading 3, the heading level that would normally be marked -1.1 or 1.2 or 1.3 or 2.1 etc. in a document. -

3~ [heading  text] Third level -heading preceding substantive text of document, that would normally be -marked 1.1.1 or 1.1.2 or 1.2.1 or 2.1.1 etc. in a document -

-


-

  1~filename level 1 heading,
-  % the primary division such as Chapter that is followed by substantive
-text,
-  % and may be further subdivided (this is the level on which by default
-html
-  % segments are made)
-
-

9.2 FONT ATTRIBUTES -

markup example: -

-


-

  normal text,  *{emphasis}*, !{bold text}!, /{italics}/, _{underscore}_,
-"{citation}",
-  ^{superscript}^, ,{subscript},, +{inserted text}+, -{strikethrough}- #{monospace}#
-  normal text
-  !{emphasis}!
-  *{bold text}*
-  _{underscore}_
-  /{italics}/
-  "{citation}"
-  ^{superscript}^
-  ,{subscript},
-  +{inserted text}+
-  -{strikethrough}-
-  #{monospace}#
-
-

resulting output: -

normal text emphasis bold text underscore italics -"citation" ^superscript^ [subscript] ++inserted text++ --strikethrough-- monospace - -

normal text -

emphasis [note:  can  be  configured  to  be  represented  by - bold,  italics  or  underscore] -

bold text -

italics -

underscore -

"citation" - -

^superscript^ -

[subscript] -

++inserted text++ -

--strikethrough-- -

monospace - -

9.3 INDENTATION AND BULLETS -

markup example: -

-


-

  ordinary paragraph
-  _1 indent paragraph one step
-  _2 indent paragraph two steps
-  _9 indent paragraph nine steps
-
-

-

resulting output: -

ordinary paragraph -

indent paragraph one step
- -

indent paragraph two steps
- -

indent paragraph nine steps
- -

markup example: -

-


-

  _* bullet text
-  _1* bullet text, first indent
-  _2* bullet text, two step indent
-
-

resulting output: -

* bullet text -

* bullet text, first indent
- -

* bullet text, two step indent
- -

Numbered List (not to be confused with headings/titles, (document structure)) - -

markup example: -

-


-

  # numbered list                numbered list 1., 2., 3, etc.
-  _# numbered list numbered list indented a., b., c., d., etc.
-
-

9.4 FOOTNOTES / ENDNOTES -

Footnotes and endnotes not distinguished in -markup. They are automatically numbered. Depending on the output file format -(html, EPUB, odf, pdf etc.), the document output selected will have either -footnotes or endnotes. -

markup example: -

-


-

  ~{ a footnote or endnote }~
-
-

resulting output: -

[^12] -

markup example: -

-


-

  normal text~{ self contained endnote marker & endnote in one }~ continues
-
-

resulting output: -

normal text[^13] continues -

markup example: -

-


-

  normal text ~{* unnumbered asterisk footnote/endnote, insert multiple
-asterisks if required }~ continues
-  normal text ~{** another unnumbered asterisk footnote/endnote }~ continues
-
-

resulting output: -

normal text [^*] continues -

normal text [^**] continues - -

markup example: -

-


-

  normal text ~[*  editors  notes,  numbered  asterisk  footnote/endnote  series
- ]~ continues
-  normal text ~[+  editors  notes,  numbered  asterisk  footnote/endnote  series
- ]~ continues
-
-

resulting output: -

normal text [^*3] continues -

normal text [^+2] continues - -

Alternative endnote pair notation for footnotes/endnotes: -

-


-

  % note the endnote marker
-  normal text~^ continues
-  ^~ endnote text following the paragraph in which the marker occurs
-
-

the standard and pair notation cannot be mixed in the same document -

- -

9.5 LINKS -

9.5.1 NAKED URLS WITHIN TEXT, DEALING WITH URLS -

urls found within -text are marked up automatically. A url within text is automatically hyperlinked -to itself and by default decorated with angled braces, unless they are -contained within a code block (in which case they are passed as normal -text), or escaped by a preceding underscore (in which case the decoration -is omitted). -

markup example: -

-


-

  normal text http://www.jus.uio.no/sisu continues
-
-

resulting output: -

normal text <http://www.jus.uio.no/sisu -> continues -

An - -

escaped url without decoration -

markup example: -

-


-

  normal text _http://www.jus.uio.no/sisu continues
-  deb http://www.jus.uio.no/sisu/archive unstable main non-free
-
-

resulting output: -

normal text <_http://www.jus.uio.no/sisu -> continues -

-deb <_http://www.jus.uio.no/sisu/archive -> unstable main non-free -

where a code -block is used there is neither decoration nor hyperlinking, code blocks - -

are discussed later in this document -

resulting output: -

-


-

  deb http://www.jus.uio.no/sisu/archive unstable main non-free
-  deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
-
-

To link text or an image to a url the markup is as follows -

markup example: - -

-


-

  about { SiSU }http://url.org markup
-
-

9.5.2 LINKING TEXT -

resulting output: -

about SiSU <http://www.jus.uio.no/sisu/ -> - -

markup -

A shortcut notation is available so the url link may also be provided - -

automatically as a footnote -

markup example: -

-


-

  about {~^ SiSU }http://url.org markup
-
-

resulting output: -

abou tSiSU <http://www.jus.uio.no/sisu/ -> [^14] markup -

- -

9.5.3 LINKING IMAGES -

markup example: -

-


-

  { tux.png 64x80 }image
-  % various url linked images
-  {tux.png 64x80
-  {GnuDebianLinuxRubyBetterWay.png 100x101
-  {~^ ruby_logo.png
-
-

resulting output: -

[ tux.png ] -

tux.png 64x80 -

[  ruby_logo  (png  missing) - ] [^15] -

GnuDebianLinuxRubyBetterWay.png 100x101 and Ruby -

linked url footnote - -

shortcut -

-


-

  {~^  [text  to  link] }http://url.org
-  % maps to: {  [text  to  link] }http://url.org ~{ http://url.org }~
-  % which produces hyper-linked text within a document/paragraph,
-  with an endnote providing the url for the text location used in the hyperlink
-
-

-


-

  text marker *~name
-
-

note at a heading level the same is automatically achieved by providing -names to headings 1, 2 and 3 i.e. 2~[name] and 3~[name] or in the case of -auto-heading numbering, without further intervention. -

9.6 GROUPED TEXT -

9.6.1 - -

TABLES -

Tables may be prepared in two either of two forms -

markup example: - -

-


-

  table{ c3; 40; 30; 30;
-  This is a table
-  this would become column two of row one
-  column three of row one is here
-  And here begins another row
-  column two of row two
-  column three of row two, and so on
-  }table
-
-

resulting output: -

 [table  omitted,  see  other  document  formats]
- -

a second form may be easier to work with in cases where there is not - -

much information in each column -

markup example: [^16] -

-


-

  !_ Table 3.1: Contributors to Wikipedia, January 2001 - June 2005
-  {table~h 24; 12; 12; 12; 12; 12; 12;}
-                                  |Jan. 2001|Jan. 2002|Jan. 2003|Jan. 2004|July
-2004|June 2006
-  Contributors*                   |       10|      472|    2,188|    9,653|
-  25,011|   48,721
-  Active contributors**           |        9|      212|      846|    3,228|
-   8,442|   16,945
-  Very active contributors***     |        0|       31|      190|      692|
-   1,639|    3,016
-  No. of English language articles|       25|   16,000|  101,000|  190,000|
-320,000|  630,000
-  No. of articles, all languages  |       25|   19,000|  138,000|  490,000|
-862,000|1,600,000
-  \* Contributed at least ten times; \** at least 5 times in last month;
-\* more than 100 times in last month.
-
-

resulting output: -

Table 3.1: Contributors to Wikipedia, January 2001 -- June 2005 -

 [table  omitted,  see  other  document  formats]
- -

* Contributed at least ten times; ** at least 5 times in last month; -*** more than 100 times in last month. -

9.6.2 POEM -

basic markup: -

-


-

  poem{
-    Your poem here
-  }poem
-  Each verse in a poem is given a separate object number.
-
-

markup example: -

-


-

  poem{
-                      ‘Fury said to a
-                     mouse, That he
-                   met in the
-                 house,
-
-                both go to
-                  law:  I will
-                    prosecute
-                      YOU.  --Come,
-                         I’ll take no
-                          denial; We
-                       must have a
-                   trial:  For
-                really this
-             morning I’ve
-            nothing
-           to do.
-             Said the
-               mouse to the
-                 cur,
-                   a trial,
-                     dear Sir,
-                           With
-                       no jury
-                    or judge,
-                  would be
-                wasting
-               our
-                breath.
-
-                   judge, I’ll
-                     be jury,
-                           Said
-                      cunning
-                        old Fury:
-
-                        try the
-                           whole
-                            cause,
-                               and
-                          condemn
-                         you
-                        to
-                         death.
-  }poem
-
-

resulting output: -

’Fury said to a
- mouse, That he
- met in the
- house,
-
- both go to
- law: I will
- prosecute
- YOU. --Come,
- I’ll take no
- denial; We
- must have a
- trial: For
- really this
- morning I’ve
- nothing
- to do.
- Said the
- mouse to the
- cur,
- a trial,
- dear Sir,
- With
- no jury
- or judge,
- would be
- wasting
- our
- breath.
-
- judge, I’ll
- be jury,
- Said
- cunning
- old Fury:
-
- try the
- whole
- cause,
- and
- condemn
- you
- to
- death.
- -

9.6.3 GROUP -

basic markup: -

-


-

  group{
-    Your grouped text here
-  }group
-  A group is treated as an object and given a single object number.
-
-

markup example: -

-


-

  group{
-                      ’Fury said to a
-                     mouse, That he
-                   met in the
-                 house,
-
-                both go to
-                  law:  I will
-                    prosecute
-                      YOU.  --Come,
-                         I’ll take no
-                          denial; We
-                       must have a
-                   trial:  For
-                really this
-             morning I’ve
-            nothing
-           to do.
-             Said the
-               mouse to the
-                 cur,
-                   a trial,
-                     dear Sir,
-                           With
-                       no jury
-                    or judge,
-                  would be
-                wasting
-               our
-                breath.
-
-                   judge, I’ll
-                     be jury,
-                           Said
-                      cunning
-                        old Fury:
-
-                        try the
-                           whole
-                            cause,
-                               and
-                          condemn
-                         you
-                        to
-                         death.
-  }group
-
-

resulting output: -

’Fury said to a
- mouse, That he
- met in the
- house,
-
- both go to
- law: I will
- prosecute
- YOU. --Come,
- I’ll take no
- denial; We
- must have a
- trial: For
- really this
- morning I’ve
- nothing
- to do.
- Said the
- mouse to the
- cur,
- a trial,
- dear Sir,
- With
- no jury
- or judge,
- would be
- wasting
- our
- breath.
-
- judge, I’ll
- be jury,
- Said
- cunning
- old Fury:
-
- try the
- whole
- cause,
- and
- condemn
- you
- to
- death.
- -

9.6.4 CODE -

Code tags are used to escape regular sisu markup, and have -been used extensively within this document to provide examples of SiSU -markup. You cannot however use code tags to escape code tags. They are however -used in the same way as group or poem tags. -

A code-block is treated as -an object and given a single object number. [an more than 100 times in last -month. option  to  number  each  line  of  code  may  be  considered  at more than -100 times in last month. some  later  time] -

use of code tags instead of -poem compared, resulting output: -

-


-

                      ’Fury said to a
-                     mouse, That he
-                   met in the
-                 house,
-
-                both go to
-                  law:  I will
-                    prosecute
-                      YOU.  --Come,
-                         I’ll take no
-                          denial; We
-                       must have a
-                   trial:  For
-                really this
-             morning I’ve
-            nothing
-           to do.
-             Said the
-               mouse to the
-                 cur,
-                   a trial,
-                     dear Sir,
-                           With
-                       no jury
-                    or judge,
-                  would be
-                wasting
-               our
-                breath.
-
-                   judge, I’ll
-                     be jury,
-                           Said
-                      cunning
-                        old Fury:
-
-                        try the
-                           whole
-                            cause,
-                               and
-                          condemn
-                         you
-                        to
-                         death.
-
-

9.7 BOOK INDEX -

To make an index append to paragraph the book index term -relates to it, using an equal sign and curly braces. -

Currently two levels -are provided, a main term and if needed a sub-term. Sub-terms are separated -from the main term by a colon. -

-


-

    Paragraph containing main term and sub-term.
-    ={Main term:sub-term}
-
-

The index syntax starts on a new line, but there should not be an empty -line between paragraph and index markup. -

The structure of the resulting -index would be: -

-


-

    Main term, 1
-      sub-term, 1
-
-

Several terms may relate to a paragraph, they are separated by a semicolon. -If the term refers to more than one paragraph, indicate the number of paragraphs. - -

-


-

    Paragraph containing main term, second term and sub-term.
-    ={first term; second term: sub-term}
-
-

The structure of the resulting index would be: -

-


-

    First term, 1,
-    Second term, 1,
-      sub-term, 1
-
-

If multiple sub-terms appear under one paragraph, they are separated under -the main term heading from each other by a pipe symbol. -

-


-

    Paragraph containing main term, second term and sub-term.
-    ={Main term:sub-term+1|second sub-term
-    A paragraph that continues discussion of the first sub-term
-
-

The plus one in the example provided indicates the first sub-term spans -one additional paragraph. The logical structure of the resulting index would -be: -

-


-

    Main term, 1,
-      sub-term, 1-3,
-      second sub-term, 1,
-
-

10. COMPOSITE DOCUMENTS MARKUP -

It is possible to build a document by -creating a master document that requires other documents. The documents -required may be complete documents that could be generated independently, -or they could be markup snippets, prepared so as to be easily available -to be placed within another text. If the calling document is a master document -(built from other documents), it should be named with the suffix .ssm Within -this document you would provide information on the other documents that -should be included within the text. These may be other documents that would -be processed in a regular way, or markup bits prepared only for inclusion -within a master document .sst regular markup file, or .ssi (insert/information) - -

A secondary file of the composite document is built prior to processing - -

with the same prefix and the suffix ._sst -

basic markup for importing a - -

document into a master document -

-


-

  << filename1.sst
-  << filename2.ssi
-
-

The form described above should be relied on. Within the Vim editor it -results in the text thus linked becoming hyperlinked to the document it -is calling in which is convenient for editing. Alternative markup for importation -of documents under consideration, and occasionally supported have been. - -

-


-

  << filename.ssi
-  <<{filename.ssi}
-  % using textlink alternatives
-  << |filename.ssi|@|^|
-
-

MARKUP SYNTAX HISTORY -

11. NOTES RELATED TO FILES-TYPES AND MARKUP SYNTAX - -

0.38 is substantially current, depreciated 0.16 supported, though file - -

names were changed at 0.37 -

* sisu --query=[sisu  version  [0.38] or ’history] - -

provides a short history of changes to SiSU markup -

0.57 (2007w34/4) -SiSU 0.57 is the same as 0.42 with the introduction of some a shortcut to -use the headers @title and @creator in the first heading [expanded  using - the  and  @author:] -

-


-

  :A~ @title by @author
-
-

0.52 (2007w14/6) declared document type identifier at start of text/document: - -

.B SiSU
- 0.52 -

or, backward compatible using the comment marker: -

%
- SiSU 0.38 -

variations include ’ SiSU (text|master|insert) [version]’ and ’sisu-[version]’ - -

0.51 (2007w13/6) skins changed (simplified), markup unchanged -

0.42 (2006w27/4) -* (asterisk) type endnotes, used e.g. in relation to author -

SiSU 0.42 is -the same as 0.38 with the introduction of some additional endnote types, - -

Introduces some variations on endnotes, in particular the use of the - -

asterisk -

-


-

  ~{* for example for describing an author }~ and ~{** for describing a
-second author }~
-
-

* for example for describing an author -

** for describing a second author - -

and -

-


-

  ~[*  my  note  ]~ or ~[+  another  note  ]~
-
-

which numerically increments an asterisk and plus respectively -

*1 my - -

note +1 another note -

0.38 (2006w15/7) introduced new/alternative notation -for headers, e.g. @title: (instead of 0~title), and accompanying document -structure markup, :A,:B,:C,1,2,3 (maps to previous 1,2,3,4,5,6) -

SiSU -0.38 introduced alternative experimental header and heading/structure markers, - -

-


-

  @headername: and headers :A~ :B~ :C~ 1~ 2~ 3~
-
-

as the equivalent of: -

-


-

  0~headername and headers 1~ 2~ 3~ 4~ 5~ 6~
-
-

The internal document markup of SiSU 0.16 remains valid and standard Though - -

note that SiSU 0.37 introduced a new file naming convention -

SiSU has in -effect two sets of levels to be considered, using 0.38 notation A-C headings/levels, -pre-ordinary paragraphs /pre-substantive text, and 1-3 headings/levels, levels -which are followed by ordinary text. This may be conceptualised as levels -A,B,C, 1,2,3, and using such letter number notation, in effect: A must -exist, optional B and C may follow in sequence (not strict) 1 must exist, -optional 2 and 3 may follow in sequence i.e. there are two independent heading -level sequences A,B,C and 1,2,3 (using the 0.16 standard notation 1,2,3 -and 4,5,6) on the positive side: the 0.38 A,B,C,1,2,3 alternative makes -explicit an aspect of structuring documents in SiSU that is not otherwise -obvious to the newcomer (though it appears more complicated, is more in -your face and likely to be understood fairly quickly); the substantive -text follows levels 1,2,3 and it is ’nice’ to do most work in those levels - -

0.37 (2006w09/7) introduced new file naming convention, .sst (text), .ssm -(master), .ssi (insert), markup syntax unchanged -

SiSU 0.37 introduced new -file naming convention, using the file extensions .sst .ssm and .ssi to replace -.s1 .s2 .s3 .r1 .r2 .r3 and .si
- -

this is captured by the following file ’rename’ instruction: -

-


-

  rename ’s/.s[123]$/.sst/’ *.s{1,2,3}
-  rename ’s/.r[123]$/.ssm/’ *.r{1,2,3}
-  rename ’s/.si$/.ssi/’ *.si
-
-

The internal document markup remains unchanged, from SiSU 0.16 -

0.35 (2005w52/3) -sisupod, zipped content file introduced -

0.23 (2005w36/2) utf-8 for markup - -

file -

0.22 (2005w35/3) image dimensions may be omitted if rmagick is available - -

to be relied upon -

0.20.4 (2005w33/4) header 0~links -

0.16 (2005w25/2) substantial -changes introduced to make markup cleaner, header 0~title type, and headings -[1-6]~ introduced, also percentage sign (%) at start of a text line as comment - -

marker -

SiSU 0.16 (0.15 development branch) introduced the use of -

the -header 0~ and headings/structure 1~ 2~ 3~ 4~ 5~ 6~ -

in place of the 0.1 -header, heading/structure notation -

SiSU 0.1 headers and headings structure -represented by header 0{~ and headings/structure 1{ 2{ 3{ 4{~ 5{ 6{ -

12. -SISU FILETYPES -

SiSU has plaintext and binary filetypes, and can process -either type of document. -

12.1 .SST .SSM .SSI MARKED UP PLAIN TEXT -

SiSU documents -are prepared as plain-text (utf-8) files with SiSU markup. They may make reference -to and contain images (for example), which are stored in the directory -beneath them _sisu/image. SiSU plaintext markup files are of three types -that may be distinguished by the file extension used: regular text .sst; -master documents, composite documents that incorporate other text, which -can be any regular text or text insert; and inserts the contents of which -are like regular text except these are marked .ssi and are not processed.
- -

SiSU processing can be done directly against a sisu documents; which -may be located locally or on a remote server for which a url is provided. - -

SiSU source markup can be shared with the command: -

sisu -s [filename]
- -

12.1.1 SISU TEXT - REGULAR FILES (.SST) -

The most common form of document -in SiSU , see the section on SiSU markup. -

<http://www.jus.uio.no/sisu/sisu_markup -> - -

<http://www.jus.uio.no/sisu/sisu_manual -> -

12.1.2 SISU MASTER FILES (.SSM) -

Composite -documents which incorporate other SiSU documents which may be either regular -SiSU text .sst which may be generated independently, or inserts prepared -solely for the purpose of being incorporated into one or more master documents. - -

The mechanism by which master files incorporate other documents is described -as one of the headings under under SiSU markup in the SiSU manual. -

Note: -Master documents may be prepared in a similar way to regular documents, -and processing will occur normally if a .sst file is renamed .ssm without -requiring any other documents; the .ssm marker flags that the document may -contain other documents. -

Note: a secondary file of the composite document -is built prior to processing with the same prefix and the suffix ._sst [^17] - -

<http://www.jus.uio.no/sisu/sisu_markup -> -

<http://www.jus.uio.no/sisu/sisu_manual -> - -

12.1.3 SISU INSERT FILES (.SSI) -

Inserts are documents prepared solely for -the purpose of being incorporated into one or more master documents. They -resemble regular SiSU text files except they are ignored by the SiSU processor. -Making a file a .ssi file is a quick and convenient way of flagging that -it is not intended that the file should be processed on its own. -

12.2 SISUPOD, -ZIPPED BINARY CONTAINER (SISUPOD.ZIP, .SSP) -

A sisupod is a zipped SiSU -text file or set of SiSU text files and any associated images that they -contain (this will be extended to include sound and multimedia-files) -

-SiSU plaintext files rely on a recognised directory structure to find contents -such as images associated with documents, but all images for example for -all documents contained in a directory are located in the sub-directory -_sisu/image. Without the ability to create a sisupod it can be inconvenient -to manually identify all other files associated with a document. A sisupod -automatically bundles all associated files with the document that is turned -into a pod. -

The structure of the sisupod is such that it may for example -contain a single document and its associated images; a master document -and its associated documents and anything else; or the zipped contents -of a whole directory of prepared SiSU documents. -

The command to create -a sisupod is: -

sisu -S [filename]
- -

Alternatively, make a pod of the contents of a whole directory: -

-sisu -S
- -

SiSU processing can be done directly against a sisupod; which may be -located locally or on a remote server for which a url is provided. -

<http://www.jus.uio.no/sisu/sisu_commands -> - -

<http://www.jus.uio.no/sisu/sisu_manual -> -

13. EXPERIMENTAL ALTERNATIVE INPUT -REPRESENTATIONS -

13.1 ALTERNATIVE XML -

SiSU offers alternative XML input -representations of documents as a proof of concept, experimental feature. -They are however not strictly maintained, and incomplete and should be -handled with care. -

convert from sst to simple xml representations (sax, -dom and node): -

sisu --to-sax [filename/wildcard] or sisu --to-sxs [filename/wildcard]
- -

sisu --to-dom [filename/wildcard] or sisu --to-sxd [filename/wildcard]
- -

sisu --to-node [filename/wildcard] or sisu --to-sxn [filename/wildcard]
- -

convert to sst from any sisu xml representation (sax, dom and node): - -

sisu --from-xml2sst [filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

or the same: -

sisu --from-sxml [filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

13.1.1 XML SAX REPRESENTATION -

To convert from sst to simple xml (sax) -representation: -

sisu --to-sax [filename/wildcard] or sisu --to-sxs [filename/wildcard]
- -

To convert from any sisu xml representation back to sst -

sisu --from-xml2sst -[filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

or the same: -

sisu --from-sxml [filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

13.1.2 XML DOM REPRESENTATION -

To convert from sst to simple xml (dom) -representation: -

sisu --to-dom [filename/wildcard] or sisu --to-sxd [filename/wildcard]
- -

To convert from any sisu xml representation back to sst -

sisu --from-xml2sst -[filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

or the same: -

sisu --from-sxml [filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

13.1.3 XML NODE REPRESENTATION -

To convert from sst to simple xml (node) -representation: -

sisu --to-node [filename/wildcard] or sisu --to-sxn [filename/wildcard]
- -

To convert from any sisu xml representation back to sst -

sisu --from-xml2sst -[filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

or the same: -

sisu --from-sxml [filename/wildcard  [.sxs.xml,.sxd.xml,sxn.xml]]
- -

14. CONFIGURATION -

14.1 DETERMINING THE CURRENT CONFIGURATION -

Information -on the current configuration of SiSU should be available with the help -command: -

sisu -v
- -

which is an alias for: -

sisu --help env
- -

Either of these should be executed from within a directory that contains -sisu markup source documents. -

14.2 CONFIGURATION FILES (CONFIG.YML) -

SiSU -configration parameters are adjusted in the configuration file, which can -be used to override the defaults set. This includes such things as which -directory interim processing should be done in and where the generated -output should be placed. -

The SiSU configuration file is a yaml file, which -means indentation is significant. -

SiSU resource configuration is determined -by looking at the following files if they exist: -

./_sisu/sisurc.yml
- -

~/.sisu/sisurc.yml
- -

/etc/sisu/sisurc.yml
- -

The search is in the order listed, and the first one found is used. -

- In the absence of instructions in any of these it falls back to the internal -program defaults. -

Configuration determines the output and processing directories -and the database access details. -

If SiSU is installed a sample sisurc.yml - -

may be found in /etc/sisu/sisurc.yml -

15. SKINS -

Skins modify the default -appearance of document output on a document, directory, or site wide basis. -Skins are looked for in the following locations: -

./_sisu/skin
- -

~/.sisu/skin
- -

/etc/sisu/skin
- -

Within the skin directory are the following the default sub-directories -for document skins: -

./skin/doc
- -

./skin/dir
- -

./skin/site
- -

A skin is placed in the appropriate directory and the file named skin_[name].rb - -

The skin itself is a ruby file which modifies the default appearances -set in the program. -

15.1 DOCUMENT SKIN -

Documents take on a document skin, -if the header of the document specifies a skin to be used. -

-


-

  @skin: skin_united_nations
-
-

15.2 DIRECTORY SKIN -

A directory may be mapped on to a particular skin, -so all documents within that directory take on a particular appearance. -If a skin exists in the skin/dir with the same name as the document directory, -it will automatically be used for each of the documents in that directory, -(except where a document specifies the use of another skin, in the skin/doc -directory). -

A personal habit is to place all skins within the doc directory, -and symbolic links as needed from the site, or dir directories as required. - -

15.3 SITE SKIN -

A site skin, modifies the program default skin. -

15.4 SAMPLE - -

SKINS -

With SiSU installed sample skins may be found in: -

/etc/sisu/skin/doc -and
- /usr/share/doc/sisu/v2/sisu_markup_samples/samples/_sisu/skin/doc
- -

(or equivalent directory) and if sisu-markup-samples is installed also -under: -

/usr/share/doc/sisu-markup-samples/v2/samples/_sisu/skin/doc
- -

Samples of list.yml and promo.yml (which are used to create the right -column list) may be found in: -

/usr/share/doc/sisu/sisu_markup_samples/dfsg/_sisu/skin/yml -(or equivalent
- directory)
- -

16. CSS - CASCADING STYLE SHEETS (FOR HTML, XHTML AND XML) -

CSS files -to modify the appearance of SiSU html, XHTML or XML may be placed in the -configuration directory: ./_sisu/css ; ~/.sisu/css or; /etc/sisu/css and -these will be copied to the output directories with the command sisu -CC. - -

The basic CSS file for html output is html.css, placing a file of that -name in directory _sisu/css or equivalent will result in the default file -of that name being overwritten. -

HTML: html.css -

XML DOM: dom.css -

XML -SAX: sax.css -

XHTML: xhtml.css -

The default homepage may use homepage.css - -

or html.css -

Under consideration is to permit the placement of a CSS file -with a different name in directory _sisu/css directory or equivalent, and -change the default CSS file that is looked for in a skin.[^18] -

17. ORGANISING -CONTENT -

17.1 DIRECTORY STRUCTURE AND MAPPING -

The output directory root -can be set in the sisurc.yml file. Under the root, subdirectories are made -for each directory in which a document set resides. If you have a directory -named poems or conventions, that directory will be created under the output -directory root and the output for all documents contained in the directory -of a particular name will be generated to subdirectories beneath that directory -(poem or conventions). A document will be placed in a subdirectory of the -same name as the document with the filetype identifier stripped (.sst .ssm) - -

The last part of a directory path, representing the sub-directory in which -a document set resides, is the directory name that will be used for the -output directory. This has implications for the organisation of document -collections as it could make sense to place documents of a particular subject, -or type within a directory identifying them. This grouping as suggested -could be by subject (sales_law, english_literature); or just as conveniently -by some other classification (X University). The mapping means it is also -possible to place in the same output directory documents that are for organisational -purposes kept separately, for example documents on a given subject of two -different institutions may be kept in two different directories of the -same name, under a directory named after each institution, and these would -be output to the same output directory. Skins could be associated with each -institution on a directory basis and resulting documents will take on the -appropriate different appearance. -

-

18. HOMEPAGES -

SiSU is about the ability -to auto-generate documents. Home pages are regarded as custom built items, -and are not created by SiSU SiSU has a default home page, which will not -be appropriate for use with other sites, and the means to provide your -own home page instead in one of two ways as part of a site’s configuration, -these being: -

1. through placing your home page and other custom built -documents in the subdirectory _sisu/home/ (this probably being the easier -and more convenient option) -

2. through providing what you want as the -home page in a skin, -

Document sets are contained in directories, usually -organised by site or subject. Each directory can/should have its own homepage. -See the section on directory structure and organisation of content. -

18.1 - -

HOME PAGE AND OTHER CUSTOM BUILT PAGES IN A SUB-DIRECTORY -

Custom built -pages, including the home page index.html may be placed within the configuration -directory _sisu/home/ in any of the locations that is searched for the -configuration directory, namely ./_sisu ; ~/_sisu ; /etc/sisu From there -they are copied to the root of the output directory with the command: -

- sisu -CC
- -

18.2 HOME PAGE WITHIN A SKIN -

Skins are described in a separate section, -but basically are a file written in the programming language Ruby that -may be provided to change the defaults that are provided with sisu with -respect to individual documents, a directories contents or for a site. -

- If you wish to provide a homepage within a skin the skin should be in -the directory _sisu/skin/dir and have the name of the directory for which -it is to become the home page. Documents in the directory commercial_law -would have the homepage modified in skin_commercial law.rb; or the directory - -

poems in skin_poems.rb -

-


-

    class Home
-      def homepage
-        # place the html content of your homepage here, this will become
-index.html
-        <<HOME <html>
-  <head></head>
-  <doc>
-  <p>this is my new homepage.</p>
-  </doc>
-  </html>
-  HOME
-      end
-    end
-
-

19. MARKUP AND OUTPUT EXAMPLES -

19.1 MARKUP EXAMPLES -

Current markup examples -and document output samples are provided at <http://www.jus.uio.no/sisu/SiSU/examples.html -> - -

Some markup with syntax highlighting may be found under <http://www.jus.uio.no/sisu/sample/syntax -> -but is not as up to date. -

For some documents hardly any markup at all -is required at all, other than a header, and an indication that the levels -to be taken into account by the program in generating its output are. -

20. -SISU SEARCH - INTRODUCTION -

SiSU output can easily and conveniently be -indexed by a number of standalone indexing tools, such as Lucene, Hyperestraier. - -

Because the document structure of sites created is clearly defined, and -the text object citation system is available hypothetically at least, for -all forms of output, it is possible to search the sql database, and either -read results from that database, or just as simply map the results to the -html output, which has richer text markup. -

In addition to this SiSU has -the ability to populate a relational sql type database with documents at -an object level, with objects numbers that are shared across different -output types, which make them searchable with that degree of granularity. -Basically, your match criteria is met by these documents and at these locations -within each document, which can be viewed within the database directly -or in various output formats. -

21. SQL -

21.1 POPULATING SQL TYPE DATABASES - -

SiSU feeds sisu markupd documents into sql type databases PostgreSQL[^19] -and/or SQLite[^20] database together with information related to document -structure. -

This is one of the more interesting output forms, as all the -structural data of the documents are retained (though can be ignored by -the user of the database should they so choose). All site texts/documents -are (currently) streamed to four tables: -

* one containing semantic -(and other) headers, including, title, author,
- subject, (the Dublin Core...);
- -

* another the substantive texts by individual "paragraph" (or object) --
- along with structural information, each paragraph being identifiable -by its
- paragraph number (if it has one which almost all of them do), and the
- substantive text of each paragraph quite naturally being searchable -(both in
- formatted and clean text versions for searching); and
- -

* a third containing endnotes cross-referenced back to the paragraph -from
- which they are referenced (both in formatted and clean text versions -for
- searching).
- -

* a fourth table with a one to one relation with the headers table -contains
- full text versions of output, eg. pdf, html, xml, and ascii.
- -

There is of course the possibility to add further structures. -

At this -level SiSU loads a relational database with documents chunked into objects, -their smallest logical structurally constituent parts, as text objects, -with their object citation number and all other structural information -needed to construct the document. Text is stored (at this text object level) -with and without elementary markup tagging, the stripped version being -so as to facilitate ease of searching. -

Being able to search a relational -database at an object level with the SiSU citation system is an effective -way of locating content generated by SiSU object numbers, and all versions -of the document have the same numbering, complex searches can be tailored -to return just the locations of the search results relevant for all available -output formats, with live links to the precise locations in the database -or in html/xml documents; or, the structural information provided makes -it possible to search the full contents of the database and have headings -in which search content appears, or to search only headings etc. (as the -Dublin Core is incorporated it is easy to make use of that as well). -

22. -POSTGRESQL -

22.1 NAME -

SiSU - Structured information, Serialized Units -- a document publishing system, postgresql dependency package -

22.2 DESCRIPTION - -

Information related to using postgresql with sisu (and related to the -sisu_postgresql dependency package, which is a dummy package to install -dependencies needed for SiSU to populate a postgresql database, this being -part of SiSU - man sisu). -

22.3 SYNOPSIS -

sisu -D [instruction] [filename/wildcard - if  required]
- -

sisu -D --pg --[instruction] [filename/wildcard  if  required]
- -

22.4 COMMANDS -

Mappings to two databases are provided by default, postgresql -and sqlite, the same commands are used within sisu to construct and populate -databases however -d (lowercase) denotes sqlite and -D (uppercase) denotes -postgresql, alternatively --sqlite or --pgsql may be used -

-D or --pgsql may -be used interchangeably. -

22.4.1 CREATE AND DESTROY DATABASE -

-

- -
--pgsql --createall -
-
initial step, creates required relations (tables, indexes) in existing -(postgresql) database (a database should be created manually and given -the same name as working directory, as requested) (rb.dbi) -

- -
sisu -D --createdb -
-
-

creates database where no database existed before -

- -
sisu -D --create
-
creates - -

database tables where no database tables existed before -

- -
sisu -D --Dropall -
-
destroys database (including all its content)! kills data and drops tables, -indexes and database associated with a given directory (and directories -of the same name). -

- -
sisu -D --recreate
-
destroys existing database and builds - -

a new empty database structure -

-
-22.4.2 IMPORT AND REMOVE DOCUMENTS -

-

- -
sisu --D --import -v [filename/wildcard]
-
populates database with the contents of -the file. Imports documents(s) specified to a postgresql database (at an -object level). -

- -
sisu -D --update -v [filename/wildcard]
-
updates file contents - -

in database -

- -
sisu -D --remove -v [filename/wildcard]
-
removes specified document -from postgresql database. -

-
-23. SQLITE -

23.1 NAME -

SiSU - Structured information, -Serialized Units - a document publishing system. -

23.2 DESCRIPTION -

Information -related to using sqlite with sisu (and related to the sisu_sqlite dependency -package, which is a dummy package to install dependencies needed for SiSU -to populate an sqlite database, this being part of SiSU - man sisu). -

23.3 - -

SYNOPSIS -

sisu -d [instruction] [filename/wildcard  if  required]
- -

sisu -d --(sqlite|pg) --[instruction] [filename/wildcard  if
- required]
- -

23.4 COMMANDS -

Mappings to two databases are provided by default, postgresql -and sqlite, the same commands are used within sisu to construct and populate -databases however -d (lowercase) denotes sqlite and -D (uppercase) denotes -postgresql, alternatively --sqlite or --pgsql may be used -

-d or --sqlite may -be used interchangeably. -

23.4.1 CREATE AND DESTROY DATABASE -

-

- -
--sqlite --createall -
-
initial step, creates required relations (tables, indexes) in existing -(sqlite) database (a database should be created manually and given the -same name as working directory, as requested) (rb.dbi) -

- -
sisu -d --createdb -
-
-

creates database where no database existed before -

- -
sisu -d --create
-
creates - -

database tables where no database tables existed before -

- -
sisu -d --dropall -
-
destroys database (including all its content)! kills data and drops tables, -indexes and database associated with a given directory (and directories -of the same name). -

- -
sisu -d --recreate
-
destroys existing database and builds - -

a new empty database structure -

-
-23.4.2 IMPORT AND REMOVE DOCUMENTS -

-

- -
sisu --d --import -v [filename/wildcard]
-
populates database with the contents of -the file. Imports documents(s) specified to an sqlite database (at an object -level). -

- -
sisu -d --update -v [filename/wildcard]
-
updates file contents in database - -

- -
sisu -d --remove -v [filename/wildcard]
-
removes specified document from sqlite -database. -

-
-24. INTRODUCTION -

24.1 SEARCH - DATABASE FRONTEND SAMPLE, UTILISING -DATABASE AND SISU FEATURES, INCLUDING OBJECT CITATION NUMBERING (BACKEND -CURRENTLY POSTGRESQL) -

Sample search frontend <http://search.sisudoc.org -> -[^21] A small database and sample query front-end (search from) that makes -use of the citation system, object citation numbering to demonstrates functionality.[^22] - -

SiSU can provide information on which documents are matched and at what -locations within each document the matches are found. These results are -relevant across all outputs using object citation numbering, which includes -html, XML, EPUB, LaTeX, PDF and indeed the SQL database. You can then refer -to one of the other outputs or in the SQL database expand the text within -the matched objects (paragraphs) in the documents matched. -

Note you may -set results either for documents matched and object number locations within -each matched document meeting the search criteria; or display the names -of the documents matched along with the objects (paragraphs) that meet -the search criteria.[^23] -

-

- -
sisu -F --webserv-webrick
-
builds a cgi web search - -

frontend for the database created -

The following is feedback on the setup -on a machine provided by the help command: -

sisu --help sql
- -

-


-

  Postgresql
-    user:             ralph
-    current db set:   SiSU_sisu
-    port:             5432
-    dbi connect:      DBI:Pg:database=SiSU_sisu;port=5432
-  sqlite
-    current db set:   /home/ralph/sisu_www/sisu/sisu_sqlite.db
-    dbi connect       DBI:SQLite:/home/ralph/sisu_www/sisu/sisu_sqlite.db
-
-

Note on databases built -

By default, [unless  otherwise  specified] databases -are built on a directory basis, from collections of documents within that -directory. The name of the directory you choose to work from is used as -the database name, i.e. if you are working in a directory called /home/ralph/ebook -the database SiSU_ebook is used. [otherwise  a  manual  mapping  for  the  collection - is -

-
-24.2 SEARCH FORM -

-

- -
sisu -F
-
generates a sample search form, which must - -

be copied to the web-server cgi directory -

- -
sisu -F --webserv-webrick
-
generates -a sample search form for use with the webrick server, which must be copied - -

to the web-server cgi directory -

- -
sisu -Fv
-
as above, and provides some information - -

on setting up hyperestraier -

- -
sisu -W
-
starts the webrick server which should - -

be available wherever sisu is properly installed -

The generated search - -

form must be copied manually to the webserver directory as instructed -

-

-
-25. HYPERESTRAIER -

See the documentation for hyperestraier: -

<http://hyperestraier.sourceforge.net/ ->
- -

/usr/share/doc/hyperestraier/index.html
- -

man estcmd
- -

NOTE: the examples that follow assume that sisu output is placed in - -

the directory /home/ralph/sisu_www -

(A) to generate the index within the -webserver directory to be indexed: -

estcmd gather -sd [index  name] [directory - path  to  index]
- -

the following are examples that will need to be tailored according to -your needs: -

cd /home/ralph/sisu_www
- -

estcmd gather -sd casket /home/ralph/sisu_www
- -

you may use the ’find’ command together with ’egrep’ to limit indexing to -particular document collection directories within the web server directory: - -

find /home/ralph/sisu_www -type f | egrep
- ’/home/ralph/sisu_www/sisu/.+?.html$’ |estcmd gather -sd casket -
- -

Check which directories in the webserver/output directory (~/sisu_www -or elsewhere depending on configuration) you wish to include in the search -index. -

As sisu duplicates output in multiple file formats, it it is probably -preferable to limit the estraier index to html output, and as it may also -be desirable to exclude files ’plain.txt’, ’toc.html’ and ’concordance.html’, as -these duplicate information held in other html output e.g. -

find /home/ralph/sisu_www --type f | egrep
- ’/sisu_www/(sisu|bookmarks)/.+?.html$’ | egrep -v
- ’(doc|concordance).html$’ |estcmd gather -sd casket -
- -

from your current document preparation/markup directory, you would construct -a rune along the following lines: -

find /home/ralph/sisu_www -type f -| egrep ’/home/ralph/sisu_www/([specify
- first  directory  for  inclusion]|[specify  second  directory  for
- inclusion]|[another  directory  for  inclusion?  ...])/.+?.html$’ |
- egrep -v ’(doc|concordance).html$’ |estcmd gather -sd
- /home/ralph/sisu_www/casket -
- -

(B) to set up the search form -

(i) copy estseek.cgi to your cgi directory -and set file permissions to 755: -

sudo cp -vi /usr/lib/estraier/estseek.cgi -/usr/lib/cgi-bin
- -

sudo chmod -v 755 /usr/lib/cgi-bin/estseek.cgi
- -

sudo cp -v /usr/share/hyperestraier/estseek.* /usr/lib/cgi-bin
- -

[see  estraier  documentation  for  paths]
- -

(ii) edit estseek.conf, with attention to the lines starting ’indexname:’ -and ’replace:’: -

indexname: /home/ralph/sisu_www/casket
- -

replace: ^file:///home/ralph/sisu_www{{!}}http://localhost -
- -

replace: /index.html?${{!}}/
- -

(C) to test using webrick, start webrick: -

sisu -W
- -

and try open the url: <http://localhost:8081/cgi-bin/estseek.cgi -> -

26. SISU_WEBRICK - -

26.1 NAME -

SiSU - Structured information, Serialized Units - a document - -

publishing system -

26.2 SYNOPSIS -

sisu_webrick [port] -

or -

sisu -W [port] - -

26.3 DESCRIPTION -

sisu_webrick is part of SiSU (man sisu) sisu_webrick -starts Ruby ’s Webrick web-server and points it to the directories to which -SiSU output is written, providing a list of these directories (assuming -SiSU is in use and they exist). -

The default port for sisu_webrick is set -to 8081, this may be modified in the yaml file: ~/.sisu/sisurc.yml a sample -of which is provided as /etc/sisu/sisurc.yml (or in the equivalent directory -on your system). -

26.4 SUMMARY OF MAN PAGE -

sisu_webrick, may be started -on it’s own with the command: sisu_webrick [port] or using the sisu command -with the -W flag: sisu -W [port] -

where no port is given and settings are - -

unchanged the default port is 8081 -

26.5 DOCUMENT PROCESSING COMMAND FLAGS - -

sisu -W [port] starts Ruby Webrick web-server, serving SiSU output directories, -on the port provided, or if no port is provided and the defaults have not - -

been changed in ~/.sisu/sisurc.yaml then on port 8081 -

26.6 FURTHER INFORMATION - -

For more information on SiSU see: <http://www.jus.uio.no/sisu -> -

or man sisu - -

26.7 AUTHOR -

Ralph Amissah ralph@amissah.com or ralph.amissah@gmail.com -

- -

26.8 SEE ALSO -

sisu(1) -
- -

sisu_vim(7) -
- -

sisu(8) -
- -

27. REMOTE SOURCE DOCUMENTS -

SiSU processing instructions can be run -against remote source documents by providing the url of the documents against -which the processing instructions are to be carried out. The remote SiSU -documents can either be sisu marked up files in plaintext .sst or .ssm or; -zipped sisu files, sisupod.zip or filename.ssp -

.sst / .ssm - sisu text files - -

SiSU can be run against source text files on a remote machine, provide -the processing instruction and the url. The source file and any associated -parts (such as images) will be downloaded and generated locally. -

-


-

  sisu -3 http://[provide  url  to  valid  .sst  or  .ssm  file]
-
-

Any of the source documents in the sisu examples page can be used in -this way, see <http://www.jus.uio.no/sisu/SiSU/examples.html -> and use the url -for the desired document. -

NOTE: to set up a remote machine to serve SiSU -documents in this way, images should be in the directory relative to the - -

document source ../_sisu/image -

sisupod - zipped sisu files -

A sisupod is -the zipped content of a sisu marked up text or texts and any other associated -parts to the document such as images. -

SiSU can be run against a sisupod -on a (local or) remote machine, provide the processing instruction and -the url, the sisupod will be downloaded and the documents it contains generated -locally. -

-


-

  sisu -3 http://[provide  url  to  valid  sisupod.zip  or  .ssp  file]
-
-

Any of the source documents in the sisu examples page can be used in -this way, see <http://www.jus.uio.no/sisu/SiSU/examples.html -> and use the url -for the desired document. -

REMOTE DOCUMENT OUTPUT -

28. REMOTE OUTPUT -

-Once properly configured SiSU output can be automatically posted once generated -to a designated remote machine using either rsync, or scp. -

In order to -do this some ssh authentication agent and keychain or similar tool will -need to be configured. Once that is done the placement on a remote host -can be done seamlessly with the -r (for scp) or -R (for rsync) flag, which -may be used in conjunction with other processing flags, e.g. -

-


-

  sisu -3R sisu_remote.sst
-
-

28.1 COMMANDS -

-

- -
-R [filename/wildcard]
-
copies sisu output files to remote -host using rsync. This requires that sisurc.yml has been provided with information -on hostname and username, and that you have your different if -R is used -with other flags from if used alone. Alone the rsync --delete parameter is -sent, useful for cleaning the remote directory (when -R is used together -with other flags, it is not). Also see -r -

- -
-r [filename/wildcard]
-
copies sisu -output files to remote host using scp. This requires that sisurc.yml has -been provided with information on hostname and username, and that you have - -

your -

-
-28.2 CONFIGURATION -

[expand  on  the  setting  up  of  an  ssh-agent  /  keychain] - -

29. REMOTE SERVERS -

As SiSU is generally operated using the command line, -and works within a Unix type environment, SiSU the program and all documents -can just as easily be on a remote server, to which you are logged on using -a terminal, and commands and operations would be pretty much the same as -they would be on your local machine. -

30. QUICKSTART - GETTING STARTED HOWTO - -

30.1 INSTALLATION -

Installation is currently most straightforward and -tested on the Debian platform, as there are packages for the installation -of sisu and all requirements for what it does. -

30.1.1 DEBIAN INSTALLATION - -

SiSU is available directly from the Debian Sid and testing archives (and -possibly Ubuntu), assuming your /etc/apt/sources.list is set accordingly: - -

-


-

    aptitude update
-    aptitude install sisu-complete
-
-

The following /etc/apt/sources.list setting permits the download of additional -markup samples: -

-


-

  #/etc/apt/sources.list
-    deb http://ftp.fi.debian.org/debian/ unstable main non-free contrib
-    deb-src http://ftp.fi.debian.org/debian/ unstable main non-free contrib
-  d
-
-

The aptitude commands become: -

-


-

    aptitude update
-    aptitude install sisu-complete sisu-markup-samples
-
-

If there are newer versions of SiSU upstream of the Debian archives, - -

they will be available by adding the following to your /etc/apt/sources.list - -

-


-

  #/etc/apt/sources.list
-    deb http://www.jus.uio.no/sisu/archive unstable main non-free
-    deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
-
-

repeat the aptitude commands -

-


-

    aptitude update
-    aptitude install sisu-complete sisu-markup-samples
-
-

Note however that it is not necessary to install sisu-complete if not -all components of sisu are to be used. Installing just the package sisu -will provide basic functionality. -

30.1.2 RPM INSTALLATION -

RPMs are provided -though untested, they are prepared by running alien against the source -package, and against the debs. -

They may be downloaded from: -

<http://www.jus.uio.no/sisu/SiSU/download.html#rpm ->
- -

as root type: -

rpm -i [rpm  package  name]
- -

30.1.3 INSTALLATION FROM SOURCE -

To install SiSU from source check information -at: -

<http://www.jus.uio.no/sisu/SiSU/download.html#current ->
- -

* download the source package -

* Unpack the source -

Two alternative -modes of installation from source are provided, setup.rb (by Minero Aoki) -and a rant(by Stefan Lang) built install file, in either case: the first -steps are the same, download and unpack the source file: -

For basic use -SiSU is only dependent on the programming language in which it is written -Ruby , and SiSU will be able to generate html, EPUB, various XMLs, including -ODF (and will also produce LaTeX). Dependencies required for further actions, -though it relies on the installation of additional dependencies which the -source tarball does not take care of, for things like using a database -(postgresql or sqlite)[^24] or converting LaTeX to pdf. -

setup.rb -

This -is a standard ruby installer, using setup.rb is a three step process. In -the root directory of the unpacked SiSU as root type: -

-


-

      ruby setup.rb config
-      ruby setup.rb setup
-      #[and  as  root:]
-      ruby setup.rb install
-
-

further information on setup.rb is available from: -

<http://i.loveruby.net/en/projects/setup/ ->
- -

<http://i.loveruby.net/en/projects/setup/doc/usage.html ->
- -

-

The root directory of the unpacked SiSU as root type: -

ruby install -base
- -

or for a more complete installation: -

ruby install
- -

or -

ruby install base
- -

This makes use of Rant (by Stefan Lang) and the provided Rantfile. It -has been configured to do post installation setup setup configuration and -generation of first test file. Note however, that additional external package -dependencies, such as tetex-extra are not taken care of for you. -

Further - -

information on -

<http://make.rubyforge.org/ ->
- -

<http://rubyforge.org/frs/?group_id=615 ->
- -

For a list of alternative actions you may type: -

ruby install help
- -

ruby install -T
- -

30.2 TESTING SISU, GENERATING OUTPUT -

To check which version of sisu -is installed: -

sisu -v -

Depending on your mode of installation one or -a number of markup sample files may be found either in the directory: -

- -

or -

-

change directory to the appropriate one: -

cd /usr/share/doc/sisu/sisu_markup_samples/dfsg - -

30.2.1 BASIC TEXT, PLAINTEXT, HTML, XML, ODF, EPUB -

Having moved to the -directory that contains the markup samples (see instructions above if necessary), - -

choose a file and run sisu against it -

sisu -NhwoabxXyv free_as_in_freedom.rms_and_free_software.sam_williams.sst - -

this will generate html including a concordance file, opendocument text -format, plaintext, XHTML and various forms of XML, and OpenDocument text - -

30.2.2 LATEX / PDF -

Assuming a LaTeX engine such as tetex or texlive is -installed with the required modules (done automatically on selection of -sisu-pdf in Debian ) -

Having moved to the directory that contains the markup -samples (see instructions above if necessary), choose a file and run sisu - -

against it -

sisu -pv free_as_in_freedom.rms_and_free_software.sam_williams.sst - -

sisu -3 free_as_in_freedom.rms_and_free_software.sam_williams.sst -

should -generate most available output formats: html including a concordance file, -opendocument text format, plaintext, XHTML and various forms of XML, and - -

OpenDocument text and pdf -

30.2.3 RELATIONAL DATABASE - POSTGRESQL, SQLITE - -

Relational databases need some setting up - you must have permission to -create the database and write to it when you run sisu. -

Assuming you have - -

the database installed and the requisite permissions -

sisu --sqlite --recreate - -

sisu --sqlite -v --import free_as_in_freedom.rms_and_free_software.sam_williams.sst - -

sisu --pgsql --recreate -

sisu --pgsql -v --import free_as_in_freedom.rms_and_free_software.sam_williams.sst - -

30.3 GETTING HELP -

30.3.1 THE MAN PAGES -

Type: -

man sisu
- -

The man pages are also available online, though not always kept as up -to date as within the package itself: -

* sisu.1 <http://www.jus.uio.no/sisu/man/sisu.1 -> -[^25] -

* sisu.8 <http://www.jus.uio.no/sisu/man/sisu.8 -> [^26] -

* man directory -<http://www.jus.uio.no/sisu/man -> [^27] -

30.3.2 BUILT IN HELP -

sisu --help -

sisu - -

- -

- -

help --env -

sisu --help --commands -

sisu --help --markup -

30.3.3 THE HOME PAGE -

-<http://www.jus.uio.no/sisu -> -

<http://www.jus.uio.no/sisu/SiSU -> -

30.4 MARKUP SAMPLES - -

A number of markup samples (along with output) are available off: -

<http://www.jus.uio.no/sisu/SiSU/examples.html -> - -

Additional markup samples are packaged separately in the file: -

* -

-On Debian they are available in non-free[^28] to include them it is necessary -to include non-free in your /etc/apt/source.list or obtain them from the -sisu home site. -

31. EDITOR FILES, SYNTAX HIGHLIGHTING -

The directory: - -

./data/sisu/v2/conf/editor-syntax-etc/
- -

/usr/share/sisu/v2/conf/editor-syntax-etc
- -

contains rudimentary sisu syntax highlighting files for: -

* (g)vim -<http://www.vim.org -> -

package: sisu-vim
- -

status: largely done -

there is a vim syntax highlighting and folds -component
- -

* gedit <http://www.gnome.org/projects/gedit -> -

* gobby <http://gobby.0x539.de/ -> - -

file: sisu.lang
- -

place in: -

/usr/share/gtksourceview-1.0/language-specs
- -

or -

~/.gnome2/gtksourceview-1.0/language-specs
- -

status: very basic syntax highlighting
- -

comments: this editor features display line wrap and is used by Goby!
- -

* nano <http://www.nano-editor.org -> -

file: nanorc
- -

save as: -

~/.nanorc
- -

status: basic syntax highlighting
- -

comments: assumes dark background; no display line-wrap; does line -breaks
- -

* diakonos (an editor written in ruby) <http://purepistos.net/diakonos -> - -

file: diakonos.conf -

save as: -

~/.diakonos/diakonos.conf
- -

includes: -

status: basic syntax highlighting
- -

comments: assumes dark background; no display line-wrap -

* kate & kwrite -<http://kate.kde.org -> -

file: sisu.xml
- -

place in:
- -

/usr/share/apps/katepart/syntax
- -

or
- -

~/.kde/share/apps/katepart/syntax
- -

[settings::configure  kate::{highlighting,filetypes}]
- -

[tools::highlighting::{markup,scripts}::  .B  SiSU  ]
- -

* nedit <http://www.nedit.org -> -

file: sisu_nedit.pats
- -

nedit -import sisu_nedit.pats
- -

status: a very clumsy first attempt [not  really  done]
- -

comments: this editor features display line wrap
- -

* emacs <http://www.gnu.org/software/emacs/emacs.html -> -

files: sisu-mode.el
- -

to file ~/.emacs add the following 2 lines:
- -

(add-to-list ’load-path
- -

(require ’sisu-mode.el)
- -

[not  done  /  not  yet  included]
- -

* vim & gvim <http://www.vim.org -> -

files:
- -

package is the most comprehensive sisu syntax highlighting and editor
- environment provided to date (is for vim/ gvim, and is separate from -the
- contents of this directory)
- -

status: this includes: syntax highlighting; vim folds; some error -checking
- -

comments: this editor features display line wrap
- -

NOTE: -

[  .B  SiSU  parses  files  with  long  lines  or  line  breaks,  display - linewrap  (without  line-breaks)  is  a  convenient editor  feature  to  have  for - sisu  markup] -

32. HOW DOES SISU WORK? -

SiSU markup is fairly minimalistic, -it consists of: a (largely optional) document header, made up of information -about the document (such as when it was published, who authored it, and -granting what rights) and any processing instructions; and markup within -the substantive text of the document, which is related to document structure -and typeface. SiSU must be able to discern the structure of a document, -(text headings and their levels in relation to each other), either from -information provided in the document header or from markup within the text -(or from a combination of both). Processing is done against an abstraction -of the document comprising of information on the document’s structure and -its objects,[2] which the program serializes (providing the object numbers) -and which are assigned hash sum values based on their content. This abstraction -of information about document structure, objects, (and hash sums), provides -considerable flexibility in representing documents different ways and for -different purposes (e.g. search, document layout, publishing, content certification, -concordance etc.), and makes it possible to take advantage of some of the -strengths of established ways of representing documents, (or indeed to -create new ones). -

33. SUMMARY OF FEATURES -

* sparse/minimal markup (clean -utf-8 source texts). Documents are prepared in a single UTF-8 file using a -minimalistic mnemonic syntax. Typical literature, documents like headers -are optional. -

* markup is easily readable/parsable by the human eye, (basic -markup is simpler and more sparse than the most basic HTML), [this  may - also  be  converted  to  XML  representations  of  the  same  input/source  document]. - -

* markup defines document structure (this may be done once in a header -pattern-match description, or for heading levels individually); basic text -attributes (bold, italics, underscore, strike-through etc.) as required; -and semantic information related to the document (header information, extended -beyond the Dublin core and easily further extended as required); the headers -may also contain processing instructions. SiSU markup is primarily an abstraction -of document structure and document metadata to permit taking advantage -of the basic strengths of existing alternative practical standard ways -of representing documents [be  that  paper  publication,  sql  search  etc.] (html, -epub, xml, odf, latex, pdf, sql) -

* for output produces reasonably elegant -output of established industry and institutionally accepted open standard -formats.[3] takes advantage of the different strengths of various standard -formats for representing documents, amongst the output formats currently -supported are: -

* html - both as a single scrollable text and a segmented -document
- -

* xhtml
- -

* epub
- -

* XML - both in sax and dom style xml structures for further development -as
- required
- -

* ODF - open document format, the iso standard for document storage
- -

* LaTeX - used to generate pdf
- -

* pdf (via LaTeX)
- -

* sql - population of an sql database, (at the same object level that -is
- used to cite text within a document)
- -

Also produces: concordance files; document content certificates (md5 -or sha256 digests of headings, paragraphs, images etc.) and html manifests -(and sitemaps of content). (b) takes advantage of the strengths implicit -in these very different output types, (e.g. PDFs produced using typesetting -of LaTeX, databases populated with documents at an individual object/paragraph -level, making possible granular search (and related possibilities)) -

* -ensuring content can be cited in a meaningful way regardless of selected -output format. Online publishing (and publishing in multiple document formats) -lacks a useful way of citing text internally within documents (important -to academics generally and to lawyers) as page numbers are meaningless -across browsers and formats. sisu seeks to provide a common way of pinpoint -the text within a document, (which can be utilized for citation and by -search engines). The outputs share a common numbering system that is meaningful -(to man and machine) across all digital outputs whether paper, screen, -or database oriented, (pdf, HTML, EPUB, xml, sqlite, postgresql), this -numbering system can be used to reference content. -

* Granular search within -documents. SQL databases are populated at an object level (roughly headings, -paragraphs, verse, tables) and become searchable with that degree of granularity, -the output information provides the object/paragraph numbers which are -relevant across all generated outputs; it is also possible to look at just -the matching paragraphs of the documents in the database; [output  indexing - also  work  well  with  search  indexing tools  like  hyperestraier]. -

*longtermmaintainabilityofdocumentcollectionsinaworldofchanging -formats, having a very sparsely marked-up source document base. there is -a considerable degree of future-proofing, output representations are upgradeable -(open document text) module in 2006, epub in 2009 and in future html5 output -sometime in future, without modification of existing prepared texts -

* -SQL search aside, documents are generated as required and static once generated. - -

* documents produced are static files, and may be batch processed, this -needs to be done only once but may be repeated for various reasons as desired -(updated content, addition of new output formats, updated technology document -presentations/representations) -

* document source (plaintext utf-8) if - -

shared on the net may be used as input and processed locally to produce - -

the different document outputs -

* document source may be bundled together -(automatically) with associated documents (multiple language versions or -master document with inclusions) and images and sent as a zip file called -a sisupod, if shared on the net these too may be processed locally to produce - -

the desired document outputs -

* generated document outputs may automatically -be posted to remote sites. -

* for basic document generation, the only software -dependency is Ruby , and a few standard Unix tools (this covers plaintext, -HTML, EPUB, XML, ODF, LaTeX). To use a database you of course need that, -and to convert the LaTeX generated to pdf, a latex processor like tetex -or texlive. -

* as a developers tool it is flexible and extensible -

Syntax -highlighting for SiSU markup is available for a number of text editors. - -

SiSU is less about document layout than about finding a way with little -markup to be able to construct an abstract representation of a document -that makes it possible to produce multiple representations of it which -may be rather different from each other and used for different purposes, -whether layout and publishing, or search of content -

i.e. to be able to -take advantage from this minimal preparation starting point of some of -the strengths of rather different established ways of representing documents -for different purposes, whether for search (relational database, or indexed -flat files generated for that purpose whether of complete documents, or -say of files made up of objects), online viewing (e.g. html, xml, pdf), or -paper publication (e.g. pdf)... -

the solution arrived at is by extracting structural -information about the document (about headings within the document) and -by tracking objects (which are serialized and also given hash values) in -the manner described. It makes possible representations that are quite different -from those offered at present. For example objects could be saved individually -and identified by their hashes, with an index of how the objects relate -to each other to form a document. -

34. HELP SOURCES -

For a summary of alternative -ways to get help on SiSU try one of the following: -

man page -

man sisu_help
- -

man2html -

<http://www.jus.uio.no/sisu/man/sisu_help.1.html ->
- -

sisu generated output - links to html -

<http://sisudoc.org/sisu/sisu_help/index.html ->
- -

help sources lists -

Alternative sources for this help sources page -listed here: -

man sisu_help_sources
- -

<http://sisudoc.org/sisu/sisu_help_sources/index.html ->
- -

34.1 MAN PAGES -

34.1.1 MAN -

man sisu
- -

man 7 sisu_complete
- -

man 7 sisu_pdf
- -

man 7 sisu_postgresql
- -

man 7 sisu_sqlite
- -

man sisu_termsheet
- -

man sisu_webrick
- -

34.2 SISU GENERATED OUTPUT - LINKS TO HTML -

Note SiSU documentation is -prepared in SiSU and output is available in multiple formats including -amongst others html, pdf, odf and epub which may be also be accessed via -the html pages[^28] -

34.2.1 WWW.SISUDOC.ORG -

<http://sisudoc.org/sisu/sisu_manual/index.html -> - -

<http://sisudoc.org/sisu/sisu_manual/index.html ->
- -

<http://sisudoc.org/sisu/sisu_commands/index.html ->
- -

<http://sisudoc.org/sisu/sisu_complete/index.html ->
- -

<http://sisudoc.org/sisu/sisu_configuration/index.html ->
- -

<http://sisudoc.org/sisu/sisu_description/index.html ->
- -

<http://sisudoc.org/sisu/sisu_examples/index.html ->
- -

<http://sisudoc.org/sisu/sisu_faq/index.html ->
- -

<http://sisudoc.org/sisu/sisu_filetypes/index.html ->
- -

<http://sisudoc.org/sisu/sisu_help/index.html ->
- -

<http://sisudoc.org/sisu/sisu_help_sources/index.html ->
- -

<http://sisudoc.org/sisu/sisu_howto/index.html ->
- -

<http://sisudoc.org/sisu/sisu_introduction/index.html ->
- -

<http://sisudoc.org/sisu/sisu_manual/index.html ->
- -

<http://sisudoc.org/sisu/sisu_markup/index.html ->
- -

<http://sisudoc.org/sisu/sisu_output_overview/index.html ->
- -

<http://sisudoc.org/sisu/sisu_pdf/index.html ->
- -

<http://sisudoc.org/sisu/sisu_postgresql/index.html ->
- -

<http://sisudoc.org/sisu/sisu_quickstart/index.html ->
- -

<http://sisudoc.org/sisu/sisu_remote/index.html ->
- -

<http://sisudoc.org/sisu/sisu_search/index.html ->
- -

<http://sisudoc.org/sisu/sisu_skin/index.html ->
- -

<http://sisudoc.org/sisu/sisu_sqlite/index.html ->
- -

<http://sisudoc.org/sisu/sisu_syntax_highlighting/index.html ->
- -

<http://sisudoc.org/sisu/sisu_vim/index.html ->
- -

<http://sisudoc.org/sisu/sisu_webrick/index.html ->
- -

34.3 MAN2HTML -

34.3.1 LOCALLY INSTALLED -

<file:///usr/share/doc/sisu/v2/html/sisu.1.html> - -

<file:///usr/share/doc/sisu/v2/html/sisu_help.1.html> -

<file:///usr/share/doc/sisu/v2/html/sisu_help_sources.1.html> - -

/usr/share/doc/sisu/v2/html/sisu.1.html
- -

/usr/share/doc/sisu/v2/html/sisu_pdf.7.html
- -

/usr/share/doc/sisu/v2/html/sisu_postgresql.7.html
- -

/usr/share/doc/sisu/v2/html/sisu_sqlite.7.html
- -

/usr/share/doc/sisu/v2/html/sisu_webrick.1.html
- -

34.3.2 WWW.JUS.UIO.NO/SISU -

<http://www.jus.uio.no/sisu/man/sisu.1.html -> -

<http://www.jus.uio.no/sisu/man/sisu.1.html ->
- -

<http://www.jus.uio.no/sisu/man/sisu_complete.7.html ->
- -

<http://www.jus.uio.no/sisu/man/sisu_pdf.7.html ->
- -

<http://www.jus.uio.no/sisu/man/sisu_postgresql.7.html ->
- -

<http://www.jus.uio.no/sisu/man/sisu_sqlite.7.html ->
- -

<http://www.jus.uio.no/sisu/man/sisu_webrick.1.html ->
- -

-

    -.
  1. objects include: headings, paragraphs, verse, tables, images, but not -footnotes/endnotes which are numbered separately and tied to the object -from which they are referenced.
  2. .
  3. i.e. the html, pdf, epub, odf outputs are -each built individually and optimised for that form of presentation, rather -than for example the html being a saved version of the odf, or the pdf -being a saved version of the html. -

  4. .
  5. the different heading levels
  6. .
  7. units -of text, primarily paragraphs and headings, also any tables, poems, code-blocks -
  8. .
  9. Specification submitted by Adobe to ISO to become a full open ISO specification -<http://www.linux-watch.com/news/NS7542722606.html ->
  10. .
  11. ISO standard ISO/IEC 26300:2006 - -

  12. .
  13. An open standard format for e-books -

    - -

    *1.
    -
    square brackets
    - -
    *2.
    -
    square brackets -
    - -
    +1.
    -
    square brackets
  14. .
  15. <http://www.jus.uio.no/sisu/man/ ->
  16. .
  17. <http://www.jus.uio.no/sisu/man/sisu.1.html -> -
  18. .
  19. From sometime after SiSU 0.58 it should be possible to describe SiSU markup -using SiSU, which though not an original design goal is useful.
  20. .
  21. files should -be prepared using UTF-8 character encoding
  22. .
  23. a footnote or endnote
  24. .
  25. self contained -endnote marker & endnote in one - -
    *.
    -
    unnumbered asterisk footnote/endnote, -insert multiple asterisks if required
    - -
    **.
    -
    another unnumbered asterisk footnote/endnote -
    - -
    *3.
    -
    editors notes, numbered asterisk footnote/endnote series
    - -
    +2.
    -
    editors -notes, numbered asterisk footnote/endnote series
  26. .
  27. <http://www.jus.uio.no/sisu/ -> -
  28. .
  29. <http://www.ruby-lang.org/en/ ->
  30. .
  31. Table from the Wealth of Networks by Yochai Benkler -<http://www.jus.uio.no/sisu/the_wealth_of_networks.yochai_benkler ->
  32. .
  33. .ssc (for composite) -is under consideration but ._sst makes clear that this is not a regular -file to be worked on, and thus less likely that people will have processing. -It may be however that when the resulting file is shared .ssc is an appropriate -suffix to use.
  34. .
  35. <http://www.postgresql.org/ -> <http://advocacy.postgresql.org/ -> <http://en.wikipedia.org/wiki/Postgresql -> -
  36. .
  37. <http://www.hwaci.com/sw/sqlite/ -> <http://en.wikipedia.org/wiki/Sqlite ->
  38. .
  39. <http://search.sisudoc.org -> -
  40. .
  41. (which could be extended further with current back-end). As regards scaling -of the database, it is as scalable as the database (here Postgresql) and -hardware allow.
  42. .
  43. of this feature when demonstrated to an IBM software innovations -evaluator in 2004 he said to paraphrase: this could be of interest to us. -We have large document management systems, you can search hundreds of thousands -of documents and we can tell you which documents meet your search criteria, -but there is no way we can tell you without opening each document where -within each your matches are found.
  44. .
  45. There is nothing to stop MySQL support -being added in future.
  46. .
  47. <http://www.jus.uio.no/sisu/man/sisu.1 ->
  48. .
  49. <http://www.jus.uio.no/sisu/man/sisu.8 -> -
  50. .
  51. <http://www.jus.uio.no/sisu/man ->
  52. .
  53. the Debian Free Software guidelines require -that everything distributed within Debian can be changed - and the documents -are authors’ works that while freely distributable are not freely changeable. -
  54. .
  55. -

    named index.html or more extensively through sisu_manifest.html -

  56. -
- -

See Also

-sisu(1) -, -
-sisu-epub(1) -,
-sisu-harvest(1) -,
-sisu-html(1) -,
-sisu-odf(1) -,
-sisu-pdf(1) -,
-sisu-pg(1) -,
-sisu-sqlite(1) -,
-sisu-txt(1) -.
-sisu_vim(7) -
-sisu(8) - -

-

Homepage

-More information about SiSU can be found at <http://www.jus.uio.no/sisu/ ->. - -

-

Author

-SiSU was written by Ralph Amissah <ralph@amissah.com>.

- -


-Table of Contents

-

- - -- cgit v1.2.3