From eae9bb93fdd2e677c8882bcc96d42b804ac2bafe Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Sun, 18 Nov 2012 21:50:04 -0500 Subject: v4: documentation; markup samples & help --- data/doc/sisu/html/sisu4.1.html | 3693 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 3693 insertions(+) create mode 100644 data/doc/sisu/html/sisu4.1.html (limited to 'data/doc/sisu/html/sisu4.1.html') diff --git a/data/doc/sisu/html/sisu4.1.html b/data/doc/sisu/html/sisu4.1.html new file mode 100644 index 00000000..345e574b --- /dev/null +++ b/data/doc/sisu/html/sisu4.1.html @@ -0,0 +1,3693 @@ + + + + + +"sisu"("1") manual page + + +Table of Contents

+
+ +

Name

+
+sisu - documents: markup, structuring, publishing in multiple standard formats, +and search
+ +

Synopsis

+
+sisu [-short-options|--long-options] [filename/wildcard] +


+sisu [-abCcDdeFGghIikLMmNnoPpQqRrSsTtUuVvWwXxYyZ_0-9] [filename/wildcard] + +


+sisu --txt --html --epub --odt --pdf --wordmap --sqlite --manpage --texinfo --sisupod --source +--qrcode [filename/wildcard] +


+sisu [-Ddcv] [instruction] [filename/wildcard] +


+sisu --pg (--createdb|update [filename/wildcard]|--dropall) +


+sisu [operations] +


+sisu [-CcFLSVvW] +


+sisu (--configure|--webrick|--sample-search-form) +

Sisu - Manual,

+RALPH AMISSAH
+ +

+

What is Sisu?

+
+ +

+

Introduction - What is Sisu?

+
+ +


+SiSU is a framework for document structuring, publishing (in multiple open +standard formats) and search, comprising of: (a) a lightweight document +structure and presentation markup syntax; and (b) an accompanying engine +for generating standard document format outputs from documents prepared +in sisu markup syntax, which is able to produce multiple standard outputs +(including the population of sql databases) that (can) share a common numbering +system for the citation of text within a document. +


+SiSU is developed under an open source, software libre license ( GPLv3 +). Its use case for development is work with medium to large document sets +and cope with evolving document formats/ representation technologies. Documents +are prepared once, and generated as need be to update the technical presentation +or add additional output formats. Various output formats (including search +related output) share a common mechanism for cross-output-format citation. + +


+SiSU both defines a markup syntax and provides an engine that produces +open standards format outputs from documents prepared with SiSU markup. +From a single lightly prepared document sisu custom builds several standard +output formats which share a common (text object) numbering system for +citation of content within a document (that also has implications for search). +The sisu engine works with an abstraction of the document’s structure and +content from which it is possible to generate different forms of representation +of the document. Significantly SiSU markup is more sparse than html and +outputs which include HTML, EPUB, ODT (Open Document Format text), LaTeX, +landscape and portrait PDF, all of which can be added to and updated. SiSU +is also able to populate SQL type databases at an object level, which means +that searches can be made with that degree of granularity. +


+Source document preparation and output generation is a two step process: +(i) document source is prepared, that is, marked up in sisu markup syntax +and (ii) the desired output subsequently generated by running the sisu +engine against document source. Output representations if updated (in the +sisu engine) can be generated by re-running the engine against the prepared +source. Using SiSU markup applied to a document, SiSU custom builds (to +take advantage of the strengths of different ways of representing documents) +various standard open output formats including plain text, HTML, XHTML, +XML, EPUB, ODT, LaTeX or PDF files, and populate an SQL database with objects[^1] +(equating generally to paragraph-sized chunks) so searches may be performed +and matches returned with that degree of granularity ( e.g. your search criteria +is met by these documents and at these locations within each document). +Document output formats share a common object numbering system for locating +content. This is particularly suitable for "published" works (finalized +texts as opposed to works that are frequently changed or updated) for which +it provides a fixed means of reference of content. +


+In preparing a SiSU document you optionally provide semantic information +related to the document in a document header, and in marking up the substantive +text provide information on the structure of the document, primarily indicating +heading levels and footnotes. You also provide information on basic text +attributes where used. The rest is automatic, sisu from this information +custom builds[^2] the different forms of output requested. +


+SiSU works with an abstraction of the document based on its structure which +is comprised of its headings[^3] and objects[^4], which enables SiSU to represent +the document in many different ways, and to take advantage of the strengths +of different ways of presenting documents. The objects are numbered, and +these numbers can be used to provide a common basis for citing material +within a document across the different output format types. This is significant +as page numbers are not well suited to the digital age, in web publishing, +changing a browser’s default font or using a different browser can mean +that text will appear on a different page; and publishing in different +formats, html, landscape and portrait pdf etc. again page numbers are not +useful to cite text. Dealing with documents at an object level together +with object numbering also has implications for search that SiSU is able +to take advantage of. +


+One of the challenges of maintaining documents is to keep them in a format +that allows use of them independently of proprietary platforms. Consider +issues related to dealing with legacy proprietary formats today and what +guarantee you have that old proprietary formats will remain (or can be +read without proprietary software/equipment) in 15 years time, or the way +the way in which html has evolved over its relatively short span of existence. +SiSU provides the flexibility of producing documents in multiple non-proprietary +open formats including HTML, EPUB, [^5] ODT, [^6] PDF [^7] ODF, [^8]. Whilst +SiSU relies on software, the markup is uncomplicated and minimalistic which +guarantees that future engines can be written to run against it. It is also +easily converted to other formats, which means documents prepared in SiSU +can be migrated to other document formats. Further security is provided +by the fact that the software itself, SiSU is available under GPLv3 a licence +that guarantees that the source code will always be open, and free as in +libre, which means that that code base can be used, updated and further +developed as required under the terms of its license. Another challenge +is to keep up with a moving target. SiSU permits new forms of output to +be added as they become important, (Open Document Format text was added +in 2006 when it became an ISO standard for office applications and the +archival of documents), EPUB was introduced in 2009; and allows the technical +representations existing output to be updated ( HTML has evolved and the +related module has been updated repeatedly over the years, presumably when +the World Wide Web Consortium (w3c) finalises HTML 5 which is currently +under development, the HTML module will again be updated allowing all existing +documents to be regenerated as HTML 5). +


+The document formats are written to the file-system and available for indexing +by independent indexing tools, whether off the web like Google and Yahoo +or on the site like Lucene and Hyperestraier. +


+SiSU also provides other features such as concordance files and document +content certificates, and the working against an abstraction of document +structure has further possibilities for the research and development of +other document representations, the availability of objects is useful for +example for topic maps and thesauri, together with the flexibility of SiSU +offers great possibilities. +


+SiSU is primarily for published works, which can take advantage of the +citation system to reliably reference its documents. SiSU works well in +a complementary manner with such collaborative technologies as Wikis, which +can take advantage of and be used to discuss the substance of content prepared +in SiSU. +


+<http://www.sisudoc.org/ +> +


+<http://www.jus.uio.no/sisu +> +

+

Commands Summary

+
+ +

+

Description

+ +


+SiSU is a document publishing system, that from a simple single marked-up +document, produces multiple output formats including: plaintext, HTML, +XHTML, XML, EPUB, ODT ( OpenDocument ( ODF ) text), LaTeX, PDF, info, and +SQL ( PostgreSQL and SQLite ) , which share text object numbers ("object +citation numbering") and the same document structure information. For more +see: <http://sisudoc.org +> or <http://www.jus.uio.no/sisu +> +

+

Document Processing +Command Flags

+ +

+

+ +
-a [filename/wildcard]
+
produces plaintext with Unix linefeeds +and without markup, (object numbers are omitted), has footnotes at end +of each paragraph that contains them [  -A  for  output  file] [see  -e  for  endnotes]. +(Options include: --endnotes for endnotes --footnotes for footnotes at the +end of each paragraph --unix for unix linefeed (default) --msdos for msdos +linefeed) +

+ +
-b [filename/wildcard]
+
see --xhtml +

+ +
--by-*
+
see --output-by-* +

+ +
-C
+
configure/initialise +shared output directory files initialize shared output directory (config +files such as css and dtd files are not updated if they already exist unless +modifier is used). -C --init-site configure/initialise site more extensive than +-C on its own, shared output directory files/force update, existing shared +output config files such as css and dtd files are updated if this modifier +is used. +

+ +
-CC
+
see --configure +

+ +
-c [filename/wildcard]
+
see --color-toggle +

+ +
--color-toggle +[filename/wildcard]
+
screen toggle ansi screen colour on or off depending +on default set (unless -c flag is used: if sisurc colour default is set +to ’true’, output to screen will be with colour, if sisurc colour default +is set to ’false’ or is undefined screen output will be without colour). Alias + +

- +

c +

+ +
--configure
+
configure/initialise shared output directory files initialize +shared output directory (config files such as css and dtd files are not +updated if they already exist unless modifier is used). The equivalent of: +-C --init-site configure/initialise site, more extensive than -C on its own, +shared output directory files/force update, existing shared output config +files such as css and dtd files are updated if -CC is used. +

+ +
--concordance +[filename/wildcard]
+
produces concordance (wordmap) a rudimentary index +of all the words in a document. (Concordance files are not generated for +documents of over 260,000 words unless this limit is increased in the file +sisurc.yml). Alias -w +

+ +
-D [instruction] [filename]
+
see --pg +

+ +
-d [--db-[database  type + (sqlite|pg)]] --[instruction] [filename]
+
see --sqlite +

+ +
--dal [filename/wildcard/url] +
+
assumed for most other flags, creates new intermediate files for processing +(document abstraction) that is used in all subsequent processing of other +output. This step is assumed for most processing flags. To skip it see -n. + +

Alias -m +

+ +
--delete [filename/wildcard]
+
see --zap +

+ +
--dump[=directory_path] [filename/wildcard] +
+
places output in directory specified, if none is specified in the current +directory (pwd). Compare --redirect +

+ +
-e [filename/wildcard]
+
see --epub +

+ +
--epub +[filename/wildcard]
+
produces an epub document, [sisu  version  >=2  ] (filename.epub). + +

Alias -e +

+ +
--exc-*
+
exclude output feature, overrides configuration settings +--exc- ocn, (exclude object citation numbering, (switches off object citation +numbering ) , affects html (seg, scroll), epub, xhtml, xml, pdf) ; --exc-toc, +(exclude table of contents, affects html (scroll), epub, pdf) ; --exc-links-to-manifest, +--exc-manifest-links, (exclude links to manifest, affects html (seg, scroll)); +--exc-search-form, (exclude search form, affects html (seg, scroll), manifest); +--exc-minitoc, (exclude mini table of contents, affects html (seg), concordance, +manifest); --exc-manifest-minitoc, (exclude mini table of contents, affects +manifest); --exc-html-minitoc, (exclude mini table of contents, affects html +(seg), concordance); --exc-html-navigation, (exclude navigation, affects html +(seg)); --exc-html-navigation-bar, (exclude navigation bar, affects html (seg)); +--exc-html-search-form, (exclude search form, affects html (seg, scroll)); --exc-html-right-pane, +(exclude right pane/column, affects html (seg, scroll)); --exc-html-top-band, +(exclude top band, affects html (seg, scroll), concordance (minitoc forced +on to provide seg navigation)); --exc-segsubtoc (exclude sub table of contents, +affects html (seg), epub) ; see also --inc-* +

+ +
-F [--webserv=webrick]
+
see --sample-search-form + +

+ +
-f [optional  string  part  of  filename]
+
see --find +

+ +
--find [optional  string  part + of  filename]
+
without match string, glob all .sst .ssm files in directory +(including language subdirectories). With match string, find files that +match given string in directory (including language subdirectories). Alias +-f, --glob, -G +

+ +
-G [optional  string  part  of  filename]
+
see --find +

+ +
-g [filename/wildcard] +
+
+

see --git +

+ +
--git [filename/wildcard]
+
produces or updates markup source file +structure in a git repo (experimental and subject to change). Alias -g +

+ +
--glob +[optional  string  part  of  filename]
+
see --find +

+ +
-h [filename/wildcard]
+
see + +

- +

- +

html +

+ +
--harvest *.ss[tm]
+
makes two lists of sisu output based on the sisu +markup documents in a directory: list of author and authors works (year +and titles), and; list by topic with titles and author. Makes use of header +metadata fields (author, title, date, topic_register). Can be used with +maintenance (-M) and remote placement (-R) flags. +

+ +
--help [topic]
+
provides help +on the selected topic, where topics (keywords) include: list, (com)mands, +short(cuts), (mod)ifiers, (env)ironment, markup, syntax, headers, headings, +endnotes, tables, example, customise, skin, (dir)ectories, path, (lang)uage, +db, install, setup, (conf)igure, convert, termsheet, search, sql, features, +license. +

+ +
--html [filename/wildcard]
+
produces html output, segmented text +with table of contents (toc.html and index.html) and the document in a single +file (scroll.html). Alias -h +

+ +
-I [filename/wildcard]
+
see --texinfo +

+ +
-i [filename/wildcard] +
+
+

see --manpage +

+ +
--inc-*
+
include output feature, overrides configuration settings, +(usually the default if none set), has precedence over --exc-* (exclude output +feature). Some detail provided under --exc-*, see --exc-* +

+ +
-j [filename/wildcard] +
+
copies images associated with a file for use by html, xhtml & xml outputs +(automatically invoked by --dump & redirect). +

+ +
--keep-processing-files [filename/wildcard/url] +
+
+

see --maintenance +

+ +
-L
+
prints license information. +

+ +
-M [filename/wildcard/url] +
+
+

see --maintenance +

+ +
-m [filename/wildcard/url]
+
see --dal (document abstraction +level/layer) +

+ +
--machine [filename/wildcard/url]
+
see --dal (document abstraction +level/layer) +

+ +
--maintenance [filename/wildcard/url]
+
maintenance mode, interim +processing files are preserved and their locations indicated. (also see +-V). Aliases -M and --keep-processing-files. +

+ +
--manpage [filename/wildcard]
+
produces +man page of file, not suitable for all outputs. Alias -i +

+ +
-N [filename/wildcard/url] +
+
document digest or document content certificate ( DCC ) as md5 digest tree +of the document: the digest for the document, and digests for each object +contained within the document (together with information on software versions +that produced it) (digest.txt). -NV for verbose digest output to screen. +

+ +
-n +[filename/wildcard/url]
+
skip the creation of intermediate processing files +(document abstraction) if they already exist, this skips the equivalent +of -m which is otherwise assumed by most processing flags. +

+ +
--no-*
+
see --exc-* + +

+ +
-o [filename/wildcard/url]
+
see --odt +

+ +
--odf [filename/wildcard/url]
+
see --odt + +

+ +
--odt [filename/wildcard/url]
+
output basic document in opendocument file +format (opendocument.odt). Alias -o +

+ +
--output-by-*
+
select output directory structure +from 3 alternatives: --output-by-language, (language directory (based on language +code) with filetype (html, epub, pdf etc.) subdirectories); --output-by-filetype, +(filetype directories with language code as part of filename); --output-by-filename, +(filename directories with language code as part of filename). This is configurable. +Alias --by-* +

+ +
-P [language_directory/filename  language_directory]
+
see --po4a + +

+ +
-p [filename/wildcard]
+
see --pdf +

+ +
--pdf [filename/wildcard]
+
produces LaTeX +pdf (portrait.pdf & landscape.pdf). Default paper size is set in config file, +or document header, or provided with additional command line parameter, +e.g. --papersize-a4 preset sizes include: ’A4’, U.S. ’letter’ and ’legal’ and book sizes +’A5’ and ’B5’ (system defaults to A4). Alias -p +

+ +
--pg [instruction] [filename] +
+
database PostgreSQL ( --pgsql may be used instead) possible instructions, +include: --createdb; --create; --dropall; --import [filename]; --update [filename]; +--remove [filename]; see database section below. Alias -D +

+ +
--po [language_directory/filename + language_directory]
+
see --po4a +

+ +
--po4a [language_directory/filename  language_directory] +
+
produces .pot and po files for the file in the languages specified by the +language directory. SiSU markup is placed in subdirectories named with the +language code, e.g. en/ fr/ es/. The sisu config file must set the output +directory structure to multilingual. v3, experimental +

+ +
-Q [filename/wildcard] +
+
+

see --qrcode +

+ +
-q [filename/wildcard]
+
see --quiet +

+ +
--qrcode [filename/wildcard] +
+
generate QR code image of metadata (used in manifest). v3 only. +

+ +
--quiet [filename/wildcard] +
+
quiet less output to screen. +

+ +
-R [filename/wildcard]
+
see --rsync +

+ +
-r [filename/wildcard] +
+
+

see --scp +

+ +
--redirect[=directory_path] [filename/wildcard]
+
places output in +subdirectory under specified directory, subdirectory uses the filename +(without the suffix). If no output directory is specified places the subdirectory +under the current directory (pwd). Compare --dump +

+ +
--rsync [filename/wildcard] +
+
copies sisu output files to remote host using rsync. This requires that +sisurc.yml has been provided with information on hostname and username, +and that you have your "keys" and ssh agent in place. Note the behavior +of rsync different if -R is used with other flags from if used alone. Alone +the rsync --delete parameter is sent, useful for cleaning the remote directory +(when -R is used together with other flags, it is not). Also see --scp. Alias + +

- +

R +

+ +
-S
+
see --sisupod +

+ +
-S [filename/wildcard]
+
see --sisupod +

+ +
-s [filename/wildcard] +
+
+

see --source +

+ +
--sample-search-form [--webserv=webrick]
+
generate examples of (naive) +cgi search form for SQLite and PgSQL depends on your already having used +sisu to populate an SQLite and/or PgSQL database, (the SQLite version scans +the output directories for existing sisu_sqlite databases, so it is first +necessary to create them, before generating the search form) see -d -D and +the database section below. If the optional parameter --webserv=webrick is +passed, the cgi examples created will be set up to use the default port +set for use by the webrick server, (otherwise the port is left blank and +the system setting used, usually 80). The samples are dumped in the present +work directory which must be writable, (with screen instructions given +that they be copied to the cgi-bin directory). Alias -F +

+ +
--scp [filename/wildcard] +
+
copies sisu output files to remote host using scp. This requires that sisurc.yml +has been provided with information on hostname and username, and that you +have your "keys" and ssh agent in place. Also see --rsync. Alias -r +

+ +
--sqlite +--[instruction] [filename]
+
database type set to SQLite, this produces one +of two possible databases, without additional database related instructions +it produces a discreet SQLite file for the document processed; with additional +instructions it produces a common SQLite database of all processed documents +that (come from the same document preparation directory and as a result) +share the same output directory base path (possible instructions include: +--createdb; --create; --dropall; --import [filename]; --update [filename]; --remove +[filename]); see database section below. Alias -d +

+ +
--sisupod
+
produces a sisupod +a zipped sisu directory of markup files including sisu markup source files +and the directories local configuration file, images and skins. Note: this +only includes the configuration files or skins contained in ./_sisu not +those in ~/.sisu -S [filename/wildcard] option. Note: (this
+ option is tested only with zsh). Alias -S +

+ +
--sisupod [filename/wildcard]
+
produces +a zipped file of the prepared document specified along with associated +images, by default named sisupod.zip they may alternatively be named with +the filename extension .ssp This provides a quick way of gathering the relevant +parts of a sisu document which can then for example be emailed. A sisupod +includes sisu markup source file, (along with associated documents if a +master file, or available in multilingual versions), together with related +images and skin. SiSU commands can be run directly against a sisupod contained +in a local directory, or provided as a url on a remote site. As there is +a security issue with skins provided by other users, they are not applied +unless the flag --trust or --trusted is added to the command instruction, it +is recommended that file that are not your own are treated as untrusted. +The directory structure of the unzipped file is understood by sisu, and +sisu commands can be run within it. Note: if you wish to send multiple files, +it quickly becomes more space efficient to zip the sisu markup directory, +rather than the individual files for sending). See the -S option without +[filename/wildcard]. Alias -S +

+ +
--source [filename/wildcard]
+
copies sisu markup +file to output directory. Alias -s +

+ +
-T [filename/wildcard  (*.termsheet.rb)] +
+
standard form document builder, preprocessing feature +

+ +
-t [filename/wildcard] +
+
+

see --txt +

+ +
--texinfo [filename/wildcard]
+
produces texinfo and info file, (view +with pinfo). Alias -I +

+ +
--txt [filename/wildcard]
+
produces plaintext with Unix +linefeeds and without markup, (object numbers are omitted), has footnotes +at end of each paragraph that contains them [  -A  for  output  file] [see  -e + for  endnotes]. (Options include: --endnotes for endnotes --footnotes for footnotes +at the end of each paragraph --unix for unix linefeed (default) --msdos for +msdos linefeed). Alias -t +

+ +
-U [filename/wildcard]
+
see --urls +

+ +
-u [filename/wildcard] +
+
provides url mapping of output files for the flags requested for processing, + +

also see -U +

+ +
--urls [filename/wildcard]
+
prints url output list/map for the +available processing flags options and resulting files that could be requested, +(can be used to get a list of processing options in relation to a file, +together with information on the output that would be produced), -u provides +url output mapping for those flags requested for processing. The default +assumes sisu_webrick is running and provides webrick url mappings where +appropriate, but these can be switched to file system paths in sisurc.yml. + +

Alias -U +

+ +
-V
+
on its own, provides SiSU version and environment information +(sisu --help env) +

+ +
-V [filename/wildcard]
+
even more verbose than the -v flag. + +

+ +
-v
+
on its own, provides SiSU version information +

+ +
-v [filename/wildcard] +
+
+

see --verbose +

+ +
--v3 [filename/wildcard]
+
invokes the sisu v3 document parser/generator. +You may run sisu3 instead. +

+ +
--v4 [filename/wildcard]
+
invokes the sisu v4 document +parser/generator. This is the default and is normally omitted. +

+ +
--verbose [filename/wildcard] +
+
provides verbose output of what is being generated, where output is placed +(and error messages if any), as with -u flag provides a url mapping of files +created for each of the processing flag requests. Alias -v +

+ +
-W
+
see --webrick + +

+ +
-w [filename/wildcard]
+
see --concordance +

+ +
--webrick
+
starts ruby’ s webrick webserver +points at sisu output directories, the default port is set to 8081 and +can be changed in the resource configuration files. [tip:  the  webrick  server + requires  link  suffixes,  so  html output  should  be  created  using  the  -h  option + rather  than  also,  note  -F  webrick  ]. Alias -W +

+ +
--wordmap [filename/wildcard] +
+
+

see --concordance +

+ +
--xhtml [filename/wildcard]
+
produces xhtml/ XML output for +browser viewing (sax parsing). Alias -b +

+ +
--xml-dom [filename/wildcard]
+
produces +XML output with deep document structure, in the nature of dom. Alias -X +

+

+ +
--xml-sax [filename/wildcard]
+
produces XML output shallow structure (sax parsing). + +

Alias -x +

+ +
-X [filename/wildcard]
+
see --xml-dom +

+ +
-x [filename/wildcard]
+
see --xml-sax + +

+ +
-Y [filename/wildcard]
+
produces a short sitemap entry for the document, +based on html output and the sisu_manifest. --sitemaps generates/updates the +sitemap index of existing sitemaps. (Experimental, [g,y,m  announcement  this + week]) +

+ +
-y [filename/wildcard]
+
produces an html summary of output generated +(hyperlinked to content) and document specific metadata (sisu_manifest.html). +This step is assumed for most processing flags. +

+ +
-Z [filename/wildcard]
+
see + +

- +

- +

zap +

+ +
--zap [filename/wildcard]
+
Zap, if used with other processing flags deletes +output files of the type about to be processed, prior to processing. If +-Z is used as the lone processing related flag (or in conjunction with a +combination of -[mMvVq]), will remove the related document output directory. + +

Alias -Z +

+
+ +

Command Line Modifiers

+
+ +

+

+ +
--no-
+
ocn [with  --html  --pdf  or  --epub] switches off object citation numbering. +Produce output without identifying numbers in margins of html or LaTeX +/pdf output. +

+ +
--no-annotate
+
strips output text of editor endnotes[^*1] denoted + +

by asterisk or dagger/plus sign +

+ +
--no-asterisk
+
strips output text of editor +endnotes[^*2] denoted by asterisk sign +

+ +
--no-dagger
+
strips output text of editor +endnotes[^+1] denoted by dagger/plus sign +

+
+ +

Database Commands

+
+ +


+dbi - database interface +


+-D or --pgsql set for PostgreSQL -d or --sqlite default set for SQLite -d is modifiable +with --db=[database  type  (PgSQL  or  .I  SQLite  )  ] +

+

+ +
--pg -v --createall
+
initial step, +creates required relations (tables, indexes) in existing PostgreSQL database +(a database should be created manually and given the same name as working +directory, as requested) (rb.dbi) [  -dv  --createall  .I SQLite  equivalent] it +may be necessary to run sisu -Dv --createdb initially NOTE: at the present +time for PostgreSQL it may be necessary to manually create the database. +The command would be working  directory  name  (without  path)]. Please use +only alphanumerics and underscores. +

+ +
--pg -v --import
+
[filename/wildcard] imports +data specified to PostgreSQL db (rb.dbi) [  -dv  --import  .I  SQLite  equivalent] + +

+ +
--pg -v --update
+
[filename/wildcard] updates/imports specified data to PostgreSQL +db (rb.dbi) [  -dv  --update  .I  SQLite  equivalent] +

+ +
--pg --remove
+
[filename/wildcard] +removes specified data to PostgreSQL db (rb.dbi) [  -d  --remove  .I  SQLite  equivalent] + +

+ +
--pg --dropall
+
kills data" and drops ( PostgreSQL or SQLite ) db, tables & +indexes [  -d  --dropall  .I  SQLite  equivalent] +


+The -v is for verbose output. +

+
+ +

Shortcuts, Shorthand for Multiple Flags

+
+ +

+

+ +
--update [filename/wildcard]
+
Checks existing file output and runs the flags +required to update this output. This means that if only html and pdf output +was requested on previous runs, only the -hp files will be applied, and +only these will be generated this time, together with the summary. This +can be very convenient, if you offer different outputs of different files, +and just want to do the same again. +

+ +
-0 to -5 [filename  or  wildcard]
+
Default +shorthand mappings (for v3, note that the defaults can be changed/configured +in the sisurc.yml file): +

+ +
-0
+
-NQhewpotbxXyYv [this  is  the  default  action  run + when  no options  are  give,  i.e.  on  ’sisu  [filename]’] +

+ +
-1
+
-Qhewpoty +

+ +
-2
+
-NQhewpotbxXy + +

+ +
-3
+
-NQhewpotbxXyY +

+ +
-4
+
-NQhewpotbxXDyY --update +

+ +
-5
+
-NQhewpotbxXDyYv --update +


+add -v for verbose mode and -c to toggle color state, e.g. sisu -2vc [filename + or  wildcard] +


+ +

consider -u for appended url info or -v for verbose output +

+
+ +

Command Line +with Flags - Batch Processing

+ +


+In the data directory run sisu -mh filename or wildcard eg. "sisu -h cisg.sst" +or "sisu -h *.{sst,ssm}" to produce html version of all documents. +


+Running sisu (alone without any flags, filenames or wildcards) brings up +the interactive help, as does any sisu command that is not recognised. Enter +to escape. +

+

Help

+
+ +

+

Sisu Manual

+ +


+The most up to date information on sisu should be contained in the sisu_manual, +available at: +


+ <http://sisudoc.org/sisu/sisu_manual/ +>
+ +


+The manual can be generated from source, found respectively, either within +the SiSU tarball or installed locally at: +


+ ./data/doc/sisu/markup-samples/sisu_manual
+ +


+ /usr/share/doc/sisu/markup-samples/sisu_manual
+ +


+move to the respective directory and type e.g.: +


+ sisu sisu_manual.ssm
+ +

+

Sisu Man Pages

+ +


+If SiSU is installed on your system usual man commands should be available, +try: +


+ man sisu
+ +


+Most SiSU man pages are generated directly from sisu documents that are +used to prepare the sisu manual, the sources files for which are located +within the SiSU tarball at: +


+ ./data/doc/sisu/markup-samples/sisu_manual
+ +


+Once installed, directory equivalent to: +


+ /usr/share/doc/sisu/markup-samples/sisu_manual
+ +


+Available man pages are converted back to html using man2html: +


+ /usr/share/doc/sisu/html/
+ +


+ ./data/doc/sisu/html
+ +


+An online version of the sisu man page is available here: +


+* various sisu man pages <http://www.jus.uio.no/sisu/man/ +> [^9] +


+* sisu.1 <http://www.jus.uio.no/sisu/man/sisu.1.html +> [^10] +

+

Sisu Built-in Interactive +Help

+ +


+This is particularly useful for getting the current sisu setup/environment +information: +


+ sisu --help
+ +


+ sisu --help [subject]
+ +


+ sisu --help commands
+ +


+ sisu --help markup
+ +


+ sisu --help env [for  feedback  on  the  way  your  system  is
+ setup  with  regard  to  sisu  ]
+ +


+ sisu -V [environment  information,  same  as  above  command]
+ +


+ sisu (on its own provides version and some help information)
+ +


+Apart from real-time information on your current configuration the SiSU +manual and man pages are likely to contain more up-to-date information than +the sisu interactive help (for example on commands and markup). +


+NOTE: Running the command sisu (alone without any flags, filenames or wildcards) +brings up the interactive help, as does any sisu command that is not recognised. +Enter to escape. +

+

Introduction to Sisu Markup[^11]

+
+ +

+

Summary

+ +


+SiSU source documents are plaintext ( UTF-8 )[^12] files +


+All paragraphs are separated by an empty line. +


+Markup is comprised of: +


+* at the top of a document, the document header made up of semantic meta-data +about the document and if desired additional processing instructions (such +an instruction to automatically number headings from a particular level +down) +


+* followed by the prepared substantive text of which the most important +single characteristic is the markup of different heading levels, which +define the primary outline of the document structure. Markup of substantive +text includes: +


+ * heading levels defines document structure
+ +


+ * text basic attributes, italics, bold etc.
+ +


+ * grouped text (objects), which are to be treated differently, such as +code
+ blocks or poems.
+ +


+ * footnotes/endnotes
+ +


+ * linked text and images
+ +


+ * paragraph actions, such as indent, bulleted, numbered-lists, etc.
+ +


+Some interactive help on markup is available, by typing sisu and selecting + +

markup or sisu --help markup +


+To check the markup in a file: +


+ sisu --identify [filename].sst
+ +


+ +

For brief descriptive summary of markup history +


+ sisu --query-history
+ +


+or if for a particular version: +


+ sisu --query-0.38
+ +

+

Markup Examples

+ +

+

Online

+ +


+Online markup examples are available together with the respective outputs +produced from <http://www.jus.uio.no/sisu/SiSU/examples.html +> or from <http://www.jus.uio.no/sisu/sisu_examples/ +> + +


+There is of course this document, which provides a cursory overview of +sisu markup and the respective output produced: <http://www.jus.uio.no/sisu/sisu_markup/ +> + +


+an alternative presentation of markup syntax: /usr/share/doc/sisu/on_markup.txt.gz + +

+

Installed

+ +


+With SiSU installed sample skins may be found in: /usr/share/doc/sisu/markup-samples +(or equivalent directory) and if sisu -markup-samples is installed also under: + +

/usr/share/doc/sisu/markup-samples-non-free +

+

Markup of Headers

+
+ +


+Headers contain either: semantic meta-data about a document, which can be +used by any output module of the program, or; processing instructions. +

+
+Note: the first line of a document may include information on the markup +version used in the form of a comment. Comments are a percentage mark at +the start of a paragraph (and as the first character in a line of text) +followed by a space and the comment: +


+

% this would be a comment
+
+

+

Sample Header

+ +


+This current document is loaded by a master document that has a header +similar to this one: +


+

% SiSU master 2.0
+@title: SiSU
+:subtitle: Manual
+@creator:
+:author: Amissah, Ralph
+@publisher: [publisher  name]
+@rights: Copyright (C) Ralph Amissah 2007, part of SiSU documentation,
+License GPL 3
+@classify:
+:type: information
+:topic_register: SiSU:manual;electronic documents:SiSU:manual
+:subject: ebook, epublishing, electronic book, electronic publishing,
+    electronic document, electronic citation, data structure,
+     citation systems, search
+% used_by: manual
+@date:
+:published: 2008-05-22
+:created: 2002-08-28
+:issued: 2002-08-28
+:available: 2002-08-28
+:modified: 2010-03-03
+@make:
+:num_top: 1
+:breaks: new=C; break=1
+:bold: /Gnu|Debian|Ruby|SiSU/
+:home_button_text: {SiSU}http://sisudoc.org; {git}http://git.sisudoc.org
+:footer: {SiSU}http://sisudoc.org; {git}http://git.sisudoc.org
+:manpage: name=sisu - documents: markup, structuring, publishing in multiple
+standard formats, and search;
+     synopsis=sisu [-abcDdeFhIiMmNnopqRrSsTtUuVvwXxYyZz0-9] [filename/wildcard
+ ]
+     . sisu [-Ddcv] [instruction]
+     . sisu [-CcFLSVvW]
+     . sisu --v4 [operations]
+     . sisu --v3 [operations]
+@links:
+{ SiSU Homepage }http://www.sisudoc.org/
+{ SiSU Manual }http://www.sisudoc.org/sisu/sisu_manual/
+{ Book Samples & Markup Examples }http://www.jus.uio.no/sisu/SiSU/examples.html
+{ SiSU Download }http://www.jus.uio.no/sisu/SiSU/download.html
+{ SiSU Changelog }http://www.jus.uio.no/sisu/SiSU/changelog.html
+{ SiSU Git repo }http://git.sisudoc.org/?p=code/sisu.git;a=summary
+{ SiSU List Archives }http://lists.sisudoc.org/pipermail/sisu/
+{ SiSU @ Debian }http://packages.qa.debian.org/s/sisu.html
+{ SiSU Project @ Debian }http://qa.debian.org/developer.php?login=sisu@lists.sisudoc.org
+{ SiSU @ Wikipedia }http://en.wikipedia.org/wiki/SiSU
+
+

+

Available Headers

+ +


+Header tags appear at the beginning of a document and provide meta information +on the document (such as the Dublin Core ) , or information as to how the +document as a whole is to be processed. All header instructions take the +form @headername: or on the next line and indented by once space :subheadername: + +

All Dublin Core meta tags are available +


+@indentifier: information or instructions +


+where the "identifier" is a tag recognised by the program, and the "information" +or "instructions" belong to the tag/indentifier specified +


+Note: a header where used should only be used once; all headers apart from +@title: are optional; the @structure: header is used to describe document +structure, and can be useful to know. +


+ +

This is a sample header +


+

% SiSU 2.0 [declared  file-type  identifier  with  markup  version]
+
+


+

@title: [title  text] [this  header  is  the  only  one  that  is  mandatory]
+  :subtitle: [subtitle  if  any]
+  :language: English
+
+


+

@creator:
+:author: [Lastname,  First  names]
+:illustrator: [Lastname,  First  names]
+:translator: [Lastname,  First  names]
+:prepared_by: [Lastname,  First  names]
+
+


+

@date:
+:published: [year  or  yyyy-mm-dd]
+:created: [year  or  yyyy-mm-dd]
+:issued: [year  or  yyyy-mm-dd]
+:available: [year  or  yyyy-mm-dd]
+:modified: [year  or  yyyy-mm-dd]
+:valid: [year  or  yyyy-mm-dd]
+:added_to_site: [year  or  yyyy-mm-dd]
+:translated: [year  or  yyyy-mm-dd]
+
+


+

@rights:
+:copyright: Copyright (C) [Year  and  Holder]
+:license: [Use  License  granted]
+:text: [Year  and  Holder]
+:translation: [Name,  Year]
+:illustrations: [Name,  Year]
+
+


+

@classify:
+:topic_register: SiSU:markup sample:book;book:novel:fantasy
+:type:
+:subject:
+:description:
+:keywords:
+:abstract:
+:loc: [Library  of  Congress  classification]
+:dewey: Dewey classification
+
+


+

@identify:
+:isbn: [ISBN]
+:oclc:
+
+


+

@links: { SiSU }http://www.sisudoc.org
+  { FSF }http://www.fsf.org
+
+


+

@make:
+:num_top: 1
+:headings: [text  to  match  for  each  level      (e.g.  PART;  Chapter;  Section;
+ Article;  or  another:  none;  BOOK|FIRST|SECOND;  none;  CHAPTER;)  :breaks:  new=:C;
+ break=1  :promo:  sisu,  ruby,  sisu_search_libre,  open_society  :bold:  [regular
+ expression  of  words/phrases  to  be  made  bold]
+:italics: [regular  expression  of  words/phrases  to  italicise]
+:home_button_text: {SiSU}http://sisudoc.org; {git}http://git.sisudoc.org
+:footer: {SiSU}http://sisudoc.org; {git}http://git.sisudoc.org
+
+


+

@original:
+:language: [language]
+
+


+

@notes:
+:comment:
+:prefix: [prefix  is  placed  just  after  table  of  contents]
+
+

+

Markup of Substantive Text

+
+ +

+

Heading Levels

+ +


+Heading levels are :A~ ,:B~ ,:C~ ,1~ ,2~ ,3~ ... :A - :C being part / section +headings, followed by other heading levels, and 1 -6 being headings followed +by substantive text or sub-headings. :A~ usually the title :A~? conditional +level 1 heading (used where a stand-alone document may be imported into +another) +


+:A~ [heading  text] Top level heading [this  usually  has  similar  content + to  the  title  @title:  ] NOTE: the heading levels described here are in 0.38 +notation, see heading +


+:B~ [heading  text] Second level heading [this  is  a  heading  level  divider] + +


+:C~ [heading  text] Third level heading [this  is  a  heading  level  divider] + +


+1~ [heading  text] Top level heading preceding substantive text of document +or sub-heading 2, the heading level that would normally be marked 1. or 2. +or 3. etc. in a document, and the level on which sisu by default would break +html output into named segments, names are provided automatically if none +are given (a number), otherwise takes the form 1~my_filename_for_this_segment + +


+2~ [heading  text] Second level heading preceding substantive text of document +or sub-heading 3 , the heading level that would normally be marked 1.1 or +1.2 or 1.3 or 2.1 etc. in a document. +


+3~ [heading  text] Third level heading preceding substantive text of document, +that would normally be marked 1.1.1 or 1.1.2 or 1.2.1 or 2.1.1 etc. in a document + +


+

1~filename level 1 heading,
+% the primary division such as Chapter that is followed by substantive
+text, and may be further subdivided (this is the level on which by default
+html segments are made)
+
+

+

Font Attributes

+ +


+markup example: +


+

normal text,  *{emphasis}*, !{bold text}!, /{italics}/, _{underscore}_,
+"{citation}",
+^{superscript}^, ,{subscript},, +{inserted text}+, -{strikethrough}-, #{monospace}#
+normal text
+*{emphasis}* [note:  can  be  configured  to  be  represented  by  bold,  italics
+ or  underscore]
+!{bold text}!
+/{italics}/
+_{underscore}_
+"{citation}"
+^{superscript}^
+,{subscript},
++{inserted text}+
+-{strikethrough}-
+#{monospace}#
+
+


+resulting output: +


+normal text, emphasis, bold text , italics, underscore, "citation", ^superscript^, +[subscript], ++inserted text++, --strikethrough--, monospace +


+ +

normal text +


+emphasis [note:  can  be  configured  to  be  represented  by  bold,  italics  italics + or  underscore] or  underscore] +


+ +

bold text +


+ +

italics +


+ +

underscore +


+"citation" +


+^superscript^ +


+[subscript] +


+++inserted text++ +


+--strikethrough-- +


+ +

monospace +

+

Indentation and Bullets

+ +


+markup example: +


+

ordinary paragraph
+_1 indent paragraph one step
+_2 indent paragraph two steps
+_9 indent paragraph nine steps
+
+


+resulting output: +


+ +

ordinary paragraph +


+ indent paragraph one step
+ +


+ indent paragraph two steps
+ +


+ indent paragraph nine steps
+ +


+markup example: +


+

_* bullet text
+_1* bullet text, first indent
+_2* bullet text, two step indent
+
+


+resulting output: +


+* bullet text +


+ * bullet text, first indent
+ +


+ * bullet text, two step indent
+ +


+Numbered List (not to be confused with headings/titles, (document structure)) + +


+markup example: +


+

# numbered list                numbered list 1., 2., 3, etc.
+_# numbered list numbered list indented a., b., c., d., etc.
+
+

+

Hanging Indents

+ +


+markup example: +


+

_0_1 first line no indent,
+rest of paragraph indented one step
+_1_0 first line indented,
+rest of paragraph no indent
+in each case level may be 0-9
+
+


+resulting output: +


+ first line no indent, rest of paragraph indented one step
+ +


+first line indented, rest of paragraph no indent +


+ +

in each case level may be 0-9 +

+

Footnotes / Endnotes

+ +


+Footnotes and endnotes are marked up at the location where they would be +indicated within a text. They are automatically numbered. The output type + +

determines whether footnotes or endnotes will be produced +


+markup example: +


+

~{ a footnote or endnote }~
+
+


+resulting output: +


+[^13] +


+markup example: +


+

normal text~{ self contained endnote marker & endnote in one }~ continues
+
+


+resulting output: +


+normal text[^14] continues +


+markup example: +


+

normal text ~{* unnumbered asterisk footnote/endnote, insert multiple asterisks
+if required }~ continues
+normal text ~{** another unnumbered asterisk footnote/endnote }~ continues
+
+


+resulting output: +


+normal text [^*] continues +


+normal text [^**] continues +


+markup example: +


+

normal text ~[*  editors  notes,  numbered  asterisk  footnote/endnote  series
+ ]~ continues
+normal text ~[+  editors  notes,  numbered  asterisk  footnote/endnote  series
+ ]~ continues
+
+


+resulting output: +


+normal text [^*3] continues +


+normal text [^+2] continues +


+Alternative endnote pair notation for footnotes/endnotes: +


+

% note the endnote marker "~^"
+normal text~^ continues
+^~ endnote text following the paragraph in which the marker occurs
+
+


+ +

the standard and pair notation cannot be mixed in the same document +

+

Links

+ +

+ +

Naked Urls Within Text, Dealing with Urls

+ +


+urls found within text are marked up automatically. A url within text is +automatically hyperlinked to itself and by default decorated with angled +braces, unless they are contained within a code block (in which case they +are passed as normal text), or escaped by a preceding underscore (in which +case the decoration is omitted). +


+markup example: +


+

normal text http://www.sisudoc.org/ continues
+
+


+resulting output: +


+normal text <http://www.sisudoc.org/ +> continues +


+ +

An escaped url without decoration +


+markup example: +


+

normal text _http://www.sisudoc.org/ continues
+deb _http://www.jus.uio.no/sisu/archive unstable main non-free
+
+


+resulting output: +


+normal text <_http://www.sisudoc.org/ +> continues +


+deb <_http://www.jus.uio.no/sisu/archive +> unstable main non-free +


+where a code block is used there is neither decoration nor hyperlinking, + +

code blocks are discussed later in this document +


+resulting output: +


+

deb http://www.jus.uio.no/sisu/archive unstable main non-free
+deb-src http://www.jus.uio.no/sisu/archive unstable main non-free
+
+

+

Linking Text

+ +


+ +

To link text or an image to a url the markup is as follows +


+markup example: +


+

about { SiSU }http://url.org markup
+
+


+resulting output: +


+aboutSiSU <http://www.sisudoc.org/ +> markup +


+ +

A shortcut notation is available so the url link may also be provided automatically + +

as a footnote +


+markup example: +


+

about {~^ SiSU }http://url.org markup
+
+


+resulting output: +


+aboutSiSU <http://www.sisudoc.org/ +> [^15] markup +


+Internal document links to a tagged location, including an ocn +


+markup example: +


+

about { text links }#link_text
+
+


+resulting output: +


+about ⌠text links⌡⌈link_text⌋ +


+ +

Shared document collection link +


+markup example: +


+

about { SiSU book markup examples }:SiSU/examples.html
+
+


+resulting output: +


+about ⌠ SiSU book markup examples⌡⌈:SiSU/examples.html⌋ +

+

Linking +Images

+ +


+markup example: +


+

{ tux.png 64x80 }image
+% various url linked images
+{tux.png 64x80 "a better way" }http://www.sisudoc.org/
+{GnuDebianLinuxRubyBetterWay.png 100x101 "Way Better - with Gnu/Linux, Debian
+and Ruby" }http://www.sisudoc.org/
+{~^ ruby_logo.png "Ruby" }http://www.ruby-lang.org/en/
+
+


+resulting output: +


+[ tux.png ] +


+tux.png 64x80 "Gnu/Linux - a better way" <http://www.sisudoc.org/ +> +


+GnuDebianLinuxRubyBetterWay.png 100x101 "Way Better - with Gnu/Linux, Debian +and Ruby" <http://www.sisudoc.org/ +> +


+ruby_logo.png 70x90 "Ruby" <http://www.ruby-lang.org/en/ +> [^16] +


+ +

linked url footnote shortcut +


+

{~^ [text  to  link] }http://url.org
+% maps to: { [text  to  link] }http://url.org ~{ http://url.org }~
+% which produces hyper-linked text within a document/paragraph, with an
+endnote providing the url for the text location used in the hyperlink
+
+


+

text marker *~name
+
+


+note at a heading level the same is automatically achieved by providing +names to headings 1, 2 and 3 i.e. 2~[name] and 3~[name] or in the case of +auto-heading numbering, without further intervention. +

+

Link Shortcut for +Multiple Versions of a Sisu Document in the Same Directory

+TREE +


+markup example: +


+

!_ /{"Viral Spiral"}/, David Bollier
+{ "Viral Spiral", David Bollier [3sS]}viral_spiral.david_bollier.sst
+
+


+ Viral Spiral, David Bollier +

"Viral Spiral", David Bollier <http://corundum/sisu_manual/en/manifest/viral_spiral.david_bollier.html +> + document manifest <http://corundum/sisu_manual/en/manifest/viral_spiral.david_bollier.html +>
+ ⌠html, segmented text⌡「http://corundum/sisu_manual/en/html/viral_spiral.david_bollier.html」 +
+ ⌠html, scroll, document in one⌡「http://corundum/sisu_manual/en/html/viral_spiral.david_bollier.html」 +
+ ⌠epub⌡「http://corundum/sisu_manual/en/epub/viral_spiral.david_bollier.epub」 +
+ ⌠pdf, landscape⌡「http://corundum/sisu_manual/en/pdf/viral_spiral.david_bollier.pdf」 +
+ ⌠pdf, portrait⌡「http://corundum/sisu_manual/en/pdf/viral_spiral.david_bollier.pdf」 +
+ ⌠odf: odt, open document text⌡「http://corundum/sisu_manual/en/odt/viral_spiral.david_bollier.odt」 +
+ ⌠xhtml scroll⌡「http://corundum/sisu_manual/en/xhtml/viral_spiral.david_bollier.xhtml」 +
+ ⌠xml, sax⌡「http://corundum/sisu_manual/en/xml/viral_spiral.david_bollier.xml」 +
+ ⌠xml, dom⌡「http://corundum/sisu_manual/en/xml/viral_spiral.david_bollier.xml」 +
+ ⌠concordance⌡「http://corundum/sisu_manual/en/html/viral_spiral.david_bollier.html」 +
+ ⌠dcc, document content certificate (digests)⌡「http://corundum/sisu_manual/en/digest/viral_spiral.david_bollier.txt」 +
+ ⌠markup source text⌡「http://corundum/sisu_manual/en/src/viral_spiral.david_bollier.sst」 +
+ ⌠markup source (zipped) pod⌡「http://corundum/sisu_manual/en/pod/viral_spiral.david_bollier.sst.zip」 +
+ +

+

Grouped Text

+ +

+

Tables

+ +


+ +

Tables may be prepared in two either of two forms +


+markup example: +


+

table{ c3; 40; 30; 30;
+This is a table
+this would become column two of row one
+column three of row one is here
+And here begins another row
+column two of row two
+column three of row two, and so on
+}table
+
+


+resulting output: +

This is a table|this would become column two of row one|column +three of row one is here』And here begins another row|column two of row +two|column three of row two, and so on』 +


+ +

a second form may be easier to work with in cases where there is not much + +

information in each column +


+markup example: [^18] +


+

!_ Table 3.1: Contributors to Wikipedia, January 2001 - June 2005
+{table~h 24; 12; 12; 12; 12; 12; 12;}
+                                |Jan. 2001|Jan. 2002|Jan. 2003|Jan. 2004|July 2004|June
+2006
+Contributors*                   |       10|      472|    2,188|    9,653|
+25,011|   48,721
+Active contributors**           |        9|      212|      846|    3,228|
+ 8,442|   16,945
+Very active contributors***     |        0|       31|      190|      692|
+ 1,639|    3,016
+No. of English language articles|       25|   16,000|  101,000|  190,000|  320,000|
+ 630,000
+No. of articles, all languages  |       25|   19,000|  138,000|  490,000|  862,000|1,600,000
+* Contributed at least ten times; ** at least 5 times in last month; ***
+more than 100 times in last month.
+
+


+resulting output: +


+Table 3.1: Contributors to Wikipedia, January 2001 - June 2005 +

|Jan. 2001|Jan. +2002|Jan. 2003|Jan. 2004|July 2004|June 2006』Contributors*|10|472|2,188|9,653|25,011|48,721』Active +contributors**|9|212|846|3,228|8,442|16,945』Very active contributors***|0|31|190|692|1,639|3,016』No. +of English language articles|25|16,000|101,000|190,000|320,000|630,000』No. of +articles, all languages|25|19,000|138,000|490,000|862,000|1,600,000』 +


+* Contributed at least ten times; ** at least 5 times in last month; *** +more than 100 times in last month. +

+

Poem

+ +


+basic markup: +


+

poem{
+  Your poem here
+}poem
+Each verse in a poem is given an object number.
+
+


+markup example: +


+

poem{
+                    ‘Fury said to a
+                   mouse, That he
+                 met in the
+               house,
+            "Let us
+              both go to
+                law:  I will
+                  prosecute
+                    YOU.  --Come,
+                       I’ll take no
+                        denial; We
+                     must have a
+                 trial:  For
+              really this
+           morning I’ve
+          nothing
+         to do."
+           Said the
+             mouse to the
+               cur, "Such
+                 a trial,
+                   dear Sir,
+                         With
+                     no jury
+                  or judge,
+                would be
+              wasting
+             our
+              breath."
+               "I’ll be
+                 judge, I’ll
+                   be jury,"
+                         Said
+                    cunning
+                      old Fury:
+                     "I’ll
+                      try the
+                         whole
+                          cause,
+                             and
+                        condemn
+                       you
+                      to
+                       death."’
+}poem
+
+


+resulting output: +

‘Fury said to a
+ mouse, That he
+ met in the
+ house,
+ "Let us
+ both go to
+ law: I will
+ prosecute
+ YOU. --Come,
+ I’ll take no
+ denial; We
+ must have a
+ trial: For
+ really this
+ morning I’ve
+ nothing
+ to do."
+ Said the
+ mouse to the
+ cur, "Such
+ a trial,
+ dear Sir,
+ With
+ no jury
+ or judge,
+ would be
+ wasting
+ our
+ breath."
+ "I’ll be
+ judge, I’ll
+ be jury,"
+ Said
+ cunning
+ old Fury:
+ "I’ll
+ try the
+ whole
+ cause,
+ and
+ condemn
+ you
+ to
+ death."’
+ +

+

Group

+ +


+basic markup: +


+

group{
+  Your grouped text here
+}group
+A group is treated as an object and given a single object number.
+
+


+markup example: +


+

group{
+                    ‘Fury said to a
+                   mouse, That he
+                 met in the
+               house,
+            "Let us
+              both go to
+                law:  I will
+                  prosecute
+                    YOU.  --Come,
+                       I’ll take no
+                        denial; We
+                     must have a
+                 trial:  For
+              really this
+           morning I’ve
+          nothing
+         to do."
+           Said the
+             mouse to the
+               cur, "Such
+                 a trial,
+                   dear Sir,
+                         With
+                     no jury
+                  or judge,
+                would be
+              wasting
+             our
+              breath."
+               "I’ll be
+                 judge, I’ll
+                   be jury,"
+                         Said
+                    cunning
+                      old Fury:
+                     "I’ll
+                      try the
+                         whole
+                          cause,
+                             and
+                        condemn
+                       you
+                      to
+                       death."’
+}group
+
+


+resulting output: +

‘Fury said to a
+ mouse, That he
+ met in the
+ house,
+ "Let us
+ both go to
+ law: I will
+ prosecute
+ YOU. --Come,
+ I’ll take no
+ denial; We
+ must have a
+ trial: For
+ really this
+ morning I’ve
+ nothing
+ to do."
+ Said the
+ mouse to the
+ cur, "Such
+ a trial,
+ dear Sir,
+ With
+ no jury
+ or judge,
+ would be
+ wasting
+ our
+ breath."
+ "I’ll be
+ judge, I’ll
+ be jury,"
+ Said
+ cunning
+ old Fury:
+ "I’ll
+ try the
+ whole
+ cause,
+ and
+ condemn
+ you
+ to
+ death."’
+ +

+

Code

+ +


+Code tags code{ ... }code (used as with other group tags described above) +are used to escape regular sisu markup, and have been used extensively +within this document to provide examples of SiSU markup. You cannot however +use code tags to escape code tags. They are however used in the same way +as group or poem tags. +


+A code-block is treated as an object and given a single object number. [an +option  to  number  each  line  of  code  may  be  considered  at some  later  time] + +


+use of code tags instead of poem compared, resulting output: +


+

                    ‘Fury said to a
+                   mouse, That he
+                 met in the
+               house,
+            "Let us
+              both go to
+                law:  I will
+                  prosecute
+                    YOU.  --Come,
+                       I’ll take no
+                        denial; We
+                     must have a
+                 trial:  For
+              really this
+           morning I’ve
+          nothing
+         to do."
+           Said the
+             mouse to the
+               cur, "Such
+                 a trial,
+                   dear Sir,
+                         With
+                     no jury
+                  or judge,
+                would be
+              wasting
+             our
+              breath."
+               "I’ll be
+                 judge, I’ll
+                   be jury,"
+                         Said
+                    cunning
+                      old Fury:
+                     "I’ll
+                      try the
+                         whole
+                          cause,
+                             and
+                        condemn
+                       you
+                      to
+                       death."’
+
+


+From SiSU 2.7.7 on you can number codeblocks by placing a hash after the +opening code tag code{# as demonstrated here: +


+

1  |                    ‘Fury said to a
+2  |                   mouse, That he
+3  |                 met in the
+4  |               house,
+5  |            "Let us
+6  |              both go to
+7  |                law:  I will
+8  |                  prosecute
+9  |                    YOU.  --Come,
+10 |                       I’ll take no
+11 |                        denial; We
+12 |                     must have a
+13 |                 trial:  For
+14 |              really this
+15 |           morning I’ve
+16 |          nothing
+17 |         to do."
+18 |           Said the
+19 |             mouse to the
+20 |               cur, "Such
+21 |                 a trial,
+22 |                   dear Sir,
+23 |                         With
+24 |                     no jury
+25 |                  or judge,
+26 |                would be
+27 |              wasting
+28 |             our
+29 |              breath."
+30 |               "I’ll be
+31 |                 judge, I’ll
+32 |                   be jury,"
+33 |                         Said
+34 |                    cunning
+35 |                      old Fury:
+36 |                     "I’ll
+37 |                      try the
+38 |                         whole
+39 |                          cause,
+40 |                             and
+41 |                        condemn
+42 |                       you
+43 |                      to
+44 |                       death."’
+
+

+

Additional Breaks - Linebreaks Within Objects, Column and Page-breaks

+ +

+

Line-breaks

+ +

+
+To break a line within a "paragraph object", two backslashes \\ with a space +before and a space or newline after them may be used. +


+

To break a line within a "paragraph object",
+two backslashes \\ with a space before
+and a space or newline after them \\
+may be used.
+
+


+The html break br enclosed in angle brackets (though undocumented) is available +in versions prior to 3.0.13 and 2.9.7 (it remains available for the time being, +but is depreciated). +

+

Page Breaks

+ +


+Page breaks are only relevant and honored in some output formats. A page +break or a new page may be inserted manually using the following markup +on a line on its own: +


+page new =\= or breaks the page, starts a new page. +


+page break -\- or breaks a column, starts a new column, if using columns, +else breaks the page, starts a new page. +


+

-\\-
+or
+<:pb>
+
+


+ +

or +


+

=\\=
+or
+<:pn>
+
+

+

Book Index

+ +


+To make an index append to paragraph the book index term relates to it, +using an equal sign and curly braces. +


+Currently two levels are provided, a main term and if needed a sub-term. +Sub-terms are separated from the main term by a colon. +


+

  Paragraph containing main term and sub-term.
+  ={Main term:sub-term}
+
+


+The index syntax starts on a new line, but there should not be an empty +line between paragraph and index markup. +


+The structure of the resulting index would be: +


+

  Main term, 1
+    sub-term, 1
+
+


+Several terms may relate to a paragraph, they are separated by a semicolon. +If the term refers to more than one paragraph, indicate the number of paragraphs. + +


+

  Paragraph containing main term, second term and sub-term.
+  ={first term; second term: sub-term}
+
+


+The structure of the resulting index would be: +


+

  First term, 1,
+  Second term, 1,
+    sub-term, 1
+
+


+If multiple sub-terms appear under one paragraph, they are separated under +the main term heading from each other by a pipe symbol. +


+

  Paragraph containing main term, second term and sub-term.
+  ={Main term:sub-term+1|second sub-term}
+  A paragraph that continues discussion of the first sub-term
+
+


+The plus one in the example provided indicates the first sub-term spans +one additional paragraph. The logical structure of the resulting index would +be: +


+

  Main term, 1,
+    sub-term, 1-3,
+    second sub-term, 1,
+
+

+

Composite Documents Markup

+
+ +


+It is possible to build a document by creating a master document that requires +other documents. The documents required may be complete documents that could +be generated independently, or they could be markup snippets, prepared +so as to be easily available to be placed within another text. If the calling +document is a master document (built from other documents), it should be +named with the suffix .ssm Within this document you would provide information +on the other documents that should be included within the text. These may +be other documents that would be processed in a regular way, or markup +bits prepared only for inclusion within a master document .sst regular markup +file, or .ssi (insert/information) A secondary file of the composite document + +

is built prior to processing with the same prefix and the suffix ._sst +

+
+ +

basic markup for importing a document into a master document +


+

<< filename1.sst
+<< filename2.ssi
+
+


+The form described above should be relied on. Within the Vim editor it results +in the text thus linked becoming hyperlinked to the document it is calling +in which is convenient for editing. +

+

Sisu Filetypes

+
+ +


+SiSU has plaintext and binary filetypes, and can process either type of +document. +

+

.sst .ssm .ssi Marked Up Plain Text

+ +

+

+ +
SiSU¤b〕 documents are prepared +as plain-text (utf-8) files with
+
SiSU markup. They may make reference to and +contain images (for example), which are stored in the directory beneath +them _sisu/image. 〔b¤SiSU plaintext markup files are of three types that +may be distinguished by the file extension used: regular text .sst; master +documents, composite documents that incorporate other text, which can be +any regular text or text insert; and inserts the contents of which are +like regular text except these are marked .ssi and are not processed.
+ +


+SiSU processing can be done directly against a sisu documents; which may +be located locally or on a remote server for which a url is provided. +


+SiSU source markup can be shared with the command: +


+ sisu -s [filename]
+ +

+
+ +

Sisu Text - Regular Files (.sst)

+ +


+The most common form of document in SiSU, see the section on SiSU markup. + +

+

Sisu Master Files (.ssm)

+ +


+Composite documents which incorporate other SiSU documents which may be +either regular SiSU text .sst which may be generated independently, or inserts +prepared solely for the purpose of being incorporated into one or more +master documents. +


+The mechanism by which master files incorporate other documents is described +as one of the headings under under SiSU markup in the SiSU manual. +


+Note: Master documents may be prepared in a similar way to regular documents, +and processing will occur normally if a .sst file is renamed .ssm without +requiring any other documents; the .ssm marker flags that the document may +contain other documents. +


+Note: a secondary file of the composite document is built prior to processing +with the same prefix and the suffix ._sst [^19] +

+

Sisu Insert Files (.ssi)

+ +

+
+Inserts are documents prepared solely for the purpose of being incorporated +into one or more master documents. They resemble regular SiSU text files +except they are ignored by the SiSU processor. Making a file a .ssi file +is a quick and convenient way of flagging that it is not intended that +the file should be processed on its own. +

+

Sisupod, Zipped Binary Container +(sisupod.zip, .ssp)

+ +


+A sisupod is a zipped SiSU text file or set of SiSU text files and any +associated images that they contain (this will be extended to include sound +and multimedia-files) +

+

+ +
SiSU
+
plaintext files rely on a recognised directory +structure to find contents such as images associated with documents, but +all images for example for all documents contained in a directory are located +in the sub-directory _sisu/image. Without the ability to create a sisupod +it can be inconvenient to manually identify all other files associated +with a document. A sisupod automatically bundles all associated files with +the document that is turned into a pod. +


+The structure of the sisupod is such that it may for example contain a +single document and its associated images; a master document and its associated +documents and anything else; or the zipped contents of a whole directory +of prepared SiSU documents. +


+The command to create a sisupod is: +


+ sisu -S [filename]
+ +


+Alternatively, make a pod of the contents of a whole directory: +


+ sisu -S
+ +


+SiSU processing can be done directly against a sisupod; which may be located +locally or on a remote server for which a url is provided. +


+<http://www.sisudoc.org/sisu/sisu_commands +> +


+<http://www.sisudoc.org/sisu/sisu_manual +> +

+
+ +

Configuration

+
+ +

+

Configuration Files

+ +

+

Config.yml

+ +


+SiSU configration parameters are adjusted in the configuration file, which +can be used to override the defaults set. This includes such things as which +directory interim processing should be done in and where the generated +output should be placed. +


+The SiSU configuration file is a yaml file, which means indentation is +significant. +


+SiSU resource configuration is determined by looking at the following files +if they exist: +


+ ./_sisu/v4/sisurc.yml
+ +


+ ./_sisu/sisurc.yml
+ +


+ ~/.sisu/v4/sisurc.yml
+ +


+ ~/.sisu/sisurc.yml
+ +


+ /etc/sisu/v4/sisurc.yml
+ +


+ /etc/sisu/sisurc.yml
+ +


+The search is in the order listed, and the first one found is used. +


+In the absence of instructions in any of these it falls back to the internal +program defaults. +


+Configuration determines the output and processing directories and the +database access details. +


+ +

If SiSU is installed a sample sisurc.yml may be found in /etc/sisu/sisurc.yml + +

+

Sisu_document_make

+ +


+Most sisu document headers relate to metadata, the exception is the @make: +header which provides processing related information. The default contents +of the @make header may be set by placing them in a file sisu_document_make. + +


+The search order is as for resource configuration: +


+ ./_sisu/v4/sisu_document_make
+ +


+ ./_sisu/sisu_document_make
+ +


+ ~/.sisu/v4/sisu_document_make
+ +


+ ~/.sisu/sisu_document_make
+ +


+ /etc/sisu/v4/sisu_document_make
+ +


+ /etc/sisu/sisu_document_make
+ +


+A sample sisu_document_make can be found in the _sisu/ directory under +along with the provided sisu markup samples. +

+

Css - Cascading Style Sheets +(for Html, Xhtml and Xml)

+
+ +


+CSS files to modify the appearance of SiSU html, XHTML or XML may be placed +in the configuration directory: ./_sisu/css ; ~/.sisu/css or; /etc/sisu/css +and these will be copied to the output directories with the command sisu +-CC. +


+The basic CSS file for html output is html. css, placing a file of that +name in directory _sisu/css or equivalent will result in the default file +of that name being overwritten. +


+HTML: html. css +


+XML DOM: dom.css +


+XML SAX: sax.css +


+XHTML: xhtml. css +


+The default homepage may use homepage.css or html. css +


+Under consideration is to permit the placement of a CSS file with a different +name in directory _sisu/css directory or equivalent.[^20] +

+

Organising Content +- Directory Structure and Mapping

+
+ +


+SiSU v3 has new options for the source directory tree, and output directory +structures of which there are 3 alternatives. +

+

Document Source Directory

+ +

+
+The document source directory is the directory in which sisu processing +commands are given. It contains the sisu source files (.sst .ssm .ssi), or +(for sisu v3 may contain) subdirectories with language codes which contain +the sisu source files, so all English files would go in subdirectory en/, +French in fr/, Spanish in es/ and so on. ISO 639-1 codes are used (as varied +by po4a). A list of available languages (and possible sub-directory names) +can be obtained with the command "sisu --help lang" The list of languages +is limited to langagues supported by XeTeX polyglosia. +

+

General Directories

+ +

+
+

% files stored at this level e.g. sisu_manual.sst or
+% for sisu v3 may be under language sub-directories
+% e.g.
+ ./subject_name/en
+ ./subject_name/fr
+ ./subject_name/es
+ ./subject_name/_sisu
+ ./subject_name/_sisu/css
+ ./subject_name/_sisu/image
+
+

+

Document Output Directory Structures

+ +

+

Output Directory Root

+ +


+The output directory root can be set in the sisurc.yml file. Under the root, +subdirectories are made for each directory in which a document set resides. +If you have a directory named poems or conventions, that directory will +be created under the output directory root and the output for all documents +contained in the directory of a particular name will be generated to subdirectories +beneath that directory (poem or conventions). A document will be placed +in a subdirectory of the same name as the document with the filetype identifier +stripped (.sst .ssm) +


+The last part of a directory path, representing the sub-directory in which +a document set resides, is the directory name that will be used for the +output directory. This has implications for the organisation of document +collections as it could make sense to place documents of a particular subject, +or type within a directory identifying them. This grouping as suggested +could be by subject (sales_law, english_literature); or just as conveniently +by some other classification (X University). The mapping means it is also +possible to place in the same output directory documents that are for organisational +purposes kept separately, for example documents on a given subject of two +different institutions may be kept in two different directories of the +same name, under a directory named after each institution, and these would +be output to the same output directory. Skins could be associated with each +institution on a directory basis and resulting documents will take on the +appropriate different appearance. +

+

Alternative Output Structures

+ +


+There are 3 possibile output structures described as being, by language, +by filetype or by filename, the selection is made in sisurc.yml +


+

#% output_dir_structure_by: language; filetype; or filename
+output_dir_structure_by: language   #(language & filetype, preferred?)
+#output_dir_structure_by: filetype
+#output_dir_structure_by: filename  #(default, closest to original v1 &
+v2)
+
+

+

by Language

+ +


+ +

The by language directory structure places output files +


+The by language directory structure separates output files by language +code (all files of a given language), and within the language directory +by filetype. +


+ +

Its selection is configured in sisurc.yml +


+output_dir_structure_by: language +


+

    |-- en
+    |-- epub
+    |-- hashes
+    |-- html
+    | |-- viral_spiral.david_bollier
+    | |-- manifest
+    | |-- qrcode
+    | |-- odt
+    | |-- pdf
+    | |-- sitemaps
+    | |-- txt
+    | |-- xhtml
+    | ‘-- xml
+    |-- po4a
+    | ‘-- live-manual
+    |     |-- po
+    |     |-- fr
+    |     ‘-- pot
+    ‘-- _sisu
+        |-- css
+        |-- image
+        |-- image_sys -> ../../_sisu/image_sys
+        ‘-- xml
+            |-- rnc
+            |-- rng
+            ‘-- xsd
+
+


+#by: language subject_dir/en/manifest/filename.html +

+

by Filetype

+ +


+The by filetype directory structure separates output files by filetype, +all html files in one directory pdfs in another and so on. Filenames are +given a language extension. +


+ +

Its selection is configured in sisurc.yml +


+output_dir_structure_by: filetype +


+

    |-- epub
+    |-- hashes
+    |-- html
+    |-- viral_spiral.david_bollier
+    |-- manifest
+    |-- qrcode
+    |-- odt
+    |-- pdf
+    |-- po4a
+    |-- live-manual
+    |     |-- po
+    |     |-- fr
+    |     ‘-- pot
+    |-- _sisu
+    | |-- css
+    | |-- image
+    | |-- image_sys -> ../../_sisu/image_sys
+    | ‘-- xml
+    |     |-- rnc
+    |     |-- rng
+    |     ‘-- xsd
+    |-- sitemaps
+    |-- txt
+    |-- xhtml
+    ‘-- xml
+
+


+#by: filetype subject_dir/html/filename/manifest.en.html +

+

by Filename

+ +


+The by filename directory structure places most output of a particular +file (the different filetypes) in a common directory. +


+ +

Its selection is configured in sisurc.yml +


+output_dir_structure_by: filename +


+

    |-- epub
+    |-- po4a
+    |-- live-manual
+    |     |-- po
+    |     |-- fr
+    |     ‘-- pot
+    |-- _sisu
+    | |-- css
+    | |-- image
+    | |-- image_sys -> ../../_sisu/image_sys
+    | ‘-- xml
+    |     |-- rnc
+    |     |-- rng
+    |     ‘-- xsd
+    |-- sitemaps
+    |-- src
+    |-- pod
+    ‘-- viral_spiral.david_bollier
+
+


+#by: filename subject_dir/filename/manifest.en.html +

+

Remote Directories

+ +


+

% containing sub_directories named after the generated files from which
+they are made
+ ./subject_name/src
+% contains shared source files text and binary e.g. sisu_manual.sst and sisu_manual.sst.zip
+ ./subject_name/_sisu
+% configuration file e.g. sisurc.yml
+ ./subject_name/_sisu/skin
+% skins in various skin directories doc, dir, site, yml
+ ./subject_name/_sisu/css
+ ./subject_name/_sisu/image
+% images for documents contained in this directory
+ ./subject_name/_sisu/mm
+
+

+

Sisupod

+ +


+

% files stored at this level e.g. sisu_manual.sst
+ ./sisupod/_sisu
+% configuration file e.g. sisurc.yml
+ ./sisupod/_sisu/skin
+% skins in various skin directories doc, dir, site, yml
+ ./sisupod/_sisu/css
+ ./sisupod/_sisu/image
+% images for documents contained in this directory
+ ./sisupod/_sisu/mm
+
+

+

Organising Content

+ +

+

Homepages

+
+ +


+SiSU is about the ability to auto-generate documents. Home pages are regarded +as custom built items, and are not created by SiSU. More accurately, SiSU +has a default home page, which will not be appropriate for use with other +sites, and the means to provide your own home page instead in one of two +ways as part of a site’s configuration, these being: +


+1. through placing your home page and other custom built documents in the +subdirectory _sisu/home/ (this probably being the easier and more convenient +option) +


+2. through providing what you want as the home page in a skin, +


+Document sets are contained in directories, usually organised by site or +subject. Each directory can/should have its own homepage. See the section +on directory structure and organisation of content. +

+

Home Page and Other +Custom Built Pages in a Sub-directory

+ +


+Custom built pages, including the home page index.html may be placed within +the configuration directory _sisu/home/ in any of the locations that is +searched for the configuration directory, namely ./_sisu ; ~/_sisu ; /etc/sisu +From there they are copied to the root of the output directory with the +command: +


+ sisu -CC
+ +

+

Markup and Output Examples

+
+ +

+

Markup Examples

+ +


+Current markup examples and document output samples are provided off <http://sisudoc.org +> +or <http://www.jus.uio.no/sisu +> and in the sisu -markup-sample package available +off <http://sources.sisudoc.org +> +


+For some documents hardly any markup at all is required at all, other than +a header, and an indication that the levels to be taken into account by +the program in generating its output are. +

+

Sisu Markup Samples

+ +


+A few additional sample books prepared as sisu markup samples, output formats +to be generated using SiSU are contained in a separate package sisu -markup-samples. +sisu -markup-samples contains books (prepared using sisu markup), that were +released by their authors various licenses mostly different Creative Commons +licences that do not permit inclusion in the Debian Project as they have +requirements that do not meet the Debian Free Software Guidelines for various +reasons, most commonly that they require that the original substantive +text remain unchanged, and sometimes that the works be used only non-commercially. + +


+Accelerando, Charles Stross (2005) accelerando.charles_stross.sst +


+Alice’s Adventures in Wonderland, Lewis Carroll (1865) alices_adventures_in_wonderland.lewis_carroll.sst + +


+CONTENT, Cory Doctorow (2008) content.cory_doctorow.sst +


+Democratizing Innovation, Eric von Hippel (2005) democratizing_innovation.eric_von_hippel.sst + +


+Down and Out in the Magic Kingdom, Cory Doctorow (2003) down_and_out_in_the_magic_kingdom.cory_doctorow.sst + +


+For the Win, Cory Doctorow (2010) for_the_win.cory_doctorow.sst +


+Free as in Freedom - Richard Stallman’s Crusade for Free Software, Sam Williams +(2002) free_as_in_freedom.richard_stallman_crusade_for_free_software.sam_williams.sst + +


+Free as in Freedom 2.0 - Richard Stallman and the Free Software Revolution, +Sam Williams (2002), Richard M. Stallman (2010) free_as_in_freedom_2.richard_stallman_and_the_free_software_revolution.sam_williams.richard_stallman.sst + +


+Free Culture - How Big Media Uses Technology and the Law to Lock Down Culture +and Control Creativity, Lawrence Lessig (2004) free_culture.lawrence_lessig.sst + +


+Free For All - How Linux and the Free Software Movement Undercut the High +Tech Titans, Peter Wayner (2002) free_for_all.peter_wayner.sst +


+GNU GENERAL PUBLIC LICENSE v2, Free Software Foundation (1991) gpl2.fsf.sst + +


+GNU GENERAL PUBLIC LICENSE v3, Free Software Foundation (2007) gpl3.fsf.sst + +


+Gulliver’s Travels, Jonathan Swift (1726 / 1735) gullivers_travels.jonathan_swift.sst + +


+Little Brother, Cory Doctorow (2008) little_brother.cory_doctorow.sst +


+The Cathederal and the Bazaar, Eric Raymond (2000) the_cathedral_and_the_bazaar.eric_s_raymond.sst + +


+The Public Domain - Enclosing the Commons of the Mind, James Boyle (2008) + +

the_public_domain.james_boyle.sst +


+The Wealth of Networks - How Social Production Transforms Markets and Freedom, +Yochai Benkler (2006) the_wealth_of_networks.yochai_benkler.sst +


+Through the Looking Glass, Lewis Carroll (1871) through_the_looking_glass.lewis_carroll.sst + +


+Two Bits - The Cultural Significance of Free Software, Christopher Kelty +(2008) two_bits.christopher_kelty.sst +


+UN Contracts for International Sale of Goods, UN (1980) un_contracts_international_sale_of_goods_convention_1980.sst + +


+Viral Spiral, David Bollier (2008) viral_spiral.david_bollier.sst +

+

Sisu Search +- Introduction

+
+ +


+SiSU output can easily and conveniently be indexed by a number of standalone +indexing tools, such as Lucene, Hyperestraier. +


+Because the document structure of sites created is clearly defined, and +the text object citation system is available hypothetically at least, for +all forms of output, it is possible to search the sql database, and either +read results from that database, or just as simply map the results to the +html output, which has richer text markup. +


+In addition to this SiSU has the ability to populate a relational sql type +database with documents at an object level, with objects numbers that are +shared across different output types, which make them searchable with that +degree of granularity. Basically, your match criteria is met by these documents +and at these locations within each document, which can be viewed within +the database directly or in various output formats. +

+

Sql

+
+ +

+

Populating Sql Type Databases

+ +


+SiSU feeds sisu markupd documents into sql type databases PostgreSQL [^21] +and/or SQLite [^22] database together with information related to document +structure. +


+This is one of the more interesting output forms, as all the structural +data of the documents are retained (though can be ignored by the user of +the database should they so choose). All site texts/documents are (currently) +streamed to four tables: +


+ * one containing semantic (and other) headers, including, title, author,
+ subject, (the
+ .I Dublin Core.
+ ..);
+ +


+ * another the substantive texts by individual "paragraph" (or object) +-
+ along with structural information, each paragraph being identifiable +by its
+ paragraph number (if it has one which almost all of them do), and the
+ substantive text of each paragraph quite naturally being searchable +(both in
+ formatted and clean text versions for searching); and
+ +


+ * a third containing endnotes cross-referenced back to the paragraph from
+ which they are referenced (both in formatted and clean text versions +for
+ searching).
+ +


+ * a fourth table with a one to one relation with the headers table contains
+ full text versions of output, eg. pdf, html, xml, and
+ .I ascii.
+ +


+There is of course the possibility to add further structures. +


+At this level SiSU loads a relational database with documents chunked into +objects, their smallest logical structurally constituent parts, as text +objects, with their object citation number and all other structural information +needed to construct the document. Text is stored (at this text object level) +with and without elementary markup tagging, the stripped version being +so as to facilitate ease of searching. +


+Being able to search a relational database at an object level with the +SiSU citation system is an effective way of locating content generated +by SiSU. As individual text objects of a document stored (and indexed) together +with object numbers, and all versions of the document have the same numbering, +complex searches can be tailored to return just the locations of the search +results relevant for all available output formats, with live links to the +precise locations in the database or in html/xml documents; or, the structural +information provided makes it possible to search the full contents of the +database and have headings in which search content appears, or to search +only headings etc. (as the Dublin Core is incorporated it is easy to make +use of that as well). +

+

Postgresql

+
+ +

+

Name

+ +


+SiSU - Structured information, Serialized Units - a document publishing system, + +

postgresql dependency package +

+

Description

+ +


+Information related to using postgresql with sisu (and related to the sisu_postgresql +dependency package, which is a dummy package to install dependencies needed +for SiSU to populate a postgresql database, this being part of SiSU - man +sisu) . +

+

Synopsis

+ +


+ sisu -D [instruction] [filename/wildcard  if  required]
+ +


+ sisu -D --pg --[instruction] [filename/wildcard  if  required]
+ +

+

Commands

+ +


+Mappings to two databases are provided by default, postgresql and sqlite, +the same commands are used within sisu to construct and populate databases +however -d (lowercase) denotes sqlite and -D (uppercase) denotes postgresql, + +

alternatively --sqlite or --pgsql may be used +


+-D or --pgsql may be used interchangeably. +

+

Create and Destroy Database

+ +

+

+ +
--pgsql +--createall
+
initial step, creates required relations (tables, indexes) in +existing (postgresql) database (a database should be created manually and +given the same name as working directory, as requested) (rb.dbi) +

+ +
sisu -D +--createdb
+
creates database where no database existed before +

+ +
sisu -D --create +
+
+

creates database tables where no database tables existed before +

+ +
sisu -D +--Dropall
+
destroys database (including all its content)! kills data and drops +tables, indexes and database associated with a given directory (and directories +of the same name). +

+ +
sisu -D --recreate
+
destroys existing database and builds + +

a new empty database structure +

+
+ +

Import and Remove Documents

+ +

+

+ +
sisu -D --import +-v [filename/wildcard]
+
populates database with the contents of the file. +Imports documents(s) specified to a postgresql database (at an object level). + +

+ +
sisu -D --update -v [filename/wildcard]
+
updates file contents in database + +

+ +
sisu -D --remove -v [filename/wildcard]
+
removes specified document from postgresql +database. +

+
+ +

Sqlite

+
+ +

+

Name

+ +


+SiSU - Structured information, Serialized Units - a document publishing system. + +

+

Description

+ +


+Information related to using sqlite with sisu (and related to the sisu_sqlite +dependency package, which is a dummy package to install dependencies needed +for SiSU to populate an sqlite database, this being part of SiSU - man sisu) +. +

+

Synopsis

+ +


+ sisu -d [instruction] [filename/wildcard  if  required]
+ +


+ sisu -d --(sqlite|pg) --[instruction] [filename/wildcard  if
+ required]
+ +

+

Commands

+ +


+Mappings to two databases are provided by default, postgresql and sqlite, +the same commands are used within sisu to construct and populate databases +however -d (lowercase) denotes sqlite and -D (uppercase) denotes postgresql, + +

alternatively --sqlite or --pgsql may be used +


+-d or --sqlite may be used interchangeably. +

+

Create and Destroy Database

+ +

+

+ +
--sqlite +--createall
+
initial step, creates required relations (tables, indexes) in +existing (sqlite) database (a database should be created manually and given +the same name as working directory, as requested) (rb.dbi) +

+ +
sisu -d --createdb +
+
+

creates database where no database existed before +

+ +
sisu -d --create
+
creates + +

database tables where no database tables existed before +

+ +
sisu -d --dropall +
+
destroys database (including all its content)! kills data and drops tables, +indexes and database associated with a given directory (and directories +of the same name). +

+ +
sisu -d --recreate
+
destroys existing database and builds + +

a new empty database structure +

+
+ +

Import and Remove Documents

+ +

+

+ +
sisu -d --import +-v [filename/wildcard]
+
populates database with the contents of the file. +Imports documents(s) specified to an sqlite database (at an object level). + +

+ +
sisu -d --update -v [filename/wildcard]
+
updates file contents in database + +

+ +
sisu -d --remove -v [filename/wildcard]
+
removes specified document from sqlite +database. +

+
+ +

Introduction

+
+ +

+

Search - Database Frontend Sample, Utilising Database and Sisu Features,

+INCLUDING +OBJECT CITATION NUMBERING (BACKEND CURRENTLY POSTGRESQL) +


+Sample search frontend <http://search.sisudoc.org +> [^23] A small database and +sample query front-end (search from) that makes use of the citation system, +object citation numbering to demonstrates functionality.[^24] +


+SiSU can provide information on which documents are matched and at what +locations within each document the matches are found. These results are +relevant across all outputs using object citation numbering, which includes +html, XML, EPUB, LaTeX, PDF and indeed the SQL database. You can then refer +to one of the other outputs or in the SQL database expand the text within +the matched objects (paragraphs) in the documents matched. +


+Note you may set results either for documents matched and object number +locations within each matched document meeting the search criteria; or +display the names of the documents matched along with the objects (paragraphs) +that meet the search criteria.[^25] +

+

+ +
sisu -F --webserv-webrick
+
builds a cgi web + +

search frontend for the database created +


+The following is feedback on the setup on a machine provided by the help +command: +


+ sisu --help sql
+ +


+

Postgresql
+  user:             ralph
+  current db set:   SiSU_sisu
+  port:             5432
+  dbi connect:      DBI:Pg:database=SiSU_sisu;port=5432
+sqlite
+  current db set:   /home/ralph/sisu_www/sisu/sisu_sqlite.db
+  dbi connect       DBI:SQLite:/home/ralph/sisu_www/sisu/sisu_sqlite.db
+
+


+ +

Note on databases built +


+By default, [unless  otherwise  specified] databases are built on a directory +basis, from collections of documents within that directory. The name of +the directory you choose to work from is used as the database name, i.e. +if you are working in a directory called /home/ralph/ebook the database +SiSU_ebook is used. [otherwise  a  manual  mapping  for  the  collection  is +

+
+ +

Search +Form

+ +

+

+ +
sisu -F
+
generates a sample search form, which must be copied to the + +

web-server cgi directory +

+ +
sisu -F --webserv-webrick
+
generates a sample search +form for use with the webrick server, which must be copied to the web-server + +

cgi directory +

+ +
sisu -W
+
starts the webrick server which should be available + +

wherever sisu is properly installed +


+ +

The generated search form must be copied manually to the webserver directory + +

as instructed +

+
+ +

Sisu_webrick

+
+ +

+

Name

+ +


+SiSU - Structured information, Serialized Units - a document publishing system + +

+

Synopsis

+ +


+sisu_webrick [port] +


+ +

or +


+sisu -W [port] +

+

Description

+ +


+sisu_webrick is part of SiSU (man sisu) sisu_webrick starts Ruby SiSU +output is written, providing a list of these directories (assuming SiSU +is in use and they exist). +


+The default port for sisu_webrick is set to 8081, this may be modified +in the yaml file: ~/.sisu/sisurc.yml a sample of which is provided as /etc/sisu/sisurc.yml +(or in the equivalent directory on your system). +

+

Summary of Man Page

+ +


+sisu_webrick, may be started on it’s own with the command: sisu_webrick +[port] or using the sisu command with the -W flag: sisu -W [port] +


+ +

where no port is given and settings are unchanged the default port is 8081 + +

+

Document Processing Command Flags

+ +


+sisu -W [port] starts Ruby Webrick web-server, serving SiSU output directories, +on the port provided, or if no port is provided and the defaults have not + +

been changed in ~/.sisu/sisurc.yaml then on port 8081 +

+

Summary of Features

+
+ +


+* sparse/minimal markup (clean utf-8 source texts). Documents are prepared +in a single UTF-8 file using a minimalistic mnemonic syntax. Typical literature, +documents like "War and Peace" require almost no markup, and most of the +headers are optional. +


+* markup is easily readable/parsable by the human eye, (basic markup is +simpler and more sparse than the most basic HTML ) , [this  may  also  be + converted  to  .I  XML  representations  of  the  same  input/source  document]. +

+
+* markup defines document structure (this may be done once in a header +pattern-match description, or for heading levels individually); basic text +attributes (bold, italics, underscore, strike-through etc.) as required; +and semantic information related to the document (header information, extended +beyond the Dublin core and easily further extended as required); the headers +may also contain processing instructions. SiSU markup is primarily an abstraction +of document structure and document metadata to permit taking advantage +of the basic strengths of existing alternative practical standard ways +of representing documents [be  that  browser viewing,  paper  publication, + sql  search  etc.] (html, epub, xml, odf, latex, pdf, sql) +


+* for output produces reasonably elegant output of established industry +and institutionally accepted open standard formats.[3] takes advantage of +the different strengths of various standard formats for representing documents, +amongst the output formats currently supported are: +


+* HTML - both as a single scrollable text and a segmented document +


+* XHTML +


+* EPUB +


+* XML - both in sax and dom style xml structures for further development + +

as required +


+* ODT - Open Document Format text, the iso standard for document storage + +


+* LaTeX - used to generate pdf +


+* PDF (via LaTeX ) +


+* SQL - population of an sql database ( PostgreSQL or SQLite ) , (at the +same object level that is used to cite text within a document) +


+Also produces: concordance files; document content certificates (md5 or +sha256 digests of headings, paragraphs, images etc.) and html manifests +(and sitemaps of content). (b) takes advantage of the strengths implicit +in these very different output types, (e.g. PDFs produced using typesetting +of LaTeX, databases populated with documents at an individual object/paragraph +level, making possible granular search (and related possibilities)) +


+* ensuring content can be cited in a meaningful way regardless of selected +output format. Online publishing (and publishing in multiple document formats) +lacks a useful way of citing text internally within documents (important +to academics generally and to lawyers) as page numbers are meaningless +across browsers and formats. sisu seeks to provide a common way of pinpoint +the text within a document, (which can be utilized for citation and by +search engines). The outputs share a common numbering system that is meaningful +(to man and machine) across all digital outputs whether paper, screen, +or database oriented, (pdf, HTML, EPUB, xml, sqlite, postgresql) , this +numbering system can be used to reference content. +


+* Granular search within documents. SQL databases are populated at an object +level (roughly headings, paragraphs, verse, tables) and become searchable +with that degree of granularity, the output information provides the object/paragraph +numbers which are relevant across all generated outputs; it is also possible +to look at just the matching paragraphs of the documents in the database; +[output  indexing  also  work +


+* long term maintainability of document collections in a world of changing +formats, having a very sparsely marked-up source document base. there is +a considerable degree of future-proofing, output representations are "upgradeable", +and new document formats may be added. e.g. addition of odf (open document +text) module in 2006, epub in 2009 and in future html5 output sometime +in future, without modification of existing prepared texts +


+* SQL search aside, documents are generated as required and static once +generated. +


+* documents produced are static files, and may be batch processed, this +needs to be done only once but may be repeated for various reasons as desired +(updated content, addition of new output formats, updated technology document +presentations/representations) +


+* document source ( plaintext utf-8) if shared on the net may be used as + +

input and processed locally to produce the different document outputs +

+
+* document source may be bundled together (automatically) with associated +documents (multiple language versions or master document with inclusions) +and images and sent as a zip file called a sisupod, if shared on the net + +

these too may be processed locally to produce the desired document outputs + +


+* generated document outputs may automatically be posted to remote sites. + +


+* for basic document generation, the only software dependency is Ruby, +and a few standard Unix tools (this covers plaintext, HTML, EPUB, XML, +ODF, LaTeX ) . To use a database you of course need that, and to convert +the LaTeX generated to pdf, a latex processor like tetex or texlive. +


+* as a developers tool it is flexible and extensible +


+Syntax highlighting for SiSU markup is available for a number of text editors. + +


+SiSU is less about document layout than about finding a way with little +markup to be able to construct an abstract representation of a document +that makes it possible to produce multiple representations of it which +may be rather different from each other and used for different purposes, +whether layout and publishing, or search of content +


+i.e. to be able to take advantage from this minimal preparation starting +point of some of the strengths of rather different established ways of +representing documents for different purposes, whether for search (relational +database, or indexed flat files generated for that purpose whether of complete +documents, or say of files made up of objects), online viewing (e.g. html, +xml, pdf) , or paper publication (e.g. pdf) ... +


+the solution arrived at is by extracting structural information about the +document (about headings within the document) and by tracking objects (which +are serialized and also given hash values) in the manner described. It makes +possible representations that are quite different from those offered at +present. For example objects could be saved individually and identified +by their hashes, with an index of how the objects relate to each other +to form a document. +

+

    +.
  1. objects include: headings, paragraphs, verse, tables, +images, but not footnotes/endnotes which are numbered separately and tied +to the object from which they are referenced. +


    +

  2. .
  3. i.e. the +


    +HTML, +


    +PDF, +


    +EPUB, +


    + +

    ODT +


    +outputs are each built individually and optimised for that form of presentation, +rather than for example the html being a saved version of the odf, or the +pdf being a saved version of the html. +


    +

  4. .
  5. +

    the different heading levels +


    +

  6. .
  7. units of text, primarily paragraphs and headings, also any tables, poems, + +

    code-blocks +


    +

  8. .
  9. +

    An open standard format for e-books +


    +

  10. .
  11. Open Document Format ( +


    + +

    ODF +


    +) text +


    +

  12. .
  13. +

    Specification submitted by Adobe to ISO to become a full open ISO specification + +


    +<http://www.linux-watch.com/news/NS7542722606.html +> +


    +

  14. .
  15. +

    ISO standard ISO/IEC 26300:2006 +


    + + +

    *1.
    +
    square brackets +


    +

    + +
    *2.
    +
    square brackets +


    +

    + +
    +1.
    +
    square brackets +


    +

  16. .
  17. <http://www.jus.uio.no/sisu/man/ +> +


    +

  18. .
  19. <http://www.jus.uio.no/sisu/man/sisu.1.html +> +


    +

  20. .
  21. From sometime after SiSU 0.58 it should be possible to describe SiSU markup +using SiSU, which though not an original design goal is useful. +


    +

  22. .
  23. +

    files should be prepared using +


    + +

    UTF-8 +


    + +

    character encoding +


    +

  24. .
  25. +

    a footnote or endnote +


    +

  26. .
  27. self contained endnote marker & endnote in one +


    + + +

    *.
    +
    unnumbered asterisk footnote/endnote, insert multiple asterisks if required + +


    +

    + +
    **.
    +
    another unnumbered asterisk footnote/endnote +


    +

    + +
    *3.
    +
    editors notes, numbered asterisk footnote/endnote series +


    +

    + +
    +2.
    +
    editors notes, numbered asterisk footnote/endnote series +


    +

  28. .
  29. <http://www.sisudoc.org/ +> +


    +

  30. .
  31. <http://www.ruby-lang.org/en/ +> +


    +

  32. .
  33. +

    Table from the Wealth of Networks by Yochai Benkler +


    +<http://www.jus.uio.no/sisu/the_wealth_of_networks.yochai_benkler +> +


    +

  34. .
  35. .ssc (for composite) is under consideration but ._sst makes clear that this +is not a regular file to be worked on, and thus less likely that people +will have "accidents", working on a .ssc file that is overwritten by subsequent +processing. It may be however that when the resulting file is shared .ssc +is an appropriate suffix to use. +


    +

  36. .
  37. <http://www.postgresql.org/ +> +


    +<http://advocacy.postgresql.org/ +> +


    +<http://en.wikipedia.org/wiki/Postgresql +> +


    +

  38. .
  39. <http://www.hwaci.com/sw/sqlite/ +> +


    +<http://en.wikipedia.org/wiki/Sqlite +> +


    +

  40. .
  41. <http://search.sisudoc.org +> +


    +

  42. .
  43. (which could be extended further with current back-end). As regards scaling +of the database, it is as scalable as the database (here Postgresql) and +hardware allow. +


    +

  44. .
  45. of this feature when demonstrated to an IBM software innovations evaluator +in 2004 he said to paraphrase: this could be of interest to us. We have +large document management systems, you can search hundreds of thousands +of documents and we can tell you which documents meet your search criteria, +but there is no way we can tell you without opening each document where +within each your matches are found. +


    + +

  46. +
+ +

See Also

+ sisu(1) +,
+ sisu-epub(1) +,
+ sisu-harvest(1) +,
+ sisu-html(1) +,
+ sisu-odf(1) +,
+ sisu-pdf(1) +,
+ sisu-pg(1) +,
+ sisu-sqlite(1) +,
+ sisu-txt(1) +.
+ sisu_vim(7) +
+ +

Homepage

+ More information about SiSU can be found at <http://www.sisudoc.org/ +> +or <http://www.jus.uio.no/sisu/ +>
+ +

Source

+ <http://sources.sisudoc.org/ +>
+ +

Author

+ SiSU is written by Ralph Amissah <ralph@amissah.com>
+

+ +


+Table of Contents

+

+ + -- cgit v1.2.3