SiSU -->
TOC linked  toc  Full Text  scroll  PDF portrait   pdf  PDF landscape   pdf  ODF/ODT  odt    A-Z  Document Manifest  @
TOC next >>
< ^ >

SiSU

Commands [0.58]

Ralph Amissah

copy @ SiSU

SiSU - Commands [0.58],
Ralph Amissah

 

  1

 

  2

 

  3

 

1. Introduction - What is SiSU?

  4

SiSU is a system for document markup, publishing (in multiple open standard formats) and search

  5

SiSU  1  is a  2  framework for document structuring, publishing and search, comprising of (a) a lightweight document structure and presentation markup syntax and (b) an accompanying engine for generating standard document format outputs from documents prepared in sisu markup syntax, which is able to produce multiple standard outputs that (can) share a common numbering system for the citation of text within a document.

  6

SiSU is developed under an open source, software libre license (GPL3). It has been developed in the context of coping with large document sets with evolving markup related technologies, for which you want multiple output formats, a common mechanism for cross-output-format citation, and search.

  7

SiSU both defines a markup syntax and provides an engine that produces open standards format outputs from documents prepared with SiSU markup. From a single lightly prepared document sisu custom builds several standard output formats which share a common (text object) numbering system for citation of content within a document (that also has implications for search). The sisu engine works with an abstraction of the document's structure and content from which it is possible to generate different forms of representation of the document. Significantly SiSU markup is more sparse than html and outputs which include html, LaTeX, landscape and portrait pdfs, Open Document Format (ODF), all of which can be added to and updated. SiSU is also able to populate SQL type databases at an object level, which means that searches can be made with that degree of granularity. Results of objects (primarily paragraphs and headings) can be viewed directly in the database, or just the object numbers shown - your search criteria is met in these documents and at these locations within each document.

  8

Source document preparation and output generation is a two step process: (i) document source is prepared, that is, marked up in sisu markup syntax and (ii) the desired output subsequently generated by running the sisu engine against document source. Output representations if updated (in the sisu engine) can be generated by re-running the engine against the prepared source. Using SiSU markup applied to a document, SiSU custom builds various standard open output formats including plain text, HTML, XHTML, XML, OpenDocument, LaTeX or PDF files, and populate an SQL database with objects  3  (equating generally to paragraph-sized chunks) so searches may be performed and matches returned with that degree of granularity ( e.g. your search criteria is met by these documents and at these locations within each document). Document output formats share a common object numbering system for locating content. This is particularly suitable for "published" works (finalized texts as opposed to works that are frequently changed or updated) for which it provides a fixed means of reference of content.

  9

In preparing a SiSU document you optionally provide semantic information related to the document in a document header, and in marking up the substantive text provide information on the structure of the document, primarily indicating heading levels and footnotes. You also provide information on basic text attributes where used. The rest is automatic, sisu from this information custom builds  4  the different forms of output requested.

  10

SiSU works with an abstraction of the document based on its structure which is comprised of its frame  5  and the objects  6  it contains, which enables SiSU to represent the document in many different ways, and to take advantage of the strengths of different ways of presenting documents. The objects are numbered, and these numbers can be used to provide a common base for citing material within a document across the different output format types. This is significant as page numbers are not suited to the digital age, in web publishing, changing a browser's default font or using a different browser means that text appears on different pages; and in publishing in different formats, html, landscape and portrait pdf etc. again page numbers are of no use to cite text in a manner that is relevant against the different output types. Dealing with documents at an object level together with object numbering also has implications for search.

  11

One of the challenges of maintaining documents is to keep them in a format that would allow users to use them without depending on a proprietary software popular at the time. Consider the ease of dealing with legacy proprietary formats today and what guarantee you have that old proprietary formats will remain (or can be read without proprietary software/equipment) in 15 years time, or the way the way in which html has evolved over its relatively short span of existence. SiSU provides the flexibility of outputing documents in multiple non-proprietary open formats including html, pdf  7  and the ISO standard ODF.  8  Whilst SiSU relies on software, the markup is uncomplicated and minimalistic which guarantees that future engines can be written to run against it. It is also easily converted to other formats, which means documents prepared in SiSU can be migrated to other document formats. Further security is provided by the fact that the software itself, SiSU is available under GPL3 a licence that guarantees that the source code will always be open, and free as in libre which means that that code base can be used updated and further developed as required under the terms of its license. Another challenge is to keep up with a moving target. SiSU permits new forms of output to be added as they become important, (Open Document Format text was added in 2006), and existing output to be updated (html has evolved and the related module has been updated repeatedly over the years, presumably when the World Wide Web Consortium (w3c) finalises html 5 which is currently under development, the html module will again be updated allowing all existing documents to be regenerated as html 5).

  12

The document formats are written to the file-system and available for indexing by independent indexing tools, whether off the web like Google and Yahoo or on the site like Lucene and Hyperestraier.

  13

SiSU also provides other features such as concordance files and document content certificates, and the working against an abstraction of document structure has further possibilities for the research and development of other document representations, the availability of objects is useful for example for topic maps and the commercial law thesaurus by Vikki Rogers and Al Krtizer, together with the flexibility of SiSU offers great possibilities.

  14

SiSU is primarily for published works, which can take advantage of the citation system to reliably reference its documents. SiSU works well in a complementary manner with such collaborative technologies as Wikis, which can take advantage of and be used to discuss the substance of content prepared in SiSU.

  15

<http://www.jus.uio.no/sisu>

  16

^

 1. "SiSU information Structuring Universe" or "Structured information, Serialized Units".

 

also chosen for the meaning of the Finnish term "sisu".

 

 2. Unix command line oriented

 

 3. objects include: headings, paragraphs, verse, tables, images, but not footnotes/endnotes which are numbered separately and tied to the object from which they are referenced.

 

 4. i.e. the html, pdf, odf outputs are each built individually and optimised for that form of presentation, rather than for example the html being a saved version of the odf, or the pdf being a saved version of the html.

 

 5. the different heading levels

 

 6. units of text, primarily paragraphs and headings, also any tables, poems, code-blocks

 

 7. Specification submitted by Adobe to ISO to become a full open ISO specification

 

<http://www.linux-watch.com/news/NS7542722606.html>

 

 8. ISO/IEC 26300:2006

 
 
SiSU -->
TOC linked  toc  Full Text  scroll  PDF portrait   pdf  PDF landscape   pdf  ODF/ODT  odt    A-Z  Document Manifest  @
TOC next >>
< ^ >

SiSU

Output generated by SiSU 0.59.0 2007-09-23 (2007w38/0)
SiSU Copyright © Ralph Amissah 1997, current 2007. All Rights Reserved.
SiSU is software for document structuring, publishing and search,
www.jus.uio.no/sisu and www.sisudoc.org
w3 since October 3 1993 ralph@amissah.com

SiSU using:
Standard SiSU markup syntax,
Standard SiSU meta-markup syntax, and the
Standard SiSU object citation numbering and system, (object/text positioning system)
Copyright © Ralph Amissah 1997, current 2007. All Rights Reserved.

GPLv3

SiSU is released under GPLv3 or later, <http://www.gnu.org/licenses/gpl.html>

SiSU, developed using Ruby on Debian/Gnu/Linux software infrastructure, with the usual GPL (or OSS) suspects.
Better - "performance, reliability, scalability, security & total cost of ownership" [not to mention flexibility & choice] use of and adherence to open standards (where practical and fair) and it is software libre.
Get With the Future Way Better!



idx txt


SiSU manual


SiSU