From 306aed5b8a559aad2fb944a946ffdda9713f07ec Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Sat, 6 Mar 2010 09:47:55 -0500 Subject: introducing version 2, major patch, (version 1 libraries retained) --- README | 117 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 99 insertions(+), 18 deletions(-) (limited to 'README') diff --git a/README b/README index d1bd6870..d3294b8f 100644 --- a/README +++ b/README @@ -1,6 +1,8 @@ -SiSU 1.0 2009 +%% SiSU versions 1 & 2, 2010 Homepage: -* README CHANGELOG +* README CHANGELOG CHANGELOG_v1 CHANGELOG_v2 + +Herein (this package) reside SiSU versions 1 and 2. %% Description --------------- @@ -11,10 +13,10 @@ Homepage: formats. With minimal preparation of a plain-text (UTF-8) file using its native - markup-syntax, SiSU produces: plain-text, HTML, XHTML, XML, ODF:ODT - (Opendocument), LaTeX, PDF, and populates an SQL database (PostgreSQL or - SQLite) with text objects, roughly, paragraph sized chunks so that document - searches are done at this level of granularity. + markup-syntax, SiSU produces: plain-text, HTML, XHTML, XML, EPUB (v2 only) + ODF:ODT (Opendocument), LaTeX, PDF, and populates an SQL database (PostgreSQL + or SQLite) with text objects, roughly, paragraph sized chunks so that + document searches are done at this level of granularity. Outputs share a common citation numbering system, associated with text objects and any semantic meta-data provided about the document. @@ -47,6 +49,56 @@ Homepage: Homepage: +%% Take 2 +--------- + +The ideas behind SiSU evolved working with managing static, published documents +that needed to be citable, ideally searchable and preferably available in +multiple formats over a period of time with a rapidly changing World Wide Web. +Initial experience was in 1993, one issue being that the document content +remained the same, but presentation needed to be updated with changing formats, +html in particular has really changed since then. + +So the idea was to provide a minimal markup requirement for documents that +remained the same, and a generator to convert that markup custom producing +various output types. This made it possible to: + +* have a marked-up document set and continue improving the presentation, as the +generators code was updated, e.g. update HTML as it evolves, and improve upon +LaTeX driven pdf output + +* have available new document formats/ output types as they came to be of +interest, e.g. version 2 includes EPUB + +* produce a citation system that is available across different output types, +text based on objects (rather than page numbers), i.e. you can accurately and +reliably cite text within a document regardless of the document format version +that is being looked at + +* take advantage of the strengths of disparate technologies representing text, +each output type being custom generated for that format, the object citation +system lends itself as a result is that there is little necessity that one +output type should be based on or related to another, just that the content is +preserved and presented in a way that is well suited to the output type in +question + +* produce consistent quality presentation for material, suitable where +substance/content is more important than appearance, there is some sacrifice of +flexibility and no concept of wysiwyg, e.g. there is no attempt to make pdf +output identical to html, rather the system attempts to take advantage of +making the best presentation it can in each output format taking advantage of +the strengths of that format available to it given the minimal markup (sisu +document preparation); the citation system ensures you can pinpoint the same +text + +SiSU works best: + +* with published works (e.g. books, articles), static documents the content of +which is changed rarely, and ideally when they do in the form of a new edition. + +* for literature and law related content + +SiSU uses Unicode, utf-8 where it is available, ----- SiSU - simple information structuring universe, is a publishing tool, document @@ -64,9 +116,9 @@ Amongst it's characteristics are: * simple mnemonoic markup style, -* the ability to produce multiple output formats, including html, XML, LaTeX, -pdf (via LaTeX), stream to a relational database whilst retaining document -structure - Postgresql and Sqlite, +* the ability to produce multiple output formats, including html, XML, EPUB, +LaTeX, pdf (via LaTeX), stream to a relational database whilst retaining +document structure - Postgresql and Sqlite, * that all share a common citation system (a simple idea from which much good), possibly most exciting, the following: if fed into a relational database (as it @@ -93,17 +145,17 @@ Once set up it is simple to use. Within the SiSU tarball: - ./data/doc/sisu/v1/sisu_markup_samples/sisu_manual + ./data/doc/sisu/v2/sisu_markup_samples/sisu_manual Once installed, directory equivalent to: - + Available man pages are converted back to html using man2html: - + - ./data/doc/sisu/v1/html/ + ./data/doc/sisu/v2/html/ %% Online Information, places to look --------------- @@ -285,7 +337,7 @@ the first document). After installation of sisu-complete, move to the document samples directory - cd /usr/share/doc/sisu/v1/sisu_markup_samples/dfsg + cd /usr/share/doc/sisu/v2/sisu_markup_samples/dfsg and run @@ -426,10 +478,10 @@ and Sample marked up document are provided with the download tarball in the directory: - ./data/doc/sisu/v1/sisu_markup_samples/dfsg + ./data/doc/sisu/v2/sisu_markup_samples/dfsg These are installed on the system usually at: - /usr/share/doc/sisu/v1/sisu_markup_samples/dfsg + /usr/share/doc/sisu/v2/sisu_markup_samples/dfsg More markup samples are available in the package sisu-markup-samples @@ -442,11 +494,40 @@ Many more are available online off: There is syntax support for some editors provided (together with a README file) in - ./data/sisu/v1/conf/editor-syntax-etc + ./data/sisu/v2/conf/editor-syntax-etc usually installed to: - /usr/share/sisu/v1/conf/editor-syntax-etc + /usr/share/sisu/v2/conf/editor-syntax-etc + +v1, v2 Changes +--------------- + +See changelogs + +From a developer's perspective the substantive change between the two versions +is to the middle layer, (the document abstraction, the intermediate document +representation used in processing). Version 1 uses strings and relies on +regular expressions to identify document objects, while Version 2 uses ruby +objects. The version 1 approach whilst programming language neutral offers less +control, and leads to complicated code; version 2 approach takes advantage of +features within the ruby language suited to what the application does. +Development is curently on version 2, version 1 is likely to remain for some +time as a reference implementation. + +%% v1, v2 Compatibility Notes +--------------- + +Versions 1 and 2 are not quite compatible, version 1 and version 2 will run +against each other's documents but document metadata, and processing +instructions may be lost. + +On the input side, version 1 and 2 headers are different, version 2 headers +have been tidied, see document markup samples provided + +On the output side, the sql databases produced if search is to be implemented +are not the same and a database must be generated for each version, most other +differences should be relatively cosmetic. %% License --------------- -- cgit v1.2.3