From 97603990aeadd04dea20b1d4e0b294ae307ce1a3 Mon Sep 17 00:00:00 2001 From: Ralph Amissah Date: Mon, 29 Apr 2024 21:29:23 -0400 Subject: sisudoc homepage, update sync with actual --- org/spine-bespoke-output-homepage-html.org | 105 ++++++++++++++++++----------- 1 file changed, 67 insertions(+), 38 deletions(-) (limited to 'org') diff --git a/org/spine-bespoke-output-homepage-html.org b/org/spine-bespoke-output-homepage-html.org index 35c916b..64665f3 100644 --- a/org/spine-bespoke-output-homepage-html.org +++ b/org/spine-bespoke-output-homepage-html.org @@ -35,15 +35,30 @@ formats & search

ℹ - A short description

- SiSU is an object-centric, lightweight markup based, document structuring, parser, publishing and search tool for document collections. It is command line -oriented and generates static content that is currently made searchable at an -object level through an SQL database. -Markup helps define (delineate) objects (primarily various types of text block) -which are tracked in sequence, substantive objects being numbered sequentially -by the program for object citation. +oriented and generates static content that is made searchable at an object level +through an SQL database. +

+

+SiSU markup helps define (delineate) objects (primarily various types of text +block) which are tracked in sequence, substantive objects being numbered +sequentially by the program for object citation. Breaking document into numbered +objects provides interesting possibilities. These object numbers provide the +possibility of citing/locating text precisely across different document formats +and different languages (assuming the document has been translated). For search +it also makes it possible to identify precisely where within in each document +search criteria is met in the form of an index. Additionally the use of objects +(and that objects are numbered) frees the possibility to represent the document +in the manner considered most suitable to a specific document format (whilst +retaining its structural (and citation) integrity). +

+ +

+Objects which include their inherent associated properties (which vary by type +of object), constitute building blocks of a document from which alternative +representations of a document can be built (or imagined).

Δ - SiSU project source

@@ -81,10 +96,11 @@ by the program for object citation. To give an idea of how this works here is a small collection of documents marked up for and generated by the software. The curation of topics for a collection of specialized related documents would benefit from a consistently applied bespoke -ontology or thesaurus.
The documents presented are documents that have been -released under various creative commons licences, in the public domain, or the -author's work, with the exception of one that is under GPL and the old abandoned -Debian live-manual +ontology or thesaurus. +
+The documents presented are documents that have been released under various +creative commons licences, in the public domain, or the author's work, with the +exception of one that is under GPL and the old abandoned Debian live-manual

@@ -134,7 +150,6 @@ Debian live-manual

ℹ - SiSU description

- SiSU is an object-centric, lightweight markup based, document structuring, parser, publishing and search tool for document collections. It is command line oriented and generates static content that is currently made searchable at an @@ -142,10 +157,9 @@ object level through an SQL database. Markup helps define (delineate) objects (primarily various types of text block) which are tracked in sequence, substantive objects being numbered sequentially by the program for object citation. -

-

+

Summary. An object is a unit of text within a document the most common being a paragraph. Objects include individual headings, paragraphs, tables, grouped text of various types such as code blocks and within poems, verse. @@ -159,24 +173,21 @@ themselves, rather belonging to the object from which they are referenced, and following their own numbering sequence. From heading objects (linked) tables of content may be generated, and if additional metadata is provided book type indexes can be generated that link back to the objects to which they relate. -

-

+

Unpacking this a bit further. SiSU as a concept independent of its markup language and the parsers that have been implemented, is based on the following ideas: -

-

+

Object-Centricity. On objects: In SiSU objects are the fundamental unit from which larger constructs within a document and the document itself is built. Breaking the document into objects provides interesting possibilities. -

-

+

Objects are fundamental building blocks: Conceptually within SiSU, objects are the building blocks or individual units of construction of a document. Objects are usually blocks of text, the most common of which is the @@ -187,21 +198,20 @@ formatted and placed as needed, providing flexibility and enabling multiple types of representation across disperate formats and text recepticle, examples including html, epub, latex (in the past mind-maps) and sql (populated at an object level, and thereby providing search with that degree of granularity). -

-

+

Sequential. Objects have sequence: That objects have sequence, goes largely without saying, this follows authorship, it is part of the definition of a document and how a document is written to convey meaning. -

-

+

Object Numbers & Citation. Substantive objects are numbered for citation purposes: Most objects within a document are meant by the author to be a substantive part of the document. All such objects are numbered sequentially and can be referenced thereby for citation purposes. +
Object numbers provide the possibility of citing/locating text precisely across different document formats and different languages (assuming the document has been translated). For search it also makes it possible to identify precisely @@ -211,8 +221,8 @@ interest. Additionally the use of objects (and that objects are numbered) frees the possibility to represent the document in the manner considered most suitable to a specific document format wilst retaining its structural (and citation) integrity). -

+

Characteristics. Objects have properties and attributes: Objects have properties (and may have attributes). By properties I here refer to the @@ -221,10 +231,9 @@ Attributes extend further and may include other things that one might wish to associate with the object (examples not necessarily currently available/ implemented in SiSU might include, formatting whether it is indented, or metadata e.g. the associated language, or programming language for a code block) -

-

+

Document structure. Heading objects hold documents structure: Heading objects hold documents structure through their heading level property. The types of document of interest to SiSU have structure that is captured by the heading @@ -233,44 +242,42 @@ additional properties that (i) they may be regarded as containing the other objects following them sequentially (until the next heading of a similar or higher level), heading objects may include other headings (sub-headings), and (ii) that they have a heirarchy, the root "heading" being the document -title.
A complication was intruduced to provide greater flexibility across -document output formats. Headings have two sets of levels, the level under which +title. +
+A complication was intruduced to provide greater flexibility across document +output formats. Headings have two sets of levels, the level under which substantive text occurs, this would be a chapter or segment level, and above that in the heirarchy if needed are document section separators, book, section, part. -

-

+

Non-objects Most but not all parts of a document are treated as objects. Notably footnotes are not objects in themselves, rather belonging to the object from which they are referenced, and following their own numbering sequence. From heading objects (linked) tables of content may be generated, and if additional metadata is provided book type indexes can be generated that link back to the objects to which they relate. -

-

+

The Document Header. SiSU document have headers which contain document metadata, at a minimum the document title and author. In addition the document header may contain markup instruction (e.g. how to identify headings within the document, in which case those headings need not be found and treated accordingly) -

-

+

SiSU parsers have now been implemented in different programming paradigms and languages a couple of times, the chosen markup has been left unchanged though the document headers have been modified. - +
This is the core of sisu, beyond which there is more but largely in the form of choices based on ... existing output formats and of implementation detail, deciding what attributes of objects, or within objects should be supported, extending markup to allow for the generation of book indexes from if tagging provided. -

ℹ - SiSU Historical Descriptions

@@ -363,8 +370,9 @@ software evaluator in London June 2004 that came about through a chance encounter with an IBM manager at a Linux Expo, who was curious about my interest in Gnu/Linux with my legal background... on hearing that I also wrote software, he suggested, maybe IBM should have a look at it. I was interested, the meeting -was set up... with an IBM, Software Innovations evaluator
His response after -the meeting: +was set up... with an IBM, Software Innovations evaluator +
+His response after the meeting:

@@ -439,6 +447,27 @@ process. more up to date output format representations.

+

ℹ - Some Observations

+ +

+SiSU is more suited to finalized/stratified/published writings (writings, +articles, books), that are to remain and be referenced as published, +representing a work or ideas, set at a given time. (As opposed to the +increasingly prevalent and important forms of fluid text). +

+ +

+Trained AI likely could assist in the preparation of documents (with SiSU +markup), with resulting deterministic and reproducible outputs (for substantive +document objects). Caveats: Where text objects may be in blocks (or not) there +is some room for discretion and ambiguity in the markup with resulting +possibility of differences in the resulting presentation of a document. Book +indexes are another area that if desired is markup intensive and unless +following an already published index, can be prepared differently and possibly +improved over time, and for specialised collections on a subject area could +potentially be prepared against a thesaurus. +

+

ralph.amissah www since 1993 ;-) -- cgit v1.2.3