aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorRalph Amissah <ralph@amissah.com>2008-03-22 17:06:21 +0000
committerRalph Amissah <ralph@amissah.com>2008-03-22 17:06:21 +0000
commitd8823ad680d74d53bc324115ac97515551fd9535 (patch)
tree9ea6c02ea90306a3317b76ca89773cd02159881e
parentUpdated sisu-0.66.0 (diff)
parenttex to pdf, xetex (utf8) added as alternative to pdftex (diff)
Merge branch 'upstream' into debian/sid
-rw-r--r--CHANGELOG18
-rw-r--r--data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_faq.sst44
-rw-r--r--lib/sisu/v0/shared_xml.rb110
-rw-r--r--lib/sisu/v0/sysenv.rb48
-rw-r--r--lib/sisu/v0/texpdf_format.rb472
5 files changed, 474 insertions, 218 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 9f1e2c45..cf2ab222 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -9,11 +9,23 @@ Reverse Chronological:
%% STABLE MANIFEST
+%% sisu_0.66.1.orig.tar.gz (2008-03-22:11/6)
+http://www.jus.uio.no/sisu/pkg/src/sisu_0.66.1.orig.tar.gz
+ sisu_0.66.1.orig.tar.gz
+ sisu_0.66.1-1.dsc
+ sisu_0.66.1-1.diff.gz
+
+ * tex to pdf, xetex (utf8) added as alternative to pdftex
+ [for now special character processing is separate, consider merging common
+ parts, that is, most of it]
+
+ * debian [add] texlive-xetex
+
%% sisu_0.66.0.orig.tar.gz (2008-02-24:07/7)
http://www.jus.uio.no/sisu/pkg/src/sisu_0.66.0.orig.tar.gz
- sisu_0.66.0.orig.tar.gz
- sisu_0.66.0-1.dsc
- sisu_0.66.0-1.diff.gz
+ b45d81d949590a9b24924589bc98032b 1492653 sisu_0.66.0.orig.tar.gz
+ 3d02ba34822075bea890eaa3ff666ef9 629 sisu_0.66.0-1.dsc
+ 161a19d61d48713be4890bc9d00bed18 146339 sisu_0.66.0-1.diff.gz
* ruby identify program files as utf-8
# coding: utf-8
diff --git a/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_faq.sst b/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_faq.sst
index 795367d3..f7fead86 100644
--- a/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_faq.sst
+++ b/data/doc/sisu/sisu_markup_samples/sisu_manual/sisu_faq.sst
@@ -6,7 +6,7 @@
@creator: Ralph Amissah
-@rights: Copyright (C) Ralph Amissah 2007, part of SiSU documentation, License GPL 3
+@rights: Copyright (C) Ralph Amissah 2008, part of SiSU documentation, License GPL 3
@type: information
@@ -18,9 +18,9 @@
@date.issued: 2006-09-06
-@date.modified: 2007-09-16
+@date.modified: 2008-03-12
-@date: 2007-09-16
+@date: 2008-03-12
@level: new=C; break=1; num_top=1
@@ -132,6 +132,16 @@ Where there are large document sets, it provides consistency in appearance in ea
The excuse for going this way is, it is a waste of time to think much about appearance when working on substantive content, it is the substantive content that is relevant, not the way it looks beyond the basic informational tags - and yet you want to be able to take advantage of as many useful different ways of representing documents as are available, and for various types of output to to be/look as good as it can for each medium/format in which it is presented, (with different mediums having different focuses) and SiSU tries to achieve this from minimal markup.
+2~ Can the SiSU markup be used to prepare for a LaTex automatic building of an index to the work?
+
+Has not been, is of interest though the question on introducing such possibilities is how to keep them as unobtrusive as possible, and as generically relevant as possible to other output formats (which is why the focus on object numbers). Unobtrusive refers both to the markup (where there is no big problem with introducing optional extras); and, more challengingly how to minimise impact on competing ideas/interests, such allowing the addition of semantic tags which could be tied to objects, mapped against the objects that contain them, (permitting mapping and mining of content in various ways that would be largely agnostic of output format - object numbering being an attempt to move beyond output format based content locators (such as page numbers). The desire being to (be a meta markup and) maintain agnosticism as to what is being generated and in development to favor solutions of that nature. Keep bridging LaTeX, XML, SQL ... make use of objects and serialisation for mapping whether against content or meta-content (such as semantic [or additional structural] markers).
+
+2~ Can the conversion from SiSU to LaTeX be modified if we have special needs for the LaTeX, or do we need to modify the LaTeX manually?
+
+Should be possible to modify code, it is GPLv3, should be possible either to modify existing modules or write an independent module for generating bespoke latex. Generic improvements are welcome for inclusion/incorporation in the existing code base.
+
+If there are tools to generate mathematical/scientific formula from latex to images (jpg, png), the latex parser could conceivably be used to make these available to other output formats.
+
2~ How do I create GIN or GiST index in Postgresql for use in SiSU
This at present needs to be done "manually" and it is probably necessary to alter the sample search form. The following is a helpful response from one of the contributors of GiN to Postgresql Oleg Bartunov 2006-12-06:
@@ -175,11 +185,33 @@ Now you can search:
select lid, metadata_tid, rank_cd(fts, q,2)as rank from document, plainto_tsquery('markup syntax') q where q @@ fts order by rank desc limit 10;
+2~ Are there some examples of using Ferret Search with a SiSU repository?
+
+Heard good things about Ferret, but have not used it. The output directory structure and content produced by SiSU is very uniform. Have looked at a couple of other engines (hyperestraier, lucene). There it was enough to identify the files that needed to be indexed and pass them to the search indexing tool. Some Unix rune doing the job, such as:
+
+code{
+
+find /home/ralph/sisu_www -type f | \
+egrep '/sisu_www/(sisu|document_archive)/.+?.html$' | \
+egrep -v '(doc|concordance).html$' | \
+estcmd gather -sd casket -
+
+}code
+
+you would have to experiment with what gives the desired result, the file doc.html is the complete text in html (there are additional smaller html segments), and plain.txt the document as a text file. It may be possible to index the text file and return the html document.
+
+
+2~ Have you had any reports of building SiSU from tar on Mac OS 10.4?
+
+None. In the early days of its release a Mac friend built and run the ruby code part that did not rely on system calls to bits like the latex engine. That is already some years back. He was not into writing or document markup, and did it as a favour at the time. I have not followed up that thread of development.
+
+It should however be possible, much of the output relies on plain ruby, and the system commands to latex etc. could be made appropriate for the underlying OS.
+
2~ Where is version 1.0?
-SiSU works pretty well as it is supposed to.
-Version 1.0 will have the current markup, and directory structure.
-At this point it is largely a matter of choice as to when the name change is made.
+Most of SiSU is mature and stable.
+Version 1.0 will be based on the current markup, (more likely with optional additions rather than significant changes) and directory structure.
+At this point (semantic tagging apart) it is largely a matter of choice as to when the version change is made.
The feature set for html,~{ html w3c compliance has been largely met. }~ LaTeX/pdf and opendocument is in place.
XML, and plaintext are in order.
diff --git a/lib/sisu/v0/shared_xml.rb b/lib/sisu/v0/shared_xml.rb
index abc6cc1a..c93eff5b 100644
--- a/lib/sisu/v0/shared_xml.rb
+++ b/lib/sisu/v0/shared_xml.rb
@@ -161,35 +161,46 @@ module SiSU_XML_munge
@dp=SiSU_Env::Info_env.new.digest.pattern
@url_brace=SiSU_Viz::Skin.new.url_decoration
if @md.sem_tag
+ #@ab ||=SiSU_Viz::Skin.new.semantic_tags.default
@ab ||=semantic_tags.default
end
end
def semantic_tags
def default
{
- :pub => 'publication',
- :ref => 'reference',
- :desc => 'description',
- :conv => 'convention',
- :vol => 'volume',
- :pg => 'page',
- :ct => 'cite',
- :cty => 'city',
- :org => 'organization',
- :d => 'date',
- :t => 'title',
- :a => 'author',
- :n => 'name',
- :fn => 'firstname',
- :f => 'firstname',
- :mn => 'middlename',
- :m => 'middlename',
- :ln => 'lastname',
- :l => 'lastname',
- :i => 'initials',
- :q => 'quote',
- :y => 'year',
- :ab => 'abreviation',
+ :pub => 'publication',
+ :conv => 'convention',
+ :vol => 'volume',
+ :pg => 'page',
+ :cty => 'city',
+ :org => 'organization',
+ :uni => 'university',
+ :dept => 'department',
+ :fac => 'faculty',
+ :inst => 'institute',
+ :co => 'company',
+ :com => 'company',
+ :conv => 'convention',
+ :dt => 'date',
+ :y => 'year',
+ :m => 'month',
+ :d => 'day',
+ :ti => 'title',
+ :au => 'author',
+ :ed => 'editor', #editor?
+ :v => 'version', #edition
+ :n => 'name',
+ :fn => 'firstname',
+ :mn => 'middlename',
+ :ln => 'lastname',
+ :in => 'initials',
+ :qt => 'quote',
+ :ct => 'cite',
+ :ref => 'reference',
+ :ab => 'abreviation',
+ :def => 'define',
+ :desc => 'description',
+ :trans => 'translate',
}
end
self
@@ -460,7 +471,7 @@ module SiSU_XML_munge
para
end
def xml_sem_block_paired(matched) # colon depth: many, recurs
- matched.gsub!(/\b(a):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:a]} depth="many">\\2</sem:#{@ab[:a]}>}) # sem :
+ matched.gsub!(/\b(au):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:au]} depth="many">\\2</sem:#{@ab[:au]}>}) # sem :
matched.gsub!(/\b(vol):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:vol]} depth="many">\\2</sem:#{@ab[:vol]}>}) # sem :
matched.gsub!(/\b(pub):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:pub]} depth="many">\\2</sem:#{@ab[:pub]}>}) # sem :
matched.gsub!(/\b(ref):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:ref]} depth="many">\\2</sem:#{@ab[:ref]}>}) # sem :
@@ -469,7 +480,7 @@ module SiSU_XML_munge
matched.gsub!(/\b(ct):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:ct]} depth="many">\\2</sem:#{@ab[:ct]}>}) # sem :
matched.gsub!(/\b(cty):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:cty]} depth="many">\\2</sem:#{@ab[:cty]}>}) # sem :
matched.gsub!(/\b(org):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:org]} depth="many">\\2</sem:#{@ab[:org]}>}) # sem :
- matched.gsub!(/\b(d):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:d]} depth="many">\\2</sem:#{@ab[:d]}>}) # sem :
+ matched.gsub!(/\b(dt):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:dt]} depth="many">\\2</sem:#{@ab[:dt]}>}) # sem :
matched.gsub!(/\b(n):\{(.+?)\}:\1\b/m, %{<sem:#{@ab[:n]} depth="many">\\2</sem:#{@ab[:n]}>}) # sem :
matched.gsub!(/([a-z]+(?:[_:.][a-z]+)*)(?::\{(.+?)\}:\1)/m,'<sem:\1 depth="many">\2</sem:\1>') # sem :
end
@@ -479,28 +490,37 @@ module SiSU_XML_munge
para.gsub!(/([a-z]+(?:[_:.][a-z]+)*)(?::\{(.+?)\}:\1)/m) {|c| xml_sem_block_paired(c) } # sem :
para.gsub!(/([a-z]+(?:[_:.][a-z]+)*)(?::\{(.+?)\}:\1)/m) {|c| xml_sem_block_paired(c) } # sem :
#colon one / single / flat / shallow
- para.gsub!(/:\{(.+?)\}:a\b/m, %{<sem:#{@ab[:a]} depth="one">\\1</sem:#{@ab[:a]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:n\b/m, %{<sem:#{@ab[:n]} depth="one">\\1</sem:#{@ab[:n]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:t\b/m, %{<sem:#{@ab[:t]} depth="one">\\1</sem:#{@ab[:t]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:ref\b/m, %{<sem:#{@ab[:ref]} depth="one">\\1</sem:#{@ab[:ref]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:desc\b/m, %{<sem:#{@ab[:desc]} depth="one">\\1</sem:#{@ab[:desc]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:cty\b/m, %{<sem:#{@ab[:cty]} depth="one">\\1</sem:#{@ab[:cty]}>}) # sem :
- para.gsub!(/:\{(.+?)\}:org\b/m, %{<sem:#{@ab[:org]} depth="one">\\1</sem:#{@ab[:org]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:au\b/m, %{<sem:#{@ab[:au]} depth="one">\\1</sem:#{@ab[:au]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:n\b/m, %{<sem:#{@ab[:n]} depth="one">\\1</sem:#{@ab[:n]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:ti\b/m, %{<sem:#{@ab[:ti]} depth="one">\\1</sem:#{@ab[:ti]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:ref\b/m, %{<sem:#{@ab[:ref]} depth="one">\\1</sem:#{@ab[:ref]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:desc\b/m, %{<sem:#{@ab[:desc]} depth="one">\\1</sem:#{@ab[:desc]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:cty\b/m, %{<sem:#{@ab[:cty]} depth="one">\\1</sem:#{@ab[:cty]}>}) # sem :
+ para.gsub!(/:\{(.+?)\}:org\b/m, %{<sem:#{@ab[:org]} depth="one">\\1</sem:#{@ab[:org]}>}) # sem :
para.gsub!(/:\{(.+?)\}:([a-z]+(?:[_:.][a-z]+)*)/m,'<sem:\2 depth="one">\1</sem:\2>') # sem :
#semicolon zero / none
- para.gsub!(/;\{([^}]+(?![;]))\};t\b/m, %{<sem:#{@ab[:t]} depth="zero">\\1</sem:#{@ab[:t]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};q\b/m, %{<sem:#{@ab[:q]} depth="zero">\\1</sem:#{@ab[:q]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};ref\b/m, %{<sem:#{@ab[:ref]} depth="zero">\\1</sem:#{@ab[:ref]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};desc\b/m,%{<sem:#{@ab[:desc]} depth="zero">\\1</sem:#{@ab[:desc]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};y\b/m, %{<sem:#{@ab[:y]} depth="zero">\\1</sem:#{@ab[:y]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};ab\b/m, %{<sem:#{@ab[:ab]} depth="zero">\\1</sem:#{@ab[:ab]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};pg\b/m, %{<sem:#{@ab[:pg]} depth="zero">\\1</sem:#{@ab[:pg]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};fn?\b/m, %{<sem:#{@ab[:fn]} depth="zero">\\1</sem:#{@ab[:fn]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};mn?\b/m, %{<sem:#{@ab[:mn]} depth="zero">\\1</sem:#{@ab[:mn]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};ln?\b/m, %{<sem:#{@ab[:ln]} depth="zero">\\1</sem:#{@ab[:ln]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};i\b/m, %{<sem:#{@ab[:i]} depth="zero">\\1</sem:#{@ab[:i]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};org\b/m, %{<sem:#{@ab[:org]} depth="zero">\\1</sem:#{@ab[:org]}>}) # sem ;
- para.gsub!(/;\{([^}]+(?![;]))\};cty\b/m, %{<sem:#{@ab[:cty]} depth="zero">\\1</sem:#{@ab[:cty]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};ti\b/m, %{<sem:#{@ab[:ti]} depth="zero">\\1</sem:#{@ab[:ti]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};qt\b/m, %{<sem:#{@ab[:qt]} depth="zero">\\1</sem:#{@ab[:qt]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};ref\b/m, %{<sem:#{@ab[:ref]} depth="zero">\\1</sem:#{@ab[:ref]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};ed\b/m, %{<sem:#{@ab[:ed]} depth="zero">\\1</sem:#{@ab[:ed]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};v\b/m, %{<sem:#{@ab[:v]} depth="zero">\\1</sem:#{@ab[:v]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};desc\b/m, %{<sem:#{@ab[:desc]} depth="zero">\\1</sem:#{@ab[:desc]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};def\b/m, %{<sem:#{@ab[:def]} depth="zero">\\1</sem:#{@ab[:def]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};trans\b/m, %{<sem:#{@ab[:trans]} depth="zero">\\1</sem:#{@ab[:trans]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};y\b/m, %{<sem:#{@ab[:y]} depth="zero">\\1</sem:#{@ab[:y]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};ab\b/m, %{<sem:#{@ab[:ab]} depth="zero">\\1</sem:#{@ab[:ab]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};pg\b/m, %{<sem:#{@ab[:pg]} depth="zero">\\1</sem:#{@ab[:pg]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};fn?\b/m, %{<sem:#{@ab[:fn]} depth="zero">\\1</sem:#{@ab[:fn]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};mn?\b/m, %{<sem:#{@ab[:mn]} depth="zero">\\1</sem:#{@ab[:mn]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};ln?\b/m, %{<sem:#{@ab[:ln]} depth="zero">\\1</sem:#{@ab[:ln]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};in\b/m, %{<sem:#{@ab[:in]} depth="zero">\\1</sem:#{@ab[:in]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};uni\b/m, %{<sem:#{@ab[:uni]} depth="zero">\\1</sem:#{@ab[:uni]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};fac\b/m, %{<sem:#{@ab[:fac]} depth="zero">\\1</sem:#{@ab[:fac]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};inst\b/m, %{<sem:#{@ab[:inst]} depth="zero">\\1</sem:#{@ab[:inst]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};dept\b/m, %{<sem:#{@ab[:dpt]} depth="zero">\\1</sem:#{@ab[:dept]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};org\b/m, %{<sem:#{@ab[:org]} depth="zero">\\1</sem:#{@ab[:org]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};com?\b/m, %{<sem:#{@ab[:com]} depth="zero">\\1</sem:#{@ab[:com]}>}) # sem ;
+ para.gsub!(/;\{([^}]+(?![;]))\};cty\b/m, %{<sem:#{@ab[:cty]} depth="zero">\\1</sem:#{@ab[:cty]}>}) # sem ;
para.gsub!(/;\{([^}]+(?![;]))\};([a-z]+(?:[_:.][a-z]+)*)/m,'<sem:\2 depth="zero">\1</sem:\2>') # sem ;
end
para
diff --git a/lib/sisu/v0/sysenv.rb b/lib/sisu/v0/sysenv.rb
index 9cf14507..816c72b7 100644
--- a/lib/sisu/v0/sysenv.rb
+++ b/lib/sisu/v0/sysenv.rb
@@ -1,4 +1,4 @@
-# coding: utf-8
+# coding: utf-6
=begin
* Name: SiSU
@@ -647,30 +647,36 @@ module SiSU_Env
else puts "\tWARN: #{program} is not installed #{program_ref}"
end
end
- def latex2pdf #convert from latex to pdf
- prog=[]
- prog=['pdflatex','pdfetex','pdftex']
- program_ref="\n\t\tSee http://www.tug.org/applications/pdftex/\n\t\tOn Debian this is is included in tetex-extra"
+ def tex2pdf_engine
+ prog=['xetex','xelatex','pdflatex','pdfetex','pdftex']
@pdfetex_flag=false
@cmd ||=''
- tell=if @cmd =~/[MVv]/; ''
- else '> /dev/null'
- end
- mode='batchmode'
- #mode='nonstopmode'
+ @texpdf=nil
prog.each do |program|
if program_found?(program)
- case program
- when /pdflatex/; system("#{program} -interaction=#{mode} #@input #{tell}\n")
- when /pdfetex/; system("#{program} -interaction=#{mode} -fmt=pdflatex #@input #{tell}\n") # debian specific paramters ?
- #system("#{program} -interaction=batchmode -progname=pdflatex #@input\n")
- when /pdftex/; system("#{program} -interaction=#{mode} -fmt=pdflatex #@input #{tell}\n")
- end
+ @texpdf=program if program =~/xetex|xelatex|pdftex|pdflatex/
@pdfetex_flag=true
break
end
- unless @pdfetex_flag; puts "\tWARN: none of the following programs are installed: #{program[0]}, #{program[1]}, #{program[2]} is installed. #{program_ref}"
+ end
+ @texpdf
+ end
+ def latex2pdf #convert from latex to pdf
+ tell=if @cmd =~/[MVv]/; ''
+ else '> /dev/null'
+ end
+ mode='batchmode'
+ #mode='nonstopmode'
+ program_ref="\n\t\tSee http://www.tug.org/applications/pdftex/\n\t\tOn Debian this is is included in tetex-extra"
+ texpdf=tex2pdf_engine
+ if @pdfetex_flag;
+ texpdf_cmd=case texpdf
+ when /xetex/; "#{texpdf} -interaction=#{mode} -fmt=xelatex #@input #{tell}\n"
+ when /pdftex/; "#{texpdf} -interaction=#{mode} -fmt=pdflatex #@input #{tell}\n"
+ when /xelatex|pdflatex/; "#{texpdf} -interaction=#{mode} #@input #{tell}\n"
end
+ system(texpdf_cmd)
+ else puts "\tWARN: none of the following programs are installed: #{program[0]}, #{program[1]}, #{program[2]} is installed. #{program_ref}"
end
end
def makeinfo #texinfo
@@ -2558,11 +2564,11 @@ WOK
end
def images
unless FileTest.directory?("#{@env.path.output}/_sisu")
- mkdir_p("#{@env.path.output}/_sisu")
+ mkdir_p("#{@env.path.output}/_sisu")
end
unless File.exist?("#{@env.path.output}/_sisu/image_sys") \
or File.symlink?("#{@env.path.output}/_sisu/image_sys")
- File.symlink("../../_sisu/image_sys", "#{@env.path.output}/_sisu/image_sys")
+ File.symlink("../../_sisu/image_sys", "#{@env.path.output}/_sisu/image_sys")
end
end
def man_forms
@@ -2657,7 +2663,7 @@ WOK
def dbi
if psql.host =~/(?:\S{1,3}\.){3}\S{1,3}|\S+?\.\S+/
"DBI:Pg:database=#{psql.db};host=#{psql.host};port=#{psql.port}"
- else "DBI:Pg:database=#{psql.db};port=#{psql.port}"
+ else "DBI:Pg:database=#{psql.db};port=#{psql.port}"
end
end
self
@@ -3138,7 +3144,7 @@ fns_array=unless fns =~/\.ssm.sst$/
IO.readlines(fns,'')
else IO.readlines(fns,'r:utf-8')
end
-else
+else
if RUBY_VERSION < '1.9'
IO.readlines("#{path.composite_file}/#{fns}",'')
else IO.readlines("#{path.composite_file}/#{fns}",'r:utf-8')
diff --git a/lib/sisu/v0/texpdf_format.rb b/lib/sisu/v0/texpdf_format.rb
index 03bdd184..9e7fccde 100644
--- a/lib/sisu/v0/texpdf_format.rb
+++ b/lib/sisu/v0/texpdf_format.rb
@@ -284,6 +284,7 @@ WOK
@dp=@@dp ||=SiSU_Env::Info_env.new.digest.pattern
@tx=SiSU_Env::Get_init.instance.tex
@url_brace=SiSU_Viz::Skin.new.url_decoration
+ @tex2pdf=@@tex3pdf ||=SiSU_Env::System_call.new.tex2pdf_engine
end
def longtable_landscape
@end_table='\end{longtable}'
@@ -432,14 +433,14 @@ WOK
end
@string
end
- def special_characters_1(para) # ~ ^ $ & % _ { } #LaTeX special characters - KEEP list
+ def pdftex_special_characters_1(string) # ~ ^ $ & % _ { } #LaTeX special characters - KEEP list
#p @@utf_8.list
#@string=Iconv.conv('ISO-8859-1', 'UTF-8', @string)
- word=@string.scan(/\S+|\n/) #unless line =~/^(?:0~\S|%+\s)/
+ word=string.scan(/\S+|\n/) #unless line =~/^(?:0~\S|%+\s)/
para_array=[]
- if word
+ string=if word
word.each do |w| # _ - / # | : ! ^ ~
- unless para =~/^(?:0~|%+ |<!Th?¡ )/um
+ unless string =~/^(?:0~|%+ |<!Th?¡ )/um
w.gsub!(/[\\]?~/,'<=tilde>') unless w=~/^[1-6]~|~\{|\}~|~\[|\]~|^\^~\s|~\^|\*~\S+|~#|\{t~|<~\d+;(?:[ohmu]|[0-6]:)\d+;\w\d+>/
w.gsub!(/&#(?:126|152);/,'<=tilde>') #126 usual
#w.gsub!(/&#(?:126|152);/,'<=tilde>') unless w=~/https?:\/\/\S+/ #126 usual
@@ -447,162 +448,334 @@ WOK
end
para_array << w
end
- para=para_array.join(' ')
- @string=para.strip
+ string=para_array.join(' ')
+ string=string.strip
+ string
+ else ''
end
- @string.gsub!(/<~\d+;(?:\w|[0-6]:)\d+;[umdv]\d+><#@dp:#@dp>/,'')
- @string.gsub!(/.+?<-#>/,'')
- @string.gsub!(/<EOF>/,'')
- @string.gsub!(/<ENDNOTES?>/,'')
+ string.gsub!(/<~\d+;(?:\w|[0-6]:)\d+;[umdv]\d+><#@dp:#@dp>/,'')
+ string.gsub!(/.+?<-#>/,'')
+ string.gsub!(/<EOF>/,'')
+ string.gsub!(/<ENDNOTES?>/,'')
#problem sequence ->
- @string.gsub!(/&(?:nbsp);/,'<=hardspace>') # < SiSU special character also LaTeX
- @string.gsub!(/&(?:lt|#060);/,'<=lt>') # < SiSU special character also LaTeX
- @string.gsub!(/&(?:gt|#062);/,'<=gt>') # > SiSU special character also LaTeX
- @string.gsub!(/&#123;/,'<=curlyopen>') # { SiSU special character also LaTeX
- @string.gsub!(/&#125;/,'<=curlyclose>') # } SiSU special character also LaTeX
- @string.gsub!(/&#(?:126|152);/,'<=tilde>') # ~ SiSU special character also LaTeX
- @string.gsub!(/&#035;/,'\#') # # SiSU special character also LaTeX
- @string.gsub!(/&#033;/,'!') # ! SiSU not really special sisu character but done, also LaTeX
- @string.gsub!(/&#042;/,'*') # * should you wish to escape astrisk e.g. describing \*{bold}*
- @string.gsub!(/&#045;/,'-') # - SiSU special character also LaTeX
- @string.gsub!(/&#043;/,'+') # + SiSU special character also LaTeX
- @string.gsub!(/&#044;/,',') # + SiSU special character also LaTeX
- @string.gsub!(/&#038;/,'<=amp>') #unless @string=~/<:code>/ # / SiSU special character also LaTeX
- @string.gsub!(/&#047;/,'<=slash>') # / SiSU special character also LaTeX
- @string.gsub!(/&#092;/,'<=backslash>') # \ SiSU special character also LaTeX
- @string.gsub!(/&#095;/,'<=underscore>') # _ SiSU special character also LaTeX
- @string.gsub!(/&#124;/,'|') # | SiSU not really special sisu character but done, also LaTeX
- @string.gsub!(/&#058;/,':') # : SiSU not really special sisu character but done, also LaTeX
- @string.gsub!(/&#094;|\^/,'<=caret>') # ^ SiSU not really special sisu character but done, also LaTeX
- @string.gsub!(/\#/,'<=hash>')
+ string.gsub!(/&(?:nbsp);/,'<=hardspace>') # < SiSU special character also LaTeX
+ string.gsub!(/&(?:lt|#060);/,'<=lt>') # < SiSU special character also LaTeX
+ string.gsub!(/&(?:gt|#062);/,'<=gt>') # > SiSU special character also LaTeX
+ string.gsub!(/&#123;/,'<=curlyopen>') # { SiSU special character also LaTeX
+ string.gsub!(/&#125;/,'<=curlyclose>') # } SiSU special character also LaTeX
+ string.gsub!(/&#(?:126|152);/,'<=tilde>') # ~ SiSU special character also LaTeX
+ string.gsub!(/&#035;/,'\#') # # SiSU special character also LaTeX
+ string.gsub!(/&#033;/,'!') # ! SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#042;/,'*') # * should you wish to escape astrisk e.g. describing \*{bold}*
+ string.gsub!(/&#045;/,'-') # - SiSU special character also LaTeX
+ string.gsub!(/&#043;/,'+') # + SiSU special character also LaTeX
+ string.gsub!(/&#044;/,',') # + SiSU special character also LaTeX
+ string.gsub!(/&#038;/,'<=amp>') #unless @string=~/<:code>/ # / SiSU special character also LaTeX
+ string.gsub!(/&#047;/,'<=slash>') # / SiSU special character also LaTeX
+ string.gsub!(/&#092;/,'<=backslash>') # \ SiSU special character also LaTeX
+ string.gsub!(/&#095;/,'<=underscore>') # _ SiSU special character also LaTeX
+ string.gsub!(/&#124;/,'|') # | SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#058;/,':') # : SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#094;|\^/,'<=caret>') # ^ SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/\#/,'<=hash>')
##watch placement, problem sequence ^
- @string.gsub!(/<sup><font face=symbol>&atild;<\/font><\/sup>/,' ')
- @string.gsub!(/<:pb>/,'\newpage')
- @string.gsub!(/<:pn>/,'\clearpage')
- @string.gsub!(/\\copy(right|mark)?/,'<=copymark>') # ok problem with superscript
- end
- def special_characters_2(para)
- @string.gsub!(/&#156;/,'\oe ')
- @string.gsub!(/\$/,'\$')
- @string.gsub!(/\#/,'\#')
- @string.gsub!(/\%/,'\%')
- @string.gsub!(/\~/,'\~') #revist, should not be necessary to mark remaining tildes
- if @string !~/^\s*<:image|\}:image\s/
- @string.gsub!(/_/,'\_')
+ string.gsub!(/<sup><font face=symbol>&atild;<\/font><\/sup>/,' ')
+ string.gsub!(/<:pb>/,'\newpage')
+ string.gsub!(/<:pn>/,'\clearpage')
+ string.gsub!(/\\copy(right|mark)?/,'<=copymark>') # ok problem with superscript
+ string
+ end
+ def pdftex_special_characters_2(string)
+ string.gsub!(/&#156;/,'\oe ')
+ string.gsub!(/\$/,'\$')
+ string.gsub!(/\#/,'\#')
+ string.gsub!(/\%/,'\%')
+ string.gsub!(/\~/,'\~') #revist, should not be necessary to mark remaining tildes
+ if string !~/^\s*<:image|\}:image\s/
+ string.gsub!(/_/,'\_')
end
- @string.gsub!(/\{/,'\{')
- @string.gsub!(/\}/,'\}')
- @string.gsub!(/&nbsp;/,'~') # ~ character for hardspace
+ string.gsub!(/\{/,'\{')
+ string.gsub!(/\}/,'\}')
+ string.gsub!(/&nbsp;/,'~') # ~ character for hardspace
# sequence important must appear after removal of { and }
- @string.gsub!(/&\S+?;/,'') #hmmm
+ string.gsub!(/&\S+?;/,'') #hmmm
# sequence imortant place before removal of &
- if @string=~/<:code>/; @@flag_code=true
- elsif @string=~/<:code-end>/; @@flag_code=false
+ if string=~/<:code>/; @@flag_code=true
+ elsif string=~/<:code-end>/; @@flag_code=false
end
- if @@flag_code; @string.gsub!(/&/,'{\\\&}')
- else @string.gsub!(/(\s+&\s+)/,' and ')
+ if @@flag_code; string.gsub!(/&/,'{\\\&}')
+ else string.gsub!(/(\s+&\s+)/,' and ')
end
- @string.gsub!(/§/u,'\S') #latex: space between next character not preserved? #@string.gsub!(/§ /,'\S ')
- @string.gsub!(/£/u,'\pounds')
- @string.gsub!(/&\S+?;/,' ')
- @string.gsub!(/<a href=".+?">/,' ')
- @string.gsub!(/<\/a>/,' ')
- @string.gsub!(/[^\}>_]((?:https?|file|ftp):\/\/\S+?)(<\/\S>)/,' \begin{scriptsize}\href{\1}{\1} \end{scriptsize}\2') #special case
- @string.gsub!(/((?:^|\s)[}])((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\1\begin{scriptsize}\\href{\2}{\2}\end{scriptsize}\3') #special case \{ e.g. \}http://url
- @string.gsub!(/\B(?:\\_|\\)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\begin{scriptsize}\\href{\1}{\1}\end{scriptsize}\2') #specially escaped url no decoration
+ string.gsub!(/§/u,'\S') #latex: space between next character not preserved? #string.gsub!(/§ /,'\S ')
+ string.gsub!(/£/u,'\pounds')
+ string.gsub!(/&\S+?;/,' ')
+ string.gsub!(/<a href=".+?">/,' ')
+ string.gsub!(/<\/a>/,' ')
+ string.gsub!(/[^\}>_]((?:https?|file|ftp):\/\/\S+?)(<\/\S>)/,' \begin{scriptsize}\href{\1}{\1} \end{scriptsize}\2') #special case
+ string.gsub!(/((?:^|\s)[}])((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\1\begin{scriptsize}\\href{\2}{\2}\end{scriptsize}\3') #special case \{ e.g. \}http://url
+ string.gsub!(/\B(?:\\_|\\)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\begin{scriptsize}\\href{\1}{\1}\end{scriptsize}\2') #specially escaped url no decoration
unless @@flag_code
- @string.gsub!(/(^|\s)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?=\s|$))/,"\\1#{@url_brace.tex_open}\\begin{scriptsize}\\href{\\2}{\\2}\\end{scriptsize}#{@url_brace.tex_close}\\3") #url matching with decoration <url> positive lookahead, sequence issue with { linked }http://url cannot use \b at start
+ string.gsub!(/(^|\s)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?=\s|$))/,"\\1#{@url_brace.tex_open}\\begin{scriptsize}\\href{\\2}{\\2}\\end{scriptsize}#{@url_brace.tex_close}\\3") #url matching with decoration <url> positive lookahead, sequence issue with { linked }http://url cannot use \b at start
else #code-block: angle brackets special characters, note _ already escaped
- @string.gsub!(/\\_</,'{\UseTextSymbol{OML}{<}}')
- @string.gsub!(/\\_>/,'{\UseTextSymbol{OML}{>}}')
+ string.gsub!(/\\_</,'{\UseTextSymbol{OML}{<}}')
+ string.gsub!(/\\_>/,'{\UseTextSymbol{OML}{>}}')
end
- @string.gsub!(/<:ee>/,'')
- @string.gsub!(/<!>/,' ')
+ string.gsub!(/<:ee>/,'')
+ string.gsub!(/<!>/,' ')
#proposed change, insert, but may be redundant
- @string.gsub!(/ \/><:i[12]>(.+?)(?:\}~|<br)/,' \begin{ParagraphIndent}{0.01\columnwidth}\1\end{ParagraphIndent} ') # footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
- @string.gsub!(/<(br|p)>|<\/\s*(br|p)>|<(br|p)\s*\/>/," #{@@tex_backslash*2} ") # Work Area
- @string.gsub!(/<b>(.+?)<\/b>/,'\begin{bfseries}\1 \end{bfseries}')
- @string.gsub!(/<em>(.+?)<\/em>/,'\begin{bfseries}\1 \end{bfseries}')
- @string.gsub!(/<(bold|strong)>(.+?)<\/(bold|strong)>/,'\begin{bfseries}\1 \end{bfseries}')
- @string.gsub!(/<h\d+>(.+?)<\/h\d+>/,'\begin{bfseries}\1 \end{bfseries}')
- @string.gsub!(/<i>(.+?)<\/i>/,'\emph{\1}')
- @string.gsub!(/<italic>(.+?)<\/italic>/,'\emph{\1}')
- @string.gsub!(/<u>(.+?)<\/u>/,'\uline{\1}') # ulem
- @string.gsub!(/<cite>(.+?)<\/cite>/,"``\\1''") # quote
- @string.gsub!(/<ins>(.+?)<\/ins>/,'\uline{\1}') # ulem
- @string.gsub!(/<del>(.+?)<\/del>/,'\sout{\1}') # ulem
- @string.gsub!(/<sub>(.+?)<\/sub>/,"\$_{\\textrm{\\1}}\$")
- @string.gsub!(/<sup>(.+?)<\/sup>/,"\$^{\\textrm{\\1}}\$")
+ string.gsub!(/ \/><:i[12]>(.+?)(?:\}~|<br)/,' \begin{ParagraphIndent}{0.01\columnwidth}\1\end{ParagraphIndent} ') # footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ string.gsub!(/<(br|p)>|<\/\s*(br|p)>|<(br|p)\s*\/>/," #{@@tex_backslash*2} ") # Work Area
+ string.gsub!(/<b>(.+?)<\/b>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<em>(.+?)<\/em>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<(bold|strong)>(.+?)<\/(bold|strong)>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<h\d+>(.+?)<\/h\d+>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<i>(.+?)<\/i>/,'\emph{\1}')
+ string.gsub!(/<italic>(.+?)<\/italic>/,'\emph{\1}')
+ string.gsub!(/<u>(.+?)<\/u>/,'\uline{\1}') # ulem
+ string.gsub!(/<cite>(.+?)<\/cite>/,"``\\1''") # quote
+ string.gsub!(/<ins>(.+?)<\/ins>/,'\uline{\1}') # ulem
+ string.gsub!(/<del>(.+?)<\/del>/,'\sout{\1}') # ulem
+ string.gsub!(/<sub>(.+?)<\/sub>/,"\$_{\\textrm{\\1}}\$")
+ string.gsub!(/<sup>(.+?)<\/sup>/,"\$^{\\textrm{\\1}}\$")
unless @@flag_code
- @string.gsub!(/"(.+?)"/,"``\\1''") # quote marks / quotations open & close " need condition exclude for code
- @string.gsub!(/\s+"/,' ``') # open "
- @string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*"/,'\1``') # open "
- @string.gsub!(/"(\s|\.|,|:|;)/,"''\\1") # close "
- @string.gsub!(/"([1-6-]#{@@tilde}\S*|<.+?>)?\s*$/,"''\\1") # close "
- @string.gsub!(/"(\.|,)/,"''") # close "
- @string.gsub!(/\s+'/,' `') # open '
- @string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*'/,'\1`') # open '
+ string.gsub!(/"(.+?)"/,'“\1”') # quote marks / quotations open & close " need condition exclude for code
+ string.gsub!(/\s+"/,' “') # open "
+ string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*"/,'\1“') # open "
+ string.gsub!(/"(\s|\.|,|:|;)/,'”\1') # close "
+ string.gsub!(/"([1-6-]#{@@tilde}\S*|<.+?>)?\s*$/,'”\1') # close "
+ string.gsub!(/"(\.|,)/,'”') # close "
+ string.gsub!(/\s+'/,' `') # open '
+ string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*'/,'\1`') # open '
end
- @string.gsub!(/^(<:i[1-9]>)?\s*\\_\*\s*/,'\1 \begin{math} \bullet \end{math}~~') #bullets - added 2004w17 watch \\_
- @string.gsub!(/(<font.*?>|<\/font>)/,'')
- @string.gsub!(/\s*<sup>(\S+?)<\/sup>/,'^\1')
- @string.gsub!(/(<sup>|<\/sup>)/,'')
- @string
+ string.gsub!(/^(<:i[1-9]>)?\s*\\_\*\s*/,'\1 \begin{math} \bullet \end{math}~~') #bullets - added 2004w17 watch \\_
+ string.gsub!(/(<font.*?>|<\/font>)/,'')
+ string.gsub!(/\s*<sup>(\S+?)<\/sup>/,'^\1')
+ string.gsub!(/(<sup>|<\/sup>)/,'')
+ string
+ end
+ def pdftex_special_characters_3(string)
+ string.gsub!(/<br(\s*[^\/][^>])/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ string.gsub!(/([^<][^b][^r]\s+)\/>/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ #problem sequence (another kludge) ->
+ string.gsub!(/<=lt>/,'{\UseTextSymbol{OML}{<}}')
+ string.gsub!(/<=gt>/,'{\UseTextSymbol{OML}{>}}')
+ #string.gsub!(/<=lt>/,'\<')
+ #string.gsub!(/<=gt>/,'\>')
+ string.gsub!(/<=underscore>/,'\_')
+ string.gsub!(/(\href\{http:\/\/\S+?)(?:(?:<=tilde>)(\S+))+\}/,'\1\~\2}') #tildes in urls \href treated differently from text
+ string.gsub!(/<=tilde>/,'{\~~}')
+ string.gsub!(/<=pipe>/,'{\textbar}')
+ string.gsub!(/<=caret>/,'{\^{~}}')
+ #string.gsub!(/<=caret>/,'\^{}')
+ string.gsub!(/<=exclaim>/,'\Verbatim{!}')
+ string.gsub!(/<=hash>/,'{\#}')
+ #string.gsub!(/<=hash>/,'{\UseTextSymbol{OT1}{#}}')
+ #string.gsub!(/<=slash>/,'{\slash}')
+ string.gsub!(/<=hardspace>/,'{~}') #changed ... 2005
+ string.gsub!(/<=amp>/,'{\\\&}') #changed ... 2005
+ #string.gsub!(/<=amp>/,'{\UseTextSymbol{OT1}{&}}')
+ string.gsub!(/<=slash>/,'{/}')
+ string.gsub!(/<=backslash>/,'{\textbackslash}')
+ #string.gsub!(/<=asterisk>/,'*')
+ #string.gsub!(/<=exclaim>/,'!')
+ #string.gsub!(/<=asterisk>/,'{\ast}')
+ #string.gsub!(/<=copymark>/,"^{\\copyright} ") # watch has been problematic
+ #copymark='{\\begin{small}\\raisebox{1ex}{\\copyright}\\end{small}} '
+ string.gsub!(/<=copymark>\s*(.+)?\s+(<\\~\d+;\w(?:[0-6]:)?\d+;\w\d+><#@dp:#@dp>)/,"^\\copyright \\textnormal{\\1} \\2") # watch likely to be problematic
+ string
end
- def special_characters_3(para)
- @string.gsub!(/<br(\s*[^\/][^>])/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
- @string.gsub!(/([^<][^b][^r]\s+)\/>/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ def xetex_special_characters_1(string) # ~ ^ $ & % _ { } #LaTeX special characters - KEEP list
+ #p @@utf_8.list
+ #string=Iconv.conv('ISO-8859-1', 'UTF-8', @string)
+ word=string.scan(/\S+|\n/) #unless line =~/^(?:0~\S|%+\s)/
+ para_array=[]
+ string=if word
+ word.each do |w| # _ - / # | : ! ^ ~
+ unless string =~/^(?:0~|%+ |<!Th?¡ )/um
+ w.gsub!(/[\\]?~/,'<=tilde>') unless w=~/^[1-6]~|~\{|\}~|~\[|\]~|^\^~\s|~\^|\*~\S+|~#|\{t~|<~\d+;(?:[ohmu]|[0-6]:)\d+;\w\d+>/
+ w.gsub!(/&#(?:126|152);/,'<=tilde>') #126 usual
+ #w.gsub!(/&#(?:126|152);/,'<=tilde>') unless w=~/https?:\/\/\S+/ #126 usual
+ w.gsub!(/\\?\||&#124;/,'<=pipe>') #unless w=~/<~\d+;(?:[ohmu]|[0-6]:)\d+;\w\d+>/ # | SiSU not really special sisu character but done, also LaTeX
+ end
+ para_array << w
+ end
+ string=para_array.join(' ')
+ string=string.strip
+ string
+ else ''
+ end
+ string.gsub!(/<~\d+;(?:\w|[0-6]:)\d+;[umdv]\d+><#@dp:#@dp>/,'')
+ string.gsub!(/.+?<-#>/,'')
+ string.gsub!(/<EOF>/,'')
+ string.gsub!(/<ENDNOTES?>/,'')
+ #problem sequence ->
+ string.gsub!(/&(?:nbsp);/,'<=hardspace>') # < SiSU special character also LaTeX
+ string.gsub!(/&(?:lt|#060);/,'<=lt>') # < SiSU special character also LaTeX
+ string.gsub!(/&(?:gt|#062);/,'<=gt>') # > SiSU special character also LaTeX
+ string.gsub!(/&#123;/,'<=curlyopen>') # { SiSU special character also LaTeX
+ string.gsub!(/&#125;/,'<=curlyclose>') # } SiSU special character also LaTeX
+ string.gsub!(/&#(?:126|152);/,'<=tilde>') # ~ SiSU special character also LaTeX
+ string.gsub!(/&#035;/,'\#') # # SiSU special character also LaTeX
+ string.gsub!(/&#033;/,'!') # ! SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#042;/,'*') # * should you wish to escape astrisk e.g. describing \*{bold}*
+ string.gsub!(/&#045;/,'-') # - SiSU special character also LaTeX
+ string.gsub!(/&#043;/,'+') # + SiSU special character also LaTeX
+ string.gsub!(/&#044;/,',') # + SiSU special character also LaTeX
+ string.gsub!(/&#038;/,'<=amp>') #unless @string=~/<:code>/ # / SiSU special character also LaTeX
+ string.gsub!(/&#047;/,'<=slash>') # / SiSU special character also LaTeX
+ string.gsub!(/&#092;/,'<=backslash>') # \ SiSU special character also LaTeX
+ string.gsub!(/&#095;/,'<=underscore>') # _ SiSU special character also LaTeX
+ string.gsub!(/&#124;/,'|') # | SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#058;/,':') # : SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/&#094;|\^/,'<=caret>') # ^ SiSU not really special sisu character but done, also LaTeX
+ string.gsub!(/\#/,'<=hash>')
+ ##watch placement, problem sequence ^
+ string.gsub!(/<sup><font face=symbol>&atild;<\/font><\/sup>/,' ')
+ string.gsub!(/<:pb>/,'\newpage')
+ string.gsub!(/<:pn>/,'\clearpage')
+ string.gsub!(/\\copy(right|mark)?/,'<=copymark>') # ok problem with superscript
+ string
+ end
+ def xetex_special_characters_2(string)
+ string.gsub!(/&#156;/,'\oe ')
+ string.gsub!(/\$/,'\$')
+ string.gsub!(/\#/,'\#')
+ string.gsub!(/\%/,'\%')
+ string.gsub!(/\~/,'\~') #revist, should not be necessary to mark remaining tildes
+ if string !~/^\s*<:image|\}:image\s/
+ string.gsub!(/_/,'\_')
+ end
+ string.gsub!(/\{/,'\{')
+ string.gsub!(/\}/,'\}')
+ string.gsub!(/&nbsp;/,'~') # ~ character for hardspace
+ # sequence important must appear after removal of { and }
+ string.gsub!(/&\S+?;/,'') #hmmm
+ # sequence imortant place before removal of &
+ if string=~/<:code>/; @@flag_code=true
+ elsif string=~/<:code-end>/; @@flag_code=false
+ end
+ if @@flag_code; string.gsub!(/&/,'{\\\&}')
+ else string.gsub!(/(\s+&\s+)/,' and ')
+ end
+ string.gsub!(/§/u,'\S') #latex: space between next character not preserved? #string.gsub!(/§ /,'\S ')
+ string.gsub!(/£/u,'\pounds')
+ string.gsub!(/&\S+?;/,' ')
+ string.gsub!(/<a href=".+?">/,' ')
+ string.gsub!(/<\/a>/,' ')
+ string.gsub!(/[^\}>_]((?:https?|file|ftp):\/\/\S+?)(<\/\S>)/,' \begin{scriptsize}\href{\1}{\1} \end{scriptsize}\2') #special case
+ string.gsub!(/((?:^|\s)[}])((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\1\begin{scriptsize}\\href{\2}{\2}\end{scriptsize}\3') #special case \{ e.g. \}http://url
+ string.gsub!(/\B(?:\\_|\\)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?:\s|$))/,'\begin{scriptsize}\\href{\1}{\1}\end{scriptsize}\2') #specially escaped url no decoration
+ unless @@flag_code
+ string.gsub!(/(^|\s)((?:https?|file|ftp):\/\/\S+?\.[^'"><\s]+?)([;.,]?(?=\s|$))/,"\\1#{@url_brace.tex_open}\\begin{scriptsize}\\href{\\2}{\\2}\\end{scriptsize}#{@url_brace.tex_close}\\3") #url matching with decoration <url> positive lookahead, sequence issue with { linked }http://url cannot use \b at start
+ else #code-block: angle brackets special characters, note _ already escaped
+ string.gsub!(/\\_</,'{\UseTextSymbol{OML}{<}}')
+ string.gsub!(/\\_>/,'{\UseTextSymbol{OML}{>}}')
+ end
+ string.gsub!(/<:ee>/,'')
+ string.gsub!(/<!>/,' ')
+ #proposed change, insert, but may be redundant
+ string.gsub!(/ \/><:i[12]>(.+?)(?:\}~|<br)/,' \begin{ParagraphIndent}{0.01\columnwidth}\1\end{ParagraphIndent} ') # footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ string.gsub!(/<(br|p)>|<\/\s*(br|p)>|<(br|p)\s*\/>/," #{@@tex_backslash*2} ") # Work Area
+ string.gsub!(/<b>(.+?)<\/b>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<em>(.+?)<\/em>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<(bold|strong)>(.+?)<\/(bold|strong)>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<h\d+>(.+?)<\/h\d+>/,'\begin{bfseries}\1 \end{bfseries}')
+ string.gsub!(/<i>(.+?)<\/i>/,'\emph{\1}')
+ string.gsub!(/<italic>(.+?)<\/italic>/,'\emph{\1}')
+ string.gsub!(/<u>(.+?)<\/u>/,'\uline{\1}') # ulem
+ string.gsub!(/<cite>(.+?)<\/cite>/,"``\\1''") # quote
+ string.gsub!(/<ins>(.+?)<\/ins>/,'\uline{\1}') # ulem
+ string.gsub!(/<del>(.+?)<\/del>/,'\sout{\1}') # ulem
+ string.gsub!(/<sub>(.+?)<\/sub>/,"\$_{\\textrm{\\1}}\$")
+ string.gsub!(/<sup>(.+?)<\/sup>/,"\$^{\\textrm{\\1}}\$")
+ unless @@flag_code
+ string.gsub!(/"(.+?)"/,'“\1”') # quote marks / quotations open & close " need condition exclude for code
+ string.gsub!(/\s+"/,' “') # open "
+ string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*"/,'\1“') # open "
+ string.gsub!(/"(\s|\.|,|:|;)/,'”\1') # close "
+ string.gsub!(/"([1-6-]#{@@tilde}\S*|<.+?>)?\s*$/,'”\1') # close "
+ string.gsub!(/"(\.|,)/,'”') # close "
+ string.gsub!(/\s+'/,' `') # open '
+ string.gsub!(/^([1-6-]#{@@tilde}\S*|<.+?>)?\s*'/,'\1`') # open '
+ end
+ #string.gsub!(/^(<:i[1-9]>)?\s*\\_\*\s*/,'\1 \begin{math} \bullet \end{math}~~') #bullets - added 2004w17 watch \\_
+ string.gsub!(/^(<:i[1-9]>)?\s*\\_\*\s*/,'\1 ● ~~')
+ string.gsub!(/(<font.*?>|<\/font>)/,'')
+ string.gsub!(/\s*<sup>(\S+?)<\/sup>/,'^\1')
+ string.gsub!(/(<sup>|<\/sup>)/,'')
+ string
+ end
+ def xetex_special_characters_3(string)
+ string.gsub!(/<br(\s*[^\/][^>])/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
+ string.gsub!(/([^<][^b][^r]\s+)\/>/,'\1') # clean up, incredibly messy :-( footnote indents, problems if match exists in ordinary paragraphs? check! Work Area 200501 a bit tricky as must be able to match multiple times, and to clean remainder
#problem sequence (another kludge) ->
- @string.gsub!(/<=lt>/,'{\UseTextSymbol{OML}{<}}')
- @string.gsub!(/<=gt>/,'{\UseTextSymbol{OML}{>}}')
- #@string.gsub!(/<=lt>/,'\<')
- #@string.gsub!(/<=gt>/,'\>')
- @string.gsub!(/<=underscore>/,'\_')
- @string.gsub!(/(\href\{http:\/\/\S+?)(?:(?:<=tilde>)(\S+))+\}/,'\1\~\2}') #tildes in urls \href treated differently from text
- @string.gsub!(/<=tilde>/,'{\~~}')
- @string.gsub!(/<=pipe>/,'{\textbar}')
- @string.gsub!(/<=caret>/,'{\^{~}}')
- #@string.gsub!(/<=caret>/,'\^{}')
- @string.gsub!(/<=exclaim>/,'\Verbatim{!}')
- @string.gsub!(/<=hash>/,'{\#}')
- #@string.gsub!(/<=hash>/,'{\UseTextSymbol{OT1}{#}}')
- #@string.gsub!(/<=slash>/,'{\slash}')
- @string.gsub!(/<=hardspace>/,'{~}') #changed ... 2005
- @string.gsub!(/<=amp>/,'{\\\&}') #changed ... 2005
- #@string.gsub!(/<=amp>/,'{\UseTextSymbol{OT1}{&}}')
- @string.gsub!(/<=slash>/,'{/}')
- @string.gsub!(/<=backslash>/,'{\textbackslash}')
- #@string.gsub!(/<=asterisk>/,'*')
- #@string.gsub!(/<=exclaim>/,'!')
- #@string.gsub!(/<=asterisk>/,'{\ast}')
- #@string.gsub!(/<=copymark>/,"^{\\copyright} ") # watch has been problematic
+ string.gsub!(/<=lt>/,'{\UseTextSymbol{OML}{<}}')
+ string.gsub!(/<=gt>/,'{\UseTextSymbol{OML}{>}}')
+ #string.gsub!(/<=lt>/,'\<')
+ #string.gsub!(/<=gt>/,'\>')
+ string.gsub!(/<=underscore>/,'\_')
+ string.gsub!(/(\href\{http:\/\/\S+?)(?:(?:<=tilde>)(\S+))+\}/,'\1\~\2}') #tildes in urls \href treated differently from text
+ string.gsub!(/<=tilde>/,'{\~~}')
+ string.gsub!(/<=pipe>/,'{\textbar}')
+ string.gsub!(/<=caret>/,'{\^{~}}')
+ #string.gsub!(/<=caret>/,'\^{}')
+ string.gsub!(/<=exclaim>/,'\Verbatim{!}')
+ string.gsub!(/<=hash>/,'{\#}')
+ #string.gsub!(/<=hash>/,'{\UseTextSymbol{OT1}{#}}')
+ #string.gsub!(/<=slash>/,'{\slash}')
+ string.gsub!(/<=hardspace>/,'{~}') #changed ... 2005
+ string.gsub!(/<=amp>/,'{\\\&}') #changed ... 2005
+ #string.gsub!(/<=amp>/,'{\UseTextSymbol{OT1}{&}}')
+ string.gsub!(/<=slash>/,'{/}')
+ string.gsub!(/<=backslash>/,'{\textbackslash}')
+ #string.gsub!(/<=asterisk>/,'*')
+ #string.gsub!(/<=exclaim>/,'!')
+ #string.gsub!(/<=asterisk>/,'{\ast}')
+ #string.gsub!(/<=copymark>/,"^{\\copyright} ") # watch has been problematic
#copymark='{\\begin{small}\\raisebox{1ex}{\\copyright}\\end{small}} '
- @string.gsub!(/<=copymark>\s*(.+)?\s+(<\\~\d+;\w(?:[0-6]:)?\d+;\w\d+><#@dp:#@dp>)/,"^\\copyright \\textnormal{\\1} \\2") # watch likely to be problematic
- @string
+ string.gsub!(/<=copymark>\s*(.+)?\s+(<\\~\d+;\w(?:[0-6]:)?\d+;\w\d+><#@dp:#@dp>)/,"^\\copyright \\textnormal{\\1} \\2") # watch likely to be problematic
+ string
end
- def special_characters_curly(para)
- @string.gsub!(/<=curlyopen>/,'\{')
- @string.gsub!(/<=curlyclose>/,'\}')
- @string
+ def special_characters_curly(string)
+ string.gsub!(/<=curlyopen>/,'\{')
+ string.gsub!(/<=curlyclose>/,'\}')
+ string
end
- def special_characters_unsafe_1(para) #depreciated, make obsolete
+
+
+ def special_characters_unsafe_1(string) #depreciated, make obsolete
# some substitutions are sequence sensitive, rearrange with care.
- @string.gsub!(/\\backslash (copyright|clearpage|newpage)/,"\\\\\\1") #kludge bad solution, find out where tail is sent through specChar !
- end
- def special_characters_unsafe_2(para)
- end
- def special_characters_unsafe_3(para)
+ string.gsub!(/\\backslash (copyright|clearpage|newpage)/,"\\\\\\1") #kludge bad solution, find out where tail is sent through specChar !
+ string
end
def special_characters #special characters - some substitutions are sequence sensitive, rearrange with care.
- special_characters_1(@string)
- special_characters_unsafe_1(@string)
- special_characters_2(@string)
- special_characters_3(@string)
+ string=@string
+ case @tex2pdf
+ when /pdf/
+ string=pdftex_special_characters_1(string) unless string.nil?
+ string=special_characters_unsafe_1(string) unless string.nil? #pdftex_special_characters_unsafe_1(@string)
+ string=pdftex_special_characters_2(string) unless string.nil?
+ string=pdftex_special_characters_3(string) unless string.nil?
+ when /xe/
+ string=xetex_special_characters_1(string) unless string.nil?
+ string=special_characters_unsafe_1(string) unless string.nil? #xetex_special_characters_unsafe_1(@string)
+ string=xetex_special_characters_2(string) unless string.nil? #issues with xetex
+ string=xetex_special_characters_3(string) unless string.nil?
+ end
+ @string=string
end
def special_characters_safe #special characters - some substitutions are sequence sensitive, rearrange with care.
- special_characters_1(@string)
- special_characters_2(@string)
- #special_characters_3(@string)
+ string=@string
+ case @tex2pdf
+ when /pdf/
+ string=pdftex_special_characters_1(@string) unless string.nil?
+ string=pdftex_special_characters_2(@string) unless string.nil?
+ #special_characters_3(@string)
+ when /xe/
+ string=xetex_special_characters_1(@string) unless string.nil?
+ string=xetex_special_characters_2(@string) unless string.nil? # remove this to start with, causes issues
+ end
+ @string=string
end
def heading_major(para,lev)
title=@md.title
@@ -947,17 +1120,27 @@ WOK
end
end
def tex_head_encode
- case @md.file_encoding
- when /iso-?8859/i #% iso8859
- <<WOK
-\\usepackage[latin1]{inputenc}
+ case @tex2pdf
+ when /xe/
+ <<WOK
+\\usepackage{babel}
+\\usepackage{ucs}
+\\usepackage{fontspec}
+\\usepackage{xunicode}
WOK
- else #% utf-8 assumed
- <<WOK
+ when /pdf/
+ if @md.file_encoding =~ /iso-?8859/i #% iso8859
+ <<WOK
+% \\usepackage[latin1]{inputenc}
+\\usepackage{fontspec}
+WOK
+ else #% utf-8 assumed
+ <<WOK
\\usepackage{babel}
\\usepackage{ucs}
\\usepackage[utf8x]{inputenc}
WOK
+ end
end
end
def tex_head_info
@@ -1099,7 +1282,7 @@ WOK
\\usepackage{url}
\\usepackage{alltt}
\\usepackage{thumbpdf}
-\\usepackage[pdftex,
+\\usepackage[#{@tex2pdf},
#{color.strip}
pdftitle={#@string1},
% pdftitle={Untitled},
@@ -1125,6 +1308,9 @@ WOK
pdfstartview=FitH
]
{hyperref}
+%% trace lost characters
+% \\tracinglostchars = 1
+% \\tracingonline = 1
\\usepackage[usenames]{color}
\\definecolor{myblack}{rgb}{0,0,0}
\\definecolor{myred}{rgb}{0.75,0,0}