Gly file format

Gly files are used for storing (bitmapped) glyph collections (aka fonts)
   to make them available to client applications,
   to support client-side font measurement, selection, and rendering.

Gly file format supports storing shades of gry (aka grey or gray),
   so renderings of vector fonts can also be stored,
   but gly fonts can not be scaled like a vector font.

Gly file format consists of
* a glyheader struct
* a fontmeta struct
* a list of glyphmeta structs
* a list of pixel data structures, each of which is
   either a bwmap, or a grymap, or a bwtoggle struct or a grytoggle struct.

Glyheader

typedef struct {
   uint8_t sig[4] ; /* "gly0" */
   uint32_t order ; /* 0x01020304 */
   } glyheader ;


Fontmeta

typedef struct {
   int32_t xid ; /* negative means that it is not loaded by X */
   uchar xlfd[102] ;
   uchar foundry[51] ; /* there are two 'terminal' fonts, from different foundries */
   uchar style[51] ; /* usually called 'fontname'. includes alt/bold/italique */
   uint32_t firstchar ; /* Only valid if cglyphs is nonzero */
   uint32_t lastchar ;
   uint32_t cglyphs ; /* countof non-default glyphs. is size of array of glyphmetas */
   uint32_t glyphs ; /* offset of array of glyphmeta */
   uint16_t unused3 ;
   uint16_t nomheight ; /* 'pixelsize' as reported by xlfd */
   uint16_t fontheight ; /* advised height, from fontabove+fontbelow. */
   uint16_t avgstride ; /* averaged over all chars in font. */
   uint16_t fontabove ; /* non-negative */
   uint16_t fontbelow ; /* non-negative */
   int16_t inkhighest ;
   int16_t inklowest ;
   int16_t inkleftest ;
   int16_t inkrightest ;
   uint16_t maxstride ;
   uint16_t minstride ;
   uint8_t resx ; /* from xlfd */
   uint8_t unused1 ;
   uint8_t weight ; /* weight averaged over all glyphs. really float. range [0;1] */
   int8_t slant ; /* x/y ratio of slant of 'verticals'. really a float. range [-1;1] */
   uint8_t serif ; /* seriflength/inkboxheight of 'I'. float. range [0;1] */
   uint8_t beauty ; /* beauty-contest not formalized yet. range [0;255] */
   uint8_t usecount ;
   uint8_t unused2:3 ;
   uint8_t bold:1 ; /* bool, from xlfd */
   uint8_t italic:1 ; /* bool, from xlfd */
   uint8_t equiwi:1 ; /* 0: variable-width font. 1:fixed-widht font. */
   uint8_t encoding:2 ; /* 1:unicode, 2:iso8859-1, 0: other */
   } fontmeta ;   /* 256 byte */

This fontmeta struct is not only intended for use in a gly file,
   but also for use in a fontselect program.
Debian has several fontselect programs already,
   but none of these allows you to select a font that is not already installed.
So i plan to have time to make gly files of all fonts that Debian provides
   and make a jpeg of first few characters of each font,
   for a manual fontselection program.
Also, if a webserver encounters characters that it can not display
   because X has not loaded a font that contains a glyph for it
   such a fontselect could automatically load a suitable font.

A shortcoming of X is that
   it does not tell which glyphs in a range of characters a font does not have,
   which results in a 'default' character being displayed in that case,
   without application being aware of that.
For example : for font MiscFixed20re75, X reports that it contains 65000 characters,
   but in reality it contains only 5012 non-default glyphs.
My program that grabs glyph shapes from X and puts them in a gly file
   uses a simple heuristic to detect default shapes,
   and does not include these glyphs for these characters.
(This program is not part of canvas source, and is work in sometimes progress).

This glyphmeta struct is spin-off of one of my other projects,
   and is preliminary.
Problems that still need to be addressed are :
* how to encode glyphs that are not in basic multilingual plane of unicode
   (probably by increasing datasize of 'character' member of glyphmeta),
* how to deal with fonts that have other known encodings, eg iso8859-2
   (probably by converting everything to unicode).
* how to support different drawing directions
   (probably by indicating them in one of unused members,
   and updating documentation to tell what meanings of fields then are).

xid is meant for applications that let X do text drawing.
xlfd is meant to make it easy to make X load this font (after it is selected).
firstchar and lastchar make it easy to see which range of characters this font has ;
   to find out whether font has a specific character in that range,
   that character's glyphmeta must be found, which can be done by a binary search ;
   if that glyphmeta is not present, then this font does not have that glyph.
glyphs contains offset of array of glyphmeta, from start of file, in bytes.
inkhighest/lowest/leftest/rightest are relative to 'origin' of glyph
   where y-position is zero at baseline,
   and x-position is zero at position arrived at after stride from previous glyph.
maxstride makes it possible to make a fast but crude prediciton of string length.
equiwi indicates whether font was designed as fixed-width or not ;
   that is not necessarily way that it would be used,
   because it is easy to use a variable-width font as fixed-width
   and now that glyphs are available client-side, optimum stride can be computed.

Glyphmeta

typedef struct {   /* 20 byte */
   uint16_t inkboxwi ; /* if glyph has no ink, then inboxwi/hi are zero */
   uint16_t inkboxhi ;
   int16_t inkboxoffx ; /* x/y offsets of topleft of inkbox */
   int16_t inkboxoffy ; /* relative to (0;0) of glyph. */
      /* in GRAPHDRAWING coords (ie higher pos has higher y) */
   uint32_t logiwi:12 ;
   uint32_t gry:1 ; /* dataformat: is determined by gry/b&w and toggle/inkmap */
   uint32_t toggle:19 ; /* countof toggles, for easy compute storagesize */
   uint32_t charno ; /* unicode number of character */
   uint32_t data ; /* inkless glyph has data 0 */
   } glyphmeta   ; /* data and charno MUST be last, for comparing glyphs */

logiwi is same as 'stride'.
data is offset of glyphdata, from start of file, in bytes.

gry and toggle specify format of glyphdata ;
   if gry is zero and toggle nonzero, then glyphdata is a list of bwtoggles,
   if gry is one and toggle nonzero, then glyphdata is a list of grytoggles,
   if gry is zero and toggle is zero, then glyphdata is a bwmap,
   if gry is one and toggle is zero, then glyphdata is a grymap.
All these storage types are shown furtheron.

Inkmaps

Inkmaps do not have a struct, but are arrays ;
   number of elements in array can be found from inkbox sizes.
bwmap is an array of bits, value 1 indicating 'inked',
   stored in an array of uchar
   where most significant bit represents leftmost pixel.
grymap is an array of 4bit values, stored in uchar,
   value indicating inkedness,
   highest valued nibble containing inkedness of leftmost pixel.

Toggles

A 'toggle' contains an inkedness value and a repeat count.
Thus each toggle represents inkedness of a series of identically inked pixels.
Just as with inkmaps, there are as many toggles as are needed to fill inkbox.

Bwtoggle

typedef struct {    
   uint8_t val:1 ; /* 1: was ink ; 0 : was no ink . (ie inkedness before this pixel) */
   uint8_t rep:7 ; /* offset from pixel in previous bwtoggle (or from 0) along glyphscanline */
   } bwtoggle ;

Grytoggle

typedef struct {    
   uint16_t val:4 ; /* gryscale value of borderpoint */
   uint16_t rep:12 ;
   } grytoggle ;


A few words about storage size

For black&white fonts,
   bwmap is smallest storage size,
   for fonts who'se average character width is not more than 32 pixels.
A full unicode BMP font of 32-pixels-wide characters is circa 8 MB .
For larger fonts, bwtoggle is smaller.

For gry-scale fonts,
   gry toggle is smallest storage size,
   for fonts who'se average character width is not more than 32 pixels.
A full unicode BMP font of 32-pixels-wide characters is circa 18 MB .
For larger fonts, ttf fileformat is smaller
   (if size of kerning information is small compared to glyphdata).

Grymap is thus not smallest format for any font size.

Sizes mentioned above are for font files,
   but this is not quite same as storage size of a font that is loaded by X.
Some applications use glyph caches,
   so i assume that they store valuemaps of glyph shapes
     which are not smaller (and probably larger) than grytoggles
   to save time on rendering font shapes from vector descriptions.
Here grytoggle has an advantage, because it is an on-the-fly decompressible format.
This also makes glyfiles easy to use :
   to load a font, all you need to do is to mmap it
   (using a private mmap, preferably, to not risk accidentally overwriting it).

For bitmapped fonts, filesize of gly files is less than half of that of bdf files,
   but after gzipping, difference is much smaller.