About GN4 hyphenation

Build 1501 on 14/Nov/2017  This topic last edited on: 29/Mar/2016, at 16:23

GN4 uses the Liang algorithm for hyphenation, combined with exception dictionaries. This is the algorithm used by the public-domain composition program TeX (See 'The TeX book' by Donald E. Knuth, Addison Wesley). The implementation of the algorithm has been done by Tera. The Liang algorithm is based on patterns. The same code can generate different hyphenation - even for different languages - depending on the patterns used. Currently available are patterns for English (in US and UK versions), French, German, Italian, Spanish, Portuguese and Croatian.

The patterns can be modified by the System Administrators. The pattern editing requires an excellent knowledge of the language itself.

Hyphenation patterns are stored in the database - there are no local copies. The algorithm generates perfect hyphenations in languages with regular pronunciation rules such as Italian, and it does a good job in highly irregular languages like English. Its main advantage over a vocabulary-based system is that it can generate reasonable hyphenations in words that are not in the vocabulary.

It is possible to create external plug-ins for specific hyphenation in the case where patterns for the Liang algorithm are not available, or do not offer a correct result. Currently there are two such plug-ins developed by third-parties - for Hungarian and Dutch.

It is possible to override the hyphenation generated by the system by inserting discretionary hyphens, using the GNML tag >dh< or the replacement style >-< where an hyphenation can occur. This kind of hyphenation overrides exception dictionaries and hyphenation rules. This makes it possible to completely disable the hyphenation of a word or to define custom hyphenation points.

It is possible to use multiple different hyphenations in the same story - down to the single word level. both Ted4 and Fred4 supports up to 8 different hyphenation languages to be active at the same time (one text or all the text linked to a page). Liang and plug-in based hyphenations can be used at the same time.

It is also possible to define the maximum number of consecutive hyphens, enable/disable the use of letter spacing, and specify a list of characters where breaking a line is always allowed (like '-' or '/'). These parameters are specified inside formats.