The Language File

Now that you have a clear picture of how your language is structured, it’s time to start writing the files that will highlight it. The first file that you write is known as the language file, and it is the file that specifies the context tree as well as any keywords, symbols etc. that will be highlighted in your language. You’d think that this was all that a syntax highlighter would need to do, and indeed if it was then 1.2 would have been released ages ago :). But as you’ll see later there are ways of taking the syntax highlighting well beyond what most IDEs and the old GeSHi could.

Language files live in the geshi/languages/[your-lang]/ directory. [your-lang] is a “short” version of your language name, e.g. “css”, “java”, “php”, “html” etc. The default language has the filename [your-lang].php. So for example, if you were developing language files for language “foo”, then the language file you would want to make would be geshi/languages/foo/foo.php 1). Note that language names should only include letters and digits (at this time).

The following is the CSS language files, as of 2006/06/24 (the latest form will be here:

 * GeSHi - Generic Syntax Highlighter
 * <pre>
 *   File:   geshi/languages/css/css.php
 *   Author: Nigel McNie
 *   E-mail:
 * </pre>
 * For information on how to use GeSHi, please consult the documentation
 * found in the docs/ directory, or online at
 * This program is part of GeSHi.
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  GNU General Public License for more details.
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, write to the Free Software
 *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301 USA
 * @package    geshi
 * @subpackage lang
 * @author     Nigel McNie <>
 * @license GNU GPL
 * @copyright  (C) 2004 - 2006 Nigel McNie
 * @version    $Id: css.php,v 1.7 2006/06/02 09:43:43 oracleshinoda Exp $

 * @access private

/** Get common functions for CSS */
require_once GESHI_LANGUAGES_ROOT . 'css' . GESHI_DIR_SEP . 'common.php';

function geshi_css_css (&$context)
    $context->addChild('inline_media', 'code');
    $context->addKeywordGroup('@font-face', 'at_rule/start');

function geshi_css_css_inline_media (&$context)
    $context->addDelimiters('REGEX#@media\s+\w+\s+\{#', '}');

function geshi_css_css_rule (&$context)
    $context->addDelimiters('{', '}');
    $context->addChild('css/css/string', 'string');

    // Attributes
        'azimuth', 'background', 'background-attachment', 'background-color', 'background-image',
        'background-position', 'background-repeat', 'border', 'border-bottom', 'border-bottom-color',
        'border-bottom-style', 'border-bottom-width', 'border-collapse', 'border-color', 'border-left',
        'border-left-color', 'border-left-style', 'border-left-width', 'border-right', 'border-right-color',
        'border-right-style', 'border-right-width', 'border-spacing', 'border-style', 'border-top',
        'border-top-color', 'border-top-style', 'border-top-width', 'border-width', 'bottom',
        'caption-side', 'clear', 'clip', 'color', 'content', 'counter-increment', 'counter-reset', 'cue',
        'cue-after', 'cue-before', 'cursor', 'direction', 'display', 'elevation', 'empty-cells', 'float',
        'font', 'font-face', 'font-family', 'font-size', 'font-size-adjust', 'font-stretch', 'font-style',
        'font-variant', 'font-weight', 'height', 'left', 'letter-spacing', 'line-height', 'list-style',
        'list-style-image', 'list-style-keyword', 'list-style-position', 'list-style-type', 'margin',
        'margin-bottom', 'margin-left', 'margin-right', 'margin-top', 'marker-offset', 'max-height',
        'max-width', 'min-height', 'min-width', 'orphans', 'outline', 'outline-color', 'outline-style',
        'outline-width', 'overflow', 'padding', 'padding-bottom', 'padding-left', 'padding-right',
        'padding-top', 'page', 'page-break-after', 'page-break-before', 'page-break-inside', 'pause',
        'pause-after', 'pause-before', 'pitch', 'pitch-range', 'play-during', 'position', 'quotes',
        'richness', 'right', 'size', 'speak', 'speak-header', 'speak-numeral', 'speak-punctuation',
        'speech-rate', 'stress', 'table-layout', 'text-align', 'text-decoration', 'text-decoration-color',
        'text-indent', 'text-shadow', 'text-transform', 'top', 'unicode-bidi', 'vertical-align',
        'visibility', 'voice-family', 'volume', 'white-space', 'widows', 'width', 'word-spacing',
        'z-index', 'konq_bgpos_x', 'konq_bgpos_y', 'unicode-range', 'units-per-em', 'src', 'panose-1',
        'stemv', 'stemh', 'slope', 'cap-height', 'x-height', 'ascent', 'descent', 'widths', 'bbox',
        'definition-src', 'baseline', 'centerline', 'mathline', 'topline', '!important'
    ), 'attribute');
    // Attributes that take arguments
        'url', 'attr', 'rect', 'rgb', 'counter', 'counters', 'local', 'format'
    ), 'paren');
    // Colours
        'aqua', 'black', 'blue', 'fuchsia', 'gray', 'green', 'lime', 'maroon', 'navy', 'olive',
        'purple', 'red', 'silver', 'teal', 'white', 'yellow', 'ActiveBorder', 'ActiveCaption',
        'AppWorkspace', 'Background', 'ButtonFace', 'ButtonHighlight', 'ButtonShadow', 'ButtonText',
        'CaptionText', 'GrayText', 'Highlight', 'HighlightText', 'InactiveBorder', 'InactiveCaption',
        'InactiveCaptionText', 'InfoBackground', 'InfoText', 'Menu', 'MenuText', 'Scrollbar',
        'ThreeDDarkShadow', 'ThreeDFace', 'ThreeDHighlight', 'ThreeDLightShadow', 'ThreeDShadow',
        'Window', 'WindowFrame', 'WindowText'
    ), 'color');
    // Types
        'inherit', 'none', 'hidden', 'dotted', 'dashed', 'solid', 'double', 'groove', 'ridge', 'inset',
        'outset', 'xx-small', 'x-small', 'small', 'medium', 'large', 'x-large', 'xx-large', 'smaller',
        'larger', 'italic', 'oblique', 'small-caps', 'normal', 'bold', 'bolder', 'lighter', 'light',
        'transparent', 'repeat', 'repeat-x', 'repeat-y', 'no-repeat', 'baseline', 'sub', 'super', 'top',
        'text-top', 'middle', 'bottom', 'text-bottom', 'left', 'right', 'center', 'justify', 'konq-center',
        'disc', 'circle', 'square', 'decimal', 'decimal-leading-zero', 'lower-roman', 'upper-roman', 'lower-greek',
        'lower-alpha', 'lower-latin', 'upper-alpha', 'upper-latin', 'hebrew', 'armenian', 'georgian',
        'cjk-ideographic', 'hiragana', 'katakana', 'hiragana-iroha', 'katakana-iroha', 'inline', 'block',
        'list-item', 'run-in', 'compact', 'marker', 'table', 'inline-table', 'table-row-group',
        'table-header-group', 'table-footer-group', 'table-row', 'table-column-group', 'table-column',
        'table-cell', 'table-caption', 'auto', 'crosshair', 'default', 'pointer', 'move', 'e-resize',
        'ne-resize', 'nw-resize', 'n-resize', 'se-resize', 'sw-resize', 's-resize', 'w-resize', 'text',
        'wait', 'help', 'above', 'absolute', 'always', 'avoid', 'below', 'bidi-override', 'blink', 'both',
        'capitalize', 'caption', 'close-quote', 'collapse', 'condensed', 'crop', 'cross', 'embed', 'expanded',
        'extra-condensed', 'extra-expanded', 'fixed', 'hand', 'hide', 'higher', 'icon', 'inside', 'invert',
        'landscape', 'level', 'line-through', 'loud', 'lower', 'lowercase', 'ltr', 'menu', 'message-box', 'mix',
        'narrower', 'no-close-quote', 'no-open-quote', 'nowrap', 'open-quote', 'outside', 'overline', 'portrait',
        'pre', 'relative', 'rtl', 'scroll', 'semi-condensed', 'semi-expanded', 'separate', 'show', 'small-caption',
        'static', 'static-position', 'status-bar', 'thick', 'thin', 'ultra-condensed', 'ultra-expanded', 'underline',
        'uppercase', 'visible', 'wider', 'break', 'serif', 'sans-serif', 'cursive', 'fantasy', 'monospace'
    ), 'type');

    // Symbols
            ':', ';', '(', ')', ','
    ), 'symbol');

    // Values
     ), '', array(
            1 => array('value', false)

function geshi_css_css_comment (&$context)
    $context->addDelimiters('/*', '*/');
    //$this->_contextStyleType = GESHI_STYLE_COMMENTS;

function geshi_css_css_string (&$context)
    $context->addDelimiters('"', '"');
    $context->addDelimiters("'", "'");

    // @todo possible bug where " will be escapable in ' strings etc (need two string contexts for this)
    $context->setCharactersToEscape(array('\\', 'A', '"', '"'));
    //$this->_contextStyleType = GESHI_STYLE_STRINGS;

function geshi_css_css_attribute_selector (&$context)
    $context->addDelimiters('[', ']');
    $context->addChild('css/css/string', 'string');

function geshi_css_css_at_rule (&$context)
    $context->addDelimiters(array('@import', '@charset'), ';');
    $context->addChild('css/css/string', 'string');
    $context->addKeywordGroup('url', 'paren');
        ':', ';', '(', ')'
    ), 'symbol');



As you can see, the file largely consists of function definitions. Each part of the file is explained below.

The Header Comment

The first part of the file is the header comment. This comment is in PHPDoc form. It declares the file as being licensed under the GNU GPL. All files in GeSHi are licensed under the GNU GPL, you do not have a choice in this matter.

The author you may change to be yourself. Please include an e-mail address where somebody can contact you. This is a requirement of the file being distributed with GeSHi. Before you complain about spam, think about how many times Linus Torvalds’ e-mail address appears on kernel changelogs :)

The copyright line you change to read (C) <year you wrote the file> <your name>. Alternatively, you may assign the copyright to me if you choose.

The version line you should change to read: * @version $Id$. This is expanded by the CVS server to give the long version string you can see in the file above.

The require Line

There is a couple of lines of code that read:

/** Get common functions for CSS */
require_once GESHI_LANGUAGES_ROOT . 'css' . GESHI_DIR_SEP . 'common.php';

You are allowed to make a ‘common.php’ file in your language directory where function common to all dialects of a language can be defined. An example of this would be for PHP. PHP has three dialects - PHP, PHP4 and PHP5, and the structure of PHP is not different for any of them, so some functions that define the structure of PHP are put in common.php, and the keywords for each dialect are in the language files themselves.

If you’re not planning on making any dialects of your language, just don’t bother writing these lines. It may be in the future that the file is included if it is defined by GeSHi itself.

The Function Definitions

The rest of the file consists of function definitions. Basically, you need to define one function for each context that your language contains. Each function describes to GeSHi how the context is defined. These functions have a special naming convention, and take one important parameter by reference - the context to describe.

Rather than looking at the CSS language file and trying to point out the important parts, in the next part we will build a language file for a language and see how it all fits together.

Previous | Up | Next

1) This allows GeSHi to support “dialects” - e.g. if you called the file “subfoo.php” then you could use the foo/subfoo language.
lang/dev/tutorial/3.txt · Last modified: 2011/09/01 13:03
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki