Deprecated: Assigning the return value of new by reference is deprecated in /var/www/virtual/ on line 943

Deprecated: Function split() is deprecated in /var/www/virtual/ on line 494

Deprecated: Function split() is deprecated in /var/www/virtual/ on line 494

Deprecated: Function split() is deprecated in /var/www/virtual/ on line 494
Table of Contents

Building a Language File

So now you understand what is involved in building a language file, and you know roughly what they look like. Let’s build an example language file now, so you can see what is involved.

The Language

Of course, we need a language before we can build a language file for it. For simplicities sake, let’s pick PHP, as you’ll need to be familiar with it anyway to write language files.

The file we build here won’t be the same as the PHP file in GeSHi currently, but it will give you a good idea of how they work.

File structure

Apart from the header comment, language files contain just function definitions. A stub function definition looks like this:

function geshi_lang_lang (&$context)

The functions all have the name geshi_lang_lang[_*]. This is to prevent namespace collisions (and make it easy for the GeSHi parser to find them :). lang in this case is php, so all our functions will start with geshi_php_php. There are a few cases where a function might just start with geshi_php, but they will be covered later (hint: think about the common.php file, and also that some functions may apply to all dialects of a language).

The function all take a parameter, $context, by reference (the & before $context does this). This is an object on which you make API calls to define its behaviour. For example, you can make calls to say where the context starts and ends, and what keywords it has inside it.

You must always write at least one function, which has the name geshi_lang_lang. This is the function that is called by GeSHi to set up the root context - which is the context a language is in when parsing begins.

For simplicities sake, let’s pretend that the PHP we are writing this language file for doesn’t require the <?php and ?> markers - so when the file begins we are already in the PHP context. So what should we put in the function?

I suggest you start with some of the non-tree stuff. So don’t worry about adding any children contexts yet, let’s just add some keyword groups:

function geshi_php_php (&$context)
    // Keywords in all PHP version that have manual entries
        'array', 'die', 'echo', 'empty', 'eval', 'exit', 'include',
        'include_once', 'isset', 'list', 'print', 'require', 'require_once',
        'return', 'unset'
    ), 'keyword', false, '{FNAME}');

    // Keywords in all PHP versions with no manual entry
        'and', 'as', 'break', 'case', 'class', 'continue', 'declare',
        'default', 'do', 'else', 'elseif', 'enddeclare', 'endfor',
        'endforeach', 'endif', 'endswitch', 'endwhile', 'extends', 'for',
        'foreach', 'function', 'global', 'if', 'or', 'parent', 'static',
        'switch', 'new', 'use', 'var', 'while', 'xor'
    ), 'keyword');

Breaking this down:

  • addKeywordGroup is a method you can call on the $context object to add keyword highlighting. In this case we want to add highlighting for PHP keywords, but some of them have entries at and some do not, so we create two groups.
  • The first parameter of addKeywordGroup is an array of the keywords that are in the group. They are specified exactly in the form that they appear in your language source files 1). There is no rule that says that keywords have to be alphanumeric, or indeed have any alphanumeric characters in them at all, although it’s best to stick to such constraints when writing a language file. You will see how to highlight symbols later.
  • The second parameter is a name for the keyword group. In this case we will be boring and call it “keyword” in both cases. Things in the same group get highlighted the same when it comes to outputting the result, so in this case it makes sense, as the only reason that we are doing two groups is so that one can have a URL attached to it.
  • The third parameter is whether the keywords are case sensitive or not. The default is that they are not. If you say that they are, then the keywords will be checked for based on the case that they appear in the first parameter array (so if your keywords are case sensitive and always upper case, you should write them in the array as upper case).
  • The fourth parameter is a URL for the keywords in the group. In GeSHi 1.0.X you could specify a URL with an {FNAME} placeholder, which would be substituted for the keyword where it appeared in the source. For PHP this is all we need, as gets redirected to the correct location. However, in some cases this proved to not be enough, so now you can specify a function name to call here as well. If you want to call a function, add () to the end of the parameter. For example, we might decide that we want to call geshi_php_url, so we would put “geshi_php_url()“.

So as you can see, we have created two keyword groups, one with a URL for linking to documentation and the other without. They both have the same name, so when it comes to highlighting they will be highlighted the same.

So maybe you’re now tempted to use GeSHi on this file. So, give it a go and see what happens...

// my test file
echo 'hello world!';
if (substr("hello", 0, 1) == "e") {

Well that’s wierd - there’s no pretty colours in the output!

You see above that there was nowhere in the addKeywordGroup method that allowed you to specify any colours/fonts etc. for the keywords. This is because GeSHi gets the styles from elsewhere - a theme file.

Previous | Up | Next

1) This is contrasted with GeSHi 1.0.X, where the keywords had to be specified as if they had been run through htmlspecialchars
lang/dev/tutorial/4.txt · Last modified: 2011/09/01 13:03
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki