« home   paste   Anonymous | Login | Signup for a new account 06-21-2019 01:07 CEST
* X »
GeSHi - Generic Syntax Highlighter Syntax Coloriser for PHP

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000018 [GeSHi] core feature N/A 10-28-05 04:46 11-03-05 15:53
Reporter BenBE View Status public  
Assigned To nigel
Priority low Resolution open  
Status assigned   Product Version 1.1.1alpha1
Summary 0000018: Support for configurable language features
Description In some environments it's often quite useful if you can disable some features (e.g. highlighting of some "function names" not required) or activating them (e.g. add highlighting of OpenGL function and type names on a OpenGL related page).
Additional Information The problem is less the creation of such extensions (as they can simply be added by code manipulation, the technical problem is more the requirement for fast language file loading.

I.e. on the mentioned OpenGL website nobody would like to use ASM as ASM with OpenGL would be crap (I know, although not tested yet ;D). But in addition this site would like to load an extre set of "function and type names" for use with OpenGL.

It's now the point, that such a "extension" will not only be for one single language, but one to be loaded to multiple ones at one time. I.e. that OpenGL extension will have to be loaded into /delphi, /cpp and other (yeah, OpenGL with /visualbasic :P) contexts at the same time. Loading the date for all those contexts should than be executed by telling "load it to /delphi, /cpp AND /visualbasic) using "that" information.

The issue is hardly related to 0000002, as it can take advantage of those features.
Also there's required a way for the language files to know which "extensions" should be loaded to reduce unnecessary parsing of special lines (i.e. /delphi/asm could be excluded from loading if the language file knew that in advance).
Attached Files

- Relationships
related to 0000002closed nigel Should be possible to tweak context data dynamically 

- Notes
11-01-05 13:59

Well this is something worth discussing.

You want to have the ability to add extra function names into a context. Well funnily enough, GeSHi 1.0.X already has this, there's a method called add_keywords_to_keyword_group or similar (can't remember its name) that lets you do this on the user's end.

Are you looking for something like this:

$geshi =& new GeSHi($source, 'delphi');

Or something inside the language files themselves?

Or something else?
11-02-05 08:44

Well both ;-P

I need at one hand the possibility (from within the context) to check what to load, to disable unused settings. (As mentioned to skip loading of unused subcontexts). This is the more important.

The other function to "add" extra subcontexts would be nice to have, but should be easily manageable by the discussed functions for context tweaking (cf. relations).

The discussed functionality would therefore morelikly be a favor of

$g =& new GeSHi($source, 'delphi', '+opengl, -asm');
$g->ParseCode ...

Important here is that param in the constructor to tell "this should be, that shouldn't be loaded" ... UC?
11-02-05 14:33

Well currently GeSHi does attempt to strip out contexts that will be useless, by checking if any of their starters appear in the code. Although this is limited, because the context already has to be loaded before we can decide whether to use it. And there's no sensible way to get around this, without having to store the opening delimiters somewhere twice...

What you suggest as the example is quite interesting though... let the _user_ decide what to load and what not to load. I'd be inclined to use a multidimensional array for it though, not a string, but that's just detail.
11-03-05 05:03

Well, stripping out unused contexts is just half of the truth. Why should they be loaded, if the user doesn''t even want them to use. But I'd continue let GeSHi finally decide about its data, but give the user the opportunity to say "don't highlight OpenGL commands, even if they occur" or the ASM "look out to highlight ASM, because if it appears it must be highlighted accordingly". Both things might appear AND both cases should be supported.

concerning the opening delimiters I doubt stripping the data out shouldn't be that much a problem for "in-context" operations (e.g. strip out standard identifiers, if none of them is used), as the source was already parsed. Thus removing them out of memory is just to gain some more space ... The bigger issue is more avoid parsing of unused contexts the user is already sure about he won't need for it's source. Parsing always takes time GeSHi could already spend highlighting if it was smart enought to listen to the user ;-)

@parameter format: Why not support both styles? Not everyone likes arrays. So you could convert the string to an array with only one or two lines in case a string was given.

Some small isue I could think about would be registration of extensions within GeSHi, but I guess you'll find a suitable way for this.
11-03-05 11:12

Well it's funny that you mention the "attempt to highlight even if you're not sure it's there" case, because I just added that functionality in the last release. I was having problems with the string_javascript context of HTML not working properly. The problem I found was that string_javascript's starters were both regexes that started with "^", so when GeSHi tried to find if they were in the source, it tried only at the very beginning of the source. Thus I added that functionality to get around that problem.

And adding functionality to remove-no-matter-what will be trivial after that change. So it's something I can get on to. I'm planning to change the third parameter of the constructor to be an option array anyway (remember that before it was a path), so I can add it in there.

As for converting the string, well first of all for BC reasons that third parameter will have to be an array. If it's a string it might be a path to the language files. So an array, like for PEAR::Pager or PEAR::DB:

$geshi =& new GeSHi($source, $language, array(
  'option' => 'value',

Of course, one of the option-value pairs could be:

  'contextRules' => '+opengl -asm'


  'contextRules' => array('+' => array('opengl'), '-' => array('asm'))

I guess the string version could be done as long as conversion was quick (split by whitespace, for each look at first char for +/-, apply rule on rest of string). Though I'd still like to keep the array one because it has a strict, easy-to-use format.

And for extensions, well I've already added support for adding "render" extensions (see geshi/classes/renderers directory), and support for the "codeparser" extensions that delphi and php use already. If you have other ideas, let me know.
11-03-05 11:53

@Good you mentioned: ;-) LOL Just think of everything and nothing ... I didn't know you just implemented it elsewhere ;-) Seems as if there's not much against this feature ^^

  'contextRules' => array('+' => array('opengl'), '-' => array('asm'))
I'd just like to add that I guess swapping would be nicer:
  'contextRules' => array('opengl' => '+', 'asm' => '-')
or in case to use opengl for delphi only:
  'contextRules' => array('opengl' => array('delphi' => '+', 'cpp' => '-'), 'asm' => '-')
But everything seems just fine.

@Registration of extensions: You know what I mean by this? Maybe you can give some solutions how we can solve it.
11-03-05 15:53

Swapping might be nicer. Though the example with the languages isn't a good idea... we figure that if they're highlighting using delphi then they can just specify '+opengl -asm' or whatever, and they're not interested in what c++ wants. If they call setLanguage they could then call setOption or similar to change these things.

As for extensions, before having extensions we need to ask where they would be used/useful. And I thought that they would be useful for looking at the stuff generated by the context tree parsing, and at the time when that data is to be used for generating output.

So to that end, there is the CodeParser class, which can be extended to handle looking at what is generated, and Renderer which can be extended to generate output. If you look in the geshi/classes/delphi directory, you'll see an extension of CodeParser for more correct highlighting, and in geshi/classes/renderers you'll see the HTML renderer. I plan to add other renderers such as PDF, ODF, dummy etc.

So then we ask:

  * When else might they be needed?
  * If Joe Blow, third party, reckons they've written a good renderer, how do we let them use it without putting their stuff in with the other, officially supported renderers?


$geshi =& new GeSHi($source, 'delphi');
$geshi->accept(new MyGeSHiRenderer());
echo $geshi->parseCode();

Or something else?

What about use cases for other places where extensions can be used?

- Issue History
Date Modified Username Field Change
10-28-05 04:46 BenBE New Issue
10-28-05 04:46 BenBE Status new => assigned
10-28-05 04:46 BenBE Assigned To  => nigel
10-28-05 04:46 BenBE Relationship added related to 0000002
10-28-05 04:54 BenBE version  => 1.1.1alpha1
11-01-05 13:59 nigel Note Added: 0000039
11-02-05 08:44 BenBE Note Added: 0000045
11-02-05 14:33 nigel Note Added: 0000046
11-03-05 05:03 BenBE Note Added: 0000047
11-03-05 11:12 nigel Note Added: 0000048
11-03-05 11:53 BenBE Note Added: 0000049
11-03-05 15:53 nigel Note Added: 0000051


Mantis 1.0.0rc2[^]
Copyright © 2000 - 2005 Mantis Group
47 total queries executed.
35 unique queries executed.
Powered by Mantis Bugtracker