»
« home   paste   Anonymous | Login | Signup for a new account 09-22-2017 15:30 CEST
 
* X »
«
GeSHi - Generic Syntax Highlighter Syntax Coloriser for PHP
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000031 [GeSHi] lang minor sometimes 12-03-05 14:07 02-18-06 11:50
Reporter BenBE View Status public  
Assigned To BenBE
Priority normal Resolution fixed  
Status closed   Product Version 1.1.1alpha2
Summary 0000031: Incorrect ender handling with delphi/asm
Description When there's a end being part of the raw ASM context, not in a comment, the asm context is terminated on error. When end is part of a comment, everything works fine.

See the examples.
Additional Information The issue seems to be your part ;-)
Attached Files

- Relationships

- Notes
(0000097)
nigel
12-05-05 09:51

Actually, this is your bug.

The parser will stop as soon as it encounters an "end" that isn't otherwise in another context, or hidden in a regex or similar.

You need to define new contexts or regular expressions perhaps to stop this. The regular expression for the end of ASM might have to change, or you could capture @@[whatever], though I'm not sure if this will stop it ending.
 
(0000098)
BenBE
12-05-05 13:39

The ender of ASM context simply has to be unique word.

i.e. FinishEnd is one ident, but Finish;end are multiple.

The problem I was got confused by was the thing that there's no direct way to tell "needs to be separate word" instead of "first appearance, independent of circumstances" ...
 
(0000100)
nigel
12-05-05 15:53

You can use as an ender: "/\send/" or something.
 
(0000106)
BenBE
12-06-05 06:16
edited on: 12-06-05 06:50

I played around a bit with the starts and enders, but can't seem to find a suiteable one:

$this->_contextDelimiters = array(
    0 => array(
        0 => array('REGEX#[^a-z0-9_]asm[^a-z0-9_]#i'),
        1 => array('REGEX#[^a-z0-9_]end[^a-z0-9_]#i'),
        2 => false
    )
);

basically works, except for the following cases ('' denotes start and end of text)

'asm test end' (both undetected, no asm highlighting at all)
' asm test end' (end undetected, unfinished context)
'asm test end ' (asm undetected, end --> keyword)
' asm test end ' (both detected, highlighting as wanted, except ...)

... except that the chars required to detect end being a separate keyword are part of the ender too. Same goes for the chars surrounding asm.

Thus the semicolon after end becomes delphi/delphi/asm/end ... That's the problem I got faced with.

Didn't I mention that I'm not familiar with regexps :P, did I?

Update: Changed the regexps to fix the SOC\EOC problems. The second problem persists.

$this->_contextDelimiters = array(
    0 => array(
        0 => array('REGEX#^asm|[^a-z0-9_]asm[^a-z0-9_]|^asm$#i'),
        1 => array('REGEX#[^a-z0-9_]end$|[^a-z0-9_]end[^a-z0-9_]#i'),
        2 => false
    )
);

 
(0000108)
nigel
12-06-05 10:03

The first problem I posted a regex for in bug 28.

Yes, unfortunately the character before will be gobbled using my regex, and you have this as problem 2.

I'm not sure what to do about that problem, but I'll think of something soon.
 
(0000109)
BenBE
12-06-05 11:24

Bug should be fixed. The only I learned from this is that I'm hating regexps :P

I finally found something that matches correct ...
$this->_contextDelimiters = array(
    0 => array(
        0 => array(
            'REGEX#^asm$#im',
            'REGEX#^asm(?<=\b)#im',
            'REGEX#(?<![a-z0-9_])asm$#im',
            'REGEX#(?<![a-z0-9_])asm(?<=\b)#im',
        ),
        1 => array(
            'REGEX#^end$#im',
            'REGEX#^end(?<=\b)#im',
            'REGEX#(?<![a-z0-9_])end$#im',
            'REGEX#(?<![a-z0-9_])end(?<=\b)#im',
        ),

        2 => false
    )
);

Next time I'd prefer a "separate word" option ;-)

Following testcases succeed:
---
    Asm
    JNE @@FamilyEnd //
    End;
    Asm
    JMP @@FamilyEnd
    End;
    Asm
    JNZ @@Family End //
    End;
    Asm
    JNZ @@Family End
    End;
    Asm
    JNZ @@Family;End //
    End;
    Asm
    JNZ @@Family;End
    End;
    Asm
    JNE @@FamilyEnd; //
    End;
    Asm
    JMP @@FamilyEnd;
    End;
    Asm
    JNZ @@Family End; //
    End;
    Asm
    JNZ @@Family End;
    End;
    Asm
    JNZ @@Family;End; //
    End;
    Asm
    JNZ @@Family;End;
    End;

    Asm
    JNE @@FamilyEnde //
    End;
    Asm
    JMP @@FamilyEnde
    End;
    Asm
    JNZ @@Family Ende //
    End;
    Asm
    JNZ @@Family Ende
    End;
    Asm
    JNZ @@Family;Ende //
    End;
    Asm
    JNZ @@Family;Ende
    End;
    Asm
    JNE @@FamilyEnde; //
    End;
    Asm
    JMP @@FamilyEnde;
    End;
    Asm
    JNZ @@Family Ende; //
    End;
    Asm
    JNZ @@Family Ende;
    End;
    Asm
    JNZ @@Family;Ende; //
    End;
    Asm
    JNZ @@Family;Ende;
    End;
    Asm

ende
eend
end;
    asm
end
    End;
---

As well as
---
asm
NOP end;

 asm
NOP end;

 asm;
NOP end;

masm
NOP end;

masm;
NOP end;

asma
NOP end;

 asma
NOP end;

 asma;
NOP end;
---

Fixing any cases where this patterns fail is up to you, nigel!

Greets,
BenBE.
 
(0000110)
nigel
12-06-05 11:45

Actually, I found what I was looking for re. the second problem, which should make it much easier.

Regular expressions support "lookahead" and "lookbehind"

So all we need to do is something like this:

[match non-word character as lookbehind]asm[match non-word character as lookahead]

Non-word character I discovered can be encoded as \W.

So we have:

/(?=(\W|^))asm(?=(\W|$))/

Try that one and see what you get.
 
(0000335)
nigel
02-18-06 11:50

Issue closed.
 

- Issue History
Date Modified Username Field Change
12-03-05 14:07 BenBE New Issue
12-05-05 09:51 nigel Note Added: 0000097
12-05-05 09:51 nigel Assigned To  => BenBE
12-05-05 09:51 nigel Status new => assigned
12-05-05 13:39 BenBE Note Added: 0000098
12-05-05 15:53 nigel Note Added: 0000100
12-06-05 06:16 BenBE Note Added: 0000106
12-06-05 06:50 BenBE Note Edited: 0000106
12-06-05 10:03 nigel Note Added: 0000108
12-06-05 11:24 BenBE Status assigned => resolved
12-06-05 11:24 BenBE Fixed in Version  => 1.1.1alpha3
12-06-05 11:24 BenBE Resolution open => fixed
12-06-05 11:24 BenBE Note Added: 0000109
12-06-05 11:45 nigel Note Added: 0000110
02-18-06 11:50 nigel Status resolved => closed
02-18-06 11:50 nigel Note Added: 0000335

  


Mantis 1.0.0rc2[^]
Copyright © 2000 - 2005 Mantis Group
45 total queries executed.
33 unique queries executed.
Powered by Mantis Bugtracker