Not signed in (Sign In)

Categories

Vanilla 1.1.1 is a product of Lussumo. More Information: Documentation, Community Support.

    •  
      CommentAuthorskeleton
    • CommentTimeFeb 10th 2007
     permalink

    Whenever I declared a double quote character for a char type in Java, the syntax colorizing gets messed.

    For example after the following line:
    char c = '"';
    everything is colored as a String. I think the reason is that it can not find any closing double quote.

    Here is the screenshot.

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007 edited
     permalink

    I’ve tried to forget java and it seems to be working. Are single quotes reserved for char data?
    Anyway to fix this I had to add another pattern/variable to the grammar repository.
    Change: Line 284
    string-quoted-double: { begin: /"/ end: /"/ name: 'string.quoted.double.java' swallow: '\\.' }
    To:
    string-quoted-double: { begin: /"/ end: /"/ name: 'string.quoted.double.java' patterns: [ { match: /\\./ } ] } string-quoted-single: { begin: /'/ end: /'/ name: 'string.quoted.single.java' patterns: [ { match: /\\./ } ] }
    also add this:
    { include: '#string-quoted-single' }
    after line 223.

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007
     permalink

    If single quotes are reserved for char data it would be better to give the repository variable a more descriptive name.

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007
     permalink

    So THAT’s what I didn’t do right….. I thought I could have two includes in a single {} section…such as:
    { include: '#string-quoted-double' include: '#string-quoted-single' }
    ...but, obviously, that doesn’t work. :P

    However, unless I’m wrong (which I might be), isn’t the single quote for (single) chars only ? If so, not only is this incorrect in Java, but C/C++ doesn’t handle it properly either. Since you’ve been working with the REGEX a bit more…maybe you’d know how to modify that to work? I’m thinking something more like the following:

    string-quoted-single: { match: /'.'/ name: 'string.quoted.single.java' }
    So, if I’m correct on the single quote thing, then this is the correct solution.

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007 edited
     permalink

    ...I took a look at how C/C++ handles this. It completely confuses me. When I was programming in C++, only single characters were allowed…oh, wait, it’s allowing for hex characters and such — therefore, mine actually WON’T work as expected…but there’s still a bug in the REGEX since I can simply output an entire sentence within single quotes (under C/C++). I’ll keep playing with it, but a C/C++ expert might be helpful to let us know what is, and is not allowed.

    ...ok, the REGEX is too confusing. I’m lost.

    Regex in question:
    string_escaped_char – line 383, C.itGrammar

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007 edited
     permalink

    It matches octal, hex, and escaped chars like \n inside single quotes.
    regex: /\\ ( \\|[abefnprtv’”?]|[0-3]\d{,2}|[4-7]\d?|x[a-fA-F0-9]{,2})/
    matches: \ | \|escaped chars|octal|octal|hex
    examples: \ | \|\n|\000 (max two digits after 0-3)|\40 (zero or one digit after)|\x93 (max two hex chars after x)

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007
     permalink

    If is supposed to be this way in java. Then you would need to adapt the fix above to something like the C grammar.

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007 edited
     permalink

    I have updated the java.itGrammar file with the fix.
    This has it the C way.

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007
     permalink

    tstrokes, I was pretty sure it matched special reserved characters, such as newline, carriage return, tab, beep, etc… The problem is that it doesn’t actually work as it’s intended to under Intype’s REGEX engine. It matched a junk character string such as ‘sggfdsfdsafd’ ... I tested this under the C++ grammar just to be sure.

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007 edited
     permalink

    What scope does it show for the junk character string?
    If you have an escaped char in the string does it have the correct scope?

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007 edited
     permalink

    tstrokes: Oh come on, can’t you try it too? :P

    Change to C++ mode. Type the following code:
    char someChar = 'fdsafdsfsd';

    The scope within the junk character string is, as expected: string.quoted.single.c | source.c++ where the pipe denotes a newline (I think that differentiates parents/children)

    What it should do is to leave that scope after the first identified/matched single character reference.

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007
     permalink

    Yeah, I did try in all three grammars java(my version), C, and C++.
    What I should have asked was what it should do which you kindly explained.
    Thanks.

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007
     permalink

    tstrokes: Yeah, I did try in all three grammars java(my version), C, and C++.
    What I should have asked was what it should do which you kindly explained.
    Thanks.

    Heh…sorry, and…no problem! :P

    • CommentAuthori
    • CommentTimeFeb 11th 2007 edited
     permalink

    In Java, character literals can be:

    1. a single character
    2. a unicode value in the form \u004E
    3. an octal value in the range \0 to \377
    4. an escaped character

    Here’s my regex, it works for all four cases, and marks invalids:

    string-quoted-single: { begin: /'/ end: /'/ name: 'string.quoted.single.java' patterns: [ { match: /(?<=')\\([bfnrt"'\\]|u[0-9a-fA-F]{4}|[0-7][0-7]?|[0-3][0-7]{2})(?='\s*;)/ name: 'constant.character.escape' } { match: /(?<!'\\u)(.{2,}|\\)(?='\s*;)/ name: 'invalid.illegal' } ] }

    Edit: Made the escape sequence regex more specific, and added one more subpattern for single slash (invalid).
    Edit: Okay, I believe I have gotten all covered… and made it shorter

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007
     permalink

    Three questions:
    1. On your third regex: /(?<!’\\u)..... Is that supposed to be similar to the above two?
    2. (?<=’) Is this a lookbehind?
    3. I’m guessing this ( (?=’\s;)* ) is a lookahead, but what else is it doing?

    Some of the more advanced features of regular expressions I still have to learn (like, anything beyond the basics).

    • CommentAuthori
    • CommentTimeFeb 11th 2007
     permalink

    I’ve looked around the web for information sources on C++ escape sequences, and it’s kinda inconsistent. Eg, some sites gave escaping hex chars as “\x*dd*”, while some sites say that when escaping hex chars, the hex sequence can be as long as you want, as in “\x*dddd…*”.

    So, the question is whether we should target a specific standard or try to cover all possiblities?

    •  
      CommentAuthortstrokes
    • CommentTimeFeb 11th 2007
     permalink

    Thanks idyllrain. :)

    • CommentAuthori
    • CommentTimeFeb 11th 2007
     permalink

    BrendonKoz:

    1. (?<!’\\u) is a negative lookbehind. I’m basically asserting that the characters \u does not appear before matching 2 or more characters.
    2. Yes, (?<=’) is a positive lookbehind, asserting that before matching, there is a single quote character to the left of the match.
    3. You’re right, that’s a positive lookahead, asserting that after the match, there is a single quote character, followed by zero or more whitespace characters, and a semicolon.

    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007 edited
     permalink
    *YAWN*

    I'll load up Visual Studio .NET 2005 tomorrow (later today, my time...it's early now) and take a look both within the syntax highlighting, and the MSDN documentation.
    Thank you very much for explaining those expressions!
    • CommentAuthori
    • CommentTimeFeb 11th 2007 edited
     permalink
    :) Np.

    This is the MSDN reference on "character escape sequences":http://msdn2.microsoft.com/en-us/library/6aw8xdf2(VS.80).aspx , there's also this reference which has "a list of syntaxes":http://www.csci.csusb.edu/dick/c++std/cd2/gram.html .

    There's the thing about the current grammar matching \e. Which I have not seen before.. (and some other characters listed there too)
    • CommentAuthori
    • CommentTimeFeb 11th 2007
     permalink
    • CommentAuthorBrendonKoz
    • CommentTimeFeb 11th 2007 edited
     permalink

    Wonderful… Visual Studio’s syntax highlighting has the same effect (any junk character string is matched). :P

    I didn’t find anything decent in my local MSDN documentation. I think the best source to structure a REGEX from would be your online-based Microsoft documentation link (in the case of C++). I am uncertain whether or not the base language of C will allow for all of those characters; I would think not. Then again, who’s to say that gcc supports all of those as well? Crap, who was the C/C++ programmer from the Intro thread? :P svenax?

    tstrokes: If the book mention was towards me, I own the Perl bible book. There’s an entire (large) chapter devoted to regular expressions. I’ve read through it once, I should probably do it again. :)

    • CommentAuthori
    • CommentTimeFeb 11th 2007 edited
     permalink

    My one and only C++ reference book from ages ago also does not mention support for all those characters. And I’m rather unclear on the most current language specifications for it. Anyhow, if C doesn’t support it while C++ does, we can just put in separate definitions in the respective grammar files.

    Talking bout regex books, anyone here have Mastering Regular Expressions? Is it as good as everyone seems to be saying it is? I’m planning to buy it, but I might be tempted to skip that book for Edward Tufte’s Envisioning Information. Heh…

    •  
      CommentAuthordflock
    • CommentTimeFeb 12th 2007
     permalink

    I’ve got both books and they’re both excellent. By the time you’ve finished reading Friedl’s Regex book, you’ll be dreaming in Regular Expressions – it’s pretty comprehensive and focused. However, to be fair, you can look most regex stuff up online when you need something, so it’s maybe not a book you need.

    If you’re only going to get one, I think I would get the Tufte book – his books are stellar and more broadly applicable to lots of information architecture topics, rather than being focussed on one thing.

    • CommentAuthori
    • CommentTimeFeb 14th 2007
     permalink

    Kinda broke down on Valentine’s Day and went to purchase the only copy of Envisioning Information in my whole country… superb book!

    •  
      CommentAuthordflock
    • CommentTimeFeb 14th 2007
     permalink

    Heh – good for you. At least that way you know that the gift will be appreciated!! :)