-
Notifications
You must be signed in to change notification settings - Fork 8k
Context Sensitive Lexer #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Context Sensitive Lexer #1054
Conversation
e3c8519 to
94a4100
Compare
Zend/zend_language_scanner.l
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in_class is an unsigned int ... I would be very surprised if it became smaller than zero ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a test with static unsigned int in_class = -1; and it caused strange bugs without errors here :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the unsigned and kept the assertion because it's currently not very obvious to detect bugs if somebody decrements in_class in wrong conditions.
8f862e7 to
ddea7e8
Compare
4c384aa to
636e210
Compare
|
It's great idea! |
a8effa2 to
1fe57cd
Compare
|
I have a question though, does this also apply to classes, interfaces or traits, or only to properties, methods and constants? Lets say if scalar type hinting is passed and the scalar names become reserved words, while I have an Int class and a String class. Will this RFC solve the problem? |
|
Hi,
No. I tried a more ambitious RFC before that targeted namespaces, classes, A pure lexical approach, without significant changes on the lexer Anyway, the current solution is a great start and already mitigates a lot
Hope that's useful information. Thanks, |
|
I see, thanks for your response, I will be looking forward to the more ambitious case-sensitive patch too. It's unfortunate that scalar names will become reserved words, I have mixed feelings about scalar type hinting. Part of me like it since its a step towards strongly typed PHP I was hoping for, but reserving scalar types makes it inconvenient for me. I know Levi Morrison proposed reserve some types in PHP 7, I am okay with his proposal though as I fully understand why this may be necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the distinction between the individual reserved keywords really relevant here? It look to me like it would be good enough to just return a boolean from the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be useful in case one wants to check for a specific identifier to give more specialized err messages, like: https://github.com/php/php-src/pull/1054/files#diff-3a8139128d4026ce0cb0c86beba4e6b9R4511
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen the error message check, but the distinction seems really weird there ... it's pretty much the same message with slightly different wording. That does not seem worthwhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about const FOO = [1, isset(BAR[0])]; will the isset be lexed as T_ISSET or as T_STRING? It should be a T_ISSET. We currently don't support any language constructors in constants I think, but that's just because they don't happen to be implemented, we may want to add them in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
T_ISSET because the ST_LOOKING_FOR_SR_NAME pops itself after first match so in const FOO = [1, isset(BAR[0])]; only FOO will be the T_STRING.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand your response. In that case in_class and in_const_list should be 1, so that would push LOOKING_FOR_SR_NAME after the , - or not? How does it distinguish between A = 1, B = 2 and A = [1, B] for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅
OK. Fixed with marcioAlmada@a8b1f21. Thanks ^^
|
superseded by #1158 |
This aims to be a consistent implementation of semi-reserved words grammar on OO context and OO access context.
RFC: https://wiki.php.net/rfc/context_sensitive_lexer