Skip to content

Tags: nicoSWD/php-rule-parser

Tags

v1.0

Toggle v1.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Major refactor — recursive descent parser, arithmetic, unary operator…

…s, and token system overhaul (#26)

* Add proper AST and lexer

* Refactor

* Refactor

* Refactor

* Add highlighter

* Update README.md

* Add check for duplicate regex modifiers

* Add support for string concatenation with "+" and addition for numeric operations

* Add support for doing math operations

* Add support for returning actual results instead of just true/false

* Simplify highlighter

* Rename classes

* Add support for unary minus and logical NOT operators

* Fix constructor consistency

* Replace match expressions in AstEvaluator with Interpreter pattern

Each AST node now implements evaluate(EvaluationContext): mixed,
eliminating the two massive match expressions in AstEvaluator
that violated the Open/Closed Principle.

- Added EvaluationContext DTO (wraps TokenStream + TokenFactory)
- Added evaluate() to Node base class
- Each concrete node implements its own evaluation logic
- AstEvaluator is now a thin orchestrator with zero match expressions
- All 263 tests pass

* Fix: eliminate duplicated node evaluation logic in AstEvaluator

Made evaluate() delegate to resolve() instead of both methods
independently calling $node->evaluate(). This removes the latent
bug risk where one method could be modified without updating the other.

* Add .DS_Store to .gitignore

* Fix: replace fragile $afterValue boolean with stack-based token inspection

The $afterValue boolean flag was fragile because it required manual
maintenance at every token emission point. Adding new token types
required remembering to update the flag, and whitespace/newline
tokens implicitly relied on the previous state persisting.

Replaced it with lastTokenIsValue() which walks backwards through
the token stack, skipping ignorable tokens (whitespace/comments),
and checks the actual token type hierarchy:
- Does the token implement the Value interface?
- Is it a closing parenthesis or closing array?

This is more robust because:
- Context is derived from actual emitted tokens, not a manual flag
- New token types implementing Value automatically work
- Whitespace/comments are naturally skipped via canBeIgnored()
- The logic is self-documenting and testable

* Fix string escape sequences not being unescaped in TokenEncapsedString

The getValue() method was stripping surrounding quotes but not unescaping
the content, so escape sequences like \n, \t, \, etc. were being treated
as literal characters instead of their actual control character equivalents.

Added an unescape() method that handles common JavaScript escape sequences:
\n, \r, \t, \, ", \', $, \0. Unknown sequences are preserved as-is.

Added 9 unit tests in LexerTest.php and 1 integration test in ScalarTest.php.

* Fix division by zero not handled in Evaluator

DivisionNode and ModuloNode now check for a zero divisor before
performing arithmetic operations, following JavaScript semantics:

- Division by zero returns INF, -INF, or NAN (matching JS Infinity/NaN)
- Modulo by zero returns NAN (matching JS NaN)

Added 5 test methods in OperatorsTest.php covering all edge cases.

* Remove dead getNativeValue() method from AST value nodes

The getNativeValue() method was defined in the abstract ValueNode class and
implemented in all 6 concrete subclasses (BoolNode, FloatNode, IntegerNode,
NullNode, RegexNode, StringNode), but was never called anywhere in the
codebase. The evaluate() method already returns the native PHP value directly,
making getNativeValue() entirely redundant.

* Simplify token system: replace 27 token classes with GenericToken + TokenKind enum

- Created TokenKind enum with all token types
- Created GenericToken class backed by TokenKind to replace single-purpose token classes
- Added abstract getKind(): TokenKind to BaseToken
- Made getType() concrete in BaseToken, deriving TokenType from getKind()
- Removed isOfType() method (unused)
- Removed kindToType() from TokenFactory (duplicated in BaseToken::getType())
- Updated canBeIgnored() to use getKind() directly
- Updated ExpressionFactory to use TokenKind matching instead of instanceof
- Removed 27 old token class files
- Updated all tests accordingly

All 278 tests pass.

* refactor: eliminate TokenStream God object and clean up dependency tree

- Split TokenStream into VariableRegistry, FunctionRegistry, MethodRegistry
- Made VariableRegistry mutable to avoid recreating dependency tree on each parse()
- Inlined TokenIteratorFactory in RuleEngine constructor
- Made defaultVariables a promoted constructor property
- Removed lazy initialization ( flag) from FunctionRegistry and MethodRegistry
- Restored readonly on RuleEngine

* Consolidate InternalFunction and InternalMethod into single InternalCallable DTO

* Remove dead code

* Fix AST cache never used (double parse) in Rule.php

The cached AST in Rule::isTrue() and Rule::result() was never actually used
because they called RuleEngine::evaluate() and RuleEngine::result() with the
raw rule string, which re-parsed it every time.

- Added RuleEngine::evaluateNode(Node) and RuleEngine::resolveNode(Node)
  to allow evaluating a pre-parsed AST without re-parsing
- Updated Rule::isTrue() to use evaluateNode() with the cached AST
- Updated Rule::result() to use resolveNode() with the cached AST
- Refactored RuleEngine::evaluate() and RuleEngine::result() to delegate
  to the new node-based methods for consistency

* Fix infinite recursion in Rule.php property hook

The get accessor on the  property was returning ->error,
which triggered the get accessor again, causing infinite recursion.
Since the get hook had no transformation logic, the entire property
hook was redundant and has been replaced with a simple typed property.

* Fix wrong data structure in TokenCollection: replace SplObjectStorage with ArrayObject

* Remove intersection types and empty marker interfaces

- Replace BaseToken & Operator intersection type with BaseToken in
  ExpressionFactory, ExpressionFactoryInterface, EvaluableExpression,
  and ParserException
- Remove implements Operator from GenericToken
- Remove implements Method from TokenMethod
- Delete unused empty marker interfaces: Logical, Method, Operator,
  Parenthesis, Whitespace
- Keep Value interface as it's still used with instanceof in Lexer

* Fix: Eliminate new instance creation on every function call in FunctionRegistry

Refactored FunctionRegistry to store class strings instead of Closures,
with instance caching so each function class is instantiated only once
per registry lifetime. Updated FunctionCallNode and TokenIterator to use
CallableInterface::call() instead of Closure invocation.

* Fix Lexer mutable state / reentrancy issue

Extracted all mutable scanning state ($input, $pos, $length) into a new
LexerContext value object. The Lexer no longer holds any mutable state as
instance properties - a LexerContext is created locally inside tokenize()
and passed to all private helper methods. This makes the Lexer fully
reentrant: multiple calls to tokenize() on the same Lexer instance will
not interfere with each other.

- Created src/Tokenizer/LexerContext.php with peek(), current(), advance(),
  isValid(), and startsWith() convenience methods
- Refactored Lexer.php to remove $input, $pos, $length instance properties
- All private methods now accept LexerContext instead of accessing $this->input/$this->pos/$this->length

* refactor: eliminate God constructor in RuleEngine by moving dependency wiring to RuleEngineBuilder

The RuleEngine constructor was a classic God Constructor - it created and wired
all internal dependencies (TokenFactory, Lexer, VariableRegistry, FunctionRegistry,
MethodRegistry, ObjectMethodCallerFactory, TokenIteratorFactory, Parser,
AstEvaluator) directly inside the constructor body.

Changes:
- RuleEngine now accepts fully-wired dependencies (Parser, AstEvaluator,
  VariableRegistry) via constructor injection
- RuleEngineBuilder.build() now handles all object graph assembly
- Rule.php updated to use builder instead of direct RuleEngine instantiation
- Updated test that directly instantiated RuleEngine with old constructor

* Refactor fragile method prefix fallback loop in ObjectMethodCaller

Replaced the do...while loop with manual index tracking in
findCallableMethod() with a cleaner two-step approach:
1. Check exact method name first with is_callable()
2. Use foreach loop over prefixes for fallback matching

This preserves the same behavior and prefix priority order
('', 'get', 'is', 'get_', 'is_') while being more maintainable
and less error-prone.

* Remove dead code: Expression directory, EvaluableExpression, and related test

- Removed entire src/Expression/ directory (BaseExpression, all comparison
  expression classes, ExpressionFactory, ExpressionFactoryInterface)
- Removed src/Parser/EvaluableExpression.php and EvaluableExpressionFactory.php
  which depended on the Expression classes
- Removed tests/unit/Expression/ExpressionFactoryTest.php which tested dead code

All comparison logic is handled by ComparisonNode + ComparisonOperator enum.

* Consolidate Token and TokenKind enums to reduce duplication

- Remove the redundant Token enum (src/TokenStream/Token/Token.php)
- Update TokenFactory::createFromToken() to accept TokenKind instead of Token
- Remove the 40-line tokenToKind() mapping function
- Update Lexer to use TokenKind instead of Token throughout
- Update test to use TokenKind directly

* Remove unused TokenIterator methods (getStack, getVariable, getFunction, getMethod)

These methods were only referenced in tests and never used in production code.
Also cleaned up the now-unused constructor parameters from TokenIterator and
TokenIteratorFactory, and updated RuleEngineBuilder accordingly.

* refactor: convert Lexer::tokenize() from array-based to generator-based streaming

Replace the O(n) array storage in tokenize() with a PHP generator (yield),
eliminating the need to hold all tokens in memory simultaneously.

- Remove  array and ArrayIterator wrapping
- Replace lastTokenIsValue(array ) with isTokenValue(?BaseToken )
  for O(1) context tracking instead of O(n) stack walk
- Track last non-ignorable token via  variable
- Remove unused ArrayIterator import

* refactor: remove TokenIteratorFactory, remove peekRaw(), drop readonly from TokenIterator

- Remove TokenIteratorFactory (pointless abstraction — single create() method
  that just does 'new TokenIterator(...)')
- Inline TokenIterator creation directly in Parser::parse()
- Remove peekRaw() from TokenIterator (identical to current(), misleading name)
- Replace all peekRaw() calls with current() in Parser
- Remove readonly from TokenIterator (implements Iterator with mutating methods)
- Update RuleEngineBuilder and ParserTest to remove factory dependency

* Fix anti-pattern: eliminate double evaluation in isValid/getError

isValid() was duplicating the full parse+evaluate pipeline that getError()
already performed. Refactored isValid() to delegate to getError() by
checking if the error string is empty, eliminating the code duplication
while preserving identical behavior.

* Fix anti-pattern: rename TokenizerInterface abstract class to Tokenizer

The class 'TokenizerInterface' was actually an abstract class, not an
interface. Renamed it to 'Tokenizer' to accurately reflect its nature,
and changed the  property from public to protected to fix the
encapsulation issue. Updated all references in Lexer, Parser,
RuleEngineBuilder, and ParserTest.

* Refactor: rename Tokenizer namespace to Lexer

- Create abstract Lexer class at src/Lexer/Lexer.php
- Rename concrete class to DefaultLexer at src/Lexer/DefaultLexer.php
- Update LexerContext namespace from Tokenizer to Lexer
- Update all imports and references in Parser, RuleEngineBuilder, Highlighter
- Update test files (ParserTest, LexerTest)
- Remove old src/Tokenizer/ directory
- Rename tests/unit/Tokenizer/ to tests/unit/Lexer/
- All 261 tests pass

* Refactor: Replace old Parser with recursive descent parser and consolidate token classes

- Replace the old stack-based Parser with a clean recursive descent parser
  that builds an AST directly, with documented operator precedence levels
- Consolidate 15+ single-purpose token classes into GenericToken with
  TokenKind enum for type identification
- Delete 10 unused token class files (TokenArray, TokenBoolFalse,
  TokenBoolTrue, TokenFloat, TokenFunction, TokenInteger, TokenNull,
  TokenObject, TokenRegex, TokenVariable)
- Remove dead Value marker interface
- Update all grammar method implementations to use isOfKind() instead
  of instanceof checks
- Update TokenFactory to create GenericToken instances
- Fix TokenBool to extend GenericToken
- Update tests to use GenericToken instead of deleted token classes
- All 261 tests pass with 947 assertions

* Pass raw PHP values to callables instead of wrapping in tokens

- Change CallableInterface::call() signature from ?BaseToken to mixed
- Update CallableFunction to accept mixed token and parseParameter() return type
- Remove token wrapping in FunctionCallNode::resolveArguments()
- Remove token wrapping in MethodCallNode::resolveArguments()
- Pass raw object value to MethodRegistry::get() for non-object/non-regex types
- Update ObjectMethodCaller::call() to accept mixed params, remove getTokenValues()
- Update all 12 method and 2 function implementations to work with raw values

* Use raw PHP values

* Update README.md

* Cleanup

* Fix md format

* Update coding standard from PSR-2 to PSR-12

- Updated README.md to reference PSR-12 instead of PSR-2
- Updated .styleci.yml preset from psr2 to psr12
- Fixed 338 PSR-12 coding style violations across 109 PHP files
- Moved file-level docblocks before declare(strict_types=1) per PSR-12
- Added phpcs:ignore annotations for intentional snake_case test methods
- Added squizlabs/php_codesniffer as dev dependency
- All 261 tests continue to pass

* Update composer dependencies to latest versions

- phpunit/phpunit: ^12.3 → ^13.1 (12.3.15 → 13.1.7)
- mockery/mockery: ^1.6 (1.6.12, already up to date)
- squizlabs/php_codesniffer: ^4.0 (4.0.1, already up to date)

All transitive dependencies updated accordingly.

v0.8

Toggle v0.8's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Use PHP 8.4 features (#22)

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Use PHP 8.1 features

* Change email

* Update PHPUnit and fix tests

* Refactor

* Fix type

* Update for v12

0.7.1

Toggle 0.7.1's commit message
Run only on PHP 8

0.7.0

Toggle 0.7.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Release for PHP 8 (#20)

* Require PHP >= 8.0
* Add not in token type
* Separate booleans into two different tokens to fix token overriding bug
* Cleanup
* Fix phpstan issues

0.6.1

Toggle 0.6.1's commit message
Update readme

0.6.0

Toggle 0.6.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Update README.md

0.6.0-alpha

Toggle 0.6.0-alpha's commit message

Unverified

No user is associated with the committer email.
Add support for custom object calls

0.5.7

Toggle 0.5.7's commit message

Unverified

No user is associated with the committer email.
Require phpunit 5.4|6.0

0.5.6

Toggle 0.5.6's commit message

Unverified

No user is associated with the committer email.
Merge branch 'master' of github.com:nicoSWD/php-rule-parser

0.5.5

Toggle 0.5.5's commit message
Update .travis.yml