Tags: nicoSWD/php-rule-parser
Tags
Major refactor — recursive descent parser, arithmetic, unary operator… …s, and token system overhaul (#26) * Add proper AST and lexer * Refactor * Refactor * Refactor * Add highlighter * Update README.md * Add check for duplicate regex modifiers * Add support for string concatenation with "+" and addition for numeric operations * Add support for doing math operations * Add support for returning actual results instead of just true/false * Simplify highlighter * Rename classes * Add support for unary minus and logical NOT operators * Fix constructor consistency * Replace match expressions in AstEvaluator with Interpreter pattern Each AST node now implements evaluate(EvaluationContext): mixed, eliminating the two massive match expressions in AstEvaluator that violated the Open/Closed Principle. - Added EvaluationContext DTO (wraps TokenStream + TokenFactory) - Added evaluate() to Node base class - Each concrete node implements its own evaluation logic - AstEvaluator is now a thin orchestrator with zero match expressions - All 263 tests pass * Fix: eliminate duplicated node evaluation logic in AstEvaluator Made evaluate() delegate to resolve() instead of both methods independently calling $node->evaluate(). This removes the latent bug risk where one method could be modified without updating the other. * Add .DS_Store to .gitignore * Fix: replace fragile $afterValue boolean with stack-based token inspection The $afterValue boolean flag was fragile because it required manual maintenance at every token emission point. Adding new token types required remembering to update the flag, and whitespace/newline tokens implicitly relied on the previous state persisting. Replaced it with lastTokenIsValue() which walks backwards through the token stack, skipping ignorable tokens (whitespace/comments), and checks the actual token type hierarchy: - Does the token implement the Value interface? - Is it a closing parenthesis or closing array? This is more robust because: - Context is derived from actual emitted tokens, not a manual flag - New token types implementing Value automatically work - Whitespace/comments are naturally skipped via canBeIgnored() - The logic is self-documenting and testable * Fix string escape sequences not being unescaped in TokenEncapsedString The getValue() method was stripping surrounding quotes but not unescaping the content, so escape sequences like \n, \t, \, etc. were being treated as literal characters instead of their actual control character equivalents. Added an unescape() method that handles common JavaScript escape sequences: \n, \r, \t, \, ", \', $, \0. Unknown sequences are preserved as-is. Added 9 unit tests in LexerTest.php and 1 integration test in ScalarTest.php. * Fix division by zero not handled in Evaluator DivisionNode and ModuloNode now check for a zero divisor before performing arithmetic operations, following JavaScript semantics: - Division by zero returns INF, -INF, or NAN (matching JS Infinity/NaN) - Modulo by zero returns NAN (matching JS NaN) Added 5 test methods in OperatorsTest.php covering all edge cases. * Remove dead getNativeValue() method from AST value nodes The getNativeValue() method was defined in the abstract ValueNode class and implemented in all 6 concrete subclasses (BoolNode, FloatNode, IntegerNode, NullNode, RegexNode, StringNode), but was never called anywhere in the codebase. The evaluate() method already returns the native PHP value directly, making getNativeValue() entirely redundant. * Simplify token system: replace 27 token classes with GenericToken + TokenKind enum - Created TokenKind enum with all token types - Created GenericToken class backed by TokenKind to replace single-purpose token classes - Added abstract getKind(): TokenKind to BaseToken - Made getType() concrete in BaseToken, deriving TokenType from getKind() - Removed isOfType() method (unused) - Removed kindToType() from TokenFactory (duplicated in BaseToken::getType()) - Updated canBeIgnored() to use getKind() directly - Updated ExpressionFactory to use TokenKind matching instead of instanceof - Removed 27 old token class files - Updated all tests accordingly All 278 tests pass. * refactor: eliminate TokenStream God object and clean up dependency tree - Split TokenStream into VariableRegistry, FunctionRegistry, MethodRegistry - Made VariableRegistry mutable to avoid recreating dependency tree on each parse() - Inlined TokenIteratorFactory in RuleEngine constructor - Made defaultVariables a promoted constructor property - Removed lazy initialization ( flag) from FunctionRegistry and MethodRegistry - Restored readonly on RuleEngine * Consolidate InternalFunction and InternalMethod into single InternalCallable DTO * Remove dead code * Fix AST cache never used (double parse) in Rule.php The cached AST in Rule::isTrue() and Rule::result() was never actually used because they called RuleEngine::evaluate() and RuleEngine::result() with the raw rule string, which re-parsed it every time. - Added RuleEngine::evaluateNode(Node) and RuleEngine::resolveNode(Node) to allow evaluating a pre-parsed AST without re-parsing - Updated Rule::isTrue() to use evaluateNode() with the cached AST - Updated Rule::result() to use resolveNode() with the cached AST - Refactored RuleEngine::evaluate() and RuleEngine::result() to delegate to the new node-based methods for consistency * Fix infinite recursion in Rule.php property hook The get accessor on the property was returning ->error, which triggered the get accessor again, causing infinite recursion. Since the get hook had no transformation logic, the entire property hook was redundant and has been replaced with a simple typed property. * Fix wrong data structure in TokenCollection: replace SplObjectStorage with ArrayObject * Remove intersection types and empty marker interfaces - Replace BaseToken & Operator intersection type with BaseToken in ExpressionFactory, ExpressionFactoryInterface, EvaluableExpression, and ParserException - Remove implements Operator from GenericToken - Remove implements Method from TokenMethod - Delete unused empty marker interfaces: Logical, Method, Operator, Parenthesis, Whitespace - Keep Value interface as it's still used with instanceof in Lexer * Fix: Eliminate new instance creation on every function call in FunctionRegistry Refactored FunctionRegistry to store class strings instead of Closures, with instance caching so each function class is instantiated only once per registry lifetime. Updated FunctionCallNode and TokenIterator to use CallableInterface::call() instead of Closure invocation. * Fix Lexer mutable state / reentrancy issue Extracted all mutable scanning state ($input, $pos, $length) into a new LexerContext value object. The Lexer no longer holds any mutable state as instance properties - a LexerContext is created locally inside tokenize() and passed to all private helper methods. This makes the Lexer fully reentrant: multiple calls to tokenize() on the same Lexer instance will not interfere with each other. - Created src/Tokenizer/LexerContext.php with peek(), current(), advance(), isValid(), and startsWith() convenience methods - Refactored Lexer.php to remove $input, $pos, $length instance properties - All private methods now accept LexerContext instead of accessing $this->input/$this->pos/$this->length * refactor: eliminate God constructor in RuleEngine by moving dependency wiring to RuleEngineBuilder The RuleEngine constructor was a classic God Constructor - it created and wired all internal dependencies (TokenFactory, Lexer, VariableRegistry, FunctionRegistry, MethodRegistry, ObjectMethodCallerFactory, TokenIteratorFactory, Parser, AstEvaluator) directly inside the constructor body. Changes: - RuleEngine now accepts fully-wired dependencies (Parser, AstEvaluator, VariableRegistry) via constructor injection - RuleEngineBuilder.build() now handles all object graph assembly - Rule.php updated to use builder instead of direct RuleEngine instantiation - Updated test that directly instantiated RuleEngine with old constructor * Refactor fragile method prefix fallback loop in ObjectMethodCaller Replaced the do...while loop with manual index tracking in findCallableMethod() with a cleaner two-step approach: 1. Check exact method name first with is_callable() 2. Use foreach loop over prefixes for fallback matching This preserves the same behavior and prefix priority order ('', 'get', 'is', 'get_', 'is_') while being more maintainable and less error-prone. * Remove dead code: Expression directory, EvaluableExpression, and related test - Removed entire src/Expression/ directory (BaseExpression, all comparison expression classes, ExpressionFactory, ExpressionFactoryInterface) - Removed src/Parser/EvaluableExpression.php and EvaluableExpressionFactory.php which depended on the Expression classes - Removed tests/unit/Expression/ExpressionFactoryTest.php which tested dead code All comparison logic is handled by ComparisonNode + ComparisonOperator enum. * Consolidate Token and TokenKind enums to reduce duplication - Remove the redundant Token enum (src/TokenStream/Token/Token.php) - Update TokenFactory::createFromToken() to accept TokenKind instead of Token - Remove the 40-line tokenToKind() mapping function - Update Lexer to use TokenKind instead of Token throughout - Update test to use TokenKind directly * Remove unused TokenIterator methods (getStack, getVariable, getFunction, getMethod) These methods were only referenced in tests and never used in production code. Also cleaned up the now-unused constructor parameters from TokenIterator and TokenIteratorFactory, and updated RuleEngineBuilder accordingly. * refactor: convert Lexer::tokenize() from array-based to generator-based streaming Replace the O(n) array storage in tokenize() with a PHP generator (yield), eliminating the need to hold all tokens in memory simultaneously. - Remove array and ArrayIterator wrapping - Replace lastTokenIsValue(array ) with isTokenValue(?BaseToken ) for O(1) context tracking instead of O(n) stack walk - Track last non-ignorable token via variable - Remove unused ArrayIterator import * refactor: remove TokenIteratorFactory, remove peekRaw(), drop readonly from TokenIterator - Remove TokenIteratorFactory (pointless abstraction — single create() method that just does 'new TokenIterator(...)') - Inline TokenIterator creation directly in Parser::parse() - Remove peekRaw() from TokenIterator (identical to current(), misleading name) - Replace all peekRaw() calls with current() in Parser - Remove readonly from TokenIterator (implements Iterator with mutating methods) - Update RuleEngineBuilder and ParserTest to remove factory dependency * Fix anti-pattern: eliminate double evaluation in isValid/getError isValid() was duplicating the full parse+evaluate pipeline that getError() already performed. Refactored isValid() to delegate to getError() by checking if the error string is empty, eliminating the code duplication while preserving identical behavior. * Fix anti-pattern: rename TokenizerInterface abstract class to Tokenizer The class 'TokenizerInterface' was actually an abstract class, not an interface. Renamed it to 'Tokenizer' to accurately reflect its nature, and changed the property from public to protected to fix the encapsulation issue. Updated all references in Lexer, Parser, RuleEngineBuilder, and ParserTest. * Refactor: rename Tokenizer namespace to Lexer - Create abstract Lexer class at src/Lexer/Lexer.php - Rename concrete class to DefaultLexer at src/Lexer/DefaultLexer.php - Update LexerContext namespace from Tokenizer to Lexer - Update all imports and references in Parser, RuleEngineBuilder, Highlighter - Update test files (ParserTest, LexerTest) - Remove old src/Tokenizer/ directory - Rename tests/unit/Tokenizer/ to tests/unit/Lexer/ - All 261 tests pass * Refactor: Replace old Parser with recursive descent parser and consolidate token classes - Replace the old stack-based Parser with a clean recursive descent parser that builds an AST directly, with documented operator precedence levels - Consolidate 15+ single-purpose token classes into GenericToken with TokenKind enum for type identification - Delete 10 unused token class files (TokenArray, TokenBoolFalse, TokenBoolTrue, TokenFloat, TokenFunction, TokenInteger, TokenNull, TokenObject, TokenRegex, TokenVariable) - Remove dead Value marker interface - Update all grammar method implementations to use isOfKind() instead of instanceof checks - Update TokenFactory to create GenericToken instances - Fix TokenBool to extend GenericToken - Update tests to use GenericToken instead of deleted token classes - All 261 tests pass with 947 assertions * Pass raw PHP values to callables instead of wrapping in tokens - Change CallableInterface::call() signature from ?BaseToken to mixed - Update CallableFunction to accept mixed token and parseParameter() return type - Remove token wrapping in FunctionCallNode::resolveArguments() - Remove token wrapping in MethodCallNode::resolveArguments() - Pass raw object value to MethodRegistry::get() for non-object/non-regex types - Update ObjectMethodCaller::call() to accept mixed params, remove getTokenValues() - Update all 12 method and 2 function implementations to work with raw values * Use raw PHP values * Update README.md * Cleanup * Fix md format * Update coding standard from PSR-2 to PSR-12 - Updated README.md to reference PSR-12 instead of PSR-2 - Updated .styleci.yml preset from psr2 to psr12 - Fixed 338 PSR-12 coding style violations across 109 PHP files - Moved file-level docblocks before declare(strict_types=1) per PSR-12 - Added phpcs:ignore annotations for intentional snake_case test methods - Added squizlabs/php_codesniffer as dev dependency - All 261 tests continue to pass * Update composer dependencies to latest versions - phpunit/phpunit: ^12.3 → ^13.1 (12.3.15 → 13.1.7) - mockery/mockery: ^1.6 (1.6.12, already up to date) - squizlabs/php_codesniffer: ^4.0 (4.0.1, already up to date) All transitive dependencies updated accordingly.
Use PHP 8.4 features (#22) * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Use PHP 8.1 features * Change email * Update PHPUnit and fix tests * Refactor * Fix type * Update for v12
PreviousNext