Commit a3f778c
ARROW-13190: [C++] [Gandiva] Change behavior of INITCAP function
The current behavior of the INITCAP function is to turn the first character of each word uppercase and remains the other as is.
The desired behavior is to turn the first letter uppercase and the other lowercase. Any character except the [lowercase letters](https://www.compart.com/en/unicode/category/Ll), [uppercase letters](https://www.compart.com/en/unicode/category/Lu) and [decimal numbers](https://www.compart.com/en/unicode/category/Nd) ones should be considered as a word separator.
That behavior is based on these database systems:
- [Oracle](https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions065.htm)
- [Postgres](https://w3resource.com/PostgreSQL/initcap-function.php)
- [Redshift](https://docs.aws.amazon.com/redshift/latest/dg/r_INITCAP.html)
- [Splice Machine](https://doc.splicemachine.com/sqlref_builtinfcns_initcap.html)
Closes apache#10604 from anthonylouisbsb/fixbug/fix-initcap-behavior and squashes the following commits:
68a4399 <Anthony Louis> Change call to get_char_len
8e05abe <Anthony Louis> Add force inline option for MSVC compiler
9146c01 <Anthony Louis> Remove GANDIVA_EXPORT for helper functions
ca0b0d0 <Anthony Louis> Add FORCE_INLINE in functions
1f4cfc7 <Anthony Louis> Add tests to modified letters
4a1a584 <Anthony Louis> Add more tests for other characters groups
32a2c2d <Anthony Louis> Fix java tests for function
4445e51 <Anthony Louis> Fix tests after changes in function
faa2169 <Anthony Louis> Change comments for is space
c98db7a <Anthony Louis> Change initcap function behavior
Authored-by: Anthony Louis <anthony@simbioseventures.com>
Signed-off-by: Praveen <praveen@dremio.com>1 parent 5fcd4d5 commit a3f778c
4 files changed
Lines changed: 80 additions & 47 deletions
File tree
- cpp/src/gandiva
- java/gandiva/src/test/java/org/apache/arrow/gandiva/evaluator
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
482 | 482 | | |
483 | 483 | | |
484 | 484 | | |
485 | | - | |
| 485 | + | |
486 | 486 | | |
487 | 487 | | |
488 | 488 | | |
| |||
497 | 497 | | |
498 | 498 | | |
499 | 499 | | |
500 | | - | |
| 500 | + | |
501 | 501 | | |
502 | 502 | | |
503 | 503 | | |
| |||
651 | 651 | | |
652 | 652 | | |
653 | 653 | | |
654 | | - | |
655 | | - | |
656 | | - | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
657 | 657 | | |
658 | 658 | | |
659 | | - | |
| 659 | + | |
660 | 660 | | |
661 | | - | |
| 661 | + | |
662 | 662 | | |
663 | 663 | | |
664 | 664 | | |
665 | | - | |
666 | | - | |
667 | | - | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
668 | 670 | | |
669 | 671 | | |
670 | | - | |
671 | | - | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
672 | 675 | | |
673 | 676 | | |
674 | 677 | | |
| |||
691 | 694 | | |
692 | 695 | | |
693 | 696 | | |
| 697 | + | |
| 698 | + | |
694 | 699 | | |
695 | 700 | | |
696 | 701 | | |
697 | | - | |
698 | | - | |
699 | | - | |
700 | | - | |
701 | | - | |
| 702 | + | |
| 703 | + | |
| 704 | + | |
702 | 705 | | |
703 | 706 | | |
704 | 707 | | |
705 | | - | |
706 | | - | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
707 | 711 | | |
708 | 712 | | |
| 713 | + | |
| 714 | + | |
709 | 715 | | |
710 | | - | |
711 | | - | |
712 | | - | |
713 | | - | |
714 | | - | |
715 | | - | |
716 | | - | |
717 | | - | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
718 | 722 | | |
719 | 723 | | |
720 | 724 | | |
721 | 725 | | |
722 | 726 | | |
| 727 | + | |
| 728 | + | |
723 | 729 | | |
724 | 730 | | |
725 | 731 | | |
| |||
738 | 744 | | |
739 | 745 | | |
740 | 746 | | |
741 | | - | |
742 | 747 | | |
743 | 748 | | |
744 | | - | |
745 | | - | |
| 749 | + | |
746 | 750 | | |
747 | 751 | | |
748 | 752 | | |
749 | 753 | | |
750 | 754 | | |
751 | 755 | | |
752 | | - | |
| 756 | + | |
753 | 757 | | |
754 | 758 | | |
755 | 759 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
46 | 57 | | |
47 | 58 | | |
48 | 59 | | |
| |||
135 | 146 | | |
136 | 147 | | |
137 | 148 | | |
138 | | - | |
139 | | - | |
140 | | - | |
141 | 149 | | |
142 | 150 | | |
143 | 151 | | |
| |||
146 | 154 | | |
147 | 155 | | |
148 | 156 | | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | 157 | | |
153 | 158 | | |
154 | 159 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
546 | 546 | | |
547 | 547 | | |
548 | 548 | | |
549 | | - | |
550 | | - | |
| 549 | + | |
| 550 | + | |
551 | 551 | | |
552 | 552 | | |
553 | 553 | | |
554 | | - | |
| 554 | + | |
555 | 555 | | |
556 | 556 | | |
557 | 557 | | |
558 | | - | |
| 558 | + | |
559 | 559 | | |
560 | 560 | | |
561 | 561 | | |
562 | | - | |
| 562 | + | |
563 | 563 | | |
564 | 564 | | |
565 | 565 | | |
| |||
572 | 572 | | |
573 | 573 | | |
574 | 574 | | |
575 | | - | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
576 | 600 | | |
577 | 601 | | |
578 | 602 | | |
| |||
Lines changed: 4 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2310 | 2310 | | |
2311 | 2311 | | |
2312 | 2312 | | |
2313 | | - | |
| 2313 | + | |
2314 | 2314 | | |
2315 | 2315 | | |
2316 | 2316 | | |
2317 | 2317 | | |
2318 | 2318 | | |
2319 | 2319 | | |
2320 | 2320 | | |
2321 | | - | |
2322 | | - | |
2323 | | - | |
| 2321 | + | |
| 2322 | + | |
| 2323 | + | |
2324 | 2324 | | |
2325 | 2325 | | |
2326 | 2326 | | |
| |||
0 commit comments