-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Output custom object with headers when no values are present #25096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…wershell into fixNoValueCSVInput
| if (values.Count == 1 && string.IsNullOrEmpty(values[0]) && !(_cmdlet is ConvertFromCsvCommand)) | ||
| { | ||
| // skip the blank lines | ||
| // Skip empty lines when Import-CSV but keep empty lines when ConvertFromCSV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug (#24698) affects both ConvertFrom-Csv and Import-Csv; I think the fix should be applied to both commands for consistency. ConvertFrom-Csv was used throughout the original issue for demonstration purposes only.
Do we know if there's a specific reason why the check for values.Count == 1 && string.IsNullOrEmpty(values[0]) was included in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comes from Windows PowerShell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I disable it for Import-CSV as well, `n's are captured as an extra element. Is this the behavior we want?
Describe "Import-Csv with different newlines" -Tags "CI" {
It "Test import-csv with '<name>' newline" -TestCases @(
@{ name = "CR"; newline = "`r" }
@{ name = "LF"; newline = "`n" }
@{ name = "CRLF"; newline = "`r`n" }
) {
param($newline)
$csvFile = Join-Path $TestDrive -ChildPath $((New-Guid).Guid)
$delimiter = ','
"h1,h2,h3$($newline)11,12,13$($newline)21,22,23$($newline)" | Out-File -FilePath $csvFile
$returnObject = Import-Csv -Path $csvFile -Delimiter $delimiter
$returnObject.Count | Should -Be 2
$returnObject[0].h1 | Should -Be 11
$returnObject[0].h2 | Should -Be 12
$returnObject[0].h3 | Should -Be 13
$returnObject[1].h1 | Should -Be 21
$returnObject[1].h2 | Should -Be 22
$returnObject[1].h3 | Should -Be 23
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the behavior we want?
No, it isn't.
This is more about ensuring whatever fix is chosen for #24698 is applied to both ConvertFrom-Csv and Import-Csv. In other words, the example CSV data below should deserialize consistently, irrespective of the source being a file or a string.
"P1"
""
""
""Can #24698 be fixed for both commands without introducing the newline issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking it can be deserialized consistently but would result in just one empty string value per header.
For example,
"P1"
""
""
""
# Would produce [pscustomobject] @{ P1 = '' } ignoring the other empty cells@'
"P1","P2"
"",
"",
'@ | ConvertFrom-Csv
# Would produce [pscustomobject] @{ P1 = ''; P2 = '' } ignoring the other empty cells againThis way we can produce a header only output consistently.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not what I had in mind when I filed the issue. For each non-empty line in input data, an object should be emitted by ConvertFrom-Csv/Import-Csv.
The following (using either command) produces 3 objects:
@'
"P1"
"a"
"b"
"c"
'@ | ConvertFrom-CsvTherefore, the following (using either command) should also produce 3 objects:
@'
"P1"
""
""
""
'@ | ConvertFrom-CsvLikewise, this produces 2 objects, because the empty line is correctly ignored:
@'
"P1"
"a"
"c"
'@ | ConvertFrom-CsvTherefore, this should also produce 2 objects:
@'
"P1"
""
""
'@ | ConvertFrom-Csv|
@jshigetomi Root of the issue in ParseNextRecord method. It does not distinguish between an input string without characters and a string with two double quotes PowerShell/src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs Lines 1609 to 1613 in bfdc498
Here PowerShell/src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs Lines 1390 to 1394 in bfdc498
But we get to Zugzwang - sometimes we want to, sometimes we don't. Of course, it would be possible to add more conditions, but this would require a more significant code change in same methods. My suggestion is not to complicate the code. If there are concerns, then make this change an experimental feature. |
|
@jshigetomi I checked my guess. I had to make a few more changes. You can see the result here: |
|
@iSazonov I'm seeing test failure for Import-CSV Maybe a warning to the user would be better for no value csvs? |
|
I don't see errors in my branch. |
|
Hmm I'm having some trouble reproducing your results. How are you resolving the brackets here? |
|
@jshigetomi Please look #25120 |
|
/azp run |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
@surfingoldelephant @iSazonov I added the proposed tests but there seems to be a slight difference in behavior between I think the test cases where the CSV is written to a file then read from the file using |
|
@jshigetomi I believe that's because the test file has an empty line at the end (default You should see the same result with # Import-Csv:
$file = [IO.Path]::GetTempFileName()
@'
"P1","P2"
,
a,
'@ | Set-Content -LiteralPath $file # Includes empty line at end
$result = Import-Csv -LiteralPath $file
$null -eq $result[1].P2 # False
# ConvertFrom-Csv:
$result = @'
"P1","P2"
,
a,
'@ | ConvertFrom-Csv
$null -eq $result[1].P2 # FalseNote this is also the existing behavior (without this PR's fix). If the same CSV data is used without an empty line at the end, both commands should deserialize the last cell as # Import-Csv:
$file = [IO.Path]::GetTempFileName()
@'
"P1","P2"
,
a,
'@ | Set-Content -LiteralPath $file -NoNewline # No empty line at end
$result = Import-Csv -LiteralPath $file
$null -eq $result[1].P2 # True
# ConvertFrom-Csv:
$result = @'
"P1","P2"
,
a,
'@ | ConvertFrom-Csv
$null -eq $result[1].P2 # True |
I think so too. It would be nice to test both cases with last empty line and without it. We could directly use .Net classes to create such files. One more test I think for Import-Csv if there is a source file with header only. |
Providing See the |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@surfingoldelephant @iSazonov Thanks for the pointer for Set-Content. I was not aware of the newline behavior. I made the ConvertFrom-CSV and Import-CSV tests the same and both behave the same. |
|
|
||
| Describe "Import-Csv with empty and null values" { | ||
|
|
||
| Context 'Empty CSV Fields' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the PR is about edge cases I think about additional tests:
- only header presents without newline
- header presents with one empty newline
And more:
- empty file (
-Headerparameter presents) - file with one empty line (
-Headerparameter presents)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, we're expecting those tests to produce no output (zero custom objects).
Let's also include Import-Csv/ConvertFrom-Csv tests with multiple empty lines.
-
Header only
@' P1,P2 '@ | ConvertFrom-Csv
-
Header followed by one empty line
@' P1,P2 '@ | ConvertFrom-Csv
-
Header followed by multiple empty lines
@' P1,P2 '@ | ConvertFrom-Csv
-
Empty input and
-Headerspecified'' | ConvertFrom-Csv -Header P1, P2
-
One empty line and
-Headerspecified@' '@ | ConvertFrom-Csv -Header P1, P2
-
Multiple empty lines and
-Headerspecified@' '@ | ConvertFrom-Csv -Header P1, P2
Perhaps additional -Header tests to cover the fix:
@'
,
A1,A2
B1,B2
,
'@ | ConvertFrom-Csv -Header P1, P2
# Expected:
[pscustomobject] @{ P1 = ''; P2 = '' }
[pscustomobject] @{ P1 = 'A1'; P2 = 'A2' }
[pscustomobject] @{ P1 = 'B1'; P2 = 'B2' }
[pscustomobject] @{ P1 = ''; P2 = $null }
surfingoldelephant
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also need to take a look at no detected delimiter scenarios.
# Detected delimiter:
@'
P1,P2,P3
A1,,
B1,,
C1,,
'@ | ConvertFrom-Csv
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = ''; P3 = '' }
[pscustomobject] @{ P1 = 'B1'; P2 = ''; P3 = '' }
[pscustomobject] @{ P1 = 'C1'; P2 = ''; P3 = $null }# No detected delimiter:
@'
P1,P2,P3
A1
B1
C1
'@ | ConvertFrom-Csv
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'B1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'C1'; P2 = $null; P3 = $null }
@'
P1;P2;P3
A1
B1
C1
'@ | ConvertFrom-Csv -Delimiter ';'
# Current/expected behavior:
[pscustomobject] @{ P1 = 'A1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'B1'; P2 = $null; P3 = $null }
[pscustomobject] @{ P1 = 'C1'; P2 = $null; P3 = $null }Notice when there is no (detected) delimiter, the resultant value is always $null. Can we add tests for this?
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run PowerShell-Windows-Packaging-CI, PowerShell-CI-linux-packaging |
|
Azure Pipelines could not run because the pipeline triggers exclude this branch/path. |


PR Summary
Fixes: #24698
This pull request includes changes to improve the handling of empty and null values in the
ConvertFrom-Csvcommand and updates the corresponding unit tests to verify the new behavior. The most important changes include modifying theImportmethod inCsvCommands.csand adding new unit tests inConvertFrom-Csv.Tests.ps1.Enhancements to
ConvertFrom-Csvcommand:src/Microsoft.PowerShell.Commands.Utility/commands/utility/CsvCommands.cs: Updated theImportmethod to skip empty lines only when usingImport-CSV, but to keep them when usingConvertFromCSV.Unit tests updates:
test/powershell/Modules/Microsoft.PowerShell.Utility/ConvertFrom-Csv.Tests.ps1: Added new unit tests to verify thatConvertFrom-Csvhandles empty and null values correctly.PR Context
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.- [ ] Issue filed:
(which runs in a different PS Host).