-
Notifications
You must be signed in to change notification settings - Fork 8.1k
Simplify getting Encoding in TranscriptionOption.FlushContentToDisk
#13910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify getting Encoding in TranscriptionOption.FlushContentToDisk
#13910
Conversation
Utils.GetEncoding was using a complicated implementation that could be easily replaced by StreamReader.GetCurrentEncoding. Used that implementation in the FlushContentToDisk method
|
@iSazonov I think that's what you had in mind. I didn't remove the GetEncoding method from Utils because I thought it was still cleaner to have that code outside of the FlushContentToDisk method, but it's greatly simplified. I've removed the Encoding property and simply call the GetEncoding method at the start of the FlushContentToDisk, I don't think that should have an impact on performances and I think it's a bit cleaner than opening a file in the setter of a property. I decided to open a separate PR from the #13899 because I thought this, while related, is maybe a bit more open for discussion, hope that's OK for you. |
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
Remove the GetEncoding from Utils and two related static readonly fields. Add the default to utf8NoBom in the stream reader Put the stream reader in the try catch.
a928560 to
4390247
Compare
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
baae118 to
6c42ba2
Compare
| Encoding currentEncoding; | ||
|
|
||
| using (StreamReader reader = new StreamReader(this.Path, Utils.utf8NoBom, detectEncodingFromByteOrderMarks: true)) | ||
| { | ||
| _ = reader.Read(); | ||
| currentEncoding = reader.CurrentEncoding; | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this is actually interspersing two pieces of logic:
- Determining the encoding we want to use
- Creating the file stream writer
If that's the case, it might be nicer to factor the first into its own method, similar to the pre-existing code:
private Encoding GetPathEncoding()
{
using (var reader = new StreamReader(Path, Utils.utf8NoBom, detectEncodingFromByteOrderMarks: true))
{
_ = reader.Read();
return reader.CurrentEncoding;
}
}
internal void FlushContentToDisk()
{
lock (OutputBeingLogged)
{
try
{
// Does this dispose check need to be within the lock at all?
if (disposed)
{
return;
}
if (_contentWriter != null)
{
return;
}
try
{
_contentWriter = new StreamWriter(
new FileStream(Path, FileMode.OpenOrCreate, FileAccess.ReadWrite),
GetPathEncoding());
// Worth adding a comment here to explain what's happening
_contentWriter.BaseStream.Seek(0, SeekOrigin.End);
}
catch (IOException)
{
_contentWriter = new StreamWriter(
new FileStream(Path, FileMode.Append, FileAccess.Write, FileShare.Read),
Utils.utf8NoBom);
}
}
finally
{
OutputBeingLogger.Clear();
}
}
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I had initially, with the GetPathEncoding defined in the Utils class, but I think @iSazonov didn't like the method there.
In the end, it will be used only in this FlushContentToDisk, maybe it should be a local function? This way it is factored, but still available only in this method?
Otherwise I can put it back into Utils, in the end it's still valid, even more so if we just let the IOException bubble if there's an issue reading the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a short method is used once I don't see a point to have it (especially since it should be in
try-catch block). We could add a comment before the code block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjmholt what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I don't like making you go around in circles. So I'm happy to leave the code as it is.
@iSazonov the current PowerShell code is full of long tracts of statements, and breaking things into methods is an easy way to get readability and also improves reusability later on. Honestly I just find it a useful tool to simplify the process for writing and reviewing code. That's where my preference comes from. And given the complexity of some parts of our codebase, I don't see much argument against it.
especially since it should be in try-catch block
This is the kind of clarity that putting something into a method provides. As is, it's not clear what the full intent and purpose of the try/catch is (why do we fall back to a totally different setting in the catch block?). Once you separate the logic into two pieces and must give them a human-readable name, it's easier to assess what each operation's error semantics should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind changing things, in the end the goal was to make the code more readable / simple, and I tend to agree that it would probably be clearer in a method, which makes it clear why we're creating this StreamReader here.
Since what seems to bother @iSazonov is the fact that the method would be used only in this method, I think it's a good candidate for a local function, it groups the logic while keeping it "inside" the method and usable only there. Do you think this would be an acceptable proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rjmholt Thanks for clarify!
I see your point and agree that a method with an informative name adds readability. But in this case it seemed to me unnecessary. We have a request to create a cmdlet like Get-FileEncoding but I don't think this piece of code is worth worrying about reuse.
As for this try-catch block, I was afraid that we could change something in the behavior and get regression now or in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, as discussed I did a test with a local function. I think it's a good middle ground between your two opinions, we have the code factored and with an understandable name and at the same time it's available only in the FlushContentToDisk method.
Waiting on your remarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a local function sounds ideal 🙂
Use a local function to factor the path encoding code.
208271c to
64c1407
Compare
| OutputBeingLogged.Clear(); | ||
| } | ||
|
|
||
| Encoding getPathEncoding() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please moved the local function to start of the method.
Also please use Pascal name case - local function is de-facto a method and could be move in another scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 😄. Interesting that you prefer the local function at the start. I haven't seen any recommendation for local functions yet. Do you happen to have coding conventions available for them?
In my team we prefer to put it at the bottom, this way you have "pseudo-code" at the beginning with only the calls and you can just scroll down to see the implementation details.
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Ilya <darpa@yandex.ru>
src/System.Management.Automation/engine/hostifaces/MshHostUserInterface.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Ilya <darpa@yandex.ru>
|
This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days. |
|
@Gimly Thanks for your contribution! |
|
🎉 Handy links: |
PR Summary
Simplify getting encoding of existing file in the
FlushContentToDiskmethod of theTranscriptionOptionclass called when the transcription file is flushed to disk.PR Context
The
FlushContentToDiskmethod was calling a complexGetEncodingutils that was trying to read the file encoding "by hand". Through a discussion with @iSazonov in #13677 and #13899 it was decided that this could be simplified by using a StreamReader and taking advantage of theCurrentEncodingproperty.This allows to reduce code complexity and should also improve compatibility with different encoding. This PR should have no functional impact except improve recognition of encoding of existing files in some cases not covered by the existing code.
PR Checklist
.h,.cpp,.cs,.ps1and.psm1files have the correct copyright headerWIP:or[ WIP ]to the beginning of the title (theWIPbot will keep its status check atPendingwhile the prefix is present) and remove the prefix when the PR is ready.