Modify ConvenientLine to work with unicode#844
Conversation
This changes how `ConvenientLines` are calculated with a breaking API change. `ConvenientLine` used to assume that all content coming back was ASCII characters and a uniform byte length. This is no longer the case. Now we use the byte length coming back from libgit2 to extract the string from a buffer using the correct byte length. This will give back the correct line everytime. The breaking API change is that we no longer return the new line character with the line nor do we include that character in the `contentLen` calculation either. I'm on the fence about that change but it seems like it's what we ought to be doing instead of force all of the consumers of the library to account for that everywhere.
|
This fixes #785 |
|
cc @nodegit/owners anybody want to chime in on the line ending stuff? |
|
Seems like a pretty large breaking change, major version bump? If this brings us closer in alignment to libgit2 (I think you're saying it is since we're using the byte length returned), then I'm 👍. |
|
@tbranyen so the byte count calc is definitely in line with what we should be doing. The removal of the We might want to do a major version bump at this point anyways since it would be nice to adhere to semver and not break people's code with a release. Especially if we going to start releasing more often. So I just argued myself out of cutting off the line endings. I'll change that and push it. If there are no objections from @nodegit/owners then I'll go ahead and merge this. |
This required a modification to how we're staging lines since that was expecting no new line char at the end and would append it itself. That's no longer needed. The tests have been updated to reflect the change.
|
Wait so did this not already trim the newline here? 74c19f8#diff-3af6ec85ce097f4106d101a2f0f41c9cL71 |
|
Ah nvm, it's then manually reappended each time. Now we just get the utf8 string and don't trim that off, but don't re-add it. I'm all for it 👍 |
Modify `ConvenientLine` to work with unicode
This changes how
ConvenientLinesare calculated with a breaking API change.ConvenientLineused to assume that all content coming back was ASCII characters anda uniform byte length. This is no longer the case. Now we use the byte length coming
back from libgit2 to extract the string from a buffer using the correct byte length. This
will give back the correct line everytime.
The breaking API change is that we no longer return the new line character with the linenor do we include that character in the
contentLencalculation either. I'm on the fenceabout that change but it seems like it's what we ought to be doing instead of force all of
the consumers of the library to account for that everywhere.
Edit: I reverted the change to returning the new line characters.