Skip to content

Conversation

@upils
Copy link

@upils upils commented Nov 21, 2025

  • Have you signed the CLA?

This PR solves 2 distinct issues:

  • When the --arch option was not used when calling chisel cut, no architecture was passed down to the selector, resulting in slices using v3-essential with an Arch value to be ignored.
  • Slices using v3-essential with an Arch value where ignored during validation of the release, possibly leading to undetected conflicts (i.e. loops).

letFunny and others added 7 commits November 20, 2025 13:41
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Signed-off-by: Paul Mars <paul.mars@canonical.com>
@github-actions
Copy link

github-actions bot commented Nov 21, 2025

Command Mean [s] Min [s] Max [s] Relative
BASE 8.916 ± 0.052 8.843 9.013 1.00
HEAD 8.916 ± 0.053 8.856 9.011 1.00 ± 0.01

@letFunny
Copy link
Collaborator

letFunny commented Nov 21, 2025

It is also useful to look at the processes to see how we failed to catch the first bug before releasing a new Chisel version.

How was the feature tested? Why was the bug not found?

For this feature, I did manual testing in my computer always using the --arch CLI option, which is why I did not see the bug. Because I wanted to see it tested against the real packages in case I missed anything, I asked the ROCKS team to create PR and test this feature in the CI. The CI did not find the issue because spread tests were not added to the PR and the logs were not manually inspected (they are in fact trimmed from the CI output). Lastly, there are also shortcomings in how we test Chisel because the CLI parsing and the glue is only tested when using spread with a limited set of inputs.

How was the bug found?

I only found the bug when reviewing the PR and seeing the logs where missing. I then tested it locally out of paranoia and found that the essentials were not installed.

How to prevent this from happening in the future?

In terms of processes, we should discuss adopting a strategy where new features are tested for a while before releasing a new version, similar to the "unstable" channel in snaps. At the same time, for every new feature, I think we should make QA part of the process and define the expectations more formally. For example, to create a PR using the new feature with the appropriate spread tests as a necessary step before tagging a new version from main.

@letFunny letFunny added Priority Look at me first Bug An undesired feature ;-) labels Nov 21, 2025
@letFunny letFunny requested a review from niemeyer November 21, 2025 10:41
Signed-off-by: Paul Mars <paul.mars@canonical.com>
Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for the fix.

Paul, please note that the summary for the PR and commit is not following our standard conventions.

@niemeyer niemeyer changed the title fix: Properly taking into account the architecture in v3-essential fix: define default arch in cut command Nov 21, 2025
Copy link
Contributor

@niemeyer niemeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, there's actually an issue worth talking about here.

logf("Selecting slices...")

if arch == "" {
return nil, fmt.Errorf(`cannot select slices: "arch" is unset`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I actually have a question here: there's something suspect about this PR, which is the whole logic was actually handling the empty arch just fine until now, and suddenly because we introduced a v3-essential feature, the behavior is changing and we're unable to let the architecture empty, while before we could. It looks like there's something being missed here. Clearly we had a default architecture before. Where was it, and why is this PR not touching that, and also why isn't that place collaborating with the existing fix?

Copy link
Collaborator

@letFunny letFunny Nov 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arch was not needed prior to v3-essential because the dependency set that Select would generate would not depend on it. When we introduced v3-essential we changed the signature of this function so the user needs to supply an arch. Paul and me discussed today whether selecting slices with an empty arch makes sense or not and we think it doesn't.

We also discussed calling deb.InferArch here which might be the default you are hinting to. In general we thought it was better to return an error and let the upper layer either infer the arch or to supply something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arch was not needed prior to v3-essential because

If that was true we would not have an arch parameter in select. It's hard to believe it's taken in, and not taken into account for anything. If it is taken into account for something, that something is now working differently from before.

Maybe that's okay, but this is ignoring the point I made instead of addressing it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to v3-essential, Select was not taking in the arch parameter (see https://github.com/canonical/chisel/pull/246/files#diff-f61b5f82e446ae57f46ae7c811a56addc74662f37f0dd6440c99c5d869b651d1R465), because the order function was not either.

This parameter was added so the arch-specific essentials could be ignored if the arch was not matching. However the implementation of #246 did not consider the case of an empty arch parameter, which happens during the release validation. And this led to not considering slices with v3-essential with a Arch entry. The fix is also addressing this issue (see second bullet point of the PR description).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously the default arch was "", and now its the result of InferArch which returns non-empty string or error: https://github.com/canonical/chisel/pull/256/changes#diff-8867bd2aaddbb2642dd6d8fee78a80c836af5eaabbd059da7fe716a378603203R61-R68

having said that, this feels like a great place/use of enums

logf("Selecting slices...")

if arch == "" {
return nil, fmt.Errorf(`cannot select slices: "arch" is unset`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously the default arch was "", and now its the result of InferArch which returns non-empty string or error: https://github.com/canonical/chisel/pull/256/changes#diff-8867bd2aaddbb2642dd6d8fee78a80c836af5eaabbd059da7fe716a378603203R61-R68

having said that, this feels like a great place/use of enums

}
arch := cmd.Arch
if arch == "" {
darch, err := deb.InferArch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: this is returning the debArch field of archPair is that the correct one to use? maybe in the future we should do an internal refactor for goArch and debArch to be enums to make it clear which one we're using where.. 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debArch is then used as a filter for values defined in path so this is the one we want. The purpose of deb.InferArch() is to return the current runtime deb architecture. The fact it is inferred from the GOARCH is an implementation details I think.

What do you think if the source of confusion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, just when reading the code there is both goArch and debArch, which are both strings so its not explicit which one should go where, and the name of InferArch does not say which one it returns, so i could imagine someone making a mistake of thinking it returns the other one -- something which would not be caught by the compiler


if test.selslices != nil {
selection, err := setup.Select(release, test.selslices, "")
selection, err := setup.Select(release, test.selslices, "amd64")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic value. should be a constant imo.

these are tests, so its maybe less important here, but nonetheless it does introduce a notion of a default architecture. maybe we should define that and the return it here:
https://github.com/canonical/chisel/pull/256/changes#diff-8867bd2aaddbb2642dd6d8fee78a80c836af5eaabbd059da7fe716a378603203R65 ??

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it introduces the notion of a default architecture from the point of view of setup.Slice().
Here in the test we use a value coherent with our tests cases. We set amd64 because that was previously the implicit value for the test cases.

The behavior of falling back to inferring the arch is consistent with how the release is inferred in obtainRelease. There is no "default value" but a heuristic to get a value from the building environment if none was explicitly provided (I see a nuance here, let me know if this is not clear).

})

if test.arch == "" {
test.arch = "amd64"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see this comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug An undesired feature ;-) Priority Look at me first

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants