-
Notifications
You must be signed in to change notification settings - Fork 349
component/module refactoring pass #9211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
missed one detail. Not a blocker - just nice to have
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski good to merge ? it looks the the get by ID call can be added incrementally ?
|
@lgirdwood it can be added later, also in #8919, so we can merge |
|
@andyross also a build failure reported by CI (testbench build) but maybe in others too. |
|
@lgirdwood |
|
|
The build failure is real, just something that doesn't build in any of the targets I poked (FWIW: is there a simple way to "run CI" on a local build host?). Need to swap hardware this morning (juggling AEC and a big array of MTK boards for Zephyr work and the desk is running out of space) and will have a fix up, along with the API additions requested. |
Although i wouldnt use it on a corp machine |
|
Oh sure, but that requires knowing the right github workflow to run. And I can read .github/workflows anyway. Both of those are ~= in complexity to just reading the build failure in CI anyway. What I meant was something in the SOF source tree like "scripts/run-ci.sh" or whatever that developers can use to easily check a PR before submission. Like, in Zephyr, almost[1] everything[2] you want to test is exercised by just running "scripts/twister" with no arguments. Would be nice to have a similar smoke test in SOF, though the needed tooling setup would be a lot of complexity. [1] Well, except for stuff that requires tooling not distributed in the SDK, so it doesn't do the doc build or renode runs or clang builds. There's an increasingly long list of "Oh, yeah, that" cases in CI failures, but really that's a bug in the SDK strategy and not the tooling. [2] With the caveat that a bare twister run now takes upwards of an hour and a half on a 64 core build box. That too is a real bug, but not with the tooling. Enumerating the cartesian product of "Tests X Platforms" is running into severe scaling problems right now. |
@marc-hb is there a script for something like this? I know a lot of the CI just invokes scripts, might not be the worst to have a user ready one |
5350c7a to
35d25b1
Compare
|
Think this should be good now? The checkpatch failure is a false positive, it's complaining (legitimately, though the failure was already present) about multiply expanded arguments in a macro that merely got a term refactored. The sparse warnings seem to be entirely about upstream Zephyr code, I don't think I see anything specific to this PR and am pretty sure that's just how they look. The "zmain" failures are indeed screaming about the change to the fatal error handling that just merged; that does need to be fixed but obviously not in this PR. |
|
@andyross in Internal Intel CI System/merge/builds, When we try to bind peakvol with kpb we get FW exception Module: 5 |
|
I don't know how to interpret that? Is there a stack trace or fault address or something you can match to the output binary? Is there a way I can reproduce the test, or at least see how it's built? Is it built entirely from code in the tree or is there more source code in your test I should be looking for? Really this shouldn't be changing anything behaviorally. App-visible changes are limited to:
|
|
Yeah, I'm a little worried here. Looking at that dump, none of the notable strings (e.g. "FwExceptionNotification", "Proc_domain", "Extended_init") appear in the SOF tree at all. I'm worried this is not just a test, but a big blob of out-of-tree source code? |
|
this is an exception CPU 0 EXCCAUSE 13 (load/store PIF data error) FW log from this test: map file attached for backtrace decoding |
|
That PC is in ipc_comp_connect, but the map file has no disassembly. Seems reasonable that it's instantiating a kpb component given the logs. Is that not exercised by any tests I can run locally? And can you repeat that build with CONFIG_OUTPUT_DISASSEMBLY=y and provide the resulting zephyr.lst file? (I hate that Zephyr turned this off by default, it takes time in CI but is irreplaceable for stuff like this). You can also get it with "objdump -D" on zephyr.elf (the objdump that matches the toolchain in use, obviously). |
|
Also when you guys have bandwidth, it would be helpful to try to try the test with each of the final two patches in this PR, which both stand alone, to see which is causing the failure (the first is just dead code removal and surely uninvolved -- it would cause a build failure, not a runtime failure, if something was using it). |
|
@andyross dstmod is NULL, and is being dereferenced in the very next line EDIT:
well... it just has. It was working only because KPB use dev->priv_data as it was originally intended to be used - as a container for a private data set and the pointer wasn't null. And was mapped to struct processing_module :( |
|
@andyross I'll provide a fix for this. |
|
Ouch. So if I'm understanding you:
Ouch, again. So it's actually a nice catch for that patch, I guess, the only issue being that the test that detected the failure was out-of-tree. This doesn't seem like something that should be in a vendor test? Any chance you can commit that somewhere so it can be exercised by other folks? I'd totally have fixed this myself if I'd seen it, I promise. :) Regardless: any possibility of merging this ahead of the kpb (or test? seems like the real bug is wherever that kpb is getting created, not in the device code itself) fix and filtering/whitelisting the failure on your end, given that it's been root caused and isn't an in-tree test? |
This is dead code: just a wrapper around module_adapter_new(), which is a public API already in use. Presumably a forgotten relic. Remove. Signed-off-by: Andy Ross <andyross@google.com>
The current SOF architecture splits a "module" across two structs, the legacy comp_dev and a newer struct processing_module. The latter contains a pointer to the former, but in many places existing code has needed a backpointer to recover the module from a component. So far this has abused the drvdata mechanism to store this as a void*, but that's fragile as many components are already (!) using drvdata for other purposes and will clobber the setting. The fact that it worked is seeming just by luck. That pointer is for the exclusive use of the comp_driver code associated with a component to store its own data. We can't be touching it from the global module code. Just give the pointer a properly-typed field of its own and make sure the two are initialized in tandem. Longer term, SOF really needs to fix this bifurcation and unify the two structs. Signed-off-by: Andy Ross <andyross@google.com>
The pipeline_id has historically been part of the comp_buffer struct, but that is being deprecated as a public API. So move it down into the contained sof_audio_stream_params struct where it can be found by new style sink/source code. Note that the actual value of the pipeline ID is a little ambiguous: on IPC3, the buffer is defined by the user in the .tplg file as part of a specific pipeline with a known ID. With IPC4 topology, the buffers are implicitly created and will be assigned the ID of their source (!) component. It is legal to define a connection across two pipelines, and there's no ability here to recover both pipeline IDs. Signed-off-by: Andy Ross <andyross@google.com>
Expose the new pipeline_id field in sof_audio_stream_params to a cleaner sink/source API for use by module code. Longer term this may want to be indirected by newer backends. Signed-off-by: Andy Ross <andyross@google.com>
35d25b1 to
029758c
Compare
|
Easy enough, seems clean to me in cyclic review. Everyone else should check the new patch too. Note that I left it in place at the end of the series rather than re-refactor it twice, which is technically a bisection issue because the crash is introduced by an earlier patch. I argue that's OK since the crash isn't visible using in-tree tests, and because the root cause bug was pre-existing, it just wasn't a crash. |
kv2019i
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Some typos, but nothing to warrant an update of the patch series, so if nobody objects, let's merge this today.
Legacy API that is not using module_adapter is now depreciated, but there are still some modules that use it. So all common code must work properly with both types of modules. This commit is fixing crash in bind operation when binding a legacy module. There also some comments added in potentially similar places, but where legacy modules cannot be used. Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
029758c to
d38f9cd
Compare
|
Took the liberty of fixing up @marcinszkudlinski 's comment per review above; hope that's OK. Also rebased. |


Some API refactoring, no behavioral changes.