-
Notifications
You must be signed in to change notification settings - Fork 349
Cross core Virtual heaps, invalidation of cross core buffers in unbind call #10044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross core Virtual heaps, invalidation of cross core buffers in unbind call #10044
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements a fix for cross‐core virtual heaps and buffer invalidation during unbind calls by refactoring the virtual heap management and converting all module binding/unbinding functions to use strongly typed structures (bind_info and unbind_info). In addition, a new cache invalidation hook is added for ring buffers to ensure proper data consistency on cross‐core operations.
- Refactored virtual heap allocation to use a single pointer rather than a per-core array.
- Updated API signatures across the codebase (including IDC, component, module adapter, tester, and copier layers) to use bind_info/unbind_info structures.
- Introduced on_bind/on_unbind hooks for source/sink APIs and added cache invalidation in the ring buffer unbind function.
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| zephyr/lib/alloc.c | Refactored virtual heap pointer from an array to a single pointer for cross-core memory management. |
| zephyr/include/sof/lib/regions_mm.h | Updated VMH heap API signatures and removed the core_id field from the vmh_heap structure. |
| src/ipc/ipc4/helper.c | Changed binding functions to use bind_info/unbind_info structures in place of void pointers. |
| src/include/sof/audio/module_adapter/module/generic.h | Updated module bind/unbind function signatures to use structured bind_info/unbind_info parameters. |
| src/include/sof/audio/component_ext.h | Modified component extension API functions to use the new binding/unbinding parameter types. |
| src/include/sof/audio/component.h | Changed component operation function pointers to accept bind_info/unbind_info types. |
| src/include/module/module/interface.h | Updated module interface function signatures for bind/unbind to support the new structures. |
| src/include/module/audio/source_api.h | Added on_bind/on_unbind hooks and a bound_module field for tracking source API bindings. |
| src/include/module/audio/sink_api.h | Added on_bind/on_unbind hooks and a bound_module field for sink API bindings. |
| src/idc/idc.c | Updated IDC binding functions to cast payloads into bind_info/unbind_info types. |
| src/debug/tester/tester.h | Modified tester interface bind/unbind function signatures to utilize bind_info/unbind_info. |
| src/debug/tester/tester.c | Updated tester binding functions according to the new API signature. |
| src/audio/module_adapter/module_adapter_ipc4.c | Adapted module adapter bind/unbind functions to use the new structured parameters. |
| src/audio/module_adapter/module/generic.c | Revised module bind/unbind implementations to include source/sink hook calls with bind_info/unbind_info. |
| src/audio/mixin_mixout/mixin_mixout.c | Updated mixout binding functions to operate with the new bind_info/unbind_info types. |
| src/audio/kpb.c | Modified KPB binding functions to use the new structured parameters (note an incorrect debug log). |
| src/audio/dai-zephyr.c | Updated DAI unbind signature to accept unbind_info instead of a generic void pointer. |
| src/audio/copier/dai_copier.h | Revised DAI copier unbind signature for consistency with new unbind_info parameter type. |
| src/audio/copier/copier.c | Updated copier binding/unbinding functions to use bind_info/unbind_info structures. |
| src/audio/buffers/ring_buffer.c | Introduced an on_unbind hook that invalidates cache for ring buffers upon disconnection. |
Comments suppressed due to low confidence (1)
src/audio/kpb.c:392
- The debug message in the kpb_unbind() function incorrectly states 'kpb_bind()'. It should be updated to 'kpb_unbind()' to accurately reflect the operation.
comp_dbg(dev, "kpb_bind()");
|
|
||
| if (bind_data->as_source) { | ||
| ret = source_bind(bind_data->as_source, mod); | ||
| if (ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need to unbind sink here? And in the potential error case below? Same in module_unbind()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, sink/src API is specific to a single component only. "Bind" for sink/source means a connection to a single component and nothing more.
Bind/unbind will be called separately for both components in question, each call on proper core in case of cross core connection. So will be bind/unbind for sink/src APIs
As to error patch: it will NEVER be situation where as_source != NULL && as_sink!=NULL
maybe it will be better to do this in more stright way:
struct bind_info {
struct ipc4_module_bind_unbind *ipc4_data;
bool is_source; // will think about better name
union {
struct sof_source *source;
struct sof_sink *sink;
}
}
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski aha, so as_source and as_sink are mutually exclusive, they cannot both be non-NULL at the same time? Then yes, would be good to make that explicit, at least with an else. And yes, using a union would be another possibility too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api changed
| * core that have been using sink API | ||
| */ | ||
| ring_buffer_invalidate_shared(ring_buffer, ring_buffer->_data_buffer, | ||
| ring_buffer->data_buffer_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be possible to also free the buffer here? might need to unbind it from the other side too if not done yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea. Now free is performed when a module is deleted.
OOOh! just got an idea! It has always been a memory leak!
Bind (create bufer) -> unbind (do nothng - don't delete) --> bind (create another)....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski No, I don't think it's called at all. In IPC3, there's just a call to pipeline_disconnect() (for both buffers connected to a component) and this internally calls buffer_detach().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I checked, buffer free is called in unbind.
It wasn't till this PR because "unbind" call was ignored in most cases
ret = comp_unbind(src, &unbind_data);
unbind_data.bind_type = COMP_BIND_TYPE_SOURCE;
unbind_data.source = audio_buffer_get_source(&buffer->audio_buffer);
ret1 = comp_unbind(sink, &unbind_data);
ll_unblock(cross_core_unbind, flags);
buffer_free(buffer);
zephyr/lib/alloc.c
Outdated
| tr_err(&zephyr_tr, "Unable to init virtual heap for core %d!", core); | ||
|
|
||
| virtual_buffers_heap[core] = heap; | ||
| virtual_buffers_heap = vmh_init_heap(&static_hp_buffers, MEM_REG_ATTR_CORE_HEAP, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, wasn't the actual meaning of CORE_HEAP that it was per core? Maybe it isn't needed any more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
obsolete name - used to be per heap, now is global, Will change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to MEM_REG_ATTR_SHARED_HEAP
MEM_REG_ATTR_* are defined in Zephyr. Zephyr creates one virtual memory space per each core, one as "shared", one for other purposes (MEM_REG_ATTR_OPPORTUNISTIC_MEMORY ??).
Anyway,
- all spaces are pure virtual and don't use any physical memory banks till memory is allocated, so they can stay in Zephyr.
- In Zephyr all regions are treated in the very same way - in fact no difference at all which region is used
- SOF used to use 3 "per core" areas, now is using one "shared"
- "shared" size is 0x100000 == 1MB, and it is hard-coded in zephyr (should be enough)
maybe it is worth to introduce some kconfig options enabling regions and adjusting sizes, but it requires a PR to Zephyr.
zephyr/test/vmh.c
Outdated
|
|
||
| struct vmh_heap *heap = vmh_init_heap(NULL, virtual_memory_region[i].attr, | ||
| i, true); | ||
| struct vmh_heap *heap = vmh_init_heap(NULL, virtual_memory_region[i].attr, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but here in line 239/238 num_regions still counts all cores, which then becomes the iteration count for this loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski this seems to still be open - this shouldn't be called in a loop any more, should it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not? All of virtual memory tables are still in Zephyr, This test uses all of them and this is fine.
There's no kconfig option in zephyr that turns off any of the virtual memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski how will it use all of them now, that you've modified vmh_init_heap() to only use heap 0 for all core heaps?
kv2019i
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits inline. I think the major issues related to last commit and whether all logic related to per-core heaps are removed (see Guennadi's comment to vmh.c).
This is quite a bit change, but at least I welcome making virtual heap usable across cores.
| } | ||
|
|
||
| comp_err(dev, "No sink buffer found for src_queue = %u", src_queue_id); | ||
| return -ENODEV; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, Something odd with all the commit messages. They are line-wrapped to under 50 columns...? Look a bit odd in git history compared to other commits in SOF.
| * core that have been using sink API | ||
| */ | ||
| ring_buffer_invalidate_shared(ring_buffer, ring_buffer->_data_buffer, | ||
| ring_buffer->data_buffer_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski No, I don't think it's called at all. In IPC3, there's just a call to pipeline_disconnect() (for both buffers connected to a component) and this internally calls buffer_detach().
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good improvements - almost ready, a few opens still.
52fab21 to
08e2284
Compare
|
Step 1 - change bind/unbind API to be more clear |
08e2284 to
d861bfc
Compare
|
step2 - rebase & conflict resolve |
| /* in case of disconnection, invalidate all cache. This method is guaranteed be called on | ||
| * core that have been using sink API | ||
| */ | ||
| ring_buffer_invalidate_shared(ring_buffer, ring_buffer->_data_buffer, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name of this function is actually somewhat confusing: usually in SOF "shared" means "uncached"
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to use the enum for sink/source type but test results from @jsarha are passing so fix is validated !
e5a1b79 to
e208721
Compare
lyakh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concerning full testing of the Zephyr memory blocks API - I think we need a twister test for that. Maybe test/vmh.c could be moved to Zephyr, that would improve testing, I think
zephyr/lib/regions_mm.c
Outdated
| i < CONFIG_MP_MAX_NUM_CPUS + VIRTUAL_REGION_COUNT; i++) { | ||
| /* Search for matching attribute so we place heap on shared virtual region */ | ||
| for (i = CONFIG_MP_MAX_NUM_CPUS; | ||
| i < CONFIG_MP_MAX_NUM_CPUS + VIRTUAL_REGION_COUNT; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this would break with zephyrproject-rtos/zephyr#91666 . I think we need a Zephyr function to find a region by the attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true! CONFIG_MP_MAX_NUM_CPUS + VIRTUAL_REGION_COUNT as a table size - instead of a single macro - should have never been merged :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can proceed with Zephyr changes
- I'll add a function finding a region by attribute and core num to zephyr.
- I'll create a PR to SOF using the zephyr function
- PR must be merged together with moving to new zephyr head
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sys_mm_drv_query_memory_regions is a standard zephyr function. The API assumes that at the end the array of regions should be freed via sys_mm_drv_query_memory_regions_free. The end of the array is marked with an empty element, so the loop doesn't have to rely on knowledge of its size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski do you want to do the Zephyr changes incrementally, i.e. we merge this and give it a a few days under CI testing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR should go first, it works with current zephyr
than, zephyr change should be merged
as the last step - move SOF to new zephyr + changes in SOF (in one PR)
Currently bind api call passes IPC4 header only. This commit extends API by adding pointers to sink/source of the entity that is being bound. Note that in pipeline 2.0 there's a possibility to bind a module to a module, therefore pointers to sink/src need to be provided instead of a pointer to a buffer Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
Currently unbind api call passes IPC4 header only. This commit extends API by adding pointers to sink/source of the entity that is being bound. Note that in pipeline 2.0 there's a possibility to bind a module to a module, therefore pointers to sink/src need to be provided instead of a pointer to a buffer Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
This commit adds bind/unbind hook to sink/source API It is guaranteed that the hooks will be called on core that the API will be / was used on, what makes them a perfect place for cache operations needed at start/end of usage. Note that free method is not a good place for this, as in case of shared buffers it may happen on either of the cores the buffer connects Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
Unbind used to be ignored for components in the same pipeline But there are some code that depends on the event and need to have it in all situations, i.e. dai_zephyr_unbind or cross-core DP connections in the same pipeline Propagate the event in all cases Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
unbind hook in buffer is guaranteed to be called on the core that was using the API. Free, however, may be called on other core Add cache invalidation, that must be called on core that was producing data (user of sink API) in unbind hook. Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
struct vmh_heap is a private structure and should not be used outside of regions_mm.c file structure moved Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
In case of cross core buffers a buffer may be allocated and freed by different core, depending on which component is deleted second. As all control structures of virtual heaps are stored in uncached aliases, there's no technical problems with allowing virtual heaps to work cross core. The only consideration is that in case of cross core allocate/free the cache invalidation MUST be performed on the core that was storing data. It is up to the memory user to ensure this Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
virtual heap may be called from multiple threads, all operations should be protected my mutex Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
struct sof_audio_buffer is a "base class" in a meaning of C++ of all buffers in SOF (currently comp_buffer and ring_buffer). Freeing of memory was based on indirect assumption that struct sof_audio_buffer is a very first field of the buffer structure this commit moves freeing of buffer to a method specific for each buffer type, enforcing that proper pointer will be used regardless of buffer structure Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
e208721 to
f7d9991
Compare
|
faiures because: docker: failed to register layer: write /usr/lib/x86_64-linux-gnu/libicuuc.a: no space left on device |
ack - think we've root caused to Zephyr SDK growing to 11G when GH action limit is 14G |
Jenkins passing - the fix for Zephyr SDK will take longer, we should merge things in the mean time. |
Zephyr provides a special macro, SYS_MM_DRV_MEMORY_REGION_FOREACH for iteration through all memory regions. This commit use this macro instead of a for loop that requires information about number of virtual memory regions provided by Zephyr. Signed-off-by: Marcin Szkudlinski <marcin.szkudlinski@intel.com>
|
added one commit - use SYS_MM_DRV_MEMORY_REGION_FOREACH macro for virtual memory regions iterating. Now the Zephyr changes are 100% independent |
this is fix for #10024
Rootcause:
In case of cross core buffers, there's no guarantee that buffer is freed on the same core that it was created by. It depends which end of cross-core buffer (which component) is freed as second - the buffer will be freed by the core of second component.
buffer memory is managed by a feature called "virtual heaps" that enforces freeing of memory on the same core that it was allocated ====> EXCEPTION
even if virtual heaps were cross-core, there's still a problem with invalidation of cache of memory being freed. It MUST be performed on a core that was producing data - and there's no way to override it. If the cache is not invalidated, it will be written back by the core itself leading either to data corruption or to exception in case the virtual memory has been unmapped.
SOLLUTION: