Skip to content

Conversation

@btian1
Copy link
Contributor

@btian1 btian1 commented Nov 3, 2022

Heap memory profiling is based on runtime zephyr API.
It will print out each memory allocation with allocated bytes
and free bytes, display as below:
zephyr: heap allocatd: 22c0 free: 152a90

Signed-off-by: Baofeng Tian baofeng.tian@intel.com

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 3, 2022

Btw, one more existing mechanism (that is arguably not so widely used) is the IPC3 src/ipc/ipc3/handler.c:ipc_glb_test_mem_usage() . This relies on OS implementing (heap_info()), which we don't do for Zephyr SOF builds. The nice thing with this was that this can be queries from kernel with an IPC message, so no need to parse the data from FW logs, the info is available even if FW logging is not, and so forth. I think this goes beyond this PR though, as this approach will be IPC dependent, but in the long term, we do want to expose Zephyr memory runtime data in this model.

FYI to @lgirdwood @mwasko and @marcinszkudlinski

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a debug overlay @kv2019i ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense if put is as default debug overlay, since its functionality is for stack overflow detection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry not following how the heap state helps with stack overflow detection, is this a different Zephyr option ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are separate functionality, no relation between heap and stack overflow detection.
CONFIG_SYS_HEAP_RUNTIME_STATS is used to collect heap allocated/free space, total is around 1.4MB.

CONFIG_STACK_SENTINEL this is used to enable stack overflow detection by check a magic number in the stack bottom, after enable this, once stack overflow happened, Zephyr logging will print "Stack overflow" with other information, so move it to debug build config is also good, please let me know, if you want to move it.

Copy link
Contributor Author

@btian1 btian1 Nov 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6530
this already added, @kv2019i , do you think I need remove here?
once remove, I need add -d option for perf build.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@btian1 see above, we need these 2 Kconfigs to allow the developers to tune the output depending on what they are debugging.

Copy link
Contributor Author

@btian1 btian1 Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood , I did not find in current SOF code base, add it from new and then add code to make these config work?

I also want to add this at the beginning, however, due to wrapped too much layer, I did not try with this.
It may need introduce new parameters for each function and code change may be big, any better ideas?

rmalloc/rzalloc...., need add pipeline and component info for each memory alloc and free.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config SOF_HEAP_RUNTIME_ALLOC

@lgirdwood Maybe add "DEBUG" or "PRINT" somewhere? Otherwise this option looks to me like it enables heap allocations and without it rmalloc() won't work. Same for SOF_HEAP_RUNTIME_PIPELINE perhaps

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suspect that those options would have quite a large run-time / logging impact, so I'd really rather have them off by default and only enable them for debugging. The former would supposedly "only" print one line for each allocations, but we allocate quite a lot. We have many instances where we initialise a single functionality but use several allocations for it. Hopefully we don't allocate any memory for each .copy() / .process() iteration, but even when one pipeline is running and another one processes IPCs which will print excessive additional information for memory allocations - that can affect performance of the running pipeline.
Same holds for heap state dumps for each pipeline creation and destruction. Especially since we now have multiple pipelines for most or all streams. I know we have those dumps with XTOS and I find them quite intrusive and excessively polluting logs too (see #4744 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RanderWang , Rander today introduced that close fw alloc memory based on pipeline, this is much more easy to do data collection and analysis.

For SOF, allocation based on rmalloc, any code call this module can get a buffer with specific size, the difficult part is we have to add all called place with pipe and component information.

@lgirdwood
Copy link
Member

Btw, one more existing mechanism (that is arguably not so widely used) is the IPC3 src/ipc/ipc3/handler.c:ipc_glb_test_mem_usage() . This relies on OS implementing (heap_info()), which we don't do for Zephyr SOF builds. The nice thing with this was that this can be queries from kernel with an IPC message, so no need to parse the data from FW logs, the info is available even if FW logging is not, and so forth. I think this goes beyond this PR though, as this approach will be IPC dependent, but in the long term, we do want to expose Zephyr memory runtime data in this model.

FYI to @lgirdwood @mwasko and @marcinszkudlinski

Ack, I think we do need to extend IPC for development. i.e. debug, memory, status info, thread info, and a Zephyr shell. This could all come under a debug Kconfig menu.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i any comments ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry not following how the heap state helps with stack overflow detection, is this a different Zephyr option ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please enable this config option in a separate patch as this is totally different logical change.

Also, please imperative mood in the commit message.

e.g: memory: Add support for heap memory profiling

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is enabled in another patch:#6530, so I will drop here, and when profiling build, just need add -d option is fine

@btian1
Copy link
Contributor Author

btian1 commented Nov 10, 2022

Btw, one more existing mechanism (that is arguably not so widely used) is the IPC3 src/ipc/ipc3/handler.c:ipc_glb_test_mem_usage() . This relies on OS implementing (heap_info()), which we don't do for Zephyr SOF builds. The nice thing with this was that this can be queries from kernel with an IPC message, so no need to parse the data from FW logs, the info is available even if FW logging is not, and so forth. I think this goes beyond this PR though, as this approach will be IPC dependent, but in the long term, we do want to expose Zephyr memory runtime data in this model.
FYI to @lgirdwood @mwasko and @marcinszkudlinski

Ack, I think we do need to extend IPC for development. i.e. debug, memory, status info, thread info, and a Zephyr shell. This could all come under a debug Kconfig menu.

Yes I am fully agree, IPC implementation will be more convenient, no need print is great, I also feel difficult for this adding, without adding, we don't know the heap, if added, print a lot of fw logs, take a 48k as example, it finally allocated 0x9698 37k bytes.
without it, we don't know the exact heap allocation status.

Please suggest next step, if we want to add ipc in near future, I am ok with that solution.
If we don't have in near future, I would suggest add this into code for now.
In future, once IPC implementation is done, we can revert this PR.

Thanks
Tim

@btian1
Copy link
Contributor Author

btian1 commented Nov 11, 2022

@kv2019i , your test remind me whether we need add one more functionality with this PR.
This PR only collect data allocated and free number, how about add code to rfree to check data free, then compare the allocated and free size to check memory leakage?

Let me know what do you think?

Thanks
Tim

zephyr/wrapper.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for "normal" byte-counts decimal sizes are easier to read, i.e. I'd use %u. Or at least put 0x in front or use %#x

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to which format both ok, some developer have a brain can quick know real size with hex.
Ok, I can also change it to decimal, it is more straight forward to most people

Heap memory profiling is based on runtime zephyr API.
It will print out each memory allocation with allocated bytes
and free bytes, display as below:
zephyr: heap allocatd: 22c0 free: 152a90 max allocated: 1000

Signed-off-by: Baofeng Tian <baofeng.tian@intel.com>
@lgirdwood lgirdwood merged commit 371d351 into thesofproject:main Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants