Skip to content

Conversation

@keyonjie
Copy link
Contributor

dw-dma: change to use runtime_shared heap for lli buffer

The lli structs are used by both DSP and DMAC, allocate them from the
runtime_shared heap to avoid being corrupted by cache writing back.

This helps to fix a lot of DMA xrun issue according to the validation.

after this fix, the previous merged commit that enlarging the lli struct size is not needed any more. @lgirdwood FYI.

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kv2019i can you check this with Zephyr. Thanks !

@lgirdwood
Copy link
Member

@aiChaoSONG are you able to update cmake to 3.20 for Zephyr build CI ?

[ 84%] Building C object CMakeFiles/rimage.dir/src/adsp_config.c.o
[ 92%] Building C object CMakeFiles/rimage.dir/tomlc99/toml.c.o
[100%] Linking C executable rimage
[100%] Built target rimage
/workdir/zephyrproject
+ west build --build-dir build-apl --board intel_adsp_cavs15 zephyr/samples/subsys/audio/sof
/workdir/zephyrproject
-- west build: generating a build system
CMake Error at CMakeLists.txt:3 (cmake_minimum_required):
  CMake 3.20.0 or higher is required.  You are running version 3.18.3

@aiChaoSONG
Copy link
Collaborator

aiChaoSONG commented Aug 23, 2021

are you able to update cmake to 3.20 for Zephyr build CI ?

@lgirdwood For internal CI, CMake is updated 3.21.1, we are good to build and test now. For github actions, as no Ubuntu is shipped with a higher version CMake, you will see zephyr build failure in github actions before we can fix it, I will talk with Marc on this.

Before we can fix the github action, you can check sof-ci/jenkins/pr-fw-build as an alternative.
image

You can check all firmware build log in the Details.

image

@keyonjie
Copy link
Contributor Author

SOFCI TEST

@keyonjie
Copy link
Contributor Author

Looks this introduces regression on APL, let me take a look to it.

@keyonjie keyonjie marked this pull request as draft August 23, 2021 05:43
@lgirdwood
Copy link
Member

@lgirdwood For internal CI, CMake is updated 3.21.1, we are good to build and test now. For github actions, as no Ubuntu is shipped with a higher version CMake, you will see zephyr build failure in github actions before we can fix it, I will talk with Marc on this.

@aiChaoSONG ubuntu CI can be forced to use a newer version of cmake by adding a private ppa for cmake. @marc-hb may have other or better ways to achieve this.

@kv2019i
Copy link
Collaborator

kv2019i commented Aug 23, 2021

@lgirdwood Test running with Zephyr, will report results today. UPDATE: tests pass cleanly on Zephyr (TGL).

@marc-hb
Copy link
Collaborator

marc-hb commented Aug 23, 2021

Zephyr Github Actions fixed by @aiChaoSONG in #4668

@keyonjie
Copy link
Contributor Author

well, there is a point that even though for APL single core, it is preferred to use uncached lli address, this requires change for our cAVS heap definition @lgirdwood

/* heap block memory map */
struct mm {
	/* system heap - used during init cannot be freed */
	struct mm_heap system[PLATFORM_HEAP_SYSTEM];
	/* system runtime heap - used for runtime system components */
	struct mm_heap system_runtime[PLATFORM_HEAP_SYSTEM_RUNTIME];
-- #if CONFIG_CORE_COUNT > 1
	/* object shared between different cores - used during init cannot be freed */
	struct mm_heap system_shared[PLATFORM_HEAP_SYSTEM_SHARED];
	/* object shared between different cores */
	struct mm_heap runtime_shared[PLATFORM_HEAP_RUNTIME_SHARED];
-- #endif
	/* general heap for components */
	struct mm_heap runtime[PLATFORM_HEAP_RUNTIME];
	/* general component buffer heap */
	struct mm_heap buffer[PLATFORM_HEAP_BUFFER];

	struct mm_info total;
	uint32_t heap_trace_updated;	/* updates that can be presented */
	spinlock_t lock;	/* all allocs and frees are atomic */
};

@keyonjie
Copy link
Contributor Author

@lgirdwood let me fix this with the heap refinement together, so let's pick it in this order:

  1. heap refinement.
  2. enlarge buffer heap.
  3. DW DMA fixes.

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this PR actually fix #4568? If yes - it should mention it, perhaps. Also, when fixing such hard to debug bugs it would be best to indicate which specific change fixes the problem. Also, to re-iterate, it would really be much better if we could figure out what exactly was broken...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do shared heaps have the CACHE capability set? Both .runtime_shared and .system_shared heaps return uncached aliases, don't they?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was aeda015 a guess or a documented feature? How did we come up with 128 bytes on TGL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was aeda015 a guess or a documented feature? How did we come up with 128 bytes on TGL?

Unfortunately, It was a guess only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it have to be zeroed? It wasn't zeroed before. If that was causing problems because we expect zeroes somewhere, then that should be fixed explicitly! Otherwise rmalloc() should do just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't need alignment any longer? That seems very odd for hardware descriptors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't need alignment any longer? That seems very odd for hardware descriptors?

@plbossart the rmalloc() internal guarantees the return pointer is PLATFORM_DCACHE_ALIGN aligned, please see the source here:

return get_ptr_from_heap(heap, flags, caps, bytes, PLATFORM_DCACHE_ALIGN);

@lgirdwood
Copy link
Member

@keyonjie looks like we need this to be well commented in the code due to the complexity of the problem along side the 1,2 and 3 fixes. Ok do do this a single PR that way we will keep your 1,22 3 ordering ?

This reverts commit aeda015.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
@keyonjie keyonjie changed the title Dw dma fixes Heap refinement Part2 -- Dw dma fixes Sep 8, 2021
@keyonjie keyonjie marked this pull request as ready for review September 8, 2021 10:33
The lli structs are used by both DSP and DMAC, allocate them from the
runtime_shared heap to avoid being corrupted by cache writing back.

This helps to fix a lot of DMA xrun issue according to the validation.

Signed-off-by: Keyon Jie <yang.jie@linux.intel.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't need alignment any longer? That seems very odd for hardware descriptors?

@keyonjie
Copy link
Contributor Author

keyonjie commented Sep 9, 2021

Let me close this, will submit others first.

@keyonjie keyonjie closed this Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants