NOTICE: The Processors Wiki will End-of-Life on January 15, 2021. It is recommended to download any files or other content you may need that are hosted on processors.wiki.ti.com. The site is now set to read only.

OpenMP OpenCL DSP Heap Management

From Texas Instruments Wiki
Jump to: navigation, search

Memory requests (malloc, calloc, etc.) within OpenMP target regions or OpenCL kernels are satisfied by allocating portions from DSP heaps. The OpenMP/OpenCL implementation specifies fixed sizes and placements for private DSP heaps as well as the heap that is shared by all the DSPs. The DSP runtime provides additional APIs to initialize and manage heaps in order to afford the user more flexibility to control the size and location of heaps.

DSP Heaps in Shared Memory (DDR or MSMC)[edit]

The DSP runtime provides the following APIs to initialize and manage heaps in shared memory.

Heap Initialization API[edit]

The heap initialization functions __heap_init_[ddr|msmc] must be called by one of the DSP cores to initialize internal heap data structures before making any memory management calls such as __malloc_[ddr|msmc]. Once initialized, the heaps are accessible by all the DSP cores. These APIs are thread safe under the OpenMP and OpenCL programming models on the DSP (Each DSP is running a single thread of execution).

Note: If data allocated on the heap is shared across DSP cores, the programmer is responsible for cache consistency of allocated memory across cores . If OpenMP is used to parallelize the program, cache consistency is managed by the OpenMP runtime.

<source lang="c" strict enclose="div" header="DSP heap initialization APIs" footer=""> void __heap_init_ddr(void *ptr, int size); void __heap_init_msmc(void *ptr, int size); </source> Note that ptr is a pointer to underlying memory to be configured as a user-controlled heap. Therefore, the underlying memory must be allocated before calling the heap initialization function. Initialized heaps are persistent across target regions and kernels until the underlying memory regions for them are deallocated.

Via OpenMP[edit]

The following code snippets illustrate how to allocate memory for heaps and call the initialization functions from OpenMP. <source lang="c" strict enclose="div" header="OpenMP heap initialization and memory allocation" footer=""> /*-----------------------------------------------------------------------------

  • User-controlled DSP heaps are initialized within a target region. The call
  • to __heap_init_xxx can be included within any target region. However the
  • initialization function must be called before any __malloc_xxx calls are
  • made.
  • User-controlled DSP heaps can be persistent across target regions as long as
  • the underlying memory (aka buffers pointed to by p are not deallocated.
  • ----------------------------------------------------------------------------*/

void heap_init_ddr(char* p, size_t bytes) {

  1. pragma omp target map(to:bytes,p[0:bytes])
  {
     __heap_init_ddr(p,bytes); 
  }

}

void heap_init_msmc(char *p, size_t bytes) {

  1. pragma omp target map(to: bytes, p[0:bytes])
  {
     __heap_init_msmc(p,bytes); 
  }

} /*-----------------------------------------------------------------------------

  • The DSP core executing the enclosed target region will allocate from the
  • heaps and then free the memory.
  • ----------------------------------------------------------------------------*/

void alloc_and_free(size_t bytes) {

  1. pragma omp target map(to: bytes)
  {
     char *p1 = (char *) __malloc_ddr(bytes);
     char *p2 = (char *) __malloc_msmc(bytes);
     if (!p1 || !p2) 
        printf("Error\n");
     else
     {
        printf("DDR  heap pointer is 0x%08x\n", p1);
        printf("MSMC heap pointer is 0x%08x\n", p2);
     }
     __free_ddr(p1);
     __free_msmc(p2);
  }

} </source>

<source lang="c" strict enclose="div" header="OpenMP host code" footer=""> /*------------------------------------------------------------------------

  • From the host, create the underlying memory store for the heaps
  • -----------------------------------------------------------------------*/

int ddr_heap_size = 16 << 20; int msmc_heap_size = 1 << 20; char* HeapDDR = (char*) __malloc_ddr(ddr_heap_size); char* HeapMSMC = (char*) __malloc_msmc(msmc_heap_size);

/*------------------------------------------------------------------------

  • Initialize the pre-allocated buffers as new DDR and MSMC heaps
  • accessible to DSP cores.
  • -----------------------------------------------------------------------*/

heap_init_ddr (HeapDDR, ddr_heap_size); heap_init_msmc(HeapMSMC, msmc_heap_size); /*------------------------------------------------------------------------

  • On each DSP core, alloc memory from both ddr and msmc and then free it.
  • -----------------------------------------------------------------------*/

alloc_and_free(1024); </source>


Via OpenCL[edit]

The following code snippets illustrate how to allocate memory for heaps and call the initialization functions from OpenCL. The OpenCL example dspheap contains complete source.

<source lang="c" strict enclose="div" header="OpenCL Kernel Code" footer=""> /*-----------------------------------------------------------------------------

  • These kernels initialize user controlled heaps, they do not have to be
  • separate kernels. The call to __heap_init_xxx can be rolled into an existing
  • kernel and called before any __malloc_xxx calls are made.
  • These heaps can be persistent across kernel boundaries as long as the
  • underlying memory (aka buffers pointed to by p are not deallocated.
  • ----------------------------------------------------------------------------*/

kernel void heap_init_ddr(void *p, size_t bytes)

   { __heap_init_ddr(p,bytes); }

kernel void heap_init_msmc(void *p, size_t bytes)

   { __heap_init_msmc(p,bytes); }

/*-----------------------------------------------------------------------------

  • This kernel will allocate from the heaps and then free them memory.
  • ----------------------------------------------------------------------------*/

kernel void alloc_and_free(int bytes) {

   char *p1 = __malloc_ddr(bytes);
   char *p2 = __malloc_msmc(bytes);
   if (!p1 || !p2) return;
   printf("DDR  heap pointer is 0x%08x\n", p1);
   printf("MSMC heap pointer is 0x%08x\n", p2);
   __free_ddr(p1);
   __free_msmc(p2);

} </source>

<source lang="c" strict enclose="div" header="OpenCL Host Code" footer=""> /*------------------------------------------------------------------------

  • Create the underlying memory store for the heaps with OpenCL Buffers
  • -----------------------------------------------------------------------*/

int ddr_heap_size = 16 << 20; // 16MB int msmc_heap_size = 1 << 20; // 1MB Buffer HeapDDR (context, CL_MEM_READ_WRITE, ddr_heap_size); Buffer HeapMSMC(context, CL_MEM_READ_WRITE|CL_MEM_USE_MSMC_TI, msmc_heap_size);

...

/*------------------------------------------------------------------------

  • Create a command queue and kernelfunctors for all kernels in our program
  • -----------------------------------------------------------------------*/

CommandQueue Q(context, devices[0]); KernelFunctor heap_init_ddr = Kernel(program, "heap_init_ddr") .bind(Q, NDRange(1), NDRange(1)); KernelFunctor heap_init_msmc = Kernel(program, "heap_init_msmc").bind(Q, NDRange(1), NDRange(1)); KernelFunctor alloc_and_free = Kernel(program, "alloc_and_free").bind(Q, NDRange(8), NDRange(1));

/*------------------------------------------------------------------------

  • Call kernels to initialize a DDR based and a MSMC based heap, the init
  • step only needs to run once and one 1 core only. See the functor
  • mapping above that defines the global size to be 1.
  • -----------------------------------------------------------------------*/

heap_init_ddr (HeapDDR, ddr_heap_size) .wait(); heap_init_msmc(HeapMSMC, msmc_heap_size).wait();

/*------------------------------------------------------------------------

  • On each core alloc memory from both ddr and msmc and the free it.
  • -----------------------------------------------------------------------*/

alloc_and_free(1024).wait(); </source>

Via a DATA_SECTION pragma in DSP C code[edit]

<source lang="c" strict enclose="div" header="Specifying storage for MSMC Heap" footer=""> /* Array is already aligned on a 64b boundary. No need for DATA_ALIGN */

  1. define MSMC_HEAP_SIZE (16*1024)
  2. pragma DATA_SECTION(msmc_heap, ".mem_msm")

char msmc_heap[MSMC_HEAP_SIZE];

... void foo() {

   __heap_init_msmc ((void *)msmc_heap, MSMC_HEAP_SIZE);
   ...
   double *p = (double *)__malloc_msmc(sizeof(double)*256);
   ...
   __free_msmc(p);

} </source>


Dynamic Memory Management APIs[edit]

After the DDR and/or MSMC heap is initialized by one of the DSP cores using the API specified in Section #Heap Initialization API on the DSP , the following APIs are available from all DSP cores for dynamic memory management:

Heap in DDR[edit]

<source lang="c" strict enclose="div" header="Allocation from DDR" footer=""> void *__malloc_ddr (size_t size); void *__calloc_ddr (size_t num, size_t size); void *__realloc_ddr (void *ptr, size_t size); void __free_ddr (void *ptr); void *__memalign_ddr (size_t alignment, size_t size); </source>

Heap in MSMC[edit]

<source lang="c" strict enclose="div" header="Allocation from MSMC" footer=""> void *__malloc_msmc (size_t size); void *__calloc_msmc (size_t num, size_t size); void *__realloc_msmc (void *ptr, size_t size); void __free_msmc (void *ptr); void *__memalign_msmc (size_t alignment, size_t size); </source>

DSP Heap in Local Memory (L2SRAM)[edit]

The DSP runtime provides a simplistic API to initialize a heap in L2 SRAM and allocate from it. This heap is local to the core which initialized it.

Heap Initialization API[edit]

A heap can be initialized in L2 SRAM via the following API: <source lang="c" strict enclose="div" header="DSP heap creation APIs" footer=""> void __heap_init_l2(void *ptr, int size); </source>

The storage associated with the heap must be start on a 64bit boundary. Unlike DDR and MSMC heaps, heaps initialized in L2 SRAM do not persist across target regions or kernels. Underlying storage for dsp heaps in local memory can be set up in one of the following ways:

Via the local map type in OpenMP[edit]

TI's OpenMP implementation includes a TI-specific local map type that allows data to be allocated on a DSP's L2 SRAM. This allocated buffer can be used to initialize the heap. <source lang="c" strict enclose="div" header="Specifying storage for L2 Heap using local clause" footer=""> void l2_alloc_and_free(char *p, size_t bytes) {

  //p is actually just a dummy buffer. It will not be copied to the DSPs.
  1. pragma omp target map(to:bytes) map(local:p[0:bytes])
  {
     //p gets allocated in DSP L2 SRAMS at the start of the target region
     char *p1;
     __heap_init_l2(p, bytes); 
     p1 = (char *) __malloc_l2(bytes);
     if (!p1) 
        printf("Error\n");
     else
        printf("L2SRAM  heap pointer is 0x%08x\n", p1);
  }

} </source>


Via the __local keyword in OpenCL[edit]

Use the OpenCL __local approach to allocate of chunk of memory on L2 using OpenCL. This chunk can be used to initialize the heap:

<source lang="c" strict enclose="div" header="OpenCL Kernel" footer=""> kernel void L2MallocLocal(local void *ptr, int size) {

   __heap_init_l2(ptr, size);
   ...
   ... __malloc_l2(sizeof(double)));

} </source>
<source lang="c" strict enclose="div" header="OpenCL Host Code" footer=""> Kernel kernel2(program, "L2MallocLocal"); kernel2.setArg(0, __local(L2_HEAP_SIZE)); kernel2.setArg(1, L2_HEAP_SIZE); </source>

Via a DATA_SECTION pragma in DSP C code[edit]

<source lang="c" strict enclose="div" header="Specifying storage for L2 Heap" footer=""> /* Array is already aligned on a 64b boundary. No need for DATA_ALIGN */

  1. define L2_HEAP_SIZE (256)
  2. pragma DATA_SECTION(l2_heap, ".mem_l2")

char l2_heap[L2_HEAP_SIZE];

... void foo() {

   __heap_init_l2 ((void *)l2_heap, L2_HEAP_SIZE);
   ...
   ... __malloc_l2(sizeof(double));

} </source>

Dynamic Memory Management APIs[edit]

After the L2 heap is initialized by the DSP cores using the __heap_init_l2 call, the only API available is a malloc:

<source lang="c" strict enclose="div" header="Allocation from DDR" footer=""> void *__malloc_l2 (size_t size); /* Pointer returned is aligned to 64 bit boundary */ </source>

E2e.jpg {{
  1. switchcategory:MultiCore=
  • For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
  • For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article OpenMP OpenCL DSP Heap Management here.

Keystone=
  • For technical support on MultiCore devices, please post your questions in the C6000 MultiCore Forum
  • For questions related to the BIOS MultiCore SDK (MCSDK), please use the BIOS Forum

Please post only comments related to the article OpenMP OpenCL DSP Heap Management here.

C2000=For technical support on the C2000 please post your questions on The C2000 Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. DaVinci=For technical support on DaVincoplease post your questions on The DaVinci Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. MSP430=For technical support on MSP430 please post your questions on The MSP430 Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. OMAP35x=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. OMAPL1=For technical support on OMAP please post your questions on The OMAP Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. MAVRK=For technical support on MAVRK please post your questions on The MAVRK Toolbox Forum. Please post only comments about the article OpenMP OpenCL DSP Heap Management here. For technical support please post your questions at http://e2e.ti.com. Please post only comments about the article OpenMP OpenCL DSP Heap Management here.

}}

Hyperlink blue.png Links

Amplifiers & Linear
Audio
Broadband RF/IF & Digital Radio
Clocks & Timers
Data Converters

DLP & MEMS
High-Reliability
Interface
Logic
Power Management

Processors

Switches & Multiplexers
Temperature Sensors & Control ICs
Wireless Connectivity