Skip to content

feat[cartesian]: DaCe optimal for/map schedule#2628

Open
FlorianDeconinck wants to merge 1 commit into
GridTools:mainfrom
FlorianDeconinck:feat/dace_optimal_loop_schedule
Open

feat[cartesian]: DaCe optimal for/map schedule#2628
FlorianDeconinck wants to merge 1 commit into
GridTools:mainfrom
FlorianDeconinck:feat/dace_optimal_loop_schedule

Conversation

@FlorianDeconinck

Copy link
Copy Markdown
Contributor

Description

Reworked the based scheduling of the loops in OIR -> TreeIR to simplify the code and introduce an optimal scheduling for parallelism:

  • merge all map/for schedule code into a single _resolve_loop_schedule
  • respect syntax for sequential loops
  • default parallelism to maximize local parallelization (omp for parallel on CPU and plain kernel for CUDA)

This should be covered by the current unit tests

@FlorianDeconinck FlorianDeconinck requested review from romanc and twicki June 7, 2026 18:05
@romanc

romanc commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

I think the failing gpu-tests are due to a cuda-codgen issue in DaCe. From what I understand, this test case triggers the codepath here which inserts a global grid sync into a merged kernel. The grid sync insertion in successful, but the __gbar symbol isn't defined because it's not passed to the arguments of the nested SDFG. Similar, the (recreated) conditions inserted after the grid sync might (and in this case do) use undefined symbols that would need to be passed as arguments of the nested SDFG function. Both doesn't happen and this is why nvcc fails.

We could work around by slightly changing the test scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants