Skip to content

microblaze_invalidate_dcache_range() can spin forever for valid ranges touching the final cache line #381

@chrissbarr

Description

@chrissbarr

Summary

microblaze_invalidate_dcache_range() can loop indefinitely when asked to invalidate a valid range that touches the final cache line in the 32-bit address space.

This appears to affect the MicroBlaze write-through D-cache path, where XPAR_MICROBLAZE_DCACHE_USE_WRITEBACK == 0. In that configuration, the implementation walks cache-line addresses upward and relies on advancing past the final aligned address. If the final aligned address is the last cache line before 0xFFFFFFFF, the next increment wraps to 0x00000000, and the loop never terminates.

The relevant file in this repository appears to be:

lib/bsp/standalone/src/microblaze/microblaze_invalidate_dcache_range.S

Minimal Reproducer

For a MicroBlaze configuration with a 16-byte D-cache line:

#include "mb_interface.h"
#include "xil_types.h"

void repro(void)
{
    microblaze_invalidate_dcache_range((UINTPTR)0xFFFFFFF0u, 1u);
}

Equivalently, through the public cache API:

#include "xil_cache.h"

void repro(void)
{
    Xil_DCacheInvalidateRange((UINTPTR)0xFFFFFFF0u, 1u);
}

This is a valid range. It invalidates one byte at 0xFFFFFFF0, which is within the 32-bit address space. However, it causes microblaze_invalidate_dcache_range()` to spin indefinitely on the write-through path.

Expected Behavior

The function should invalidate the cache line containing the specified byte range and return.

Actual Behavior

The function invalidates the final cache line, increments the current address by one cache line, wraps to 0x00000000, and then continues looping indefinitely.

Mechanics Of The Problem

The implementation in:

lib/bsp/standalone/src/microblaze/microblaze_invalidate_dcache_range.S

computes an inclusive end address:

end = addr + len - 1

It then aligns both the start and inclusive end addresses down to cache-line boundaries.

In the write-through path, the loop effectively does this:

current = aligned_start;
end = aligned_end;

while (current <= end) {
    invalidate_cache_line(current);
    current += line_size;
}

That loop shape requires the address after the final cache line to be representable. If aligned_end is the final cache line, for example 0xFFFFFFF0 with a 16-byte line size, the next increment wraps to 0x00000000. At that point the loop never reaches its completion condition.

This affects any valid range whose aligned inclusive end address is the final cache line. With a 16-byte cache line, that means any range including bytes in:

0xFFFFFFF0..0xFFFFFFFF

Impact In Xilinx lwIP / AXI Ethernet TCP Transmit Path

As an example that this can be hit even without user code invoking Xil_DCacheInvalidateRange, this problem can be encountered when using the lwIP TCP transmit path (which is where I ran into it).

This can affect normal TCP transmit paths when using Xilinx lwIP with AXI Ethernet and no-copy buffers.

In:

ThirdParty/sw_services/lwip220/src/lwip-2.2.0/contrib/ports/xilinx/netif/xaxiemacif_dma.c

the AXI Ethernet DMA transmit path flushes each pbuf payload before handing it to the DMA engine:

XCACHE_FLUSH_DCACHE_RANGE(q->payload, q->len);

For MicroBlaze write-through D-cache configurations, the cache macro in:

lib/bsp/standalone/src/common/xenv_standalone.h

maps XCACHE_FLUSH_DCACHE_RANGE(Addr, Len) to microblaze_invalidate_dcache_range(...).

Therefore, if a no-copy TCP transmit buffer has a payload range that touches the final cache line near 0xFFFFFFFF, the Ethernet transmit path can stall indefinitely inside the cache maintenance routine.

This is particularly easy to hit with large external-memory buffers placed near the top of the 32-bit address space. For example, a no-copy transmit buffer at 0xC0000000 with length 0x40000000 ends exactly at 0xFFFFFFFF.

Why This Appears To Be A Bug

The cache API describes the arguments as a start address and a byte length. It does not document a requirement that the range must avoid the final cache line, or that addr + len must have a representable one-past-end address.

The range:

addr = 0xFFFFFFF0
len  = 1

is valid and fully contained in the 32-bit address space, but the function does not return.

Proposed Minimal Fix

Avoid using an address-walk loop that requires representing the address after the final cache line.

One minimal approach is to use the same offset-countdown style already used by the write-back branch of microblaze_invalidate_dcache_range.S.

Conceptually:

if (len == 0) {
    return;
}

aligned_start = addr & ~(line_size - 1);
aligned_end = (addr + len - 1) & ~(line_size - 1);
offset = aligned_end - aligned_start;

for (;;) {
    invalidate_cache_line(aligned_start + offset);

    if (offset == 0) {
        break;
    }

    offset -= line_size;
}

In assembly terms, after aligning r5 to the start cache line and r6 to the inclusive end cache line, the write-through branch could count down an offset rather than incrementing the address:

RSUBK   r6, r5, r6        /* r6 = aligned_end - aligned_start */

L_start:
    wdc     r5, r6        /* invalidate aligned_start + offset */

#if defined (__arch64__ )
    addlik  r6, r6, -(XPAR_MICROBLAZE_DCACHE_LINE_LEN * 4)
    beagei  r6, L_start
#else
    bneid   r6, L_start
    addik   r6, r6, -(XPAR_MICROBLAZE_DCACHE_LINE_LEN * 4)
#endif

This avoids ever needing to compute or compare against the address after the final cache line. For the failing large-range case:

aligned_start = 0xC0000000
aligned_end   = 0xFFFFFFF0
offset        = 0x3FFFFFF0

The first operation invalidates aligned_start + offset == 0xFFFFFFF0, then the offset counts down to zero and the loop terminates normally.

The same issue appears to exist in the write-through path of:

lib/bsp/standalone/src/microblaze/microblaze_flush_dcache_range.S

That file uses the same upward address-walk pattern, so it may need the same treatment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions