Skip to content

Feat/preserve duplicate file basenames#278

Open
Nitram1123 wants to merge 3 commits into
sensiolabs:1.xfrom
Nitram1123:feat/preserve-duplicate-file-basenames
Open

Feat/preserve duplicate file basenames#278
Nitram1123 wants to merge 3 commits into
sensiolabs:1.xfrom
Nitram1123:feat/preserve-duplicate-file-basenames

Conversation

@Nitram1123

Copy link
Copy Markdown
Q A
Gotenberg API version ? 8.x
Bug fix ? yes
New feature ? no
BC break ? no
Issues Fix #264

Description

By default, nothing changes. MergePdfBuilder::files() still deduplicates by resolved path and lets Gotenberg sort uploads alphanumerically by basename.
Files that share the same basename keep overwriting each other silently.

$pdf->merge()
    ->files('/a/report.pdf', '/b/report.pdf', '/a/report.pdf')
    ->generate();
// one file named 'report.pdf'

New opt-in: keep every file
allowDuplicates() opts in to disambiguating colliding basenames with a zero-padded suffix inserted before the extension, so every file actually reaches Gotenberg:

$pdf->merge()
    ->allowDuplicates()
    ->files('/a/report.pdf', '/b/report.pdf')
    ->generate();
// report.pdf, report000001.pdf

Gotenberg still sorts alphanumerically by basename. When the original basename already ends with digits (e.g. report_1.pdf), the suffixed sibling (report_1000001.pdf) is read as a larger integer and drifts to the end of the merge — so the order is preserved but not always the one the user expects.
Guaranteed call order
For a deterministic order, combine allowDuplicates() with sortFilesByCall(). Each filename is then prefixed with its call index (%06d-), which is alphanumerically stable regardless of the original basename:

$pdf->merge()
    ->allowDuplicates()
    ->sortFilesByCall()
    ->files('/reports/q4.pdf', '/cover.pdf', '/reports/q4.pdf')
    ->generate();
// 000001-q4.pdf, 000002-cover.pdf, 000003-q4.pdf

* Lets Gotenberg sort the files alphanumerically by their multipart filename.
* This is the default behavior.
*/
public function sortFilesByName(): self

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already proposed in #277, right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly this branch is based on #277.

* Gotenberg's flat upload namespace, where files sharing a multipart
* filename overwrite each other silently.
*/
public function overrideDuplicates(): self

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about a single method dedupeFiles(bool $dedupe): self ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s probably a good idea. I was thinking of doing it like the sort methods, but it’s more of a boolean-based approach.

}

/**
* @var 'name'|'call'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not enum then ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth it since it's purely internal to this trait?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then boolean seems safer if there are only 2 possible usecases

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for bool, it could be changed later. Also, we could mark these properties as internal since builder can see them.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, sorting is not really a boolean concern, because we never know if we might need a third sorting strategy later.
The value never leaves this trait and isn't part of the public API, so an enum felt like overkill. A private constant would have been my preferred option, but trait constants were only introduced in PHP 8.2 and we still support PHP 8.1.


trait FilesOrderTrait
{
use FilesTrait {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it means it replace the use FilesTrait in the final builder class. Let's see if there is a way to properly document this or maybe prevent errors using phpstan. Because they are mutually exclusive

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PHP already prevents the error at load time. As soon as a class does use FilesTrait and use FilesOrderTrait composition fails with a fatal ...

Comment thread src/Builder/Util/NormalizerFactory.php Outdated
foreach ($assets as $asset) {
yield [$type => new DataPart(new File($asset))];
foreach ($assets as $entryKey => $asset) {
yield [$type => new DataPart(new File($asset), basename($entryKey))];

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a BC. Am I wrong ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure since DataPart will retrieve basename from the File instance given

    // DataPart
    public function __construct($body, ?string $filename = null, ?string $contentType = null, ?string $encoding = null)
    {
        if ($body instanceof File && !$filename) {
            $filename = $body->getFilename();
        }

    // File
    public function __construct(
        private string $path,
        private ?string $filename = null,
    ) {
    }

    public function getFilename(): string
    {
        return $this->filename ??= basename($this->getPath());
    }

So, I'm wondering if this change is needed 🤔

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally refactored it to put the change directly in the FilesOrderTrait. I feel like that’s safer.
Let me know if that works for you.

@Nitram1123 Nitram1123 force-pushed the feat/preserve-duplicate-file-basenames branch 2 times, most recently from 2fb8ad8 to 106a019 Compare June 5, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: allow sorting sent files

3 participants