Feat/preserve duplicate file basenames#278
Conversation
| * Lets Gotenberg sort the files alphanumerically by their multipart filename. | ||
| * This is the default behavior. | ||
| */ | ||
| public function sortFilesByName(): self |
| * Gotenberg's flat upload namespace, where files sharing a multipart | ||
| * filename overwrite each other silently. | ||
| */ | ||
| public function overrideDuplicates(): self |
There was a problem hiding this comment.
What about a single method dedupeFiles(bool $dedupe): self ?
There was a problem hiding this comment.
Yes, that’s probably a good idea. I was thinking of doing it like the sort methods, but it’s more of a boolean-based approach.
| } | ||
|
|
||
| /** | ||
| * @var 'name'|'call' |
There was a problem hiding this comment.
Is it worth it since it's purely internal to this trait?
There was a problem hiding this comment.
then boolean seems safer if there are only 2 possible usecases
There was a problem hiding this comment.
OK for bool, it could be changed later. Also, we could mark these properties as internal since builder can see them.
There was a problem hiding this comment.
For me, sorting is not really a boolean concern, because we never know if we might need a third sorting strategy later.
The value never leaves this trait and isn't part of the public API, so an enum felt like overkill. A private constant would have been my preferred option, but trait constants were only introduced in PHP 8.2 and we still support PHP 8.1.
|
|
||
| trait FilesOrderTrait | ||
| { | ||
| use FilesTrait { |
There was a problem hiding this comment.
so it means it replace the use FilesTrait in the final builder class. Let's see if there is a way to properly document this or maybe prevent errors using phpstan. Because they are mutually exclusive
There was a problem hiding this comment.
PHP already prevents the error at load time. As soon as a class does use FilesTrait and use FilesOrderTrait composition fails with a fatal ...
| foreach ($assets as $asset) { | ||
| yield [$type => new DataPart(new File($asset))]; | ||
| foreach ($assets as $entryKey => $asset) { | ||
| yield [$type => new DataPart(new File($asset), basename($entryKey))]; |
There was a problem hiding this comment.
this looks like a BC. Am I wrong ?
There was a problem hiding this comment.
Not sure since DataPart will retrieve basename from the File instance given
// DataPart
public function __construct($body, ?string $filename = null, ?string $contentType = null, ?string $encoding = null)
{
if ($body instanceof File && !$filename) {
$filename = $body->getFilename();
}
// File
public function __construct(
private string $path,
private ?string $filename = null,
) {
}
public function getFilename(): string
{
return $this->filename ??= basename($this->getPath());
}So, I'm wondering if this change is needed 🤔
There was a problem hiding this comment.
I finally refactored it to put the change directly in the FilesOrderTrait. I feel like that’s safer.
Let me know if that works for you.
2fb8ad8 to
106a019
Compare
Description
By default, nothing changes. MergePdfBuilder::files() still deduplicates by resolved path and lets Gotenberg sort uploads alphanumerically by basename.
Files that share the same basename keep overwriting each other silently.
New opt-in: keep every file
allowDuplicates() opts in to disambiguating colliding basenames with a zero-padded suffix inserted before the extension, so every file actually reaches Gotenberg:
Gotenberg still sorts alphanumerically by basename. When the original basename already ends with digits (e.g. report_1.pdf), the suffixed sibling (report_1000001.pdf) is read as a larger integer and drifts to the end of the merge — so the order is preserved but not always the one the user expects.
Guaranteed call order
For a deterministic order, combine allowDuplicates() with sortFilesByCall(). Each filename is then prefixed with its call index (%06d-), which is alphanumerically stable regardless of the original basename: