Skip to content

Add https endpoint support#267

Open
GaganNarula wants to merge 3 commits into
mainfrom
gagan/fsspec-read-public-r2-bucket
Open

Add https endpoint support#267
GaganNarula wants to merge 3 commits into
mainfrom
gagan/fsspec-read-public-r2-bucket

Conversation

@GaganNarula
Copy link
Copy Markdown
Collaborator

Public buckets on cloudflare have https://... prefixes. fsspec can handle them easily. We can then read from cloudflare r2 public buckets without needing the R2 secrets which are currently on GCP and OSS users wont have access to that.

@GaganNarula GaganNarula marked this pull request as ready for review April 16, 2026 13:59
@GaganNarula GaganNarula requested a review from a team as a code owner April 16, 2026 13:59
Copy link
Copy Markdown
Collaborator

@mil-ad mil-ad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look even less like a file system. At least buckets have the file system hierarchy, even though it's fake, but not sure about HTTP URLs.

Isn't it enough to instead change filesystem_from_path() to have some sort of HTTP filesystem?

@mil-ad
Copy link
Copy Markdown
Collaborator

mil-ad commented Apr 17, 2026

but this is fine if you imagine a lot of URL path manipulation

@GaganNarula
Copy link
Copy Markdown
Collaborator Author

GaganNarula commented Apr 17, 2026

These look even less like a file system. At least buckets have the file system hierarchy, even though it's fake, but not sure about HTTP URLs.

Isn't it enough to instead change filesystem_from_path() to have some sort of HTTP filesystem?

its just that it would make life easier for two reasons:

  1. We use the filesystem_from_path and anypath at lot in esp-data, especially to join paths in the datasets when loading audio, so we need to be able to do that with "https://" prefixes or anypath() cant create the path object.

  2. Having the public r2.dev bucket lets experiment for now without relying on the secrets and the S3api and there are a few open datasets that Moritz identified which also have https urls

``filesystem_from_path() to have some sort of HTTP filesystem? i think the code is achieving that in the diff ?

@ReadyPlayerEmma
Copy link
Copy Markdown
Contributor

I thought of another potential approach and wrote a working initial example in PR #275 . I am sure it's very possible I am missing important context, but I figured it was worth exploring to get to know the code base a bit more, and I'd learn from the negative case where I find out I am missing something too.

@GaganNarula
Copy link
Copy Markdown
Collaborator Author

I thought of another potential approach and wrote a working initial example in PR #275 . I am sure it's very possible I am missing important context, but I figured it was worth exploring to get to know the code base a bit more, and I'd learn from the negative case where I find out I am missing something too.

@ReadyPlayerEmma is there a different solution for this https url issue that you are working on ? Should I close this PR ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants