Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ venv*
*.cdx.json
release_notes.txt
.hypothesis
.claude/worktrees/
7 changes: 3 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -66,10 +66,9 @@ repos:
name: vulture
language: python
entry: vulture
args: ["--min-confidence=80"]
types: [python]
files: ^dfetch/
require_serial: true
args: ["--min-confidence=80", "dfetch/"]
pass_filenames: false
always_run: true
- id: pyroma (package friendliness)
name: Pyroma
entry: pyroma
Expand Down
3 changes: 2 additions & 1 deletion CHANGELOG.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
Release 0.14.0 (unreleased)
===========================

* Strip ``user:password@`` userinfo from ``remote_url`` before writing it to ``.dfetch_data.yaml``, so a manifest URL with an embedded credential no longer leaks the credential into the committed metadata file (DFT-13)
* Respect the superproject's line-ending preference (#1206)
* Strip ``user:password@`` userinfo before storing metadata (#1206)
* Warn when a project URL uses a plaintext transport scheme (#1229)
* Documentation and threat-model clarifications for existing release attestation support (#1208)
* Report SVN externals fetched during update (#1220)
Expand Down
1 change: 1 addition & 0 deletions dfetch/commands/update.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ def _ignored(dst: str = destination) -> list[str]:
dfetch.project.create_sub_project(project).update(
force=args.force,
ignored_files_callback=_ignored,
eol_preferences_callback=superproject.eol_preferences,
)

if not args.no_recommendations and os.path.isdir(
Expand Down
6 changes: 5 additions & 1 deletion dfetch/commands/update_patch.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ def _ignored(dst: str = destination) -> list[str]:
force=True,
ignored_files_callback=_ignored,
patch_count=len(subproject.patch) - 1,
eol_preferences_callback=superproject.eol_preferences,
)

# generate reverse patch
Expand All @@ -147,7 +148,10 @@ def _ignored(dst: str = destination) -> list[str]:

# force update again to fetched version from metadata but with applying patch
subproject.update(
force=True, ignored_files_callback=_ignored, patch_count=-1
force=True,
ignored_files_callback=_ignored,
patch_count=-1,
eol_preferences_callback=superproject.eol_preferences,
)

if exceptions:
Expand Down
5 changes: 4 additions & 1 deletion dfetch/project/archivesubproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,9 @@ def wanted_version(self) -> Version:
return Version(revision=self._project_entry.hash)
return Version(revision=self.remote)

def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
def _fetch_impl(
self, version: Version, eol_hint: str | None = None
) -> tuple[Version, list[Dependency]]:
"""Download and extract the archive to the local destination.

1. Download the archive to a temporary file.
Expand All @@ -170,6 +172,7 @@ def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
Returns:
The version that was actually fetched (hash string or URL).
"""
del eol_hint # archives are extracted byte-for-byte
revision = version.revision

pathlib.Path(self.local_path).mkdir(parents=True, exist_ok=True)
Expand Down
11 changes: 7 additions & 4 deletions dfetch/project/gitsubproject.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""Git specific implementation."""

import pathlib
from functools import lru_cache
from pathlib import Path
Comment thread
coderabbitai[bot] marked this conversation as resolved.

from dfetch.log import get_logger
from dfetch.manifest.project import ProjectEntry
Expand Down Expand Up @@ -67,12 +67,14 @@ def list_tool_info() -> None:
)
SubProject._log_tool("git", "<not found in PATH>")

def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
def _fetch_impl(
self, version: Version, eol_hint: str | None = None
) -> tuple[Version, list[Dependency]]:
"""Get the revision of the remote and place it at the local path."""
rev_or_branch_or_tag = self._determine_what_to_fetch(version)

# When exporting a file, the destination directory must already exist
pathlib.Path(self.local_path).mkdir(parents=True, exist_ok=True)
Path(self.local_path).mkdir(parents=True, exist_ok=True)

Comment on lines 76 to 78

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Create the parent directory, not the destination path, before file exports.

At Line 77, Path(self.local_path).mkdir(...) creates the destination itself as a directory. For file destinations, that breaks fetch/output placement.

Suggested fix
-        Path(self.local_path).mkdir(parents=True, exist_ok=True)
+        Path(self.local_path).parent.mkdir(parents=True, exist_ok=True)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/project/gitsubproject.py` around lines 76 - 78, The issue is that
`Path(self.local_path).mkdir(parents=True, exist_ok=True)` creates the
destination path itself as a directory instead of creating just the parent
directory, which breaks file export placement. Change the code to create only
the parent directory by using `Path(self.local_path).parent.mkdir(parents=True,
exist_ok=True)` so that the actual file can be written to the correct
destination path.

license_globs = [f"/{name.lower()}" for name in LICENSE_GLOBS] + [
f"/{name.upper()}" for name in LICENSE_GLOBS
Expand All @@ -86,6 +88,7 @@ def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
src=self.source,
must_keeps=license_globs + [".gitmodules"],
ignore=self.ignore,
eol=eol_hint,
)
)

Expand All @@ -108,7 +111,7 @@ def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:

targets = {local_repo.METADATA_DIR, local_repo.GIT_MODULES_FILE}

for path in pathlib.Path(self.local_path).rglob("*"):
for path in Path(self.local_path).rglob("*"):
if path.name in targets:
safe_rm(path)

Expand Down
4 changes: 4 additions & 0 deletions dfetch/project/gitsuperproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,10 @@ def get_file_revision(self, path: str | pathlib.Path) -> str:
"""Get the revision of the given file."""
return str(self._repo.get_last_file_hash(str(path)))

def eol_preferences(self, paths: Sequence[str]) -> dict[str, str]:
"""Get the line ending requested per path by this repo's gitattributes."""
return self._repo.eol_attributes(paths)

@staticmethod
def import_projects() -> Sequence[ProjectEntry]:
"""Import projects from underlying superproject."""
Expand Down
41 changes: 38 additions & 3 deletions dfetch/project/subproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,9 @@ def update(
force: bool = False,
ignored_files_callback: Callable[[], Sequence[str]] | None = None,
patch_count: int = -1,
eol_preferences_callback: (
Callable[[Sequence[str]], dict[str, str]] | None
) = None,
) -> None:
"""Update this subproject if required.

Expand All @@ -104,6 +107,10 @@ def update(
hash). Calling it at both points ensures the stored hash and the check-time
hash use the same skiplist, preventing false "local changes" reports.
patch_count (int, optional): Number of patches to apply (-1 means all).
eol_preferences_callback (Callable, optional): Given a list of paths, returns
the line ending ("lf" or "crlf") the superproject requests per path (e.g.
from its gitattributes). Used to resolve the destination's preference,
which the VCS backend applies natively while fetching.
"""
to_fetch = self.update_is_required(force)

Expand Down Expand Up @@ -131,7 +138,9 @@ def update(
):
if warning := plaintext_warning(self.__project.remote_url):
logger.print_warning_line(self.__project.name, warning)
actually_fetched, dependency = self._fetch_impl(to_fetch)
actually_fetched, dependency = self._fetch_impl(
to_fetch, self._destination_eol_hint(eol_preferences_callback)
)
self._log_project(f"Fetched {actually_fetched}")

applied_patches = self._apply_patches(patch_count)
Expand All @@ -153,6 +162,22 @@ def update(
logger.debug(f"Writing repo metadata to: {self.__metadata.path}")
self.__metadata.dump()

def _destination_eol_hint(
self, callback: Callable[[Sequence[str]], dict[str, str]] | None
) -> str | None:
"""Get the directory-wide line-ending preference for the destination, if any.

Resolved with a hypothetical path inside the destination, so global and
directory-level rules can match. The VCS backend applies it natively
while fetching (git renormalisation, ``svn export --native-eol``).
"""
if not callback:
return None
exact = pathlib.PurePath(self.local_path).as_posix()
probe = f"{exact}/_"
result = callback([exact, probe])
return result.get(probe) or result.get(exact)
Comment on lines +178 to +179

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prefer exact-path EOL over probe fallback.

At Line 179, precedence is reversed. If the callback returns both keys, probe can override a more specific exact file rule, producing the wrong eol_hint.

Suggested fix
-        return result.get(probe) or result.get(exact)
+        return result.get(exact) or result.get(probe)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/project/subproject.py` around lines 178 - 179, The return statement in
the callback result handling has reversed precedence. Currently it checks the
probe result first and falls back to exact, but it should prioritize the
exact-path result over the probe fallback. Swap the order of the two
result.get() calls so that result.get(exact) is evaluated first, with
result.get(probe) as the fallback, ensuring that more specific exact file rules
take precedence over generic probe matches when both exist.


def _apply_patches(self, count: int = -1) -> list[str]:
"""Apply the patches."""
cwd = pathlib.Path(".").resolve()
Expand Down Expand Up @@ -429,8 +454,18 @@ def _are_there_local_changes(self, files_to_ignore: Sequence[str]) -> bool:
)

@abstractmethod
def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
"""Fetch the given version of the subproject, should be implemented by the child class."""
def _fetch_impl(
self, version: Version, eol_hint: str | None = None
) -> tuple[Version, list[Dependency]]:
"""Fetch the given version of the subproject, should be implemented by the child class.

Args:
version: The version to fetch.
eol_hint: Directory-wide line ending ("lf" or "crlf") the superproject
requests for the destination. Backends apply it natively while
fetching where the VCS supports it (git renormalisation,
``svn export --native-eol``).
"""

@abstractmethod
def get_default_branch(self) -> str:
Expand Down
9 changes: 9 additions & 0 deletions dfetch/project/superproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,15 @@ def _get_useremail_fallback(self) -> str:
def get_file_revision(self, path: str | pathlib.Path) -> str:
"""Get the revision of the given file."""

def eol_preferences(self, paths: Sequence[str]) -> dict[str, str]:
"""Get the line ending ("lf" or "crlf") this project requests per path.

Only paths with an explicit preference are returned. By default a
superproject has none; VCS-specific superprojects may override this.
"""
del paths # unused arg
return {}

@staticmethod
@abstractmethod
def import_projects() -> Sequence[ProjectEntry]:
Expand Down
14 changes: 11 additions & 3 deletions dfetch/project/svnsubproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,10 +113,16 @@ def _remove_ignored_files(self) -> None:
if not (file_or_dir.is_file() and is_license_file(file_or_dir.name)):
safe_rm(file_or_dir)

def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
def _fetch_impl(
self, version: Version, eol_hint: str | None = None
) -> tuple[Version, list[Dependency]]:
"""Get the revision of the remote and place it at the local path."""
branch, branch_path, revision = self._determine_what_to_fetch(version)

# Let svn itself produce the requested line endings; this applies to
# files marked 'svn:eol-style=native', other files keep their bytes.
native_eol = {"lf": "LF", "crlf": "CRLF"}.get(eol_hint or "", "")

complete_path = "/".join(
filter(None, [self.remote, branch_path.strip(), self.source])
).strip("/")
Expand All @@ -128,7 +134,7 @@ def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:

complete_path, file_pattern = self._parse_file_pattern(complete_path)

SvnRepo.export(complete_path, revision, self.local_path)
SvnRepo.export(complete_path, revision, self.local_path, native_eol)

if file_pattern:
for file in find_non_matching_files(self.local_path, (file_pattern,)):
Expand All @@ -147,7 +153,9 @@ def _fetch_impl(self, version: Version) -> tuple[Version, list[Dependency]]:
if os.path.isdir(self.local_path)
else os.path.dirname(self.local_path)
)
SvnRepo.export(f"{root_branch_path}/{license_files[0]}", revision, dest)
SvnRepo.export(
f"{root_branch_path}/{license_files[0]}", revision, dest, native_eol
)

if self.ignore:
self._remove_ignored_files()
Expand Down
4 changes: 4 additions & 0 deletions dfetch/project/svnsuperproject.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ def get_sub_project(self, project: ProjectEntry) -> SubProject | None:
"""Get the subproject in the same vcs type as the superproject."""
return SvnSubProject(project)

def eol_preferences(self, paths: Sequence[str]) -> dict[str, str]:
"""Get the line ending requested per path by this repo's svn:auto-props."""
return {path: eol for path in paths if (eol := self._repo.eol_style_for(path))}

def ignored_files(self, path: str) -> Sequence[str]:
"""Return a list of files that can be ignored in a given path."""
resolved_path = resolve_absolute_path(path)
Expand Down
5 changes: 4 additions & 1 deletion dfetch/util/cmdline.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,15 @@ def run_on_cmdline(
logger: logging.Logger,
cmd: list[str],
env: Mapping[str, str] | None = None,
input_data: bytes | None = None,
) -> "subprocess.CompletedProcess[Any]":
"""Run a command and log the output, and raise if something goes wrong."""
logger.debug(f"Running {cmd}")

try:
proc = subprocess.run(cmd, env=env, capture_output=True, check=True) # nosec
proc = subprocess.run( # nosec
cmd, env=env, input=input_data, capture_output=True, check=True
Comment on lines +49 to +50

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Remove the inline Bandit suppression from the shared process boundary.

This helper is the single subprocess runner for Git/SVN operations, so keeping # nosec here suppresses security review for every current and future caller. shell=False avoids shell injection, but it does not prove that the executable and arguments are always constrained; that proof should live in code, not in an inline waiver.

As per coding guidelines, dfetch/**/*.py: "Avoid inline lint suppressions (# noqa, # type: ignore, # pylint: disable, # pyright: ignore) without fixing the root cause".

🧰 Tools
🪛 ast-grep (0.43.0)

[error] 48-50: Command coming from incoming request
Context: subprocess.run( # nosec
cmd, env=env, input=input_data, capture_output=True, check=True
)
Note: [CWE-20].

(subprocess-from-request)


[error] 48-50: Use of unsanitized data to create processes
Context: subprocess.run( # nosec
cmd, env=env, input=input_data, capture_output=True, check=True
)
Note: [CWE-78].

(os-system-unsanitized-data)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@dfetch/util/cmdline.py` around lines 49 - 50, Remove the `# nosec` inline
Bandit suppression comment from the subprocess.run call that creates the proc
variable. The inline waiver suppresses security review for all callers of this
helper function, which violates the coding guidelines. Instead of relying on an
inline suppression, ensure that the executable and arguments passed to
subprocess.run are properly validated and constrained in the code itself, so the
security properties are evident from the implementation rather than from a
waiver.

Sources: Coding guidelines, Linters/SAST tools

)
except subprocess.CalledProcessError as exc:
raise SubprocessCommandError(
exc.cmd,
Expand Down
Loading
Loading