📝 Add docstrings to feat/default-models#2
Conversation
Docstrings generation was requested by @Reallyeasy1. * #1 (comment) The following files were modified: * `build_combined.py`
|
Important Review skippedThis PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback. To trigger a single review, invoke the You can disable this status message by setting the Use the checkbox below for a quick retry:
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
build_combined.py (2)
8-10: Anchor project paths to the script file, not process CWD.Using
Path(".")makes notebook discovery/output location depend on where the script is launched from. That can break execution outside repo root.💡 Suggested change
-ROOT = Path(".") +ROOT = Path(__file__).resolve().parent NB_DIR = ROOT / "notebooks" OUT_DIR = ROOT / "colab_package"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@build_combined.py` around lines 8 - 10, The project paths currently use ROOT = Path(".") which ties discovery to the current working directory; change ROOT to be anchored to the script file location by setting ROOT to Path(__file__).resolve().parent (or equivalent) so NB_DIR and OUT_DIR (the NB_DIR and OUT_DIR variables) are computed relative to the script's directory rather than the process CWD.
251-262: Remove the unusedout_dir_valparameter fromcontent_cells_nb.
out_dir_valis never used (Line 258), and all callsites pass a redundant third argument. Keeping it increases API noise.💡 Suggested change
-def content_cells_nb(cells, out_dir_var, out_dir_val): +def content_cells_nb(cells, out_dir_var): @@ - out_dir_val (str): Unused parameter retained for compatibility; it does not affect processing. @@ -for c in content_cells_nb(nb02, "OUT_TFIDF", "OUT_TFIDF"): +for c in content_cells_nb(nb02, "OUT_TFIDF"): @@ -for c in content_cells_nb(nb03, "OUT_NB", "OUT_NB"): +for c in content_cells_nb(nb03, "OUT_NB"): @@ -for c in content_cells_nb(nb04, "BERT_OUT", "BERT_OUT"): +for c in content_cells_nb(nb04, "BERT_OUT"): @@ -for c in content_cells_nb(nb05, "REPORTS_DIR", "REPORTS_DIR"): +for c in content_cells_nb(nb05, "REPORTS_DIR"):Also applies to: 302-326
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@build_combined.py` around lines 251 - 262, The function content_cells_nb currently accepts an unused parameter out_dir_val; remove out_dir_val from the function signature and docstring and update any callers that pass three arguments to call content_cells_nb with only (cells, out_dir_var); ensure the substitution logic still uses out_dir_var only and adjust any related function definitions or tests that expect the old 3-arg signature (also apply the same removal for the other similar function mentioned around the later block).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@build_combined.py`:
- Around line 343-349: The current syntax-check printing block (the branch that
iterates over errors using the errors variable and prints "syntax error(s)" and
the corresponding lines) must abort execution after reporting instead of
continuing to write the combined notebook; update that block to raise an
exception (e.g. raise RuntimeError or SystemExit with a clear message)
immediately after printing the errors so the process fails fast and no combined
artifact is written; apply the same change to the other identical
syntax-reporting block (the second print branch that currently prints "All ...
cells: syntax OK" or the errors) so both places use the same fail-fast behavior.
- Around line 371-375: The three print statements that use f-strings but have no
interpolations should be changed to plain strings to satisfy Ruff F541: update
the prints that output "\ncolab_package/ contents:", "\nAlso copy into
colab_package/:", and " sarcasm_pairs_step35_clean.jsonl (from
data/processed/)" by removing the leading f prefix so the calls in
build_combined.py simply call print("<string>") instead of print(f"<string>");
keep the existing string contents and formatting unchanged.
---
Nitpick comments:
In `@build_combined.py`:
- Around line 8-10: The project paths currently use ROOT = Path(".") which ties
discovery to the current working directory; change ROOT to be anchored to the
script file location by setting ROOT to Path(__file__).resolve().parent (or
equivalent) so NB_DIR and OUT_DIR (the NB_DIR and OUT_DIR variables) are
computed relative to the script's directory rather than the process CWD.
- Around line 251-262: The function content_cells_nb currently accepts an unused
parameter out_dir_val; remove out_dir_val from the function signature and
docstring and update any callers that pass three arguments to call
content_cells_nb with only (cells, out_dir_var); ensure the substitution logic
still uses out_dir_var only and adjust any related function definitions or tests
that expect the old 3-arg signature (also apply the same removal for the other
similar function mentioned around the later block).
| if errors: | ||
| print(f"\n{len(errors)} syntax error(s):") | ||
| for i, ln, msg, line in errors: | ||
| print(f" Cell {i:3d} line {ln}: {line!r}") | ||
| print(f" {msg}") | ||
| else: | ||
| print(f"All {len(all_cells)} cells: syntax OK") |
There was a problem hiding this comment.
Fail fast when syntax validation finds errors.
Right now, syntax errors are printed but execution still writes the combined notebook. That can ship a broken artifact. After reporting errors, raise and stop.
🐛 Suggested fix
if errors:
print(f"\n{len(errors)} syntax error(s):")
for i, ln, msg, line in errors:
print(f" Cell {i:3d} line {ln}: {line!r}")
print(f" {msg}")
+ raise SyntaxError(f"Aborting write: {len(errors)} syntax error(s) detected in combined notebook cells.")
else:
print(f"All {len(all_cells)} cells: syntax OK")Also applies to: 352-369
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@build_combined.py` around lines 343 - 349, The current syntax-check printing
block (the branch that iterates over errors using the errors variable and prints
"syntax error(s)" and the corresponding lines) must abort execution after
reporting instead of continuing to write the combined notebook; update that
block to raise an exception (e.g. raise RuntimeError or SystemExit with a clear
message) immediately after printing the errors so the process fails fast and no
combined artifact is written; apply the same change to the other identical
syntax-reporting block (the second print branch that currently prints "All ...
cells: syntax OK" or the errors) so both places use the same fail-fast behavior.
| print(f"\ncolab_package/ contents:") | ||
| for p in sorted(OUT_DIR.iterdir()): | ||
| print(f" {p.name}") | ||
| print(f"\nAlso copy into colab_package/:") | ||
| print(f" sarcasm_pairs_step35_clean.jsonl (from data/processed/)") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, check if build_combined.py exists and verify the specific lines
if [ -f "build_combined.py" ]; then
echo "=== Lines 371-375 from build_combined.py ==="
sed -n '371,375p' build_combined.py
else
echo "File not found. Searching for it..."
find . -name "build_combined.py" -type f
fiRepository: SeeYangZhi/Project-LLMao
Length of output: 322
🏁 Script executed:
# Run the verification script from the review comment
python - <<'PY'
from pathlib import Path
p = Path("build_combined.py")
if p.exists():
for i, line in enumerate(p.read_text(encoding="utf-8").splitlines(), 1):
s = line.strip()
if s.startswith("print(f\"") and "{" not in s:
print(f"Line {i}: {s}")
else:
print(f"File {p} not found")
PYRepository: SeeYangZhi/Project-LLMao
Length of output: 246
Fix Ruff F541: remove f prefix from non-interpolated strings.
Lines 371, 374, and 375 are plain strings and should not be f-strings.
💡 Suggested change
-print(f"\ncolab_package/ contents:")
+print("\ncolab_package/ contents:")
@@
-print(f"\nAlso copy into colab_package/:")
-print(f" sarcasm_pairs_step35_clean.jsonl (from data/processed/)")
+print("\nAlso copy into colab_package/:")
+print(" sarcasm_pairs_step35_clean.jsonl (from data/processed/)")🧰 Tools
🪛 Ruff (0.15.2)
[error] 371-371: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 374-374: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 375-375: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@build_combined.py` around lines 371 - 375, The three print statements that
use f-strings but have no interpolations should be changed to plain strings to
satisfy Ruff F541: update the prints that output "\ncolab_package/ contents:",
"\nAlso copy into colab_package/:", and " sarcasm_pairs_step35_clean.jsonl
(from data/processed/)" by removing the leading f prefix so the calls in
build_combined.py simply call print("<string>") instead of print(f"<string>");
keep the existing string contents and formatting unchanged.
a51c032 to
1ac8cca
Compare
Docstrings generation was requested by @Reallyeasy1.
The following files were modified:
build_combined.pyThese file types are not supported
.gitignoredocs/dataset_audit.mddocs/experiment_plan.mddocs/research_summary.mdnotebooks/01_data_preparation.ipynbnotebooks/02_tfidf_lr_baseline.ipynbnotebooks/03_naive_bayes_baseline.ipynbnotebooks/04_bert_classification.ipynbnotebooks/05_error_analysis.ipynbrequirements.txtℹ️ Note
Summary by CodeRabbit