Skip to content

CSharp extension: ZIP archive support for InstallExternalLibrary/UninstallExternalLibrary#85

Open
stuartpa wants to merge 20 commits into
microsoft:mainfrom
stuartpa:dev/stuartpa/dotnet-install-external-library
Open

CSharp extension: ZIP archive support for InstallExternalLibrary/UninstallExternalLibrary#85
stuartpa wants to merge 20 commits into
microsoft:mainfrom
stuartpa:dev/stuartpa/dotnet-install-external-library

Conversation

@stuartpa
Copy link
Copy Markdown

@stuartpa stuartpa commented Apr 17, 2026

Summary

Extends InstallExternalLibrary / UninstallExternalLibrary in the .NET Core CSharp Language Extension to support ZIP archives with arbitrary file trees (e.g. packages with nested folders), not just flat DLL files. Also makes InstallExternalLibrary idempotent so ALTER EXTERNAL LIBRARY works correctly.

What changed vs. sicongliu's base branch

1. Manifest-based uninstall

UninstallExternalLibrary only receives LibraryName + LibraryInstallDirectory ΓÇö no LibraryFile, so we cannot re-read ZIP entries from sys.external_libraries at drop time. To work around this, InstallExternalLibrary now writes a <libName>.manifest file listing the relative paths of every extracted file (or, for raw-DLL installs, the single <libName>.dll entry). UninstallExternalLibrary reads that manifest and deletes exactly those files ΓÇö no more, no less.

The previous uninstall implementation deleted everything in the shared install directory, which would wipe out unrelated libraries' files.

2. File-level conflict detection on install

Before extracting a ZIP, every entry is checked against the install directory. If a file of the same name already exists (from another library), install fails with a clear error:

Cannot install library '<X>': file '<Y>' already exists in the install directory.

Directory (folder) overlaps are allowed ΓÇö multiple libraries can share a parent folder; they just can't overwrite each other's files.

3. Empty-directory cleanup on uninstall

After a manifest-driven delete, parent directories of removed files are walked deepest-first (sorted by separator count) and removed only if empty. Shared parent folders survive as long as any other library still has content in them.

4. ALTER EXTERNAL LIBRARY support (transactional re-install)

SQL Server may call InstallExternalLibrary again for the same library name during ALTER EXTERNAL LIBRARY without first calling UninstallExternalLibrary. The install now:

  1. Extracts and conflict-checks the new content into a temporary staging folder first
  2. Only cleans up the previous version's manifest after the new content has validated successfully
  3. Suppresses false-positive conflicts between the new version and the old version's own files

If the new ZIP is corrupt or conflicts with another library, the old version is left intact ΓÇö the install is atomic from the caller's perspective.

5. Defense in depth

  • Zip-slip: ZipFile.ExtractToDirectory rejects path-traversal entries at extraction time. The single-path collapse in review pass 3 (see Update below) means the install code never sees an entry that escaped the staged tree, so there is no second-level vector to defend.
  • Symlink defense: CopyDirectory skips entries with the ReparsePoint attribute at every recursion level ΓÇö both file and directory symlinks. Today's .NET writes symlink-mode ZIP entries as regular files on every platform; this guard is future-proofing for a runtime that materializes them as real symlinks. Pinned by InnerZipFutureSymlinkRejectedTest.
  • Library-name validation: ValidateLibraryName rejects null/empty/whitespace-only names, names containing .. / path separators / null characters, absolute paths, extension-only names like .dll, and Windows reserved DOS device names (CON, NUL, AUX, PRN, COM1ΓÇôCOM9, LPT1ΓÇôLPT9 ΓÇö bare or suffixed). Reserved-name rejection is enforced on every OS so libraries moved between hosts behave consistently.
  • Native error-buffer ownership: SetLibraryError in the native code allocates the error buffer with malloc (matching the managed Marshal.AllocHGlobal contract that ExtHost expects on the other side), so ownership transfers cleanly to the host. Pre-fix returned a c_str() pointer into a freed std::string ΓÇö undefined behavior on the host side.
  • Linux case-sensitivity: path comparisons use Ordinal on Linux and OrdinalIgnoreCase on Windows, so /install/Lib and /install/lib are correctly treated as distinct paths on Linux.

Error-handling matrix

Scenario Behavior
ZIP contains a file that already exists on disk Install fails with explicit error; no partial state
Two libraries extract to the same folder Allowed (folder overlap is fine, file overlap is rejected)
Non-ZIP file (raw DLL) Copied directly as <libName>.dll; manifest written so uninstall and ALTER work uniformly
Raw DLL would overwrite a foreign <libName>.dll planted by another library or external tooling Install fails with explicit error (preserves the pre-PR bFailIfExists=TRUE contract)
File with .zip extension whose bytes are not a valid ZIP Install fails with explicit error; user's file is not silently renamed to <libName>.dll
Drop library with no manifest <libName>.dll is deleted directly (legacy-compat path for libraries installed by pre-PR builds)
ALTER EXTERNAL LIBRARY with valid new content Old content replaced atomically via manifest
ALTER EXTERNAL LIBRARY with corrupt new content Old content preserved; error returned
ZIP entry path escapes install dir (zip-slip) Install fails with explicit error
Library name is empty, whitespace-only, contains ../path separator/null character, is an absolute path, is extension-only (.dll), or is a Windows reserved DOS device name (CON/NUL/AUX/PRN/COMn/LPTn ΓÇö bare or suffixed) Install fails with explicit error
Inner ZIP entry has a Unix symlink mode (theoretical, future runtime) IsReparsePoint guard skips the entry; legitimate files still install

Tests

28 new TEST_F cases in CSharpLibraryTests.cpp covering:

  • Manifest creation and content (flat + nested paths)
  • <libName>.dll alias naming and removal on uninstall
  • Flat-file coexistence allowed; file conflict rejected
  • Inner-ZIP file conflict rejected
  • Uninstall preserves unrelated libraries and shared nested directories
  • Empty nested directory cleanup
  • ALTER-style re-install (ZIPΓåöZIP, ZIPΓåönon-ZIP, non-ZIPΓåönon-ZIP, ALTER to empty-dirs ZIP preserving v1)
  • Error message surfacing through libraryError
  • Conflict in inner-ZIP code path; temp-folder cleanup on failure
  • Raw-DLL install fails when a foreign <libName>.dll exists
  • Alias-conflict detected before extraction
  • Library-name validation rejects empty / whitespace-only / path-traversal / null character / absolute path / extension-only / reserved DOS device name
  • Uninstall rejects path-traversal library names
  • ZIP install with .dll-suffixed library name
  • Install of a .zip-extension file with non-ZIP bytes fails loudly (does not silently rewrite the user's file)
  • Inner-ZIP future-symlink rejection (regression test for the single-path collapse ΓÇö see Update below)
  • Uninstall handles missing install directory gracefully

Update (review pass 2)

Atomicity contract clarified: Install is NOT atomic at the per-file level. A crash between CleanupManifest and the CopyDirectory loop can leave the install directory inconsistent. End-to-end recovery is provided by SQL Server's library management architecture: the catalog is the source of truth, and the next session re-installs from the catalog. The in-extension code is staging-validated (corrupt ZIPs cannot start a destructive cleanup) but is not crash-safe.

Raw-DLL installs now write a manifest. A one-entry {libName}.manifest listing {libName}.dll is written for raw-DLL installs as well as ZIPs. This:

  • Restores the pre-PR CopyFileW(..., bFailIfExists=TRUE) contract: a foreign {libName}.dll planted by another library or external tooling is no longer silently overwritten ΓÇö install fails.
  • Makes ALTER from rawΓåÆZIP and ZIPΓåÆraw work via the existing manifest cleanup path ΓÇö no special-case logic needed.

Other v3.8.0 hardening: native SetLibraryError null-check, ZIP file opens with FileShare.Read, CopyDirectory skips ReparsePoint entries (Linux symlink defense), alias suppression now correctly counts only root-level matches (DllUtils.CreateDllList is non-recursive), various comments / doc improvements per inline review.

Update (review pass 3)

Addresses 4 inline comments from JustinMDotNet and 13 from yaelh. Tip: 8ef1573.

Native / managed correctness fixes:

  • nativecsharpextension.cpp SetLibraryError: replaced new std::string(errorString) + c_str() with malloc(len + 1) + memcpy. Buffer ownership transfers cleanly to ExtHost; OOM path returns the no-error state instead of crashing.
  • CSharpExtension.cs AcquireInstallLock: narrowed the catch via when (IsSharingViolation(ex)) filter (HResult ERROR_SHARING_VIOLATION (32) / ERROR_LOCK_VIOLATION (33)). DirectoryNotFoundException, PathTooLongException, UnauthorizedAccessException, etc. now propagate fast instead of being swallowed.
  • DllUtils.cs: replaced Directory.GetFiles(searchPath, "{name}.*") + .Where(...EndsWith(".dll")) with EnumerateFiles + explicit Equals(".dll", OrdinalIgnoreCase) on extension and Equals(userLibName, OrdinalIgnoreCase) on stem. Eliminates the *.dll matching foo.dllx and Foo.* matching short-name 8.3 alias quirks.
  • CSharpExtension.cs install path collapsed to a single code path: both inner-zip and outer-zip cases now extract into tempFolder/inner-content/ first and walk the result on disk via IsReparsePoint-guarded CopyDirectory. Removed the direct-extract-to-installDir shortcut so a future runtime that materializes Unix symlink-mode entries can't bypass the reparse-point guard. ValidateRelativePath removed (zero callers after the collapse ΓÇö zip-slip defense is now provided by ZipFile.ExtractToDirectory plus the on-disk walk being unable to see entries that escaped the staged tree).
  • New regression test InnerZipFutureSymlinkRejectedTest + fixture testpackageK-SYMLINK.zip (262 bytes, generated by build-symlink-fixture.ps1). Inner zip contains legitfile.dll + evil-symlink.dll (Unix mode 0o120755, content /etc/passwd). Asserts: install succeeds, legitfile.dll lands in installDir, and installDir contains zero reparse points.
  • UninstallExternalLibrary: wrapped File.Delete(libraryFile) in an else branch so the manifest path exclusively owns cleanup for current-version installs. The direct delete only runs as a legacy-compat path for libraries installed by pre-PR builds with no manifest.
  • DetermineAliasSource: lexicographically-first .dll candidate via string.CompareOrdinal (stable across NTFS / ext4 / XFS and re-installs).

Validation / behavior tightening:

  • ValidateLibraryName: now rejects whitespace-only names (IsNullOrWhiteSpace) and Windows reserved DOS device names (full set: CON, NUL, AUX, PRN, COM0ΓÇôCOM9, LPT0ΓÇôLPT9). Stem is checked via Path.GetFileNameWithoutExtension, so CON, CON.dll, and nul.manifest are all rejected. Enforced on every OS for consistency. New rows in InstallRejectsInvalidLibNameTest.
  • IsZipFile content-sniff replaced by extension-based HasZipExtension. A .zip file whose bytes are not a valid archive now returns SQL_ERROR instead of being silently rewritten as <libName>.dll ΓÇö the user's registered filename is opaque to the install path. The corresponding test was renamed to InstallZipExtensionWithBadContentFailsLoudlyTest and inverted to assert the loud-failure behavior.

Test hardenings:

  • InstallZipWithManyFilesTest: per-module existence loop (Module1.dll ΓǪ Module50.dll) plus a comment block with the full historical context (the original EXPECT_EQ(dllCount, 50) agreed with the buggy install code that created the alias as {libName} with no extension ΓÇö test asserted what the code did, not what it should do; fixed in commit 38c553d).
  • DirectoryOverlapAllowedTest renamed to NonConflictingFlatFilesCoexistTest with a tighter comment block clarifying that nested-directory overlap is covered separately.

Docs / style:

  • IsReparsePoint <remarks> now includes the concrete sneaky.dll ΓåÆ /etc/shadow worked example covering both file and directory reparse-point cases.
  • Trimmed the CleanupManifest catch comment to stop at the diagnostic-trail justification.
  • Project-wide blank-line-after-} sweep applied where the pattern was genuinely two adjacent independent constructs (file/dir loops, consecutive flag-setter ifs); guard-then-action } followed by continue;/return X; left alone.
  • One-line rationale comments at the explicit .close() sites in test code so the pattern is self-documenting.

Test results: 112/112 unit tests pass on Windows release config.

Update (review pass 4)

Addresses 4 inline comments from JustinMDotNet (CMake std flag, header doc contract, alias suppression for sidecar-only roots, native error-buffer ownership in tests) plus 4 production-found bugs from end-to-end testing against SQL Server 2025 RTM-GDR.

Production fixes (end-to-end testing):

  • CSharpExtension.cs AcquireInstallLock: lock file is now created at Path.Combine(installDir, "install.lock") instead of one level up. The previous path put the lock outside the per-<dbid>/<langid> install directory, so concurrent installs into different DB/language slots serialized against each other unnecessarily and (worse) a single install could race against itself when the parent directory was on a separately-permissioned mount.
  • CSharpExtension.cs new DispatchAsZip(libName, libFilePath) helper: install dispatch is now driven by the registered library name's extension, not the staged temp file's extension. SQL Server's ExtHost passes a generated temp file with no semantic suffix, so the previous content-/extension-sniff on libFilePath was meaningless in production. Order of resolution is libName ends in .zip → ZIP, libName ends in .dll → raw DLL, otherwise fall back to libFilePath's extension (preserves the legacy contract for test fixtures that register libraries by bare name and point libraryFile at a *.zip / *.dll fixture).
  • CSharpOutputDataSet.cs + Sql.cs DotNetNVarChar plumbing: the SqlDataType.DotNetNVarChar enum value (the string row in Sql.DataTypeMap) had no entry in Sql.DataTypeSize and no case in CSharpOutputDataSet.ExtractColumn / GetStrLenNullMap. Calls fell through to default and threw KeyNotFoundException in DataTypeSize before the column ever reached the dispatch switch. Fixed by adding the MinUtf16CharSize row to DataTypeSize and a DotNetNVarChar case alongside DotNetWChar in both switches (they're SQL_C_WCHAR-shaped at the ODBC layer and share an implementation).
  • CSharpOutputDataSet.cs DotNetWChar / DotNetNVarChar Size unit: Size is now reported in bytes, matching the unit emitted by GetStrLenNullMap (Encoding.Unicode.GetByteCount). The previous code divided by a UTF-16 code-unit width and reported a character count, so SPEES logged "Reading one row failed for column N row M. The length information is incorrect." and rejected the rowset whenever a string column contained non-ASCII data.

Reviewer fixes:

  • test/src/native/CMakeLists.txt: non-MSVC target_compile_options flag changed from --std=c++17 to -std=c++17 (one dash). GCC and Clang accept both spellings, but the single-dash form is the documented one and matches the rest of the build tree.
  • include/nativecsharpextension.h: InstallExternalLibrary doc comment rewritten to spell out the new dispatch contract (libName-based, not libFile-content-based) and to document that a {libName}.manifest is written for every install (ZIP or raw DLL). The previous comment described only the legacy "raw DLL or ZIP detected from the file" behavior.
  • CSharpExtension.cs DetermineAliasSource: alias-suppression now requires an exact match against aliasFileName ("{libName}.dll") at the install root. The previous check accepted any root-level entry whose name started with "{libName}.", so a ZIP that planted only sidecars at the root (e.g. foo.deps.json, foo.runtimeconfig.json) and kept the real binary nested under lib/net8.0/foo.dll was treated as if foo.dll were already loadable — alias creation was suppressed and the install was un-loadable. The unused libName parameter was dropped from the signature in the same change. New regression test AliasCreatedWhenOnlySidecarsAtRootTest + fixture testpackageL-SIDECAR.zip (491 bytes, generated by build-sidecar-fixture.ps1) pins the new behavior.
  • test/src/native/CSharpLibraryTests.cpp new FreeLibError(SQLCHAR *) helper using LocalFree (matches the production Marshal.AllocHGlobal / LocalAlloc allocator that ExtHost expects on the consumer side). Wired into CallInstall, CallUninstall, and CallInstallCaptureError so the test harness no longer leaks the libError buffer on every failing-install assertion. Pre-fix the harness allocated, ignored, and never freed — a 113-test run accumulated kilobytes of orphan libError strings on the heap.

Test results: 113/113 unit tests pass on Windows release config (one new TEST_F: AliasCreatedWhenOnlySidecarsAtRootTest).

…xtension

Implement the optional InstallExternalLibrary and UninstallExternalLibrary APIs
for the .NET C# language extension, enabling CREATE/DROP EXTERNAL LIBRARY support
for both ZIP-packaged and raw DLL C# libraries in SQL Server.

Implementation:
- Native C++ entry points in nativecsharpextension.cpp with null-runtime guards
- Managed C# implementation in CSharpExtension.cs using System.IO.Compression.ZipFile
- ZIP magic byte detection (PK header) to distinguish ZIP vs raw DLL content
- ZIP path: extract to temp dir, scan for inner .zip and extract to install dir;
  if no inner ZIP, copy files directly. Creates {libName}.dll alias when extracted
  file names differ from the SQL library name, so DllUtils can discover them.
- Raw DLL path: copy file to install dir as {libName}.dll
- UninstallExternalLibrary clears all files/dirs from the install directory
- Unique temp folder per call (Guid-based) to prevent race conditions
- Fix DllUtils.CreateDllList to search userLibName + ".*" pattern instead of
  exact name (pre-existing bug: exact name never matched files with extensions)

Testing:
- 14 new test cases in CSharpLibraryTests.cpp
- Updated ExecuteInvalidLibraryNameScriptTest for DllUtils pattern change
- 9 test packages in test/test_packages/
- All 73 tests pass (59 existing + 14 new)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@stuartpa stuartpa force-pushed the dev/stuartpa/dotnet-install-external-library branch from e81fdfb to 5377b74 Compare April 17, 2026 11:46
@stuartpa stuartpa changed the base branch from main to dev/sicongliu/dotnet-install-external-library April 17, 2026 11:47
@stuartpa stuartpa force-pushed the dev/stuartpa/dotnet-install-external-library branch from 5377b74 to 82f9cbc Compare April 17, 2026 11:53
…fest-based cleanup, conflict detection, ALTER support

- Fix build script to use VS 2022 instead of VS 2019
- Add global.json to pin .NET SDK 8.0
- Fix: use library name as-is (no unconditional .dll append)
- InstallExternalLibrary: ZIP extraction with file conflict detection, manifest tracking
- UninstallExternalLibrary: manifest-based targeted file deletion, bottom-up empty dir cleanup
- ALTER EXTERNAL LIBRARY: clean up previous install before re-installing
- CHANGES.md documenting all changes
@stuartpa stuartpa force-pushed the dev/stuartpa/dotnet-install-external-library branch from 82f9cbc to 10ce19a Compare April 17, 2026 12:09
@stuartpa stuartpa changed the title Add InstallExternalLibrary/UninstallExternalLibrary for .NET extension + build toolchain updates CSharp extension: ZIP archive support for InstallExternalLibrary/UninstallExternalLibrary Apr 17, 2026
Stuart Padley added 4 commits April 17, 2026 05:18
…LTER

New tests:
- ManifestWrittenTest, ManifestListsNestedFilesTest
- InstallLibNameAliasNoExtensionTest
- DirectoryOverlapAllowedTest, FileConflictFailsTest
- UninstallPreservesOtherLibrariesTest
- UninstallRemovesEmptyNestedDirsTest
- AlterExternalLibraryTest

Update 3 existing tests to reflect new libName-without-extension naming:
- InstallInvalidZipTest, InstallRawDllNotZipTest, InstallZipWithManyFilesTest
- ErrorMessagePopulatedOnFailureTest: verify libError is surfaced to SQL Server
  for non-existent file, zip-slip, and file-conflict failure modes
- UninstallNonZipLibraryTest: raw-DLL install/uninstall (no-manifest path)
- InnerZipFileConflictFailsTest: conflict detection in inner-zip code path
- TempFolderCleanedUpAfterConflictTest: no GUID temp-dir leaks after failures
- AlterFromNonZipToZipTest: ALTER from raw DLL to ZIP (missing-manifest case)
- AliasFileRemovedOnUninstallTest: libName-alias file recorded in manifest
  and removed on uninstall
…validation, Linux case-sensitivity, cleanup

- Alias file now written as {libName}.dll so DllUtils.CreateDllList ({libName}.*) finds it (fixes feature regression on Linux)

- Raw-DLL install writes {libName}.dll; uninstall fallback updated to match

- Defense-in-depth zip-slip check on inner-ZIP entries before adding to manifest and again in CleanupManifest before deleting

- ALTER is now transactional: stage + validate new ZIP before removing old version, with manifest-aware conflict suppression

- Reject library names containing path separators, '..', null, or absolute paths

- Use OS-appropriate path comparer (Ordinal on Linux, OrdinalIgnoreCase on Windows)

- Sort dirs for bottom-up cleanup by separator count, not string length

- Extract conflict detection into single helper (removes duplicate check)

- Trim error messages and comments; update 3 tests to match new alias naming
@stuartpa stuartpa marked this pull request as ready for review April 17, 2026 14:49
stuartpa added 2 commits April 17, 2026 09:07
This PR branch is based on sicongliu's branch which predates PR microsoft#83 (DECIMAL Type Support on main). PR microsoft#83 added Microsoft.Data.SqlClient 5.2.2 as a required runtime dependency for SqlDecimal handling. Without it, sp_execute_external_script fails with HRESULT 0x80004004 when any script touches DECIMAL types.
Stuart Padley added 2 commits April 21, 2026 07:23
When CREATE EXTERNAL LIBRARY is called with a name ending in .dll (e.g. [Scriptoria.dll] WITH (LANGUAGE='dotnet')), the raw-DLL install path and missing-alias fallback both unconditionally appended '.dll', producing files like 'Scriptoria.dll.dll' on disk. The CLR assembly resolver could not locate the assembly by simple name and ExtHost died with 'Could not load file or assembly Scriptoria'.

Added DllFileNameFor(libName) helper that returns libName as-is if it already ends in .dll (case-insensitive on Windows via existing s_pathComparison), otherwise appends .dll. Applied at the three sites in CSharpExtension.cs: raw-DLL install path, missing-alias fallback inside the ZIP install path, and UninstallExternalLibrary raw-DLL cleanup. ZIP-based installs whose library names do not end in .dll are unaffected.

Verified end-to-end on consumer side: Test-SpAiTaskSkills.ps1 reports 4 passed, 0 failed.
@stuartpa stuartpa changed the base branch from dev/sicongliu/dotnet-install-external-library to main April 21, 2026 14:41
InstallRawDllWithDllSuffixedLibNameTest covers the case where the library is created via CREATE EXTERNAL LIBRARY [foo.dll] (libName already ending in .dll). Asserts that the raw DLL is written as 'foo.dll' (not 'foo.dll.dll') and that uninstall removes it from the same single-.dll path. Pairs with the DllFileNameFor helper added in this PR.
Copilot AI review requested due to automatic review settings April 21, 2026 14:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds ZIP-archive support (including nested file trees) for the .NET Core C# language extension’s InstallExternalLibrary / UninstallExternalLibrary, aiming to make installs idempotent and uninstalls precise via a per-library manifest.

Changes:

  • Implemented managed InstallExternalLibrary / UninstallExternalLibrary with manifest-driven uninstall, conflict detection, and temp-folder staging.
  • Updated DLL discovery logic in DllUtils and added native exports/wiring for the new APIs.
  • Added extensive native tests plus new ZIP/DLL test packages for edge cases (zip-slip, empty zips, nested trees, many files, etc.).

Reviewed changes

Copilot reviewed 9 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Implements managed install/uninstall logic (ZIP extraction, manifests, conflict checks, alias handling).
language-extensions/dotnet-core-CSharp/src/managed/utils/DllUtils.cs Adjusts library DLL discovery pattern matching.
language-extensions/dotnet-core-CSharp/src/native/nativecsharpextension.cpp Exposes native InstallExternalLibrary / UninstallExternalLibrary forwarding into managed code.
language-extensions/dotnet-core-CSharp/include/nativecsharpextension.h Declares the new native library-management API exports.
language-extensions/dotnet-core-CSharp/test/include/CSharpExtensionApiTests.h Adds function pointer typedefs and fixture members for install/uninstall APIs.
language-extensions/dotnet-core-CSharp/test/src/native/CSharpExtensionApiTests.cpp Loads the install/uninstall exports for tests.
language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Adds comprehensive unit tests for the new behaviors (manifest uninstall, conflicts, aliasing, ALTER-like reinstall, zip-slip, etc.).
language-extensions/dotnet-core-CSharp/test/src/native/CSharpExecuteTests.cpp Tweaks invalid-library-name test input.
language-extensions/dotnet-core-CSharp/test/test_packages/*.zip / *.dll Adds new fixture packages (nested layout, many files, zip-slip, bad zip, raw dll, etc.).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread language-extensions/dotnet-core-CSharp/src/managed/utils/DllUtils.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
…e validation, recursive/conflict-checked alias, CopyDirectory no-overwrite

DllUtils.CreateDllList: try exact userLibName first; fall back to userLibName + '.*' only if no exact match. Extracted into AddMatches helper. Fixes callers that pass an explicit filename like 'Foo.dll' which previously became 'Foo.dll.*' and matched nothing.

CSharpExtension.UninstallExternalLibrary: call ValidateLibraryName before building manifestPath / libraryFile. Prevents malicious or legacy names with path separators from resolving outside installDir.

CSharpExtension.InstallExternalLibrary alias creation: (a) search the full extracted tree (not just top-level) for an existing '{libName}.*' before deciding to create an alias; (b) include the alias path in the conflict-check input so a collision fails BEFORE any content is written to installDir. Prevents partial-state failures when another library already owns '{libName}.dll' at the root.

CSharpExtension.CopyDirectory: use File.Copy overwrite:false (was overwrite:true) so TOCTOU changes between conflict-check and write fail loud rather than silently clobbering another library's files.

Tests: added UninstallRejectsPathTraversalLibNameTest and AliasConflictDetectedBeforeExtractionTest to cover the new behaviors.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 17 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/include/nativecsharpextension.h Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/include/CSharpExtensionApiTests.h Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Outdated
@yaelh
Copy link
Copy Markdown

yaelh commented Apr 24, 2026

InstallZipWithManyFilesTest — DLL count assertion needs clarification

The test asserts exactly 50 .dll files, then separately asserts that the alias manyfilespackage.dll exists. But the dllCount loop counts ALL .dll files in installDir via fs::directory_iterator — which would include the alias.

Since the test passes, one of these must be true:

  • The ZIP contains 49 DLLs + 1 non-DLL file, and the alias makes 50
  • The ZIP contains a file matching manyfilespackage.*, suppressing alias creation, and there are exactly 50 DLLs
  • Some other combination

Whichever it is, the test description ("Tests that installing a zip containing many files (50 DLLs) extracts all of them correctly") doesn't explain the relationship between the package contents, the alias, and the expected count. A future maintainer changing the test package or the alias logic could break this test without understanding why.

Please add a comment clarifying what the ZIP actually contains and how the count of 50 accounts for (or excludes) the alias. For example:

// The ZIP contains 49 DLL files + 1 .txt file. Since none matches
// "manyfilespackage.*", an alias "manyfilespackage.dll" is created,
// bringing the total DLL count to 50.

(or whatever the actual breakdown is)

@yaelh
Copy link
Copy Markdown

yaelh commented Apr 24, 2026

InstallInvalidZipTest — misleading test name

The name suggests this tests handling of a corrupt/invalid ZIP file. But bad-package-ZIP.zip starts with "Th" (from "This is not a valid zip file"), not the PK magic bytes — so IsZipFile returns false and the raw-DLL install path runs. No ZIP handling is exercised at all.

A truly invalid ZIP test would need a file that starts with PK but is corrupted inside, causing ZipFile.ExtractToDirectory to throw.

This test is really verifying: "a file without ZIP magic bytes falls through to the raw-DLL install path regardless of its .zip extension." Consider renaming to something like InstallNonZipFileAsRawDllTest or InstallFileWithoutZipMagicBytesTest.

Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
…dirs ZIP guard, reparse-point sweep, libName extension-only rejection, refactor InstallExternalLibrary into helpers, doc + style sweep, 7 new tests

Block 1 substantive fixes:

- C1: AcquireInstallLock serializes Install + Uninstall on the same installDir via {installDir}/.install.lock (FileShare.None, DeleteOnClose, 100ms retry, blocks forever)

- C3: ValidateLibraryName rejects extension-only names (.dll, .txt, etc.) via Path.GetFileNameWithoutExtension

- E1: extractedFiles.Count==0 guard after collection prevents silent ALTER data loss when ZIP contains only directory entries

- E2: SAFETY comment on inner-zip ExtractToDirectory documenting why direct-to-installDir is safe today + tripwire for future .NET versions

- E3: New IsReparsePoint helper; root-level loops in Install + CollectRelativeFiles + CopyDirectory all guarded

Block 2 test fixes:

- E4: 5 new tests (AlterFromNonZipToNonZipTest, InstallRejectsInvalidLibNameTest, InstallZipWithDllSuffixedLibNameTest, UninstallWithMissingInstallDirTest, UninstallPreservesSharedNestedDirsTest) + new fixture testpackageJ-NESTED2.zip

- E5/E6/E7/E8/E10/E11: stronger assertions, removed redundant loops, exact equality, byte-for-byte content checks

- E12: surfaced + fixed latent test bug (dllCount expected 50 but real value with alias is 51)

- E13: InstallInvalidZipTest renamed to InstallNonZipFileAsRawDllTest

Block 3 refactor + style + docs:

- C6: InstallExternalLibrary split into orchestrator + 6 helpers (InstallRawDll, InstallZipPackage, FindInnerZip, CollectStagedFiles, DetermineAliasSource, ExtractContentToInstallDir, CreateAlias)

- A1-A8: empty lines before comments, single return, no var, brace style, scoped blocks unwrapped

- A5: s_pathComparer/s_pathComparison/s_lockRetryDelayMs moved to top class members

- B1-B5: full XML doc comments with <param> blocks; expanded IsZipFile + ValidateRelativePath comments

- C4: Install summary clarifies ZIP is allowed (not expected); raw DLL still supported

- C5: Logging.Trace -> Logging.Error in CleanupManifest skip path

- D1: <experimental/filesystem> -> <filesystem>

- D2: DirectoryHasFiles -> DoesDirectoryHaveFiles

- D4: Logging.Trace after manifest write (both raw-DLL and ZIP paths)
@stuartpa
Copy link
Copy Markdown
Author

Responses to PR conversation comments (commit 38c553d)

Combined response to the 13 PR-conversation comments from the 2026-04-24 review. (Inline review comments have been replied to individually as threaded replies.)


#issuecomment-4315981621 — Empty-directories ZIP → silent data loss on ALTER

Excellent catch — fixed. The new guard fires after CollectStagedFiles and before the alias-decision step:

if (extractedFiles.Count == 0)
{
    throw new InvalidOperationException(
        "The library archive contains no files.");
}

New regression test AlterToEmptyDirsZipPreservesV1Test plants v1 (testpackageB-DLL.zip as myLib), then attempts ALTER to a new fixture testpackageI-EMPTYDIRS.zip (4 directory entries, 0 file entries). Asserts the ALTER returns SQL_ERROR AND v1's three files (testpackageB.dll, testpackageB.deps.json, myLib.manifest) all survive intact. The silent-data-loss scenario is now positively tested rather than just defensively guarded.


#issuecomment-4316015154 — Inner-zip path bypasses CopyDirectory (reparse-point defense)

Took option β (the SAFETY comment). Routing through tempFolder + CopyDirectory would require either a separate inner-extract dir (so the existing root-level loop doesn't accidentally pick up the inner ZIP itself) or filtering the outer's contents — both add code paths and change failure modes. Since the threat isn't currently reachable on either Windows or Linux (.NET 8's ZipFile.ExtractToDirectory writes symlink entries as regular files), the SAFETY comment turns an implicit assumption into a documented invariant + tripwire:

// SAFETY: Extracting directly to the live installDir is currently safe
// because ZipFile.ExtractToDirectory does not restore symlink entries as
// actual symlinks on any platform -- they are written as regular files
// containing the link target text...
//
// If a future .NET version changes this (e.g. honors symlink entries on
// Linux), or if we switch to a different extraction library, this path
// MUST be re-routed through a separate temp folder followed by an
// IsReparsePoint-guarded copy -- mirror the non-inner-zip branch below
// for the pattern.

Happy to do option α (route through tempFolder) if you'd prefer the more-defensive form.


#issuecomment-4316028053 — Root-level files skip the reparse-point check

Excellent catch — fixed. Extracted the per-entry check into a single IsReparsePoint(path) helper, then guarded all sites that walk staged content into installDir:

  1. CopyDirectory files loop (already had inline check; refactored to use helper)
  2. CopyDirectory dirs loop (refactored)
  3. ExtractContentToInstallDir root-level files loop (NEW guard — the gap you found)
  4. ExtractContentToInstallDir root-level dirs loop (NEW guard — the same gap for directories)
  5. CollectRelativeFiles files loop (NEW guard — swept while I was here, so symlinks can't leak into the manifest as phantom paths)
  6. CollectRelativeFiles dirs loop (NEW guard)

The helper's <remarks> documents the threat model. No regression test added: the current test infrastructure runs only on Windows where regular users can't create symlinks without admin/dev-mode, and ZipFile.ExtractToDirectory doesn't restore symlinks anyway. Happy to add one if/when Linux CI lights up.


#issuecomment-4316048256 — Test coverage gaps (6 sub-items)

Mostly addressed:

  • (1) Raw → Raw ALTER — new AlterFromNonZipToNonZipTest: v1 = raw DLL myLib, v2 = raw DLL myLib. Asserts manifest stays at exactly 1 entry. Completes the ALTER coverage matrix (ZIP→ZIP, raw→ZIP, ZIP→raw, raw→raw).
  • (2) ValidateLibraryName install-side cases — new parameterized InstallRejectsInvalidLibNameTest covers empty / ../escape / foo\0bar (constructed via string(ptr, 7) so the embedded NUL survives) / absolute path (Windows / Linux) / extension-only .dll.
  • (3) DllUtils.AddMatches direct testnot added. The helper is private static and the test runner is C++/gtest with no managed-test-assembly seam to hit private statics directly. The exact-name vs wildcard-fallback vs .dll-only-filter behavior is exercised indirectly through the execute test suite that loads real assemblies through CSharpUserDll.InstantiateUserExecutor. Happy to make AddMatches internal + add [InternalsVisibleTo] if you'd like, but that has its own review surface area.
  • (4) .dll-suffixed libName in ZIP install — new InstallZipWithDllSuffixedLibNameTest: install testpackageB-DLL.zip as "foo.dll", asserts alias is foo.dll (NOT foo.dll.dll) and manifest tracks it under the single-.dll name.
  • (5) Uninstall when installDir doesn't exist — new UninstallWithMissingInstallDirTest: uses a sibling path that doesn't exist, asserts SQL_SUCCESS and that uninstall does NOT create the directory.
  • (6) Two libraries sharing parent dir + uninstall-one — new fixture testpackageJ-NESTED2.zip (puts files under lib/net8.0/) + new UninstallPreservesSharedNestedDirsTest: install both into the same installDir, uninstall lib1, assert lib2's files AND the shared lib/net8.0/ AND lib/ directories all survive.

5 new TEST_F + 1 new fixture; sub-item (3) deferred per the test-seam reasoning above.


#issuecomment-4316334407 — InstallZipContainingDllsTest weak assertion

Fixed. Replaced the "any .dll exists" loop with explicit EXPECT_TRUE(fs::exists(... "testpackageB.dll")) and EXPECT_TRUE(fs::exists(... "testpackageB.deps.json")) — consistent with the stronger assertions added later in the file.


#issuecomment-4316335427 — ReinstallLibraryTest redundant hasDll loop

Fixed. Removed the redundant hasDll loop. The explicit EXPECT_TRUE(fs::exists(... "testpackageB.dll")) plus the v1-cleanup EXPECT_FALSE checks cover everything the loop was checking.


#issuecomment-4316336983 — ManifestWrittenTest substring match

Fixed. Replaced e.find("testpackageB.dll") != string::npos with e == "testpackageB.dll" (and same for testpackageB.deps.json). Nested-package tests (ManifestListsNestedFilesTest) still use substring matching because path separators differ across platforms; flat-package tests now use exact equality.


#issuecomment-4316338760 — InstallLibNameAliasTest no content check

Fixed. The test now compares the alias file size against the source DLL AND reads both into strings and asserts byte-for-byte equality:

EXPECT_EQ(fs::file_size(aliasFile), fs::file_size(sourceDll))
    << "Alias file size differs from source DLL";

std::ifstream aliasStream(aliasFile.string(), std::ios::binary);
std::ifstream sourceStream(sourceDll.string(), std::ios::binary);
// read into strings, EXPECT_EQ on contents

Catches zero-length, partial-write, and wrong-source bugs that the existence check would miss.


#issuecomment-4316340092 — DirectoryOverlapAllowedTest doesn't actually test overlap

You're right. The new UninstallPreservesSharedNestedDirsTest (added as part of E4 sub-item 6) is the genuine directory-overlap test: two libraries (testpackageD-NESTED.zip + new testpackageJ-NESTED2.zip) both contribute files under lib/net8.0/, and uninstalling one is asserted to leave the other's files AND the shared parent directory intact.

For the existing DirectoryOverlapAllowedTest: should I rename it to something more accurate like NonConflictingFlatFilesCoexistTest? It's still a valid test (cross-library coexistence in one installDir is real behavior worth pinning), just not what its name suggests.


#issuecomment-4316341639 — ErrorMessagePopulatedOnFailureTest modes 1+2 weak

Fixed:

  • Mode 1 (missing file): asserts msg.find("exist.zip") != string::npos so the message must reference the path.
  • Mode 2 (zip-slip): asserts msg.find("invalid path") != string::npos matching ValidateRelativePath's actual exception text ("contains entry with invalid path").
  • Mode 3 was already content-checked.

A " " or single-byte garbage message would no longer pass.


#issuecomment-4316343496 — AlterFromNonZipToZipTest no v1 cleanup check

Fixed. The test now:

  1. Asserts v2's content present (testpackageB.dll, testpackageB.deps.json, myLib.manifest)
  2. Reads myLib.dll (the alias v2 created from testpackageB.dll)
  3. Reads testpackageB.dll (v2's source file)
  4. Compares the two byte-for-byte with EXPECT_EQ(myLibBytes, sourceBytes)

If v1's raw DLL bytes were left in place by a broken cleanup, the alias would either differ from v2's source or wouldn't have been created at all — both caught by the new assertion.


#issuecomment-4316380788 — InstallZipWithManyFilesTest count clarification

This one surfaced a real bug. While addressing it I inspected the fixture: testpackageG-MANYFILES.zip contains exactly 50 DLLs (Module1.dll .. Module50.dll), none matching manyfilespackage.*. So install creates an alias manyfilespackage.dll bringing the on-disk count to 51, not 50. The test was either passing by luck or quietly failing.

Fixed:

  • Comment now spells out the contents and the math:

    Package contents: testpackageG-MANYFILES.zip contains exactly 50 DLLs (Module1.dll .. Module50.dll) and no other files. None of them matches "manyfilespackage.*", so install must clone the first DLL as an alias named "manyfilespackage.dll". The DLL count in installDir is therefore 50 (extracted) + 1 (alias) = 51.

  • Assertion changed: EXPECT_EQ(dllCount, 51).

Thanks for flagging this — without your comment we would have shipped with a test that didn't actually verify what its name claimed.


#issuecomment-4316390427 — InstallInvalidZipTest misleading name

Fixed. Renamed to InstallNonZipFileAsRawDllTest and updated the description:

Tests that installing a file with .zip extension that is NOT actually a valid ZIP (magic bytes are not 'PK') falls through to the raw-DLL install path: the file is copied to the install directory as {libName}.dll and SQL_SUCCESS is returned. This pins the contract that IsZipFile detects content (not file extension).

NOTE: this does NOT exercise corrupt-ZIP handling. A genuinely invalid ZIP would start with 'PK' but be malformed inside, causing ZipFile.ExtractToDirectory to throw. We have no fixture for that case today.

The NOTE is honest about the gap; happy to add a corrupt-PK fixture in a follow-up.


Summary of the 38c553d changeset

  • 5 must-fix items (C1, C3, E1, E2, E3) all addressed in source + tests.
  • 8 test-improvement items (E4, E5, E6, E7, E8, E10, E11, E13) all addressed.
  • Refactor + style + docs (C6, A1–A8, B1–B5, C4, C5, D1, D2, D4) all addressed.
  • 1 latent test bug surfaced and fixed (E12).
  • 7 new TEST_F cases added; 6 existing tests strengthened; 1 test renamed.
  • 2 new test fixtures: testpackageI-EMPTYDIRS.zip, testpackageJ-NESTED2.zip.
  • InstallExternalLibrary refactored from ~285 lines into orchestrator + 7 single-purpose helpers.

Build clean: 0 warnings, 0 errors on managed Microsoft.SqlServer.CSharpExtension.dll.

Stuart Padley and others added 3 commits April 28, 2026 09:47
…111/111)

Found and fixed 4 issues introduced by the v3.9.0 changeset:

1. Lock file in installDir interfered with tests enumerating installDir contents and racing fs::remove_all on DeleteOnClose. Moved lock to a sibling path '{installDir}.install.lock' (concatenated, NOT a child of installDir).

2. C6 refactor lost a temp-folder cleanup invariant: 'tempFolder = InstallZipPackage(...)' only published the path on successful return, so a throw inside InstallZipPackage left an unreachable tempFolder leaking inside installDir. Converted to 'out string tempFolder' assigned BEFORE any work that can throw.

3. Three test bodies read files via std::ifstream without explicit close(); on Windows this raced CleanupInstallDir's fs::remove_all because the destructor runs lexically late. Added explicit .close() calls (RawDllInstallFailsIfForeignFileExists, InstallLibNameAlias, AlterFromNonZipToZip).

4. ErrorMessagePopulatedOnFailureTest mode 2 was checking for our ValidateRelativePath exception text, but .NET's ZipFile.ExtractToDirectory built-in zip-slip guard fires first with its own message ('outside the specified destination directory'). Updated assertion to match what actually surfaces.

Also: CMakeLists.txt was passing '--std=c++17' (a GCC/Clang flag, silently ignored by MSVC), so the test suite was actually building at MSVC's default C++14 the whole time. Switched to a generator expression that emits '/std:c++17' for MSVC and '--std=c++17' for non-MSVC compilers. Required to make D1 (std::filesystem) work.

Local test run: 111 passed, 0 failed (was 75 passed / 36 failed pre-fix).
Pin System.Text.Json to 10.0.4 explicitly in both csprojs so the
extension's published output ships the 10.0.4 assembly (overriding
the in-box net8 STJ 8.x in the extension's own AssemblyLoadContext).

* Microsoft.SqlServer.CSharpExtension.csproj
* Microsoft.SqlServer.CSharpExtensionTest.csproj

NuGet automatically bumped two companion packages to match:
* System.IO.Pipelines -> 10.0.4
* System.Text.Encodings.Web -> 10.0.4

No other transitive bumps were needed. TFM stays net8.0.

Also fix stale comment in nativecsharpextension.cpp: "expected to be
a zip" -> "may be a ZIP archive or a raw DLL" to match the header
and managed code.

Verified locally:
* dotnet build (production csproj): 0 warnings, 0 errors.
* dotnet build (test csproj):       0 warnings, 0 errors.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The STJ 10.0.4 pin causes a runtime version mismatch: the extension
loads into the default AssemblyLoadContext via hostfxr, where the
shared framework (Microsoft.NETCore.App 8.0.x) already provides
STJ 8.0.0.0. A local STJ 10.0.0.0 cannot coexist in the same ALC.

Revert the PackageReference additions from both csprojs. Retain the
stale-comment fix in nativecsharpextension.cpp ("expected to be a
zip" -> "may be a ZIP archive or a raw DLL").

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@JustinMDotNet JustinMDotNet force-pushed the dev/stuartpa/dotnet-install-external-library branch from 98caa4b to a0d7aae Compare April 28, 2026 20:17
Copy link
Copy Markdown

@JustinMDotNet JustinMDotNet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Overall: Solid implementation with excellent test coverage (37+ tests). The Install/Uninstall APIs handle ZIP packages, raw DLLs, ALTER permutations, conflict detection, alias lifecycle, and concurrent locking well. A few items worth addressing before merge.

Findings: 1 bug (pre-existing), 4 warnings, 3 suggestions. See inline comments.

Comment thread language-extensions/dotnet-core-CSharp/src/native/nativecsharpextension.cpp Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Outdated
Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
@yaelh yaelh self-requested a review April 29, 2026 23:03
Copy link
Copy Markdown

@yaelh yaelh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most comments have been addressed — nice work on the refactor and test additions. Requesting changes for the remaining items:

  1. Inner-zip extraction path: please take option (α) — route through tempFolder + CopyDirectory rather than relying on the SAFETY comment alone.
  2. Reparse-point regression test: add a test that validates the symlink guard fires, even if it's a no-op on Windows today.
  3. IsReparsePoint remarks: make the threat model concrete with an attack scenario example (sneaky.dll → /etc/shadow).
  4. DirectoryOverlapAllowedTest: rename to reflect what it actually tests.
  5. InstallZipWithManyFilesTest: clarify whether this test was actually being run before the fix — EXPECT_EQ(dllCount, 50) with 51 DLLs on disk should have failed.

Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Comment thread language-extensions/dotnet-core-CSharp/test/src/native/CSharpLibraryTests.cpp Outdated
// extension and the resulting "{libName}.manifest" / "{libName}.dll"
// paths would be hidden dotfiles on Linux and opaque on both
// platforms.
if (string.IsNullOrEmpty(Path.GetFileNameWithoutExtension(libName)))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does adding IsNullOrWhitespace make sense here? I tried creating a file on windows like .txt, but it was automatically modified .txt, so that's fine. but I don't know what linux does

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 55d15e3. Added string.IsNullOrWhiteSpace(libName) as the first check in ValidateLibraryName. Cheap and defensive — Linux filesystems will accept " " as a directory name, and even on Windows a NUL-delimited string with leading/trailing whitespace can sneak past Win32's filename parser depending on the API. New " " row added to InstallRejectsInvalidLibNameTest.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified — ValidateLibraryName first check at CSharpExtension.cs:1574-1577 is string.IsNullOrWhiteSpace(libName). New " " row added to InstallRejectsInvalidLibNameTest per the test on line 1779. Cheap and defensive — agree.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant adding IsNullOrWhiteSpace in addition to IsNullOrEmpty, not instead of it. aren't both empty and whitespace invalid?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good follow-up. string.IsNullOrWhiteSpace is a strict superset of string.IsNullOrEmpty -- per the .NET docs:

Returns true if the value parameter is null or String.Empty, or if value consists exclusively of white-space characters.

So the single IsNullOrWhiteSpace check at the top of ValidateLibraryName already rejects:

  1. null
  2. "" (empty)
  3. " " / "\t" / "\n" (whitespace-only)

There is no input that IsNullOrEmpty would reject that IsNullOrWhiteSpace accepts -- the latter implements the former plus the whitespace case. Adding both would be redundant: IsNullOrEmpty(s) || IsNullOrWhiteSpace(s) is exactly equivalent to IsNullOrWhiteSpace(s) for every possible s.

Coverage is in InstallRejectsInvalidLibNameTest: rows for "" and " " both reach the same ArgumentException ("Library name must not be empty or whitespace.") via the single IsNullOrWhiteSpace check.

Happy to add the explicit IsNullOrEmpty call too if you'd prefer it for readability -- just say the word and I'll push a one-liner. But on behavior alone, the current code already covers everything you described.

Comment thread language-extensions/dotnet-core-CSharp/src/managed/CSharpExtension.cs Outdated
Stuart Padley added 2 commits April 30, 2026 07:33
…, A2-A7 correctness/robustness, B1 symlink regression test, doc + test cleanups

Bug / correctness fixes (Justin + yaelh review):

  A1 nativecsharpextension.cpp -- replace `new std::string(...)` +
     `c_str()` with `malloc(len+1)` + `memcpy` so the returned pointer
     owns its own buffer. Matches the managed `Marshal.AllocHGlobal`
     contract that ExtHost expects on the other side. Fixes UB where
     ExtHost would dereference a pointer into a freed `std::string`
     internal buffer.

  A2 CSharpExtension.cs -- narrow the `IOException` catch in
     `EnsureDir` via `when (IsSharingViolation(ex))` filter that
     inspects HResult for ERROR_SHARING_VIOLATION (32) /
     ERROR_LOCK_VIOLATION (33). Other IOException variants
     (DirectoryNotFoundException, PathTooLongException, mid-creation
     failures) and non-IOException exceptions (UnauthorizedAccess,
     Argument, Security) now propagate fast instead of being swallowed.

  A3 DllUtils.cs -- replace
     `Directory.GetFiles(searchPath, "{name}.*")` +
     `.Where(...EndsWith(".dll"))` with
     `Directory.EnumerateFiles(searchPath)` + explicit
     `Equals(".dll", OrdinalIgnoreCase)` on extension and
     `Equals(userLibName, OrdinalIgnoreCase)` on stem. No more reliance
     on Win32 `FindFirstFile` wildcard semantics; eliminates the
     "*.dll matches foo.dllx" and "Foo.* matches short-name 8.3 alias"
     over-match quirks. Remarks block documents both quirks for the
     next reader.

  A4 + B1 CSharpExtension.cs + CSharpLibraryTests.cpp +
     test_packages/testpackageK-SYMLINK.zip +
     test_packages/build-symlink-fixture.ps1 -- collapse the
     inner-zip and outer-zip install paths through a single
     `contentRoot` + `IsReparsePoint`-guarded
     `ExtractContentToInstallDir` helper. Both paths now extract the
     payload into `tempFolder/inner-content/` first and walk it on
     disk, instead of one path calling
     `ZipFile.ExtractToDirectory(innerZip, installDir)` directly.
     `CollectStagedFiles` simplified to a single on-disk walk;
     `ExtractContentToInstallDir` simplified to a single code path.

     B1 regression test `InnerZipFutureSymlinkRejectedTest` uses the
     new fixture (262 bytes, generated by `build-symlink-fixture.ps1`).
     Inner zip contains `legitfile.dll` (regular file) +
     `evil-symlink.dll` (Unix mode `0o120755`, content
     `/etc/passwd`). Today's .NET ZipFile ignores Unix mode bits and
     materializes the entry as a regular file, but a future runtime
     might honor them. Test asserts the future-proofing invariant:
     install succeeds, `legitfile.dll` lands in installDir, and
     installDir contains zero reparse points.

  A5 CSharpExtension.cs -- wrap `File.Delete(libraryFile)` in an
     `else` branch so the manifest path exclusively owns cleanup for
     current-version installs. Direct `File.Delete` only runs as a
     legacy-compat path for libraries installed by pre-PR builds that
     have no manifest. Comments explain the split.

  A6 CSharpExtension.cs -- add `s_reservedDeviceNames` HashSet
     (OrdinalIgnoreCase) at the top of the class with all 22 Windows
     reserved DOS device names (CON, NUL, AUX, PRN, COM1-COM9,
     LPT1-LPT9, plus the COM0/LPT0 historical aliases).
     `ValidateLibraryName` now rejects any libName whose stem
     (`Path.GetFileNameWithoutExtension`) matches. New rows in
     `InstallRejectsInvalidLibNameTest`: CON, nul, Aux, PRN, COM1,
     LPT9, CON.dll, nul.manifest. Rejection is enforced on every OS
     so behavior stays consistent for libraries moved between hosts.

  A7 CSharpExtension.cs -- `DetermineAliasSource` now picks the
     lexicographically-first `.dll` candidate
     (`string.CompareOrdinal`) instead of the first one returned by
     `Directory.GetFiles`. Stable across NTFS / ext4 / XFS and across
     re-installs.

  B6 CSharpExtension.cs + CSharpLibraryTests.cpp -- replace
     content-sniff `IsZipFile` with extension-based `HasZipExtension`.
     `InstallNonZipFileAsRawDllTest` (which asserted that a `.zip`
     file with non-PK bytes was silently rewritten as `{libName}.dll`)
     becomes `InstallZipExtensionWithBadContentFailsLoudlyTest` which
     asserts `SQL_ERROR` + that the user's file is NOT silently
     installed under a `.dll` rename. Pre-fix behavior would copy
     `bad-package-ZIP.zip` to `bad-package.dll`; post-fix the user
     gets a clear error.

  B12 CSharpExtension.cs -- first check in `ValidateLibraryName` is
     now `string.IsNullOrWhiteSpace(libName)`. New "   " row in
     `InstallRejectsInvalidLibNameTest`.

  B13 CSharpExtension.cs -- remove `ValidateRelativePath` entirely.
     After A4 collapsed both code paths through
     `ZipFile.ExtractToDirectory` + on-disk walk it had zero callers;
     the zip-slip defense it provided is now covered by (a)
     `ZipFile.ExtractToDirectory`'s built-in zip-slip check and (b)
     the on-disk walk being unable to see entries that escaped the
     staged tree.

Test additions / hardenings:

  B3 CSharpLibraryTests.cpp -- rename `DirectoryOverlapAllowedTest`
     -> `NonConflictingFlatFilesCoexistTest`. Comment block tightened
     to state "Both packages used here are flat (no nested
     directories), so the test exercises the flat-file coexistence
     case only -- nested-directory overlap is covered separately by
     `ManifestListsNestedFilesTest` + `InnerZipFileConflictFailsTest`."

  B4 CSharpLibraryTests.cpp -- harden `InstallZipWithManyFilesTest`
     with a per-module existence loop (`Module1.dll` ... `Module50.dll`)
     plus a comment block with the historical context (the original
     `EXPECT_EQ(dllCount, 50)` passed legitimately because the old
     install code created the alias as `{libName}` with no extension
     -- test asserted what the code did, not what it should do).

Documentation:

  B2 CSharpExtension.cs -- expand `IsReparsePoint` `<remarks>` block
     with the concrete `sneaky.dll` -> `/etc/shadow` scenario and the
     directory-level reparse-point variant. Notes the guard is
     theoretical against today's .NET (which writes symlink-mode
     entries as regular files) and links to
     `InnerZipFutureSymlinkRejectedTest`.

  B7 CSharpExtension.cs -- trim the `CleanupManifest` catch comment
     to stop at the diagnostic-trail justification.

Style:

  B5 / B9 / B10 -- blank line between adjacent independent constructs:
     (1) file-loop / dir-loop split in `ExtractContentToInstallDir`,
     (2) consecutive `if (e.find("MyLib.dll"))` /
         `if (e.find("native.dll"))` flag-setters in
         `ManifestListsNestedFilesTest`,
     (3) consecutive `if (e.find("testpackageA"))` /
         `if (e.find("testpackageB"))` flag-setters in
         `AlterFromZipToZipTest`.
     `}` followed by `continue;` guards or `return X;` function-tails
     left alone -- those are idiomatic and adding blank lines there
     would only add noise.

  B11 CSharpLibraryTests.cpp -- one-line rationale comments at the
     first write-site (`sentinelStream`) and the first read-site
     cluster (`aliasStream` / `sourceStream`) explaining the pattern
     of explicit `.close()` before `EXPECT_EQ` / `fs::exists` (gtest
     macro failure paths can run arbitrary code; known-closed streams
     are easier to debug). Same pattern is used at every other site
     for consistency.

Test results: 112/112 unit tests pass on Windows release config.
…ptTest cannot revert to the pre-PR literal

yaelh asked to revert sicongliu's `"NonExistentLibrary"` rename for
git-blame continuity. Tried it locally; the test fails:

  [ RUN      ] CSharpExtensionApiTests.ExecuteInvalidLibraryNameScriptTest
  Hello .NET Core CSharpExtension!
  CSharpExecuteTests.cpp(125): error: Expected equality of these values:
    result      Which is: 0
    (-1)        Which is: -1
  CSharpExecuteTests.cpp(127): error: Value of:
    error.find("Unable to find user dll under") != string::npos
    Actual: false  Expected: true

The pre-PR literal `"Microsoft.SqlServer.CSharpExtensionTest"` is the
basename of `m_UserLibName`
(`"Microsoft.SqlServer.CSharpExtensionTest.dll"`), so the loader now
resolves it successfully to the real test DLL and the test no longer
observes the expected error. Sicongliu's rename was load-bearing, not
cosmetic.

Keep `"NonExistentLibrary"` and add a 4-line code comment at the call
site explaining why, so a future reader (or another reviewer) does not
attempt the same revert.

Test results: 112/112 unit tests pass on Windows release config.
@stuartpa
Copy link
Copy Markdown
Author

Pushed two new commits addressing the third review pass:

  • 55d15e3 — bug-fix omnibus: A1 native buffer ownership (malloc+memcpy, fixes c_str() UB), A2 IOException filter via IsSharingViolation, A3 DllUtils explicit enumeration (no more Foo.* 8.3 quirks), A4+B1 inner-zip routed through tempFolder/inner-content/ + IsReparsePoint-guarded copy + new symlink regression fixture (testpackageK-SYMLINK.zip, generated by build-symlink-fixture.ps1), A5 File.Delete gated to legacy path only, A6 s_reservedDeviceNames HashSet (CON/PRN/NUL/AUX/COM1-9/LPT1-9 + 8 new test rows), A7 lexicographic alias selection, B6 extension-based HasZipExtension (no more silent .zip.dll rewrite), B12 IsNullOrWhiteSpace, B13 ValidateRelativePath removed (zero callers after A4), plus B2/B3/B4/B5/B7/B9/B10/B11 docs/tests/style.
  • 8ef1573 — B8: kept "NonExistentLibrary" + added a 4-line code comment explaining why the pre-PR literal can't be restored (the loader resolves it to the real test DLL and the test would fail to observe the expected error). Test-failure transcript posted in the B8 thread.

Test results: 112/112 unit tests pass on Windows release config.

Replies posted on every thread above. @yaelh @JustinMDotNet — ready for another look.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 22 changed files in this pull request and generated 4 comments.

target_compile_options(dotnet-core-CSharp-extension-test PRIVATE --std=c++17)
target_compile_options(dotnet-core-CSharp-extension-test PRIVATE
"$<$<CXX_COMPILER_ID:MSVC>:/std:c++17>"
"$<$<NOT:$<CXX_COMPILER_ID:MSVC>>:--std=c++17>"
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in working tree -- changed --std=c++17 to -std=c++17 on line 28. Will be in the next commit. Thanks for the catch; agreed Clang would reject the long form even though GCC tolerates it as undocumented.

Comment on lines +146 to +148
// The library file is expected to be a zip. If it contains an inner zip,
// that zip is extracted to the install directory. Otherwise, all files
// are copied directly.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in working tree (will be in the next commit). The header comment now documents the dispatch-by-libraryName contract:

  • libraryName ending in .zip -> ZIP install (with single-inner-zip extraction support)
  • libraryName ending in .dll -> raw DLL install (copy + one-entry manifest)
  • libraryName with neither extension -> falls back to libraryFile extension (legacy compat)

Plus a note that a {libName}.manifest is always written so UninstallExternalLibrary can clean up exactly what was installed. Thanks for catching the staleness.

Comment on lines +1070 to +1073
if (name.StartsWith(libName + ".", s_pathComparison) ||
name.Equals(aliasFileName, s_pathComparison))
{
// A root-level file matches "{libName}.*" -- already discoverable.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the working tree (next commit). Good catch; the underlying issue was actually broader than the libName-ends-in-".dll" case.

What the original code did: name.StartsWith(libName + ".", ...) || name.Equals(aliasFileName, ...) -- any root-level file beginning with "{libName}." suppressed alias creation.

Why that was wrong in general: the loader (DllUtils.CreateDllList) resolves a library by trying to map "{libName}.dll" as a PE binary. A sidecar like "foo.deps.json", "foo.runtimeconfig.json", or (in the case you flagged) "foo.dll.config" matches the StartsWith prefix but is not a loadable DLL -- so suppressing on it left the install with no root-level loader target.

Note on your literal scenario (libName="foo.dll"): the recently-added DispatchAsZip routes any libName ending in ".dll" straight to the raw-DLL install path, so DetermineAliasSource is never reached for that exact case in production. But the underlying logic was still latently wrong for the much more common libName="foo" case with sidecars at the root.

Fix: tightened suppression to require an exact match against aliasFileName (i.e. "{libName}.dll" -- the file the loader will actually map). Sidecars matching the "{libName}." prefix no longer count as "already discoverable". Also dropped the now-unused libName parameter from the method signature. The <remarks> block on DetermineAliasSource documents the rationale explicitly.

Regression test: AliasCreatedWhenOnlySidecarsAtRootTest in CSharpLibraryTests.cpp, using new fixture testpackageL-SIDECAR.zip (build script: build-sidecar-fixture.ps1). The fixture has foo.deps.json and foo.runtimeconfig.json at the root, and the actual DLL nested at lib/net8.0/foo.dll. Install with libName="foo" must produce a root-level "foo.dll" alias (cloned from the nested DLL). Pre-fix: the test fails because the sidecars suppressed alias creation. Post-fix: passes. All 113 tests now pass (was 112).

Comment on lines +148 to +167
SQLCHAR *libError = nullptr;
SQLINTEGER libErrorLength = 0;

SQLRETURN result = (*installFunc)(
SQLGUID(),
reinterpret_cast<SQLCHAR *>(const_cast<char *>(libName.c_str())),
static_cast<SQLINTEGER>(libName.length()),
reinterpret_cast<SQLCHAR *>(const_cast<char *>(libFilePath.c_str())),
static_cast<SQLINTEGER>(libFilePath.length()),
reinterpret_cast<SQLCHAR *>(const_cast<char *>(installDir.c_str())),
static_cast<SQLINTEGER>(installDir.length()),
&libError,
&libErrorLength);

errorMessage.clear();
if (libError != nullptr && libErrorLength > 0)
{
errorMessage.assign(reinterpret_cast<char *>(libError),
static_cast<size_t>(libErrorLength));
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in the working tree (next commit). Added a FreeLibError(SQLCHAR *) helper in CSharpLibraryTests.cpp and call it on every libError pointer returned from the three test helpers (CallInstall, CallUninstall, CallInstallCaptureError) before they return -- so SQL_ERROR paths no longer accumulate unmanaged allocations across the gtest run.

Allocator pairing. All tests in this file call the managed Install/UninstallExternalLibrary exports in Microsoft.SqlServer.CSharpExtension.dll. The managed SetLibraryError (CSharpExtension.cs) allocates via Marshal.AllocHGlobal, which on Windows is backed by LocalAlloc -- so the matching deallocator is LocalFree, which is what FreeLibError calls. Production ExtHost releases the buffer the same way.

(Note: the native pre-flight SetLibraryError in nativecsharpextension.cpp uses malloc(), not LocalAlloc. On Windows, LocalFree on a malloc'd pointer is undefined behavior. But those tests in this file never exercise that pre-flight path -- every libError they ever see is AllocHGlobal-backed. The new FreeLibError comment documents this explicitly.)

All 113 tests still pass.

Copy link
Copy Markdown

@yaelh yaelh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signing off since the remaining comments are small and trivial to implement. also since I'll be OOF I don't want to block the PR

@monamaki
Copy link
Copy Markdown
Contributor

Please run PVS once before merging this PR.

I have shared the instructions with Justin.

@stuartpa
Copy link
Copy Markdown
Author

Acknowledged @monamaki — coordinating with @JustinMDotNet to run PVS against the dotnet-core-CSharp extension. We'll post the results (and address any findings) before requesting re-review. Thanks for getting the instructions over to him.

Production fixes (end-to-end testing against SQL Server 2025 RTM-GDR):

* CSharpExtension.cs AcquireInstallLock: place install.lock under
  Path.Combine(installDir, "install.lock") instead of one level up so
  concurrent installs into different <dbid>/<langid> slots don't
  serialize against one another.
* CSharpExtension.cs new DispatchAsZip(libName, libFilePath): install
  dispatch is now driven by the registered library name's extension
  (.zip -> ZIP, .dll -> raw DLL, otherwise fall back to libFilePath's
  extension). ExtHost passes a generated temp filename with no semantic
  suffix in production, so the previous extension-sniff on libFilePath
  was meaningless there; the libFilePath fallback preserves the legacy
  contract for test fixtures registered under a bare library name.
* CSharpOutputDataSet.cs + utils/Sql.cs DotNetNVarChar plumbing: add
  the DotNetNVarChar row to Sql.DataTypeSize and a DotNetNVarChar case
  alongside DotNetWChar in ExtractColumn / GetStrLenNullMap. Previously
  fell through to default and threw KeyNotFoundException in
  DataTypeSize before the column reached the dispatch switch.
* CSharpOutputDataSet.cs DotNetWChar / DotNetNVarChar Size unit:
  report Size in BYTES, matching the unit emitted by GetStrLenNullMap
  (Encoding.Unicode.GetByteCount). The previous code reported a
  character count, which combined with a byte-count length map caused
  SPEES to log "Reading one row failed for column N row M. The length
  information is incorrect." and reject the rowset whenever a string
  column contained non-ASCII data.

Reviewer items (PR microsoft#85 May 13 review):

* test/src/native/CMakeLists.txt: non-MSVC -std=c++17 (one dash) instead
  of --std=c++17 to match the documented spelling and the rest of the
  build tree.
* include/nativecsharpextension.h: rewrite InstallExternalLibrary doc
  comment to spell out the new libName-based dispatch contract and the
  {libName}.manifest file written for every install (ZIP or raw DLL).
* CSharpExtension.cs DetermineAliasSource: alias-suppression now
  requires an EXACT match against "{libName}.dll" at the install root.
  The previous prefix check ("{libName}.") suppressed alias creation
  for ZIPs that planted only sidecars at the root (foo.deps.json etc.)
  with the real binary nested under lib/net8.0/, leaving the install
  un-loadable. Drops the now-unused libName parameter from the
  signature. Pinned by AliasCreatedWhenOnlySidecarsAtRootTest +
  testpackageL-SIDECAR.zip fixture (build-sidecar-fixture.ps1).
* test/src/native/CSharpLibraryTests.cpp new FreeLibError(SQLCHAR *)
  helper using LocalFree (matches the production Marshal.AllocHGlobal /
  LocalAlloc allocator that ExtHost uses on the consumer side). Wired
  into CallInstall, CallUninstall, and CallInstallCaptureError so the
  test harness no longer leaks the libError buffer on every
  failing-install assertion.

Tests: 113/113 unit tests pass on Windows release config (one new
TEST_F: AliasCreatedWhenOnlySidecarsAtRootTest).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants