
Link to the issue: https://github.com/langchain-ai/langchain/issues/36854
🧾 Problem Description
LangChain’s prompt loading utilities validate file paths using lexical checks, rejecting absolute paths and traversal sequences. However, relative paths that pass validation can still resolve via symbolic links to unintended locations on the filesystem.
This creates a gap between what the application assumes it is loading and what is actually read, allowing unintended file access when symlinks are present.
🛠 My Solution
Instead of validating only the user-provided path string, the solution is to validate the resolved filesystem path.
The approach is:
Perform initial validation on the input path:
Reject absolute paths
Reject paths containing traversal patterns (e.g.,
..)
Resolve the path to its canonical form:
Follow symbolic links
Normalize the path to its final filesystem location
Define the trusted boundary:
Typically the current working directory (or a configured base directory)
Enforce containment:
Verify that the resolved path remains within the trusted boundary
Reject any path that escapes this boundary:
This prevents symlink-based redirection to unintended files
This ensures that even if a path appears safe syntactically, it cannot resolve to an unsafe location at runtime.
🔥 Key Insight: Path validation must be performed on the canonical (resolved) path, not just the user-provided string.

