Fix: Add stevebauman/purify for HTML sanitization#51
Conversation
Adds HTMLPurifier via the stevebauman/purify Laravel wrapper to sanitize user-supplied HTML content before storage, preventing stored XSS attacks through rich-text fields.
Group and event descriptions (free_text) were stored as raw user HTML and rendered unescaped, allowing stored XSS. Now sanitized with HTMLPurifier before storage, stripping dangerous tags like script and event handlers while preserving safe HTML.
User-controlled values (group names, event venues, user names,
network names) were interpolated into HTML translation strings
and rendered unescaped via {!! !!} or v-html. Now all user data
is escaped with e() (PHP) or DOM textContent (JS) before
interpolation into HTML contexts.
Publishes the default HTMLPurifier config from stevebauman/purify, which controls which HTML tags and attributes are allowed through sanitization.
|
dev_block ⚡ it seems this doesn't fully cover all edge cases. |
|
Sorry, I lost a comment: that review comprises lightly-edited Claude review notes. I also think it's worth considering whether we can add some regression tests. Claude also mentions that |
Adds sanitization mutators at the model layer so all write paths (controllers, CSV imports, tinker, seeders) are covered. Plain-text fields like names and venues use strip_tags() to remove HTML entirely. Rich-text fields like descriptions and biographies use Purify::clean() to strip dangerous tags while preserving safe formatting HTML.
Model mutators now handle sanitization of free_text fields, so the explicit Purify::clean() calls in the API controllers are redundant and can be removed.
Escapes group.name before passing to ConfirmModal's v-html rendered delete/archive confirmation messages in GroupActions.vue. Also escapes user name in the soft-delete flash message in UserController to prevent stored XSS when an admin deletes a user with a malicious name.
Adds client-side HTML sanitization via DOMPurify to the ReadMore component, which renders group and event descriptions via v-html. Server-side Purify::clean() model mutators are the primary defense, but this provides a second layer in case any write path bypasses the model (raw DB queries, migrations, imports).
Updates HTMLPurifier config to use HTML 5 doctype for consistency with Html5Definition, adds URI.AllowedSchemes to block javascript: and data: URLs in sanitized HTML, and extracts the inline DOM escaping pattern into a named escapeHtml() method for readability.
HTMLPurifier doesn't support HTML 5 as a doctype value — it throws at runtime. The Html5Definition class from stevebauman/purify adds HTML5 element support on top of the HTML 4.01 base, so HTML 4.01 Transitional + Html5Definition is the intended pairing.
Replaces raw <?php echo ?> with Blade {{ }} for user name and
location in the user list template. Adds setLocationAttribute
mutator to User model to strip HTML tags from the location field
on storage, preventing stored XSS via the location/town field.
|
un_dev_block ⚡ QA 👍 there were a lot more locations, but things look good now |
|
|
||
| public function setFreeTextAttribute($value) | ||
| { | ||
| $this->attributes['free_text'] = $value === null ? null : Purify::clean((string) $value); |
There was a problem hiding this comment.
Trying to sanitize input text is a mistake I think. This concept has been around for decades, and the solution is almost always: Treat all input as dangerous, and thus escape appropriately it when outputting.
Why are we trying to clean the human input here?
There was a problem hiding this comment.
Its a safety measure, we've audited the code and until we address every location, we are taking the extra step to sanitize the input as well.
There was a problem hiding this comment.
I think the main reason to clean human input going forward (where we want to end up) would be if it's allowed to contain HTML. I think that may be applicable here (edited via rich text editor).
There was a problem hiding this comment.
I think the main reason to clean human input going forward (where we want to end up) would be if it's allowed to contain HTML
That evidence sounds like an argument for not trying to sanitize inputs.
Even if that's the case (we accept html-containing input), I still think it's better to keep the original value and strip / sanitize on output.
Rules about what to strip / sanitize change, which makes it much harder to retroactively apply those changes if we are depending on DB content already being safe.
There was a problem hiding this comment.
You've convinced me it's better to sanitize on output.
|
|
||
| @if(App\Helpers\Fixometer::hasRole($user, 'Administrator')) | ||
| <a href="/user/edit/<?php echo $u->id; ?>"><?php echo $u->name; ?></a> | ||
| <a href="/user/edit/{{ $u->id }}">{{ $u->name }}</a> |
There was a problem hiding this comment.
I don't know about this templating system, so I don't even know what to search for, but this {{ pattern seems to be the canonical way to safely echo variables.
I see lots of other places in the codebase that use echo and it's not clear why the dangerous way was chosen. Seems like we should fix all of those.
There was a problem hiding this comment.
Yes, but the scope has gotten too big right now. We also discovered other issues but will be addressing them in follow-up pulls.
| {!! __('networks.general.count', [ | ||
| 'count' => $network->groups->count(), | ||
| 'name' => $network->name, | ||
| 'name' => e($network->name), |
There was a problem hiding this comment.
The fact that we need to escape html here indicates to me that the __() function is designed to produce or allow raw html and doesn't escape its output.
This leaves it dangerous by default and thus this likely will happily output html-like content in translated strings.
Do we control the __() function? If so, we might want to consider making it safe-by-default like our php variant. i.e. you have to explicitly mark a string as html(...).
There was a problem hiding this comment.
We don't control __() its Laravel's built-in translation helper.
|
dev_block 👍 on some more conversations. |
sctice-ifixit
left a comment
There was a problem hiding this comment.
Notes from Claude:
The overall shape of the fix is right: model mutators cover every write path (controllers, API, CSV imports, seeders, tinker), e() at the flash-message interpolation sites is the correct Blade pattern, and DOMPurify in ReadMore.vue gives a real second layer for the rich-text sink.
Main gap: two flash-message sites still interpolate user data into translated HTML rendered via {!! !!}.
app/Http/Controllers/GroupController.php:378—'emails' => rtrim(implode(', ', $not_sent))passed into__('groups.invite_success_apart_from', [...]), rendered via{!! \Session::get('warning') !!}app/Http/Controllers/PartyController.php:706— same shape with__('events.invite_invalid_emails', ['emails' => implode(', ', $not_sent)])
Both hold strings that failed email validation, so exploitation is narrow, but this PR is exactly about applying the e() pattern everywhere user data meets {!! !!}. Worth the consistency.
Smaller notes left inline below.
| @@ -4,10 +4,10 @@ | |||
| <p v-html="formattedString"></p> | |||
There was a problem hiding this comment.
This v-html renders formattedString (derived from this.text) with no sanitization. Every current call site uses :html="..." so the branch is unreachable today, but leaving an unsanitized v-html sink in the component invites a future caller to wire up the text prop and ship a vuln. Either delete the <span v-if="text"> block and the text prop, or route formattedString through DOMPurify for symmetry with the html branch.
Line 87 reinforces that this path isn't exercised — text.length > maxChars is missing this. and would throw if the branch ever ran.
| 'URI.AllowedSchemes' => [ | ||
| 'http' => true, | ||
| 'https' => true, | ||
| 'mailto' => true, |
There was a problem hiding this comment.
Suggestion: add 'tel' => true here. Mobile click-to-call links are a reasonable thing to allow through sanitized descriptions, and tel: URIs don't carry XSS risk the way javascript: and data: do.
| } | ||
| }, | ||
| methods: { | ||
| escapeHtml(str) { |
There was a problem hiding this comment.
This escapeHtml method is duplicated verbatim in resources/js/components/GroupPage.vue:163. Worth pulling into a small shared helper (resources/js/helpers.js or a mixin) — the pattern is going to spread as more v-html sites adopt this defense, and having one canonical implementation means the next dev doesn't have to decide whether to copy or refactor.
| } | ||
| }, | ||
| methods: { | ||
| escapeHtml(str) { |
There was a problem hiding this comment.
Duplicates the escapeHtml in GroupActions.vue:103. See the note there on extracting to a shared helper.
|
I think it would be great to convert uses of
|
Correct me if I'm wrong... but isn't that the point of this pull? Isn't this the vulnerability? Maybe you meant "Fixing more instances of this vulnerability than the one that was noticed should happen in a follow up pull in the interest of expediency"?
I still see this as ignoring decades of learning all across the industry. If cleaning is expensive, cache it. Don't trust stored DB data to be safe unless the DB types define it as such (i.e. int columns can't contain html) |
Yes. We're already fixing up everything we've found by doing an initial audit. We can do successive, more exhaustive audits as follow-ups.
Fair. This is also something that I think can be improved in follow-up work. |
|
un_dev_block ⚡ I am going to address all other things in a follow-up pull |
|
CR 📱 |
Description
Addresses multiple stored XSS vulnerabilities where user-controlled content was rendered as raw HTML. A security report flagged group names as a vector, but our investigation found the more critical issue is in rich-text description fields (
free_text) and flash messages that interpolate user data into HTML without escaping.Changes
Fix 1: Sanitize rich-text on storage
stevebauman/purify(HTMLPurifier wrapper) to sanitize group and event descriptions before saving to the database<script>, event handlers like onerror) while preserving safe HTML (<b>,<a>,<p>, etc.)API/GroupController.php,API/EventController.phpFix 2: Escape user data in flash messages and v-html
$group->name,$event->venue,$user->name,$network->name) ine()before interpolation into translation strings rendered via{!! !!}GroupController.php,AcceptUserInvites.php,networks/show.blade.php,GroupPage.vueQA Notes
Deploy to the staging instance and validate script tags are stripped.