Skip to content

Add accessibility validation checks to site validator#8797

Closed
TheAshutoshMishra wants to merge 4 commits intojenkins-infra:masterfrom
TheAshutoshMishra:add-accessibility-validator-checks
Closed

Add accessibility validation checks to site validator#8797
TheAshutoshMishra wants to merge 4 commits intojenkins-infra:masterfrom
TheAshutoshMishra:add-accessibility-validator-checks

Conversation

@TheAshutoshMishra
Copy link
Contributor

Integrated automated accessibility checks into the site validator to enforce WCAG 2.1 Level A compliance during the build process. Changes: Added validation for missing/generic alt text and ARIA labels for icon-only interactive elements in both HAML and HTML. Impact: Prevents accessibility regressions by catching common errors before production, improving the experience for screen reader users.

- Added check for missing alt text on images (WCAG 2.1 Level A)
- Added validation for generic alt text (image, logo, icon, etc)
- Added check for icon-only buttons/links missing aria-label
- Helps prevent accessibility issues during build time

This enhancement will catch common accessibility problems before
they reach production, improving site inclusivity for users with
screen readers and assistive technologies.
@TheAshutoshMishra TheAshutoshMishra requested a review from a team as a code owner February 4, 2026 15:24
Copilot AI review requested due to automatic review settings February 4, 2026 15:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR integrates automated accessibility validation into the site build process to enforce WCAG 2.1 Level A compliance. The validator checks for missing or generic alt text on images and verifies that icon-only interactive elements have appropriate ARIA labels.

Changes:

  • Added validate_accessibility method to check pages for accessibility compliance
  • Implemented check_image_alt_text to detect missing/generic alt attributes on images in both HAML and HTML
  • Implemented check_aria_labels to validate ARIA labels on icon-only buttons and links

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +95 to +96
# Find all img tags - both HAML and HTML
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern doesn't handle HAML's multiline image syntax (where attributes can span multiple lines). In HAML, you can have %img{ :src => 'path', :alt => 'text' } on separate lines with proper indentation. This pattern will only match if the closing brace is on the same line, potentially missing images that need validation. Consider using a more robust matching approach or document this limitation.

Suggested change
# Find all img tags - both HAML and HTML
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
# Find all img tags - both HAML and HTML (support multiline HAML attributes)
img_pattern = /%img\{.*?\}|<img[^>]*>/im

Copilot uses AI. Check for mistakes.
def check_aria_labels(page)
# Find buttons and links with only icons (no text content)
# Look for patterns like %button, %a with only ion-icon or svg inside
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern will match buttons and links that have both text content AND icons, not just icon-only elements as the comment suggests. For example, a button with text "Download" and an icon will be flagged. The pattern should ensure the element contains ONLY an icon without text content. Consider adding negative lookahead or checking the captured content for text nodes.

Copilot uses AI. Check for mistakes.
Comment on lines 99 to 101
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation only checks if alt attribute exists or contains generic terms, but doesn't check for empty alt text (alt="" or alt=''). Empty alt text is different from missing alt text - decorative images should have empty alt text, but this should be an explicit choice. Consider adding a check to warn about empty alt text to ensure it's intentional rather than an oversight.

Copilot uses AI. Check for mistakes.
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding the generic terms list makes it difficult to maintain and extend. Consider extracting this to a constant at the class level (e.g., GENERIC_ALT_TERMS) to improve code organization and make it easier to add or modify terms in the future.

Copilot uses AI. Check for mistakes.
Comment on lines 122 to 125
element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]

# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex only checks for 'aria-label' and 'aria-labelledby' in standard HTML attribute format, but doesn't account for HAML's symbol syntax (e.g., :"aria-label" => "text"). This means buttons/links with ARIA labels written in HAML hash syntax might be incorrectly flagged as missing labels. The regex should also match patterns like :"aria-label"\s*=>\s* or :"aria-labelledby"\s*=>.

Suggested change
element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]
# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
element_start = page.content.index(match[0])
element_line = page.content[element_start..element_start + 150]
# Check if there's an aria-label or aria-labelledby in either HTML or HAML syntax
unless element_line.match(/(aria-label\s*=|aria-labelledby\s*=|:"aria-label"\s*=>|:"aria-labelledby"\s*=>)/i)

Copilot uses AI. Check for mistakes.
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i

page.content.scan(icon_only_pattern).each do |match|
element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical bug: page.content.index(match[0]) can return nil if the matched text is not found, which will cause an error when attempting to create a range. This can occur because the regex captures groups and match[0] may not exist in the original content string. Consider using match.begin(0) from the MatchData object or scanning with match_index to get the position directly.

Copilot uses AI. Check for mistakes.
Use Regexp.last_match.begin(0) instead of string index to safely
get the match position, preventing nil errors when match is not found.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generic terms list includes 'logo' and 'icon', but these might be acceptable in certain contexts. For example, in the codebase at content/_partials/social-media-buttons.html.haml:33, there's %img{:title => "Logo", :src => expand_link(item.logo), :width => "96"} where "Logo" might be combined with other context. Consider whether warnings for these terms should be case-sensitive or whether they should allow compound descriptions like "Jenkins logo" or "GitHub icon".

Copilot uses AI. Check for mistakes.
Comment on lines 98 to 107
# Check if alt attribute is missing
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation doesn't check for images that use the title attribute without alt. In the codebase, several images use :title without :alt (e.g., content/2.0.html.haml:102). According to WCAG 2.1 Level A (Success Criterion 1.1.1), the title attribute alone is insufficient for accessibility because:

  1. Screen readers may not consistently announce title attributes
  2. Title attributes are only visible on hover, making them inaccessible to keyboard and touch users
  3. The alt attribute is the proper way to provide text alternatives for images

The validator should warn when images have title but no alt, as this suggests the developer may have mistakenly used title thinking it provides accessibility.

Suggested change
# Check if alt attribute is missing
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
# Determine presence of alt and title attributes (HAML and HTML forms)
has_alt = !!img_tag.match(/:alt\s*=>|alt\s*=/i)
has_title = !!img_tag.match(/:title\s*=>|title\s*=/i)
if !has_alt
if has_title
warning "Image uses title without alt in #{page.output_path}: #{img_tag.strip[0..80]}"
else
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
end
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

Copilot uses AI. Check for mistakes.
Comment on lines 82 to 134
# Validate accessibility requirements for pages
# Checks for WCAG 2.1 Level A compliance
def validate_accessibility(page)
return unless page.content

# Check for images without alt text
check_image_alt_text(page)

# Check for buttons and links that might need ARIA labels
check_aria_labels(page)
end

def check_image_alt_text(page)
# Find all img tags - both HAML and HTML
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
page.content.scan(img_pattern).each do |img_tag|
# Check if alt attribute is missing
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
end
end
end
end
end

def check_aria_labels(page)
# Find buttons and links with only icons (no text content)
# Look for patterns like %button, %a with only ion-icon or svg inside
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i

page.content.scan(icon_only_pattern) do |match|
# Get position of match in content
match_pos = Regexp.last_match.begin(0)
next unless match_pos

# Extract surrounding context for validation
element_line = page.content[match_pos, 150]

# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
warning "Interactive element with icon needs aria-label in #{page.output_path}"
end
end
end
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation approach of using regex pattern matching on raw HAML/HTML content is fragile and difficult to maintain. Consider these alternative approaches:

  1. Parse HAML/HTML into a DOM: Use libraries like Nokogiri to parse the rendered HTML output, which would allow more reliable element traversal and attribute checking.

  2. Hook into the rendering pipeline: If possible, validate during or after the HAML-to-HTML rendering phase when the structure is cleaner.

  3. Use HAML parser: Parse HAML syntax trees directly using a HAML parser to understand the structure before HTML generation.

The current regex approach will be prone to false positives and false negatives as the codebase evolves and uses different HAML/HTML patterns. A parsing-based approach would be more robust and maintainable long-term.

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +132
# Find buttons and links with only icons (no text content)
# Look for patterns like %button, %a with only ion-icon or svg inside
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i

page.content.scan(icon_only_pattern) do |match|
# Get position of match in content
match_pos = Regexp.last_match.begin(0)
next unless match_pos

# Extract surrounding context for validation
element_line = page.content[match_pos, 150]

# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
warning "Interactive element with icon needs aria-label in #{page.output_path}"
end
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The icon-only pattern has several issues that will lead to false positives and false negatives:

  1. False negatives: The pattern requires icon elements to appear on the same line as the button/link (see [^%\n]*), but in HAML, content is often indented on the next line. For example:
%button
  %ion-icon{name: "copy-outline"}

This valid HAML structure won't be matched.

  1. False positives: The pattern matches any button/link that contains an icon anywhere, even if it also has text content. For example, a button with both an icon and text label would still trigger a warning.

  2. Incomplete icon detection: The pattern only looks for %ion-icon, <svg, and <ion-icon, but icons could also be embedded as SVG files (e.g., File.read("content/images/symbols/rss.svg") as seen in content/_layouts/changelog.html.haml:16).

Consider a more robust approach that parses the element structure to determine if it has text content in addition to icons.

Suggested change
# Find buttons and links with only icons (no text content)
# Look for patterns like %button, %a with only ion-icon or svg inside
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
page.content.scan(icon_only_pattern) do |match|
# Get position of match in content
match_pos = Regexp.last_match.begin(0)
next unless match_pos
# Extract surrounding context for validation
element_line = page.content[match_pos, 150]
# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
warning "Interactive element with icon needs aria-label in #{page.output_path}"
end
# Find buttons and links that appear to be icon-only (no text content)
# Handles HAML structures where content is on indented lines and detects
# several icon patterns, including inline SVG and SVGs included via File.read.
return unless page.content
lines = page.content.lines
lines.each_with_index do |line, index|
# Match HAML-style %button or %a at the start of the line, capturing indentation
next unless line =~ /^(?<indent>\s*)(%button|%a)(\{[^}]*\})?.*$/i
indent = Regexp.last_match[:indent]
# Collect this line and any more-indented child lines as the element block
block_lines = [line]
child_indent_prefix = indent + " "
j = index + 1
while j < lines.length
next_line = lines[j]
break unless next_line.start_with?(child_indent_prefix)
block_lines << next_line
j += 1
end
block = block_lines.join
# Skip if the element already has ARIA labelling
next if block.match(/aria-label\s*=|aria-label\s*=>|aria-labelledby\s*=|aria-labelledby\s*=>/i)
# Detect icon usage: ion-icon, inline SVG, or SVG files included via File.read
has_icon =
block.match(/%ion-icon\b/i) ||
block.match(/<svg\b/i) ||
block.match(/<ion-icon\b/i) ||
block.match(/File\.read\([^)]*\.svg['"]?\)/i)
next unless has_icon
# Heuristically determine whether there is any textual content besides icons/markup.
text_candidate = block.dup
# Remove common icon constructs
text_candidate.gsub!(/%ion-icon[^\n]*/i, "")
text_candidate.gsub!(/<svg.*?<\/svg>/im, "")
text_candidate.gsub!(/<ion-icon.*?\/?>/im, "")
text_candidate.gsub!(/File\.read\([^)]*\.svg['"]?\)/im, "")
# Strip HAML element prefixes and attribute hashes
text_candidate.gsub!(/%[a-z0-9_]+(\{[^}]*\})?/i, "")
text_candidate.gsub!(/\{[^}]*\}/m, "")
# Remove simple Ruby string assignments/attributes like :title => "..."
text_candidate.gsub!(/:[a-z0-9_]+\s*=>\s*['"].*?['"]/im, "")
text_candidate.gsub!(/[a-z0-9_]+\s*=\s*['"].*?['"]/im, "")
# Remove HAML comments
text_candidate.gsub!(/^-#.*$/, "")
# Collapse everything that is not alphanumeric to spaces
text_candidate.gsub!(/[^a-z0-9]+/i, " ")
# If nothing alphanumeric remains, we treat this as icon-only
next unless text_candidate.strip.empty?
warning "Interactive element with icon needs aria-label in #{page.output_path}"

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +130
# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ARIA label check doesn't account for different HAML syntax styles. In the codebase, ARIA attributes can be written as:

  • Symbol keys with string values: :"aria-label" => "RSS" (content/_layouts/changelog.html.haml:15)
  • Regular strings: "aria-expanded" => "false" (content/_partials/dropdown.html.haml:1)

The current regex /aria-label|aria-labelledby/i won't match the symbol syntax like :"aria-label". Additionally, it doesn't check for aria-describedby which is another valid way to provide accessible labels according to WCAG.

Suggested change
# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
# Check if there's an ARIA label attribute (supports HTML and HAML symbol syntax)
unless element_line.match(/(:\"aria-(label|labelledby|describedby)\"|aria-(label|labelledby|describedby))/i)

Copilot uses AI. Check for mistakes.

# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
warning "Interactive element with icon needs aria-label in #{page.output_path}"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "Interactive element with icon needs aria-label" is too generic and doesn't help developers locate the specific issue. It lacks:

  1. Context about which element (button vs link)
  2. Line number or sufficient context to find the element
  3. The actual element attributes for debugging

Consider including more details in the warning message, similar to how line 100 includes the first 80 characters of the img tag for context. For example: "Button/link with icon-only content needs aria-label in #{page.output_path}: #{element_line.strip[0..80]}"

Suggested change
warning "Interactive element with icon needs aria-label in #{page.output_path}"
# Determine element type (button or link) from the match
match_data = Regexp.last_match
raw_type = match_data && match_data[1] ? match_data[1].downcase : 'element'
element_type = case raw_type
when '%button'
'Button'
when '%a'
'Link'
else
'Interactive element'
end
# Approximate line number based on position in content
line_number = page.content[0...match_pos].count("\n") + 1
# Short snippet of the offending element for context
snippet = element_line.strip[0..80]
warning "#{element_type} with icon-only content needs aria-label in #{page.output_path}: line #{line_number}: #{snippet}"

Copilot uses AI. Check for mistakes.

def check_image_alt_text(page)
# Find all img tags - both HAML and HTML
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HTML image pattern <img[^>]*> won't match self-closing img tags like <img ... /> which is valid XHTML and commonly used. The pattern should be <img[^>]*\/?> to match both <img> and <img /> formats.

Suggested change
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
img_pattern = /%img\{[^}]*\}|<img[^>]*\/?>/i

Copilot uses AI. Check for mistakes.
Comment on lines 102 to 109
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty alt text (alt="") is being treated as valid, but it's semantically different from missing alt text. Empty alt text indicates a decorative image (which is valid for WCAG), but this validation doesn't distinguish between:

  1. Missing alt attribute (accessibility issue)
  2. Empty alt text alt="" (decorative image - valid)
  3. Generic alt text like "image" (poor accessibility)

The current regex check on line 99 will pass for empty alt text, then line 103 will try to extract it but won't warn about it. Consider explicitly checking for empty alt text and deciding whether to warn based on context (e.g., functional vs decorative images).

Suggested change
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
# Check for empty or generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
raw_alt_text = (alt_match[1] || alt_match[2]).to_s
alt_text = raw_alt_text.strip.downcase
if alt_text.empty?
# Empty alt text is only appropriate for decorative images.
# Treat as decorative only if explicitly marked as such.
decorative = !!(img_tag.match(/aria-hidden\s*=\s*['"]true['"]/i) ||
img_tag.match(/role\s*=\s*['"](presentation|none)['"]/i))
unless decorative
warning "Empty alt text on non-decorative image in #{page.output_path}: #{img_tag.strip[0..80]}"
end
else
# Check for generic/poor alt text
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
end

Copilot uses AI. Check for mistakes.
Comment on lines 99 to 105
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alt text regex pattern doesn't account for different quote types being used in HAML vs HTML. HAML can use:

  • Single quotes: :alt => 'text' or alt: 'text'
  • Double quotes: :alt => "text" or alt: "text"
  • Symbol syntax: alt: 'text'

The current pattern /:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i only captures hash rocket syntax for HAML (:alt =>) and doesn't capture the modern symbol syntax (alt:). Look at line 1 in content/_partials/dropdown.html.haml for an example where symbols like "aria-expanded" use the key-value syntax without hash rockets.

Suggested change
unless img_tag.match(/:alt\s*=>|alt\s*=/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase

Copilot uses AI. Check for mistakes.
Handle both hash rocket (:alt =>) and symbol (alt:) syntax
in addition to HTML (alt=) to properly validate all alt text
formats used in the codebase.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 108 to 112
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
end
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generic terms check will flag legitimate use cases where simple alt text is appropriate. For example, a company logo image with alt="logo" might be perfectly acceptable when the company name is already in surrounding text. The list includes "logo" which appears to be used legitimately in the codebase (e.g., content/_layouts/sig.html.haml:19). Consider making this check more contextual or less strict, perhaps as a "nit" level warning rather than treating all generic terms equally.

Copilot uses AI. Check for mistakes.
Comment on lines 84 to 136
def validate_accessibility(page)
return unless page.content

# Check for images without alt text
check_image_alt_text(page)

# Check for buttons and links that might need ARIA labels
check_aria_labels(page)
end

def check_image_alt_text(page)
# Find all img tags - both HAML and HTML
img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
page.content.scan(img_pattern).each do |img_tag|
# Check if alt attribute is missing
# Supports: :alt => "text", alt: "text", and alt="text"
unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for generic/poor alt text
# Matches: :alt => "text", alt: "text", or alt="text"
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)
warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
end
end
end
end
end

def check_aria_labels(page)
# Find buttons and links with only icons (no text content)
# Look for patterns like %button, %a with only ion-icon or svg inside
icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i

page.content.scan(icon_only_pattern) do |match|
# Get position of match in content
match_pos = Regexp.last_match.begin(0)
next unless match_pos

# Extract surrounding context for validation
element_line = page.content[match_pos, 150]

# Check if there's an aria-label or aria-labelledby
unless element_line.match(/aria-label|aria-labelledby/i)
warning "Interactive element with icon needs aria-label in #{page.output_path}"
end
end
end
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation performs multiple full-content scans with regex operations (one for images, one for icon-only elements). For large pages with lots of content, this could be slow. Consider:

  • Combining patterns where possible to scan content once
  • Caching compiled regex patterns outside the methods
  • Adding early termination if a page is too large
  • Using more efficient parsing approaches for known content structures

While this may not be critical for build-time validation, it's worth considering the performance impact if many pages need to be validated.

Copilot uses AI. Check for mistakes.
Only flag the most problematic alt text like 'image', 'img',
'picture', 'photo'. Allow 'logo', 'icon' which can be legitimate
in context (e.g., company logo when name is in surrounding text).

This makes the validator less strict while still catching the
worst accessibility issues.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +114
# Check if alt attribute is missing
# Supports: :alt => "text", alt: "text", and alt="text"
unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for obviously poor alt text (most egregious cases only)
# Allows "logo", "icon" etc which can be legitimate in context
# Matches: :alt => "text", alt: "text", or alt="text"
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
# Only flag the most problematic generic terms
worst_terms = ['image', 'img', 'picture', 'photo']

if worst_terms.include?(alt_text)
warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"
end
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern for detecting alt text presence /:alt\s*=>|alt\s*[:=]/i will match cases where 'alt' appears in other contexts, such as in URLs, comments, or other attribute values. For instance, it could incorrectly match something like data-alternative="value". While less likely to occur, consider using word boundaries or more specific patterns to ensure you're matching actual alt attributes.

Suggested change
# Check if alt attribute is missing
# Supports: :alt => "text", alt: "text", and alt="text"
unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for obviously poor alt text (most egregious cases only)
# Allows "logo", "icon" etc which can be legitimate in context
# Matches: :alt => "text", alt: "text", or alt="text"
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
if alt_match
alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
# Only flag the most problematic generic terms
worst_terms = ['image', 'img', 'picture', 'photo']
if worst_terms.include?(alt_text)
warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"
end
# Extract alt attribute if present
# Supports: :alt => "text", alt: "text", and alt="text"
alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
unless alt_match
warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
else
# Check for obviously poor alt text (most egregious cases only)
# Allows "logo", "icon" etc which can be legitimate in context
alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
# Only flag the most problematic generic terms
worst_terms = ['image', 'img', 'picture', 'photo']
if worst_terms.include?(alt_text)
warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"

Copilot uses AI. Check for mistakes.
@zbynek
Copy link
Contributor

zbynek commented Feb 4, 2026

This sort of stuff should not be done with regular expressions.

If you're concerned about accessibility, you can try to find actual accessibility issues (e.g. https://validator.nu/?doc=https%3A%2F%2Fwww.jenkins.io ) and fix them.

@zbynek zbynek closed this Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants