Add accessibility validation checks to site validator by TheAshutoshMishra · Pull Request #8797 · jenkins-infra/jenkins.io

TheAshutoshMishra · 2026-02-04T15:24:00Z

Integrated automated accessibility checks into the site validator to enforce WCAG 2.1 Level A compliance during the build process. Changes: Added validation for missing/generic alt text and ARIA labels for icon-only interactive elements in both HAML and HTML. Impact: Prevents accessibility regressions by catching common errors before production, improving the experience for screen reader users.

- Added check for missing alt text on images (WCAG 2.1 Level A) - Added validation for generic alt text (image, logo, icon, etc) - Added check for icon-only buttons/links missing aria-label - Helps prevent accessibility issues during build time This enhancement will catch common accessibility problems before they reach production, improving site inclusivity for users with screen readers and assistive technologies.

Copilot

Pull request overview

This PR integrates automated accessibility validation into the site build process to enforce WCAG 2.1 Level A compliance. The validator checks for missing or generic alt text on images and verifies that icon-only interactive elements have appropriate ARIA labels.

Changes:

Added validate_accessibility method to check pages for accessibility compliance
Implemented check_image_alt_text to detect missing/generic alt attributes on images in both HAML and HTML
Implemented check_aria_labels to validate ARIA labels on icon-only buttons and links

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-04T15:28:13Z

content/_ext/validator.rb

+    # Find all img tags - both HAML and HTML
+    img_pattern = /%img\{[^}]*\}|<img[^>]*>/i


The regex pattern doesn't handle HAML's multiline image syntax (where attributes can span multiple lines). In HAML, you can have %img{ :src => 'path', :alt => 'text' } on separate lines with proper indentation. This pattern will only match if the closing brace is on the same line, potentially missing images that need validation. Consider using a more robust matching approach or document this limitation.

Suggested change

# Find all img tags - both HAML and HTML

img_pattern = /%img\{[^}]*\}|<img[^>]*>/i

# Find all img tags - both HAML and HTML (support multiline HAML attributes)

img_pattern = /%img\{.*?\}|<img[^>]*>/im

Copilot · 2026-02-04T15:28:14Z

content/_ext/validator.rb

+  def check_aria_labels(page)
+    # Find buttons and links with only icons (no text content)
+    # Look for patterns like %button, %a with only ion-icon or svg inside
+    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i


The regex pattern will match buttons and links that have both text content AND icons, not just icon-only elements as the comment suggests. For example, a button with text "Download" and an icon will be flagged. The pattern should ensure the element contains ONLY an icon without text content. Consider adding negative lookahead or checking the captured content for text nodes.

Copilot · 2026-02-04T15:28:14Z

content/_ext/validator.rb

+      unless img_tag.match(/:alt\s*=>|alt\s*=/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else


The validation only checks if alt attribute exists or contains generic terms, but doesn't check for empty alt text (alt="" or alt=''). Empty alt text is different from missing alt text - decorative images should have empty alt text, but this should be an explicit choice. Consider adding a check to warn about empty alt text to ensure it's intentional rather than an oversight.

Copilot · 2026-02-04T15:28:14Z

content/_ext/validator.rb

+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']


Hardcoding the generic terms list makes it difficult to maintain and extend. Consider extracting this to a constant at the class level (e.g., GENERIC_ALT_TERMS) to improve code organization and make it easier to add or modify terms in the future.

Copilot · 2026-02-04T15:28:14Z

content/_ext/validator.rb

+      element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]
+
+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)


The regex only checks for 'aria-label' and 'aria-labelledby' in standard HTML attribute format, but doesn't account for HAML's symbol syntax (e.g., :"aria-label" => "text"). This means buttons/links with ARIA labels written in HAML hash syntax might be incorrectly flagged as missing labels. The regex should also match patterns like :"aria-label"\s*=>\s* or :"aria-labelledby"\s*=>.

Suggested change

element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]

# Check if there's an aria-label or aria-labelledby

unless element_line.match(/aria-label|aria-labelledby/i)

element_start = page.content.index(match[0])

element_line = page.content[element_start..element_start + 150]

# Check if there's an aria-label or aria-labelledby in either HTML or HAML syntax

unless element_line.match(/(aria-label\s*=|aria-labelledby\s*=|:"aria-label"\s*=>|:"aria-labelledby"\s*=>)/i)

Copilot · 2026-02-04T15:28:15Z

content/_ext/validator.rb

+    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
+
+    page.content.scan(icon_only_pattern).each do |match|
+      element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]


Critical bug: page.content.index(match[0]) can return nil if the matched text is not found, which will cause an error when attempting to create a range. This can occur because the regex captures groups and match[0] may not exist in the original content string. Consider using match.begin(0) from the MatchData object or scanning with match_index to get the position directly.

Use Regexp.last_match.begin(0) instead of string index to safely get the match position, preventing nil errors when match is not found.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-04T16:06:00Z

content/_ext/validator.rb

+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']


The generic terms list includes 'logo' and 'icon', but these might be acceptable in certain contexts. For example, in the codebase at content/_partials/social-media-buttons.html.haml:33, there's %img{:title => "Logo", :src => expand_link(item.logo), :width => "96"} where "Logo" might be combined with other context. Consider whether warnings for these terms should be case-sensitive or whether they should allow compound descriptions like "Jenkins logo" or "GitHub icon".

Copilot · 2026-02-04T16:06:01Z

content/_ext/validator.rb

+      # Check if alt attribute is missing
+      unless img_tag.match(/:alt\s*=>|alt\s*=/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else
+        # Check for generic/poor alt text
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
+


The validation doesn't check for images that use the title attribute without alt. In the codebase, several images use :title without :alt (e.g., content/2.0.html.haml:102). According to WCAG 2.1 Level A (Success Criterion 1.1.1), the title attribute alone is insufficient for accessibility because:

Screen readers may not consistently announce title attributes

Title attributes are only visible on hover, making them inaccessible to keyboard and touch users

The alt attribute is the proper way to provide text alternatives for images

The validator should warn when images have title but no alt, as this suggests the developer may have mistakenly used title thinking it provides accessibility.

Suggested change

# Check if alt attribute is missing

unless img_tag.match(/:alt\s*=>|alt\s*=/i)

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

else

# Check for generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase

generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

# Determine presence of alt and title attributes (HAML and HTML forms)

has_alt = !!img_tag.match(/:alt\s*=>|alt\s*=/i)

has_title = !!img_tag.match(/:title\s*=>|title\s*=/i)

if !has_alt

if has_title

warning "Image uses title without alt in #{page.output_path}: #{img_tag.strip[0..80]}"

else

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

end

else

# Check for generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase

generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

Copilot · 2026-02-04T16:06:02Z

content/_ext/validator.rb

+  # Validate accessibility requirements for pages
+  # Checks for WCAG 2.1 Level A compliance
+  def validate_accessibility(page)
+    return unless page.content
+
+    # Check for images without alt text
+    check_image_alt_text(page)
+
+    # Check for buttons and links that might need ARIA labels
+    check_aria_labels(page)
+  end
+
+  def check_image_alt_text(page)
+    # Find all img tags - both HAML and HTML
+    img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
+    page.content.scan(img_pattern).each do |img_tag|
+      # Check if alt attribute is missing
+      unless img_tag.match(/:alt\s*=>|alt\s*=/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else
+        # Check for generic/poor alt text
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
+
+          if generic_terms.include?(alt_text)
+            warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
+          end
+        end
+      end
+    end
+  end
+
+  def check_aria_labels(page)
+    # Find buttons and links with only icons (no text content)
+    # Look for patterns like %button, %a with only ion-icon or svg inside
+    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
+
+    page.content.scan(icon_only_pattern) do |match|
+      # Get position of match in content
+      match_pos = Regexp.last_match.begin(0)
+      next unless match_pos
+
+      # Extract surrounding context for validation
+      element_line = page.content[match_pos, 150]
+
+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)
+        warning "Interactive element with icon needs aria-label in #{page.output_path}"
+      end
+    end
+  end


The validation approach of using regex pattern matching on raw HAML/HTML content is fragile and difficult to maintain. Consider these alternative approaches:

Parse HAML/HTML into a DOM: Use libraries like Nokogiri to parse the rendered HTML output, which would allow more reliable element traversal and attribute checking.

Hook into the rendering pipeline: If possible, validate during or after the HAML-to-HTML rendering phase when the structure is cleaner.

Use HAML parser: Parse HAML syntax trees directly using a HAML parser to understand the structure before HTML generation.

The current regex approach will be prone to false positives and false negatives as the codebase evolves and uses different HAML/HTML patterns. A parsing-based approach would be more robust and maintainable long-term.

Copilot · 2026-02-04T16:06:02Z

content/_ext/validator.rb

+    # Find buttons and links with only icons (no text content)
+    # Look for patterns like %button, %a with only ion-icon or svg inside
+    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
+
+    page.content.scan(icon_only_pattern) do |match|
+      # Get position of match in content
+      match_pos = Regexp.last_match.begin(0)
+      next unless match_pos
+
+      # Extract surrounding context for validation
+      element_line = page.content[match_pos, 150]
+
+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)
+        warning "Interactive element with icon needs aria-label in #{page.output_path}"
+      end


The icon-only pattern has several issues that will lead to false positives and false negatives:

False negatives: The pattern requires icon elements to appear on the same line as the button/link (see [^%\n]*), but in HAML, content is often indented on the next line. For example:

%button %ion-icon{name: "copy-outline"}

This valid HAML structure won't be matched.

False positives: The pattern matches any button/link that contains an icon anywhere, even if it also has text content. For example, a button with both an icon and text label would still trigger a warning.

Incomplete icon detection: The pattern only looks for %ion-icon, <svg, and <ion-icon, but icons could also be embedded as SVG files (e.g., File.read("content/images/symbols/rss.svg") as seen in content/_layouts/changelog.html.haml:16).

Consider a more robust approach that parses the element structure to determine if it has text content in addition to icons.

Suggested change

# Find buttons and links with only icons (no text content)

# Look for patterns like %button, %a with only ion-icon or svg inside

icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i

page.content.scan(icon_only_pattern) do |match|

# Get position of match in content

match_pos = Regexp.last_match.begin(0)

next unless match_pos

# Extract surrounding context for validation

element_line = page.content[match_pos, 150]

# Check if there's an aria-label or aria-labelledby

unless element_line.match(/aria-label|aria-labelledby/i)

warning "Interactive element with icon needs aria-label in #{page.output_path}"

end

# Find buttons and links that appear to be icon-only (no text content)

# Handles HAML structures where content is on indented lines and detects

# several icon patterns, including inline SVG and SVGs included via File.read.

return unless page.content

lines = page.content.lines

lines.each_with_index do |line, index|

# Match HAML-style %button or %a at the start of the line, capturing indentation

next unless line =~ /^(?<indent>\s*)(%button|%a)(\{[^}]*\})?.*$/i

indent = Regexp.last_match[:indent]

# Collect this line and any more-indented child lines as the element block

block_lines = [line]

child_indent_prefix = indent + " "

j = index + 1

while j < lines.length

next_line = lines[j]

break unless next_line.start_with?(child_indent_prefix)

block_lines << next_line

j += 1

end

block = block_lines.join

# Skip if the element already has ARIA labelling

next if block.match(/aria-label\s*=|aria-label\s*=>|aria-labelledby\s*=|aria-labelledby\s*=>/i)

# Detect icon usage: ion-icon, inline SVG, or SVG files included via File.read

has_icon =

block.match(/%ion-icon\b/i) ||

block.match(/<svg\b/i) ||

block.match(/<ion-icon\b/i) ||

block.match(/File\.read$[^)]*\.svg['"]?$/i)

next unless has_icon

# Heuristically determine whether there is any textual content besides icons/markup.

text_candidate = block.dup

# Remove common icon constructs

text_candidate.gsub!(/%ion-icon[^\n]*/i, "")

text_candidate.gsub!(/<svg.*?<\/svg>/im, "")

text_candidate.gsub!(/<ion-icon.*?\/?>/im, "")

text_candidate.gsub!(/File\.read$[^)]*\.svg['"]?$/im, "")

# Strip HAML element prefixes and attribute hashes

text_candidate.gsub!(/%[a-z0-9_]+(\{[^}]*\})?/i, "")

text_candidate.gsub!(/\{[^}]*\}/m, "")

# Remove simple Ruby string assignments/attributes like :title => "..."

text_candidate.gsub!(/:[a-z0-9_]+\s*=>\s*['"].*?['"]/im, "")

text_candidate.gsub!(/[a-z0-9_]+\s*=\s*['"].*?['"]/im, "")

# Remove HAML comments

text_candidate.gsub!(/^-#.*$/, "")

# Collapse everything that is not alphanumeric to spaces

text_candidate.gsub!(/[^a-z0-9]+/i, " ")

# If nothing alphanumeric remains, we treat this as icon-only

next unless text_candidate.strip.empty?

warning "Interactive element with icon needs aria-label in #{page.output_path}"

Copilot · 2026-02-04T16:06:02Z

content/_ext/validator.rb

+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)


The ARIA label check doesn't account for different HAML syntax styles. In the codebase, ARIA attributes can be written as:

Symbol keys with string values: :"aria-label" => "RSS" (content/_layouts/changelog.html.haml:15)

Regular strings: "aria-expanded" => "false" (content/_partials/dropdown.html.haml:1)

The current regex /aria-label|aria-labelledby/i won't match the symbol syntax like :"aria-label". Additionally, it doesn't check for aria-describedby which is another valid way to provide accessible labels according to WCAG.

Suggested change

# Check if there's an aria-label or aria-labelledby

unless element_line.match(/aria-label|aria-labelledby/i)

# Check if there's an ARIA label attribute (supports HTML and HAML symbol syntax)

unless element_line.match(/(:\"aria-(label|labelledby|describedby)\"|aria-(label|labelledby|describedby))/i)

Copilot · 2026-02-04T16:06:02Z

content/_ext/validator.rb

+
+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)
+        warning "Interactive element with icon needs aria-label in #{page.output_path}"


The error message "Interactive element with icon needs aria-label" is too generic and doesn't help developers locate the specific issue. It lacks:

Context about which element (button vs link)

Line number or sufficient context to find the element

The actual element attributes for debugging

Consider including more details in the warning message, similar to how line 100 includes the first 80 characters of the img tag for context. For example: "Button/link with icon-only content needs aria-label in #{page.output_path}: #{element_line.strip[0..80]}"

Suggested change

warning "Interactive element with icon needs aria-label in #{page.output_path}"

# Determine element type (button or link) from the match

match_data = Regexp.last_match

raw_type = match_data && match_data[1] ? match_data[1].downcase : 'element'

element_type = case raw_type

when '%button'

'Button'

when '%a'

'Link'

else

'Interactive element'

end

# Approximate line number based on position in content

line_number = page.content[0...match_pos].count("\n") + 1

# Short snippet of the offending element for context

snippet = element_line.strip[0..80]

warning "#{element_type} with icon-only content needs aria-label in #{page.output_path}: line #{line_number}: #{snippet}"

Copilot · 2026-02-04T16:06:03Z

content/_ext/validator.rb

+
+  def check_image_alt_text(page)
+    # Find all img tags - both HAML and HTML
+    img_pattern = /%img\{[^}]*\}|<img[^>]*>/i


The HTML image pattern <img[^>]*> won't match self-closing img tags like <img ... /> which is valid XHTML and commonly used. The pattern should be <img[^>]*\/?> to match both <img> and <img /> formats.

Suggested change

img_pattern = /%img\{[^}]*\}|<img[^>]*>/i

img_pattern = /%img\{[^}]*\}|<img[^>]*\/?>/i

Copilot · 2026-02-04T16:06:03Z

content/_ext/validator.rb

+        # Check for generic/poor alt text
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
+
+          if generic_terms.include?(alt_text)
+            warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"


Empty alt text (alt="") is being treated as valid, but it's semantically different from missing alt text. Empty alt text indicates a decorative image (which is valid for WCAG), but this validation doesn't distinguish between:

Missing alt attribute (accessibility issue)

Empty alt text alt="" (decorative image - valid)

Generic alt text like "image" (poor accessibility)

The current regex check on line 99 will pass for empty alt text, then line 103 will try to extract it but won't warn about it. Consider explicitly checking for empty alt text and deciding whether to warn based on context (e.g., functional vs decorative images).

Suggested change

# Check for generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase

generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)

warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"

# Check for empty or generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

raw_alt_text = (alt_match[1] || alt_match[2]).to_s

alt_text = raw_alt_text.strip.downcase

if alt_text.empty?

# Empty alt text is only appropriate for decorative images.

# Treat as decorative only if explicitly marked as such.

decorative = !!(img_tag.match(/aria-hidden\s*=\s*['"]true['"]/i) ||

img_tag.match(/role\s*=\s*['"](presentation|none)['"]/i))

unless decorative

warning "Empty alt text on non-decorative image in #{page.output_path}: #{img_tag.strip[0..80]}"

end

else

# Check for generic/poor alt text

generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']

if generic_terms.include?(alt_text)

warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"

end

Copilot · 2026-02-04T16:06:03Z

content/_ext/validator.rb

+      unless img_tag.match(/:alt\s*=>|alt\s*=/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else
+        # Check for generic/poor alt text
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase


The alt text regex pattern doesn't account for different quote types being used in HAML vs HTML. HAML can use:

Single quotes: :alt => 'text' or alt: 'text'

Double quotes: :alt => "text" or alt: "text"

Symbol syntax: alt: 'text'

The current pattern /:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i only captures hash rocket syntax for HAML (:alt =>) and doesn't capture the modern symbol syntax (alt:). Look at line 1 in content/_partials/dropdown.html.haml for an example where symbols like "aria-expanded" use the key-value syntax without hash rockets.

Suggested change

unless img_tag.match(/:alt\s*=>|alt\s*=/i)

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

else

# Check for generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2]).to_s.strip.downcase

unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

else

# Check for generic/poor alt text

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase

Handle both hash rocket (:alt =>) and symbol (alt:) syntax in addition to HTML (alt=) to properly validate all alt text formats used in the codebase.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-04T16:21:03Z

content/_ext/validator.rb

+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
+
+          if generic_terms.include?(alt_text)
+            warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
+          end


The generic terms check will flag legitimate use cases where simple alt text is appropriate. For example, a company logo image with alt="logo" might be perfectly acceptable when the company name is already in surrounding text. The list includes "logo" which appears to be used legitimately in the codebase (e.g., content/_layouts/sig.html.haml:19). Consider making this check more contextual or less strict, perhaps as a "nit" level warning rather than treating all generic terms equally.

Copilot · 2026-02-04T16:21:03Z

content/_ext/validator.rb

+  def validate_accessibility(page)
+    return unless page.content
+
+    # Check for images without alt text
+    check_image_alt_text(page)
+
+    # Check for buttons and links that might need ARIA labels
+    check_aria_labels(page)
+  end
+
+  def check_image_alt_text(page)
+    # Find all img tags - both HAML and HTML
+    img_pattern = /%img\{[^}]*\}|<img[^>]*>/i
+    page.content.scan(img_pattern).each do |img_tag|
+      # Check if alt attribute is missing
+      # Supports: :alt => "text", alt: "text", and alt="text"
+      unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else
+        # Check for generic/poor alt text
+        # Matches: :alt => "text", alt: "text", or alt="text"
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
+          generic_terms = ['image', 'img', 'logo', 'icon', 'picture', 'photo']
+
+          if generic_terms.include?(alt_text)
+            warning "Generic alt text '#{alt_text}' in #{page.output_path} - should be more descriptive"
+          end
+        end
+      end
+    end
+  end
+
+  def check_aria_labels(page)
+    # Find buttons and links with only icons (no text content)
+    # Look for patterns like %button, %a with only ion-icon or svg inside
+    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
+
+    page.content.scan(icon_only_pattern) do |match|
+      # Get position of match in content
+      match_pos = Regexp.last_match.begin(0)
+      next unless match_pos
+
+      # Extract surrounding context for validation
+      element_line = page.content[match_pos, 150]
+
+      # Check if there's an aria-label or aria-labelledby
+      unless element_line.match(/aria-label|aria-labelledby/i)
+        warning "Interactive element with icon needs aria-label in #{page.output_path}"
+      end
+    end
+  end


The validation performs multiple full-content scans with regex operations (one for images, one for icon-only elements). For large pages with lots of content, this could be slow. Consider:

Combining patterns where possible to scan content once

Caching compiled regex patterns outside the methods

Adding early termination if a page is too large

Using more efficient parsing approaches for known content structures

While this may not be critical for build-time validation, it's worth considering the performance impact if many pages need to be validated.

Only flag the most problematic alt text like 'image', 'img', 'picture', 'photo'. Allow 'logo', 'icon' which can be legitimate in context (e.g., company logo when name is in surrounding text). This makes the validator less strict while still catching the worst accessibility issues.

Copilot

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-04T16:47:00Z

content/_ext/validator.rb

+      # Check if alt attribute is missing
+      # Supports: :alt => "text", alt: "text", and alt="text"
+      unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)
+        warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"
+      else
+        # Check for obviously poor alt text (most egregious cases only)
+        # Allows "logo", "icon" etc which can be legitimate in context
+        # Matches: :alt => "text", alt: "text", or alt="text"
+        alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)
+        if alt_match
+          alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase
+          # Only flag the most problematic generic terms
+          worst_terms = ['image', 'img', 'picture', 'photo']
+
+          if worst_terms.include?(alt_text)
+            warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"
+          end


The regex pattern for detecting alt text presence /:alt\s*=>|alt\s*[:=]/i will match cases where 'alt' appears in other contexts, such as in URLs, comments, or other attribute values. For instance, it could incorrectly match something like data-alternative="value". While less likely to occur, consider using word boundaries or more specific patterns to ensure you're matching actual alt attributes.

Suggested change

# Check if alt attribute is missing

# Supports: :alt => "text", alt: "text", and alt="text"

unless img_tag.match(/:alt\s*=>|alt\s*[:=]/i)

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

else

# Check for obviously poor alt text (most egregious cases only)

# Allows "logo", "icon" etc which can be legitimate in context

# Matches: :alt => "text", alt: "text", or alt="text"

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

if alt_match

alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase

# Only flag the most problematic generic terms

worst_terms = ['image', 'img', 'picture', 'photo']

if worst_terms.include?(alt_text)

warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"

end

# Extract alt attribute if present

# Supports: :alt => "text", alt: "text", and alt="text"

alt_match = img_tag.match(/:alt\s*=>\s*['"](.*?)['"]|alt\s*:\s*['"](.*?)['"]|alt\s*=\s*['"](.*?)['"]/i)

unless alt_match

warning "Missing alt text on image in #{page.output_path}: #{img_tag.strip[0..80]}"

else

# Check for obviously poor alt text (most egregious cases only)

# Allows "logo", "icon" etc which can be legitimate in context

alt_text = (alt_match[1] || alt_match[2] || alt_match[3]).to_s.strip.downcase

# Only flag the most problematic generic terms

worst_terms = ['image', 'img', 'picture', 'photo']

if worst_terms.include?(alt_text)

warning "Poor alt text '#{alt_text}' in #{page.output_path} - should describe the image content"

zbynek · 2026-02-04T17:41:47Z

This sort of stuff should not be done with regular expressions.

If you're concerned about accessibility, you can try to find actual accessibility issues (e.g. https://validator.nu/?doc=https%3A%2F%2Fwww.jenkins.io ) and fix them.

TheAshutoshMishra requested a review from a team as a code owner February 4, 2026 15:24

Copilot AI review requested due to automatic review settings February 4, 2026 15:24

Copilot started reviewing on behalf of TheAshutoshMishra February 4, 2026 15:24 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

Fix potential nil index bug in ARIA label checker

41db0e2

Use Regexp.last_match.begin(0) instead of string index to safely get the match position, preventing nil errors when match is not found.

TheAshutoshMishra requested a review from Copilot February 4, 2026 15:59

Copilot started reviewing on behalf of TheAshutoshMishra February 4, 2026 16:00 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

Support both HAML syntax styles for alt text validation

8cd035e

Handle both hash rocket (:alt =>) and symbol (alt:) syntax in addition to HTML (alt=) to properly validate all alt text formats used in the codebase.

TheAshutoshMishra requested a review from Copilot February 4, 2026 16:15

Copilot started reviewing on behalf of TheAshutoshMishra February 4, 2026 16:15 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

TheAshutoshMishra requested a review from Copilot February 4, 2026 16:30

Copilot started reviewing on behalf of TheAshutoshMishra February 4, 2026 16:30 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

zbynek closed this Feb 4, 2026

		# Find all img tags - both HAML and HTML
		img_pattern = /%img\{[^}]\}\|<img[^>]>/i

-      element_line = page.content[page.content.index(match[0])..page.content.index(match[0]) + 150]
-      # Check if there's an aria-label or aria-labelledby
-      unless element_line.match(/aria-label|aria-labelledby/i)
+      element_start = page.content.index(match[0])
+      element_line = page.content[element_start..element_start + 150]
+      # Check if there's an aria-label or aria-labelledby in either HTML or HAML syntax
+      unless element_line.match(/(aria-label\s*=|aria-labelledby\s*=|:"aria-label"\s*=>|:"aria-labelledby"\s*=>)/i)

-    # Find buttons and links with only icons (no text content)
-    # Look for patterns like %button, %a with only ion-icon or svg inside
-    icon_only_pattern = /(%button|%a)(\{[^}]*\})?[^%\n]*(%ion-icon|<svg|<ion-icon)/i
-    page.content.scan(icon_only_pattern) do |match|
-      # Get position of match in content
-      match_pos = Regexp.last_match.begin(0)
-      next unless match_pos
-      # Extract surrounding context for validation
-      element_line = page.content[match_pos, 150]
-      # Check if there's an aria-label or aria-labelledby
-      unless element_line.match(/aria-label|aria-labelledby/i)
-        warning "Interactive element with icon needs aria-label in #{page.output_path}"
-      end
+    # Find buttons and links that appear to be icon-only (no text content)
+    # Handles HAML structures where content is on indented lines and detects
+    # several icon patterns, including inline SVG and SVGs included via File.read.
+    return unless page.content
+    lines = page.content.lines
+    lines.each_with_index do |line, index|
+      # Match HAML-style %button or %a at the start of the line, capturing indentation
+      next unless line =~ /^(?<indent>\s*)(%button|%a)(\{[^}]*\})?.*$/i
+      indent = Regexp.last_match[:indent]
+      # Collect this line and any more-indented child lines as the element block
+      block_lines = [line]
+      child_indent_prefix = indent + "  "
+      j = index + 1
+      while j < lines.length
+        next_line = lines[j]
+        break unless next_line.start_with?(child_indent_prefix)
+        block_lines << next_line
+        j += 1
+      end
+      block = block_lines.join
+      # Skip if the element already has ARIA labelling
+      next if block.match(/aria-label\s*=|aria-label\s*=>|aria-labelledby\s*=|aria-labelledby\s*=>/i)
+      # Detect icon usage: ion-icon, inline SVG, or SVG files included via File.read
+      has_icon =
+        block.match(/%ion-icon\b/i) ||
+        block.match(/<svg\b/i) ||
+        block.match(/<ion-icon\b/i) ||
+        block.match(/File\.read\([^)]*\.svg['"]?\)/i)
+      next unless has_icon
+      # Heuristically determine whether there is any textual content besides icons/markup.
+      text_candidate = block.dup
+      # Remove common icon constructs
+      text_candidate.gsub!(/%ion-icon[^\n]*/i, "")
+      text_candidate.gsub!(/<svg.*?<\/svg>/im, "")
+      text_candidate.gsub!(/<ion-icon.*?\/?>/im, "")
+      text_candidate.gsub!(/File\.read\([^)]*\.svg['"]?\)/im, "")
+      # Strip HAML element prefixes and attribute hashes
+      text_candidate.gsub!(/%[a-z0-9_]+(\{[^}]*\})?/i, "")
+      text_candidate.gsub!(/\{[^}]*\}/m, "")
+      # Remove simple Ruby string assignments/attributes like :title => "..."
+      text_candidate.gsub!(/:[a-z0-9_]+\s*=>\s*['"].*?['"]/im, "")
+      text_candidate.gsub!(/[a-z0-9_]+\s*=\s*['"].*?['"]/im, "")
+      # Remove HAML comments
+      text_candidate.gsub!(/^-#.*$/, "")
+      # Collapse everything that is not alphanumeric to spaces
+      text_candidate.gsub!(/[^a-z0-9]+/i, " ")
+      # If nothing alphanumeric remains, we treat this as icon-only
+      next unless text_candidate.strip.empty?
+      warning "Interactive element with icon needs aria-label in #{page.output_path}"

		# Check if there's an aria-label or aria-labelledby
		unless element_line.match(/aria-label\|aria-labelledby/i)

-        warning "Interactive element with icon needs aria-label in #{page.output_path}"
+        # Determine element type (button or link) from the match
+        match_data = Regexp.last_match
+        raw_type = match_data && match_data[1] ? match_data[1].downcase : 'element'
+        element_type = case raw_type
+                       when '%button'
+                         'Button'
+                       when '%a'
+                         'Link'
+                       else
+                         'Interactive element'
+                       end
+        # Approximate line number based on position in content
+        line_number = page.content[0...match_pos].count("\n") + 1
+        # Short snippet of the offending element for context
+        snippet = element_line.strip[0..80]
+        warning "#{element_type} with icon-only content needs aria-label in #{page.output_path}: line #{line_number}: #{snippet}"

	img_pattern = /%img\{[^}]\}\|<img[^>]>/i
	img_pattern = /%img\{[^}]\}\|<img[^>]\/?>/i

Uh oh!

Conversation

TheAshutoshMishra commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

zbynek commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects