Skip to content

Feature: Block explicit images from reaching the messaging inbox #1179

@annachayn

Description

@annachayn

Overview

Now that we run our own chat widget, every image a user sends flows through our backend before it reaches our messaging inbox (Front). That gives us one place to scan image attachments and stop explicit ones from ever reaching the team — so our messaging volunteers and staff are never exposed to them.

This is a backend change in bloom-backend, with a small optional frontend follow-up in bloom-frontend for the user-facing message. You do not need a Front account or any Front credentials — the whole feature is testable with mocks and a local smoke-test script (see Testing).

When a user uploads an image, we scan it locally before forwarding. If it's flagged as explicit, we block it: it's never forwarded, an alert is raised in Rollbar, the agent gets a note in the thread, and the user sees a clear message.

  • Flags: Porn, Hentai (anime porn), and overtly sexual / lingerie / swimwear (the model's Sexy class).
  • Does NOT flag (and must not block): violence/gore, hate speech, drug use, self-harm or medical imagery — the model doesn't detect these and we don't want it to.

We scan on the backend, not the frontend: it's the single point every attachment passes through before Front (frontend checks can be bypassed by calling the API directly), and it's the only way to meet the privacy bar below.

Action Items

  • Add nsfwjs + @tensorflow/tfjs-node and bundle a local NSFW model in the repo.
  • Add an ImageScanningService that classifies an image buffer in-process.
  • Call it near the top of sendChannelAttachment (src/front-chat/front-chat.service.ts), only for image/* files.
  • On a flag: don't forward, raise a Rollbar alert via logger.error, post an agent-facing note in the thread, and surface a 422 to the client.
  • Map the blocked error to HTTP 422 in front-chat.controller.ts.
  • Unit tests with nsfwjs mocked: block / allow / below-threshold / non-image / agent-notified / alert-raised / fail-open.
  • (Optional, bloom-frontend) Show the friendly blocked message + add the imageBlocked translation to all 6 locales (strings provided below).
  • Make a PR and tag @eleanorreem or @annarhughes.

Guidance and Resources

Privacy bar (must hold): scanning runs entirely in-process, the image is never sent to a third party, and nothing is written to disk. We achieve this with nsfwjs + @tensorflow/tfjs-node, loading a model committed to the repo via file:// (no runtime network calls). MIT-licensed and free.

⚠️ @tensorflow/tfjs-node builds/downloads a native binding at install time (one-off). If yarn install struggles, check the tfjs-node install notes and make sure you're on the Node version in our package.json engines field (>=22.13.0). Flag it on the issue if you get stuck — we'd rather know.

1. Install + bundle the model

cd bloom-backend
yarn add nsfwjs @tensorflow/tfjs-node

Commit the MobileNetV2 model files (model.json + the groupN-shardNofN.bin weight shards) from the nsfwjs models repo at src/front-chat/assets/model/. The service loads them relative to __dirname, so they must be copied into dist/ on build — add them to nest-cli.json assets (match the existing assets shape):

"assets": [{ "include": "front-chat/assets/**/*", "outDir": "dist/src" }],
"watchAssets": true

2. src/front-chat/image-scanning.service.ts

import { Injectable, OnModuleInit } from '@nestjs/common';
import * as tf from '@tensorflow/tfjs-node';
import * as nsfwjs from 'nsfwjs';
import * as path from 'path';
import { Logger } from 'src/logger/logger';

const logger = new Logger('ImageScanningService');
const EXPLICIT_CLASSES = new Set(['Porn', 'Hentai', 'Sexy']); // model also returns Neutral / Drawing
const THRESHOLD = 0.6; // 60% confidence; tune later if we see false positives/negatives

@Injectable()
export class ImageScanningService implements OnModuleInit {
  private model: nsfwjs.NSFWJS;

  async onModuleInit() {
    const modelPath = path.resolve(__dirname, 'assets', 'model');
    this.model = await nsfwjs.load(`file://${modelPath}/`, { size: 224 }); // local, no network. Trailing slash required.
    logger.log('Local NSFW model loaded');
  }

  async scanImage(buffer: Buffer): Promise<{ isSafe: boolean; reason?: string }> {
    let tensor: tf.Tensor3D | undefined;
    try {
      tensor = tf.node.decodeImage(buffer, 3) as tf.Tensor3D;
      const predictions = await this.model.classify(tensor);
      for (const p of predictions) {
        if (EXPLICIT_CLASSES.has(p.className) && p.probability > THRESHOLD) {
          return { isSafe: false, reason: `${p.className} ${(p.probability * 100).toFixed(1)}%` };
        }
      }
      return { isSafe: true };
    } catch (error) {
      // Fail-open: if the scanner breaks, let the image through so chat doesn't halt for everyone.
      logger.error(`Image scanning failed: ${(error as Error)?.message}`);
      return { isSafe: true, reason: 'Scanner error' };
    } finally {
      tensor?.dispose(); // Critical: tensors aren't garbage-collected — leaks memory otherwise.
    }
  }
}

Register ImageScanningService in front-chat.module.ts providers.

3. Scan + alert in sendChannelAttachment (front-chat.service.ts, right after the Cypress check, before building the FormData). Inject ImageScanningService, and add a typed error so the controller can return a distinct status:

export class ImageBlockedError extends Error {
  constructor(reason: string) {
    super(reason);
    this.name = 'ImageBlockedError';
  }
}
if (file.mimetype.startsWith('image/')) {
  const { isSafe, reason } = await this.imageScanningService.scanImage(file.buffer);
  if (!isSafe) {
    // logger.error forwards to Rollbar (logger.warn does not) — this is our alert.
    // PII is auto-redacted; never log the image itself, only id + classification.
    logger.error(`Blocked explicit image attachment from user ${user.id} (${reason})`);

    // Tell the agent something was blocked, without forwarding the image. Best-effort.
    await this.sendChannelTextMessage(
      user,
      'This image has been blocked because it may contain something explicit or malicious. The user has been notified.',
    ).catch(() => {});

    throw new ImageBlockedError(reason ?? 'Image blocked');
  }
}

Decision for the reviewer: the original ticket asks whether we notify the user and/or agent. This notifies both — agent via a thread note, user via the 422 → frontend message — and raises a Rollbar alert so we can monitor false positives. Comment if we'd prefer to notify only one side.

4. Map to 422 in front-chat.controller.ts (wrap the sendChannelAttachment call; keeps it distinct from the existing 400/413 paths):

try {
  const chatUser = await this.frontChatService.sendChannelAttachment(req.userEntity, file);
  this.syncChatActivity(chatUser, req.userEntity.email);
} catch (err) {
  if (err instanceof ImageBlockedError) {
    throw new UnprocessableEntityException(
      "This image has been blocked because it didn't pass our safety standards. " +
        'If you think this is a mistake, please write a message to explain why.',
    );
  }
  throw err;
}

Testing

Unit tests (mock the model — runs in CI). Add to the existing describe('sendChannelAttachment') in front-chat.service.spec.ts (~line 759; reuse its user/file fixtures). Mock so you control predictions without loading TensorFlow:

jest.mock('nsfwjs', () => ({ load: jest.fn().mockResolvedValue({ classify: jest.fn() }) }));
jest.mock('@tensorflow/tfjs-node', () => ({
  node: { decodeImage: jest.fn(() => ({ dispose: jest.fn() })) },
}));

Cover, at minimum:

  1. Explicit blockedclassify[{ className: 'Porn', probability: 0.9 }]: rejects with ImageBlockedError and the multipart upload fetch is never called.
  2. Alert raised — same case asserts logger.error (Rollbar) was called.
  3. Agent notified — spy on sendChannelTextMessage.
  4. Safe forwarded[{ className: 'Neutral', probability: 0.95 }]: upload happens as today.
  5. Below threshold[{ className: 'Sexy', probability: 0.4 }]: forwarded.
  6. Non-image skipped — a application/pdf / audio/webm file never hits scanImage.
  7. Fail-openscanImage throws → image still forwarded.
yarn test src/front-chat/front-chat.service.spec.ts

Local smoke test (prove the real model loads — runs on your machine). CI uses mocks, so confirm the bundled model wiring works. Put one or two ordinary safe images (a landscape, a pet) in src/front-chat/__fixtures__/ and:

// scripts/scan-smoke-test.ts  — npx ts-node scripts/scan-smoke-test.ts
import * as fs from 'fs';
import { ImageScanningService } from '../src/front-chat/image-scanning.service';
(async () => {
  const svc = new ImageScanningService();
  await svc.onModuleInit();
  console.log(await svc.scanImage(fs.readFileSync('src/front-chat/__fixtures__/landscape.jpg')));
  // expect { isSafe: true } — proves the model loads from disk with no network access
})();

Please don't test with or commit explicit images — the mocked unit tests already prove the blocking path; the smoke test only proves the model wiring.

Optional frontend follow-up (bloom-frontend)

So blocked users see the safety message instead of the generic "Upload failed":

  • lib/hooks/useMessaging.ts, in sendAttachment, special-case 422:

    if (!response.ok) {
      throw new Error(response.status === 422 ? 'IMAGE_BLOCKED' : `UPLOAD_FAILED_${response.status}`);
    }
  • components/messaging/MessageComposer.tsx, in the image catch:

    onError(t((err as Error).message === 'IMAGE_BLOCKED' ? 'errors.imageBlocked' : 'errors.uploadFailed'));
  • Add the imageBlocked key under messaging.…errors in all six locale files in i18n/messages/messaging/ (register matched to each existing file):

    // en.json
    "imageBlocked": "This image has been blocked because it didn't pass our safety standards. If you think this is a mistake, please write a message to explain why."
    // de.json
    "imageBlocked": "Dieses Bild wurde blockiert, weil es unseren Sicherheitsstandards nicht entsprach. Wenn du glaubst, dass dies ein Fehler ist, schreibe uns bitte eine Nachricht und erkläre uns, warum."
    // fr.json
    "imageBlocked": "Cette image a été bloquée car elle ne respecte pas nos normes de sécurité. Si vous pensez qu'il s'agit d'une erreur, veuillez nous écrire un message pour nous en expliquer la raison."
    // es.json
    "imageBlocked": "Esta imagen ha sido bloqueada porque no cumple con nuestros estándares de seguridad. Si crees que es un error, por favor, escríbenos un mensaje para explicarnos por qué."
    // pt.json
    "imageBlocked": "Esta imagem foi bloqueada porque não atende aos nossos padrões de segurança. Se você acha que isso é um engano, por favor, escreva uma mensagem para nos explicar o motivo."
    // hi.json
    "imageBlocked": "Is chhavi ko block kar diya gaya hai kyunki yeh hamare suraksha maandandon par khari nahi utri. Agar aapko lagta hai ki yeh ek galti hai, to kripya humein ek message likhkar iska kaaran batayein."

Acceptance criteria

  • Images are scanned in sendChannelAttachment before forwarding; non-images untouched.
  • An image flagged ≥60% as Porn/Hentai/Sexy is not forwarded; agent gets a note; client gets 422 with the user copy.
  • A flagged image raises a Rollbar alert (logger.error), logging only id + classification, never the image.
  • Scanning is fully in-process — no external network calls, nothing written to disk (model via file://).
  • Scanner errors fail open and are logged.
  • Unit tests cover block / allow / below-threshold / non-image / agent-notify / alert / fail-open with nsfwjs mocked.
  • Model files committed and copied to dist via nest-cli.json.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions