Syntax highlighting in docs

simonw · simonw · commit 7e08f4803d86 · 2025-02-13T16:30:13.000-08:00
diff --git a/docs/accessibility.md b/docs/accessibility.md
@@ -1,9 +1,11 @@
+(accessibility)=
+
 # Dumping out an accessibility tree
 
 The `shot-scraper accessibility` command dumps out the Chromium accessibility tree for the provided URL, as JSON:
-
-    shot-scraper accessibility https://datasette.io/
-
+```bash
+shot-scraper accessibility https://datasette.io/
+```
 Use `-o filename.json` to write the output to a file instead of displaying it.
 
 Add `--javascript SCRIPT` to execute custom JavaScript before taking the snapshot.
diff --git a/docs/authentication.md b/docs/authentication.md
@@ -1,13 +1,15 @@
+(authentication)=
+
 # Websites that need authentication
 
 If you want to take screenshots of a site that has some form of authentication, you will first need to authenticate with that website manually.
 
 You can do that using the `shot-scraper auth` command:
-
-    shot-scraper auth \
-      https://datasette-auth-passwords-demo.datasette.io/-/login \
-      auth.json
-
+```bash
+shot-scraper auth \
+  https://datasette-auth-passwords-demo.datasette.io/-/login \
+  auth.json
+```
 (For this demo, use username = `root` and password = `password!`)
 
 This will open a browser window on your computer showing the page you specified.
@@ -17,10 +19,10 @@ You can then sign in using that browser window - including 2FA or CAPTCHAs or ot
 When you are finished, hit `<enter>` at the `shot-scraper` command-line prompt. The browser will close and the authentication credentials (usually cookies) for that browser session will be written out to the `auth.json` file.
 
 To take authenticated screenshots you can then use the `-a` or `--auth` options to point to the JSON file that you created:
-
-    shot-scraper https://datasette-auth-passwords-demo.datasette.io/ \
-      -a auth.json -o authed.png
-
+```bash
+shot-scraper https://datasette-auth-passwords-demo.datasette.io/ \
+  -a auth.json -o authed.png
+```
 ## `shot-scraper auth --help`
 
 Full `--help` for `shot-scraper auth`:
diff --git a/docs/contributing.md b/docs/contributing.md
@@ -1,53 +1,55 @@
+(contributing)=
+
 # Contributing
 
 To contribute to this tool, first checkout the code. Then create a new virtual environment:
-
-    cd shot-scraper
-    python -m venv venv
-    source venv/bin/activate
-
+```bash
+cd shot-scraper
+python -m venv venv
+source venv/bin/activate
+```
 Or if you are using `pipenv`:
-
-    pipenv shell
-
+```bash
+pipenv shell
+```
 Now install the dependencies and test dependencies:
-
-    pip install -e '.[test]'
-
+```bash
+pip install -e '.[test]'
+```
 Then you'll need to install the Playwright browsers too:
-
-    shot-scraper install
-
+```bash
+shot-scraper install
+```
 To run the tests:
-
-    pytest
-
+```bash
+pytest
+```
 Some of the tests exercise the CLI utility directly. Run those like so:
-
-    tests/run_examples.sh
-
+```bash
+tests/run_examples.sh
+```
 ## Documentation
 
 Documentation for this project uses [MyST](https://myst-parser.readthedocs.io/) - it is written in Markdown and rendered using Sphinx.
 
 To build the documentation locally, run the following:
-
-    cd docs
-    pip install -r requirements.txt
-    make livehtml
-
+```bash
+cd docs
+pip install -r requirements.txt
+make livehtml
+```
 This will start a live preview server, using [sphinx-autobuild](https://pypi.org/project/sphinx-autobuild/).
 
 The CLI `--help` examples in the documentation are managed using [Cog](https://github.com/nedbat/cog). Update those files like this:
-
-    cog -r docs/*.md
-
-## Tweeting the release notes
-
-After pushing a release, I use the following to create a screenshot of the release notes to use in a tweet:
-
-    shot-scraper https://github.com/simonw/shot-scraper/releases/tag/0.15 \
-      --selector '.Box-body' --width 700 \
-      --retina
-
-[Example tweet](https://twitter.com/simonw/status/1569431710345089024).
+```bash
+cog -r docs/*.md
+```
+## Publishing the release notes
+
+After pushing a release, I use the following to create a screenshot of the release notes to use in social media posts:
+```bash
+shot-scraper https://github.com/simonw/shot-scraper/releases/tag/0.15 \
+  --selector '.Box-body' --width 700 \
+  --retina
+```
+[Example post](https://twitter.com/simonw/status/1569431710345089024).
diff --git a/docs/har.md b/docs/har.md
@@ -1,4 +1,5 @@
 (har)=
+
 # Saving a web page to an HTTP Archive
 
 An HTTP Archive file captures the full details of a series of HTTP requests and responses as JSON.
diff --git a/docs/html.md b/docs/html.md
@@ -1,30 +1,32 @@
+(html)=
+
 # Dumping the HTML of a page
 
 The `shot-scraper html` command dumps out the final HTML of a page after all JavaScript has run.
-
-    shot-scraper html https://datasette.io/
-
+```bash
+shot-scraper html https://datasette.io/
+```
 Use `-o filename.html` to write the output to a file instead of displaying it.
-
-    shot-scraper html https://datasette.io/ -o index.html
-
+```bash
+shot-scraper html https://datasette.io/ -o index.html
+```
 Add `--javascript SCRIPT` to execute custom JavaScript before taking the HTML snapshot.
-
-    shot-scraper html https://datasette.io/ \
-      --javascript "document.querySelector('h1').innerText = 'Hello, world!'"
-
+```bash
+shot-scraper html https://datasette.io/ \
+  --javascript "document.querySelector('h1').innerText = 'Hello, world!'"
+```
 ## Retrieving the HTML for a specific element
 
 You can use the `-s SELECTOR` option to capture just the HTML for one specific element on the page, identified using a CSS selector:
-
-    shot-scraper html https://datasette.io/ -s h1
-
+```bash
+shot-scraper html https://datasette.io/ -s h1
+```
 This outputs:
-
-    <h1>
-      <img class="datasette-logo" src="/static/datasette-logo.svg" alt="Datasette">
-    </h1>
-
+```html
+<h1>
+  <img class="datasette-logo" src="/static/datasette-logo.svg" alt="Datasette">
+</h1>
+```
 ## `shot-scraper html --help`
 
 Full `--help` for this command:
diff --git a/docs/javascript.md b/docs/javascript.md
@@ -1,32 +1,34 @@
+(javascript)=
+
 # Scraping pages using JavaScript
 
 The `shot-scraper javascript` command can be used to execute JavaScript directly against a page and return the result as JSON.
 
 This command doesn't produce a screenshot, but has interesting applications for scraping.
 
 To retrieve a string title of a document:
-
-    shot-scraper javascript https://datasette.io/ "document.title"
-
+```bash
+shot-scraper javascript https://datasette.io/ "document.title"
+```
 This returns a JSON string:
 ```json
 "Datasette: An open source multi-tool for exploring and publishing data"
 ```
 To return a raw string instead, use the `-r` or `--raw` options:
-
-    shot-scraper javascript https://datasette.io/ "document.title" -r
-
+```bash
+shot-scraper javascript https://datasette.io/ "document.title" -r
+```
 Output:
-
-    Datasette: An open source multi-tool for exploring and publishing data
-
+```
+Datasette: An open source multi-tool for exploring and publishing data
+```
 To return a JSON object, wrap an object literal in parenthesis:
-
-    shot-scraper javascript https://datasette.io/ "({
-      title: document.title,
-      tagline: document.querySelector('.tagline').innerText
-    })"
-
+```bash
+shot-scraper javascript https://datasette.io/ "({
+  title: document.title,
+  tagline: document.querySelector('.tagline').innerText
+})"
+```
 This returns:
 ```json
 {
@@ -55,7 +57,7 @@ shot-scraper javascript https://www.example.com/ "
 
 You can pass an `async` function if you want to use `await`, including to import modules from external URLs. This example loads the [Readability.js](https://github.com/mozilla/readability) library from [Skypack](https://www.skypack.dev/) and uses it to extract the core content of a page:
 
-```
+```bash
 shot-scraper javascript \
   https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/ "
 async () => {
@@ -65,17 +67,17 @@ async () => {
 ```
 
 To use functions such as `setInterval()`, for example if you need to delay the shot for a second to allow an animation to finish running, return a promise:
-
-    shot-scraper javascript datasette.io "
-    new Promise(done => setInterval(
-      () => {
-        done({
-          title: document.title,
-          tagline: document.querySelector('.tagline').innerText
-        });
-      }, 1000
-    ));"
-
+```bash
+shot-scraper javascript datasette.io "
+new Promise(done => setInterval(
+  () => {
+    done({
+      title: document.title,
+      tagline: document.querySelector('.tagline').innerText
+    });
+  }, 1000
+));"
+```
 (bypass-csp)=
 ## Bypassing Content Security Policy headers
 
@@ -110,13 +112,13 @@ Output:
 ## Running JavaScript from a file
 
 You can also save JavaScript to a file and execute it like this:
-
-    shot-scraper javascript datasette.io -i script.js
-
+```bash
+shot-scraper javascript datasette.io -i script.js
+```
 Or read it from standard input like this:
-
-    echo "document.title" | shot-scraper javascript datasette.io
-
+```bash
+echo "document.title" | shot-scraper javascript datasette.io
+```
 ## Using this for automated tests
 
 If a JavaScript error occurs, a stack trace will be written to standard error and the tool will terminate with an exit code of 1.
diff --git a/docs/multi.md b/docs/multi.md
@@ -1,3 +1,5 @@
+(multi)=
+
 # Taking multiple screenshots
 
 You can configure multiple screenshots using a YAML file. Create a file called `shots.yml` that looks like this:
@@ -9,21 +11,21 @@ You can configure multiple screenshots using a YAML file. Create a file called `
   url: https://www.w3.org/
 ```
 Then run the tool like so:
-
-    shot-scraper multi shots.yml
-
+```bash
+shot-scraper multi shots.yml
+```
 This will create two image files, `www-example-com.png` and `w3c.org.png`, containing screenshots of those two URLs.
 
 Use `-` to pass in YAML from standard input:
-
-    echo "- url: http://www.example.com" | shot-scraper multi -
-
+```bash
+echo "- url: http://www.example.com" | shot-scraper multi -
+```
 If you run the tool with the `-n` or `--no-clobber` option any shots where the output file aleady exists will be skipped.
 
 You can specify a subset of screenshots to take by specifying output files that you would like to create. For example, to take just the shots of `one.png` and `three.png` that are defined in `shots.yml` run this:
-
-    shot-scraper multi shots.yml -o one.png -o three.png
-
+```bash
+shot-scraper multi shots.yml -o one.png -o three.png
+```
 The `url:` can be set to a path to a file on disk as well:
 
 ```yaml
@@ -36,15 +38,15 @@ Use the `--scale-factor` option to capture all screenshots at a specific scale f
 For example, setting `--scale-factor 3` results in screenshots with a CSS pixel ratio of 3, which is ideal for emulating a high-resolution display, such as Apple's iPhone 12 screens.
 
 To take screenshots with a scale factor of 3 (tripled resolution), run the following command:
-
-    shot-scraper multi shots.yml --scale-factor 3
-
+```bash
+shot-scraper multi shots.yml --scale-factor 3
+```
 This will multiply both the width and height of all screenshots by 3, resulting in images with a higher level of detail, suitable for scenarios where you need to capture the screen as it would appear on a high-DPI display.
 
 Use `--retina` to take all screenshots at retina resolution instead, doubling the dimensions of the files:
-
-    shot-scraper multi shots.yml --retina
-
+```bash
+shot-scraper multi shots.yml --retina
+```
 Note: The `--retina` option should not be used in conjunction with the `--scale-factor` flag as they are mutually exclusive. If both are provided, the command will raise an error to prevent conflicts.
 
 To take a screenshot of just the area of a page defined by a CSS selector, add `selector` to the YAML block:
diff --git a/docs/screenshots.md b/docs/screenshots.md

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,5 @@`
`1`	`1`	`(har)=`
	`2`	`+`
`2`	`3`	`# Saving a web page to an HTTP Archive`
`3`	`4`
`4`	`5`	`An HTTP Archive file captures the full details of a series of HTTP requests and responses as JSON.`