Skip to content

Commit 7e08f48

Browse files
committed
Syntax highlighting in docs
1 parent e43f689 commit 7e08f48

File tree

8 files changed

+231
-214
lines changed

8 files changed

+231
-214
lines changed

docs/accessibility.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
1+
(accessibility)=
2+
13
# Dumping out an accessibility tree
24

35
The `shot-scraper accessibility` command dumps out the Chromium accessibility tree for the provided URL, as JSON:
4-
5-
shot-scraper accessibility https://datasette.io/
6-
6+
```bash
7+
shot-scraper accessibility https://datasette.io/
8+
```
79
Use `-o filename.json` to write the output to a file instead of displaying it.
810

911
Add `--javascript SCRIPT` to execute custom JavaScript before taking the snapshot.

docs/authentication.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
1+
(authentication)=
2+
13
# Websites that need authentication
24

35
If you want to take screenshots of a site that has some form of authentication, you will first need to authenticate with that website manually.
46

57
You can do that using the `shot-scraper auth` command:
6-
7-
shot-scraper auth \
8-
https://datasette-auth-passwords-demo.datasette.io/-/login \
9-
auth.json
10-
8+
```bash
9+
shot-scraper auth \
10+
https://datasette-auth-passwords-demo.datasette.io/-/login \
11+
auth.json
12+
```
1113
(For this demo, use username = `root` and password = `password!`)
1214

1315
This will open a browser window on your computer showing the page you specified.
@@ -17,10 +19,10 @@ You can then sign in using that browser window - including 2FA or CAPTCHAs or ot
1719
When you are finished, hit `<enter>` at the `shot-scraper` command-line prompt. The browser will close and the authentication credentials (usually cookies) for that browser session will be written out to the `auth.json` file.
1820

1921
To take authenticated screenshots you can then use the `-a` or `--auth` options to point to the JSON file that you created:
20-
21-
shot-scraper https://datasette-auth-passwords-demo.datasette.io/ \
22-
-a auth.json -o authed.png
23-
22+
```bash
23+
shot-scraper https://datasette-auth-passwords-demo.datasette.io/ \
24+
-a auth.json -o authed.png
25+
```
2426
## `shot-scraper auth --help`
2527

2628
Full `--help` for `shot-scraper auth`:

docs/contributing.md

Lines changed: 39 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1,53 +1,55 @@
1+
(contributing)=
2+
13
# Contributing
24

35
To contribute to this tool, first checkout the code. Then create a new virtual environment:
4-
5-
cd shot-scraper
6-
python -m venv venv
7-
source venv/bin/activate
8-
6+
```bash
7+
cd shot-scraper
8+
python -m venv venv
9+
source venv/bin/activate
10+
```
911
Or if you are using `pipenv`:
10-
11-
pipenv shell
12-
12+
```bash
13+
pipenv shell
14+
```
1315
Now install the dependencies and test dependencies:
14-
15-
pip install -e '.[test]'
16-
16+
```bash
17+
pip install -e '.[test]'
18+
```
1719
Then you'll need to install the Playwright browsers too:
18-
19-
shot-scraper install
20-
20+
```bash
21+
shot-scraper install
22+
```
2123
To run the tests:
22-
23-
pytest
24-
24+
```bash
25+
pytest
26+
```
2527
Some of the tests exercise the CLI utility directly. Run those like so:
26-
27-
tests/run_examples.sh
28-
28+
```bash
29+
tests/run_examples.sh
30+
```
2931
## Documentation
3032

3133
Documentation for this project uses [MyST](https://myst-parser.readthedocs.io/) - it is written in Markdown and rendered using Sphinx.
3234

3335
To build the documentation locally, run the following:
34-
35-
cd docs
36-
pip install -r requirements.txt
37-
make livehtml
38-
36+
```bash
37+
cd docs
38+
pip install -r requirements.txt
39+
make livehtml
40+
```
3941
This will start a live preview server, using [sphinx-autobuild](https://pypi.org/project/sphinx-autobuild/).
4042

4143
The CLI `--help` examples in the documentation are managed using [Cog](https://github.com/nedbat/cog). Update those files like this:
42-
43-
cog -r docs/*.md
44-
45-
## Tweeting the release notes
46-
47-
After pushing a release, I use the following to create a screenshot of the release notes to use in a tweet:
48-
49-
shot-scraper https://github.com/simonw/shot-scraper/releases/tag/0.15 \
50-
--selector '.Box-body' --width 700 \
51-
--retina
52-
53-
[Example tweet](https://twitter.com/simonw/status/1569431710345089024).
44+
```bash
45+
cog -r docs/*.md
46+
```
47+
## Publishing the release notes
48+
49+
After pushing a release, I use the following to create a screenshot of the release notes to use in social media posts:
50+
```bash
51+
shot-scraper https://github.com/simonw/shot-scraper/releases/tag/0.15 \
52+
--selector '.Box-body' --width 700 \
53+
--retina
54+
```
55+
[Example post](https://twitter.com/simonw/status/1569431710345089024).

docs/har.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
(har)=
2+
23
# Saving a web page to an HTTP Archive
34

45
An HTTP Archive file captures the full details of a series of HTTP requests and responses as JSON.

docs/html.md

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,32 @@
1+
(html)=
2+
13
# Dumping the HTML of a page
24

35
The `shot-scraper html` command dumps out the final HTML of a page after all JavaScript has run.
4-
5-
shot-scraper html https://datasette.io/
6-
6+
```bash
7+
shot-scraper html https://datasette.io/
8+
```
79
Use `-o filename.html` to write the output to a file instead of displaying it.
8-
9-
shot-scraper html https://datasette.io/ -o index.html
10-
10+
```bash
11+
shot-scraper html https://datasette.io/ -o index.html
12+
```
1113
Add `--javascript SCRIPT` to execute custom JavaScript before taking the HTML snapshot.
12-
13-
shot-scraper html https://datasette.io/ \
14-
--javascript "document.querySelector('h1').innerText = 'Hello, world!'"
15-
14+
```bash
15+
shot-scraper html https://datasette.io/ \
16+
--javascript "document.querySelector('h1').innerText = 'Hello, world!'"
17+
```
1618
## Retrieving the HTML for a specific element
1719

1820
You can use the `-s SELECTOR` option to capture just the HTML for one specific element on the page, identified using a CSS selector:
19-
20-
shot-scraper html https://datasette.io/ -s h1
21-
21+
```bash
22+
shot-scraper html https://datasette.io/ -s h1
23+
```
2224
This outputs:
23-
24-
<h1>
25-
<img class="datasette-logo" src="/static/datasette-logo.svg" alt="Datasette">
26-
</h1>
27-
25+
```html
26+
<h1>
27+
<img class="datasette-logo" src="/static/datasette-logo.svg" alt="Datasette">
28+
</h1>
29+
```
2830
## `shot-scraper html --help`
2931

3032
Full `--help` for this command:

docs/javascript.md

Lines changed: 35 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,34 @@
1+
(javascript)=
2+
13
# Scraping pages using JavaScript
24

35
The `shot-scraper javascript` command can be used to execute JavaScript directly against a page and return the result as JSON.
46

57
This command doesn't produce a screenshot, but has interesting applications for scraping.
68

79
To retrieve a string title of a document:
8-
9-
shot-scraper javascript https://datasette.io/ "document.title"
10-
10+
```bash
11+
shot-scraper javascript https://datasette.io/ "document.title"
12+
```
1113
This returns a JSON string:
1214
```json
1315
"Datasette: An open source multi-tool for exploring and publishing data"
1416
```
1517
To return a raw string instead, use the `-r` or `--raw` options:
16-
17-
shot-scraper javascript https://datasette.io/ "document.title" -r
18-
18+
```bash
19+
shot-scraper javascript https://datasette.io/ "document.title" -r
20+
```
1921
Output:
20-
21-
Datasette: An open source multi-tool for exploring and publishing data
22-
22+
```
23+
Datasette: An open source multi-tool for exploring and publishing data
24+
```
2325
To return a JSON object, wrap an object literal in parenthesis:
24-
25-
shot-scraper javascript https://datasette.io/ "({
26-
title: document.title,
27-
tagline: document.querySelector('.tagline').innerText
28-
})"
29-
26+
```bash
27+
shot-scraper javascript https://datasette.io/ "({
28+
title: document.title,
29+
tagline: document.querySelector('.tagline').innerText
30+
})"
31+
```
3032
This returns:
3133
```json
3234
{
@@ -55,7 +57,7 @@ shot-scraper javascript https://www.example.com/ "
5557

5658
You can pass an `async` function if you want to use `await`, including to import modules from external URLs. This example loads the [Readability.js](https://github.com/mozilla/readability) library from [Skypack](https://www.skypack.dev/) and uses it to extract the core content of a page:
5759

58-
```
60+
```bash
5961
shot-scraper javascript \
6062
https://simonwillison.net/2022/Mar/14/scraping-web-pages-shot-scraper/ "
6163
async () => {
@@ -65,17 +67,17 @@ async () => {
6567
```
6668

6769
To use functions such as `setInterval()`, for example if you need to delay the shot for a second to allow an animation to finish running, return a promise:
68-
69-
shot-scraper javascript datasette.io "
70-
new Promise(done => setInterval(
71-
() => {
72-
done({
73-
title: document.title,
74-
tagline: document.querySelector('.tagline').innerText
75-
});
76-
}, 1000
77-
));"
78-
70+
```bash
71+
shot-scraper javascript datasette.io "
72+
new Promise(done => setInterval(
73+
() => {
74+
done({
75+
title: document.title,
76+
tagline: document.querySelector('.tagline').innerText
77+
});
78+
}, 1000
79+
));"
80+
```
7981
(bypass-csp)=
8082
## Bypassing Content Security Policy headers
8183

@@ -110,13 +112,13 @@ Output:
110112
## Running JavaScript from a file
111113

112114
You can also save JavaScript to a file and execute it like this:
113-
114-
shot-scraper javascript datasette.io -i script.js
115-
115+
```bash
116+
shot-scraper javascript datasette.io -i script.js
117+
```
116118
Or read it from standard input like this:
117-
118-
echo "document.title" | shot-scraper javascript datasette.io
119-
119+
```bash
120+
echo "document.title" | shot-scraper javascript datasette.io
121+
```
120122
## Using this for automated tests
121123

122124
If a JavaScript error occurs, a stack trace will be written to standard error and the tool will terminate with an exit code of 1.

docs/multi.md

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
(multi)=
2+
13
# Taking multiple screenshots
24

35
You can configure multiple screenshots using a YAML file. Create a file called `shots.yml` that looks like this:
@@ -9,21 +11,21 @@ You can configure multiple screenshots using a YAML file. Create a file called `
911
url: https://www.w3.org/
1012
```
1113
Then run the tool like so:
12-
13-
shot-scraper multi shots.yml
14-
14+
```bash
15+
shot-scraper multi shots.yml
16+
```
1517
This will create two image files, `www-example-com.png` and `w3c.org.png`, containing screenshots of those two URLs.
1618

1719
Use `-` to pass in YAML from standard input:
18-
19-
echo "- url: http://www.example.com" | shot-scraper multi -
20-
20+
```bash
21+
echo "- url: http://www.example.com" | shot-scraper multi -
22+
```
2123
If you run the tool with the `-n` or `--no-clobber` option any shots where the output file aleady exists will be skipped.
2224

2325
You can specify a subset of screenshots to take by specifying output files that you would like to create. For example, to take just the shots of `one.png` and `three.png` that are defined in `shots.yml` run this:
24-
25-
shot-scraper multi shots.yml -o one.png -o three.png
26-
26+
```bash
27+
shot-scraper multi shots.yml -o one.png -o three.png
28+
```
2729
The `url:` can be set to a path to a file on disk as well:
2830

2931
```yaml
@@ -36,15 +38,15 @@ Use the `--scale-factor` option to capture all screenshots at a specific scale f
3638
For example, setting `--scale-factor 3` results in screenshots with a CSS pixel ratio of 3, which is ideal for emulating a high-resolution display, such as Apple's iPhone 12 screens.
3739

3840
To take screenshots with a scale factor of 3 (tripled resolution), run the following command:
39-
40-
shot-scraper multi shots.yml --scale-factor 3
41-
41+
```bash
42+
shot-scraper multi shots.yml --scale-factor 3
43+
```
4244
This will multiply both the width and height of all screenshots by 3, resulting in images with a higher level of detail, suitable for scenarios where you need to capture the screen as it would appear on a high-DPI display.
4345

4446
Use `--retina` to take all screenshots at retina resolution instead, doubling the dimensions of the files:
45-
46-
shot-scraper multi shots.yml --retina
47-
47+
```bash
48+
shot-scraper multi shots.yml --retina
49+
```
4850
Note: The `--retina` option should not be used in conjunction with the `--scale-factor` flag as they are mutually exclusive. If both are provided, the command will raise an error to prevent conflicts.
4951

5052
To take a screenshot of just the area of a page defined by a CSS selector, add `selector` to the YAML block:

0 commit comments

Comments
 (0)