Skip to content

Commit b3563b2

Browse files
authored
Merge pull request #6437 from JohnSnowLabs/docs/libs-install
updated installation instructions
2 parents bc1c915 + 39a9736 commit b3563b2

File tree

1 file changed

+55
-32
lines changed

1 file changed

+55
-32
lines changed

docs/en/licensed_install.md

Lines changed: 55 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -13,61 +13,68 @@ sidebar:
1313

1414
## Install NLP libraries on Ubuntu
1515

16-
For installing John Snow Labs NLP on an Ubuntu machine/VM please run the following command:
16+
For installing John Snow Labs NLP libraries on an Ubuntu machine/VM please run the following command:
1717

1818
```bash
1919
wget https://setup.johnsnowlabs.com/nlp/install.sh -O - | sudo bash -s -- -a PATH_TO_LICENSE_JSON_FILE -i -r
2020
```
2121

22+
This script will install `Spark NLP`, `Spark NLP for Healthcare`, `Spark OCR`, `NLU` and `Spark NLP Display` on the specified virtual environment. It will also create a special folder, `./JohnSnowLabs`, dedicated to all resources necessary for using the libraries. Under `./JohnSnowLabs/example_notebooks` you will find some ready to use example notebooks that you can use to test the libraries on your data.
23+
2224
The install script offers several options:
23-
- *-h* show brief help
24-
- *-i* install mode: create a virtual environment and install the library
25-
- *-r* run mode: start jupyter after installation of the library
26-
- *-v* path to virtual environment (default: ./sparknlp_env)
27-
- *-j* path to license json file for Spark NLP for Healthcare
28-
- *-o* path to license json file for Spark OCR
29-
- *-a* path to a single license json file for both Spark OCR and Spark NLP
30-
- *-s* specify pyspark version
31-
- *-p* specify port of jupyter notebook
25+
- `-h` show brief help
26+
- `-i` install mode: create a virtual environment and install the library
27+
- `-r` run mode: start jupyter after installation of the library
28+
- `-v` path to virtual environment (default: ./sparknlp_env)
29+
- `-j` path to license json file for Spark NLP for Healthcare
30+
- `-o` path to license json file for Spark OCR
31+
- `-a` path to a single license json file for both Spark OCR and Spark NLP
32+
- `-s` specify pyspark version
33+
- `-p` specify port of jupyter notebook
34+
35+
Use the `-i` flag for installing the libraries in a new virtual environment.
3236

33-
Use the -i flag for installing the libraries in a new virtual environment.
37+
You can provide the desired path for virtual env using `-v` flag, otherwise a default location of `./sparknlp_env` will be selected.
3438

35-
You can provide the desired path for virtual env using -v flag, otherwise a default location of ./sparknlp_env will be selected.
39+
The `PATH_TO_LICENSE_JSON_FILE` parameter must be replaced with the path where the license file is available on the local machine. According to the libraries you want to use different flags are available: `-j`, `-o` or `-a`. The license files can be easily downloaded from *My Subscription* section in your [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/) account.
3640

37-
The PATH_TO_LICENSE_JSON_FILE must be replaced to the path where the license file is available on the local machine. According to the libraries you want to use you have different flags: -j, -o, -a. The license files can be easily downloaded from *My Subscription* section in your my.JohnSnowLabs.com account.
41+
To start using Jupyter Notebook after the installation of the libraries use the `-r` flag.
3842

39-
To directly start using Jupyter Notebook after the installation of the libraries user the -r flag. The install script downloads a couple of ready to use example notebooks that you can use to start experimenting with the libraries.
43+
The install script downloads a couple of example notebooks that you can use to start experimenting with the libraries. Those will be availabe under `./JohnSnowLabs/example_notebooks` folder.
4044

4145

42-
## Install NLP Libraries via Docker
46+
## Install via Docker
4347

44-
We have prepared a docker image that contains all the required libraries for installing and running Spark NLP for Healthcare. However, it does not contain the library itself, as it is licensed, and requires installation credentials.
48+
A docker image that contains all the required libraries for installing and running Spark NLP for Healthcare is also available. However, it does not contain the library itself, as it is licensed, and requires installation credentials.
4549

46-
Make sure you have valid license for Spark NLP for Healthcare, and follow the instructions below:
50+
Make sure you have a valid license for Spark NLP for Healthcare (in case you do not have one, you can ask for a trial [here](https://www.johnsnowlabs.com/install/) ), and follow the instructions below:
4751

4852

4953
### Instructions
5054

51-
- Run the following commands to download the docker-compose.yml and the sparknlp_keys.txt files on your local machine:
55+
- Run the following commands to download the `docker-compose.yml` and the `sparknlp_keys.txt` files on your local machine:
5256
```bash
5357
curl -o docker-compose.yaml https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/513a4d682f11abc33b2e26ef8a9d72ad52a7b4f0/jupyter/docker_image_nlp_hc/docker-compose.yaml
5458
curl -o sparknlp_keys.txt https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/jupyter/docker_image_nlp_hc/sparknlp_keys.txt
5559
```
56-
- Download your license key in json format from my.johnsnowlabs.com
57-
- Populate License keys in sparknlp_keys.txt.
60+
61+
- Download your license key in json format from [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/)
62+
- Populate License keys in `sparknlp_keys.txt` file.
5863
- Run the following command to run the container in detached mode:
64+
5965
```bash
6066
docker-compose up -d
6167
```
62-
- By default, the jupyter notebook would run at port 8888 - you can access the notebook by typing localhost:8888 in your browser.
68+
- By default, the jupyter notebook runs on port `8888` - you can access it by typing `localhost:8888` in your browser.
6369

6470

6571
### Troubleshooting
6672

6773
- Make sure docker is installed on your system.
6874
- If you face any error while importing the lib inside jupyter, make sure all the credentials are correct in the key files and restart the service again.
69-
- If the default port 8888 is already occupied by another process, please change the mapping.
70-
- You can change/adjust volume and port mapping in the docker-compose.yml file.
75+
- If the default port `8888` is already occupied by another process, please change the mapping.
76+
- You can change/adjust volume and port mapping in the `docker-compose.yml` file.
77+
- You don't have a license key? Ask for a trial license [here](https://www.johnsnowlabs.com/install/).
7178

7279
## Install locally on Python
7380

@@ -79,7 +86,9 @@ pip install -q spark-nlp-jsl==${version} --extra-index-url https://pypi.johnsnow
7986

8087
`{version}` is the version part of the `{secret.code}` (`{secret.code}.split('-')[0]`) (i.e. `2.6.0`)
8188

82-
The `{secret.code}` is a secret code that is only available to users with valid/trial license. If you did not receive it yet, please contact us at <a href="mailto:[email protected]">[email protected]</a>.
89+
The `{secret.code}` is a secret code that is only available to users with valid/trial license.
90+
91+
You can ask for a free trial for Spark NLP for Healthcare [here](https://www.johnsnowlabs.com/install/). Then, you can obtain the secret code by visiting your account on [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/). Read more on how to get a license [here](licensed_install#get-a-spark-nlp-for-healthcare-license).
8392

8493

8594
### Setup AWS-CLI Credentials for licensed pretrained models
@@ -91,22 +100,23 @@ Instructions about how to install AWSCLI are available at:
91100

92101
<a href="https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html">Installing the AWS CLI</a>
93102

94-
Make sure you configure your credentials with aws configure following the instructions at:
103+
Make sure you configure your credentials with AWS configure following the instructions at:
95104

96105
<a href="https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html">Configuring the AWS CLI</a>
97106

98-
Please substitute the `ACCESS_KEY` and `SECRET_KEY` with the credentials you have received from your Customer Owner (CO). If you need your credentials contact us at <a href="mailto:info@johnsnowlabs.com">info@johnsnowlabs.com</a>.
107+
Please substitute the `ACCESS_KEY` and `SECRET_KEY` with the credentials available on your license json file. This is available on your account from [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/). [Read this](licensed_install#get-a-spark-nlp-for-healthcare-license) for more information.
99108

100109

101110
### Start Spark NLP for Healthcare Session from Python
102111

103-
The following will initialize the spark session in case you have run the jupyter notebook directly. If you have started the notebook using
112+
The following will initialize the spark session in case you have run the Jupyter Notebook directly. If you have started the notebook using
104113
pyspark this cell is just ignored.
105114

106115
Initializing the spark session takes some seconds (usually less than 1 minute) as the jar from the server needs to be loaded.
107116

108-
The `{secret-code}` is a secret string you should have received from your Customer Owner (CO). If you have
109-
not received them, please contact us at <a href="mailto:[email protected]">[email protected]</a>.
117+
The `{secret.code}` is a secret code that is only available to users with valid/trial license.
118+
119+
You can ask for a free trial for Spark NLP for Healthcare [here](https://www.johnsnowlabs.com/install/). Then, you can obtain the secret code by visiting your account on [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/). Read more on how to get a license [here](licensed_install#get-a-spark-nlp-for-healthcare-license).
110120

111121
You can either use our convenience function to start your Spark Session that will use standard configuration arguments:
112122

@@ -116,6 +126,7 @@ spark = sparknlp_jsl.start("{secret.code}")
116126
```
117127

118128
Or use the SparkSession module for more flexibility:
129+
119130
```python
120131
from pyspark.sql import SparkSession
121132

@@ -157,7 +168,7 @@ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:3.2.3 --jars spark-n
157168

158169
### Use Spark NLP for Healthcare in Spark shell
159170

160-
1.Download the fat jar for spark-nlp-healthcare.
171+
1.Download the fat jar for Spark NLP for Healthcare
161172

162173
```bash
163174
aws s3 cp --region us-east-2 s3://pypi.johnsnowlabs.com/$jsl_secret/spark-nlp-jsl-$jsl_version.jar spark-nlp-jsl-$jsl_version.jar
@@ -179,7 +190,7 @@ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:${public-version} --j
179190

180191
### Use Spark NLP for Healthcare in Sbt project
181192

182-
1.Download the fat jar for spark-nlp-healthcare.
193+
1.Download the fat jar for Spark NLP for Healthcare.
183194
```bash
184195
aws s3 cp --region us-east-2 s3://pypi.johnsnowlabs.com/$jsl_secret/spark-nlp-jsl-$jsl_version.jar spark-nlp-jsl-$jsl_version.jar
185196
```
@@ -399,4 +410,16 @@ spark = start(SECRET)
399410
```
400411
401412
As you see, we did not set `.master('local[*]')` explicitly to let YARN manage the cluster.
402-
Or you can set `.master('yarn')`.
413+
Or you can set `.master('yarn')`.
414+
415+
416+
## Get a Spark NLP for Healthcare license
417+
418+
You can ask for a free trial for Spark NLP for Healthcare [here](https://www.johnsnowlabs.com/install/). This will automatically create a new account for you on [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/). Login in to your new account and from `My Subscriptions` section, you can download your license key as a json file.
419+
420+
The license json file contains:
421+
- the secrets for installing the Spark NLP for Healthcare and Spark OCR libraries,
422+
- the license key as well as
423+
- AWS credentials that you need to access the s3 bucket where the healthcare models and pipelines are published.
424+
425+
If you have asked for a trial license but you cannot access your account on [my.JohnSnowLabs.com](https://my.johnsnowlabs.com/) and you did not receive the license information via email, please contact us at <a href="mailto:[email protected]">[email protected]</a>.

0 commit comments

Comments
 (0)