Skip to content

Commit 7d34183

Browse files
authored
Merge pull request #8 from ajdapretnar/docs-unsup
Unsupervised: Updated documentation
2 parents 63c04a3 + 218ecce commit 7d34183

File tree

105 files changed

+2198
-195
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

105 files changed

+2198
-195
lines changed

source/index.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,9 @@ Data
5757
widgets/data/applydomain
5858
widgets/data/purgedomain
5959
widgets/data/rank
60-
widgets/data/correlations
6160
widgets/data/color
6261
widgets/data/featurestatistics
6362
widgets/data/melt
64-
widgets/data/neighbors
6563
widgets/data/unique
6664
widgets/data/groupby
6765

@@ -150,6 +148,8 @@ Unsupervised
150148
:maxdepth: 1
151149

152150
widgets/unsupervised/PCA
151+
widgets/unsupervised/neighbors
152+
widgets/unsupervised/correlations
153153
widgets/unsupervised/correspondenceanalysis
154154
widgets/unsupervised/distancemap
155155
widgets/unsupervised/distances
Binary file not shown.
-24.7 KB
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
-15.1 KB
Binary file not shown.

source/widgets/data/neighbors.md

Lines changed: 0 additions & 42 deletions
This file was deleted.

source/widgets/unsupervised/PCA.md

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,20 +10,19 @@ PCA linear transformation of input data.
1010
**Outputs**
1111

1212
- Transformed Data: PCA transformed data
13-
- Components: [Eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors).
13+
- Data: original data with PCA components as meta variables
14+
- Components: [Eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors)
15+
- PCA: PCA to use as Scorer in [Rank](../data/rank.md)
1416

1517
[Principal Component Analysis](https://en.wikipedia.org/wiki/Principal_component_analysis) (PCA) computes the PCA linear transformation of the input data. It outputs either a transformed dataset with weights of individual instances or weights of principal components.
1618

17-
![](images/PCA-stamped.png)
19+
![](images/PCA-stamped.png){width=500px}
1820

19-
1. Select how many principal components you wish in your output. It is best to choose as few as possible with variance covered as high as possible. You can also set how much variance you wish to cover with your principal components.
20-
2. You can normalize data to adjust the values to common scale. If checked, columns are divided by their standard deviations.
21+
1. Select how many principal components you wish in your output. It is best to choose as few as possible with variance (parameter *Explained variance*) covered as high as possible. You can also set how much variance you wish to cover with your principal components.
22+
2. You can normalize data to adjust the values to common scale. If checked, columns are divided by their standard deviations. One can also set how many components to display in the graph.
2123
3. When *Apply Automatically* is ticked, the widget will automatically communicate all changes. Alternatively, click *Apply*.
22-
4. Press *Save Image* if you want to save the created image to your computer.
23-
5. Produce a report.
24-
6. Principal components graph, where the red (lower) line is the variance covered per component and the green (upper) line is cumulative variance covered by components.
2524

26-
The number of components of the transformation can be selected either in the *Components Selection* input box or by dragging the vertical cutoff line in the graph.
25+
The principal components graph, called a scree plot, show the red (lower) line, representing the variance covered per component, and the green (upper) line, representing the cumulative variance covered by components. The number of components of the transformation can be selected either in the *Components* input box or by dragging the vertical cutoff line in the graph.
2726

2827
Preprocessing
2928
-------------
@@ -39,8 +38,8 @@ Examples
3938

4039
**PCA** can be used to simplify visualizations of large datasets. Below, we used the *Iris* dataset to show how we can improve the visualization of the dataset with PCA. The transformed data in the [Scatter Plot](../visualize/scatterplot.md) show a much clearer distinction between classes than the default settings.
4140

42-
![](images/PCAExample.png)
41+
![](images/PCA-Example1.png)
4342

44-
The widget provides two outputs: transformed data and principal components. Transformed data are weights for individual instances in the new coordinate system, while components are the system descriptors (weights for principal components). When fed into the [Data Table](../data/datatable.md), we can see both outputs in numerical form. We used two data tables in order to provide a more clean visualization of the workflow, but you can also choose to edit the links in such a way that you display the data in just one data table. You only need to create two links and connect the *Transformed data* and *Components* inputs to the *Data* output.
43+
PCA can also be used as a scorer for the [Rank](../data/rank.md) widget. We used the *iris* data for this example. The data is passed both to Rank and to PCA. PCA passes the Scorer output to the Rank widget. Rank now shows feature scores for the first two principal components.
4544

46-
![](images/PCAExample2.png)
45+
![](images/PCA-Example2.png)

0 commit comments

Comments
 (0)