Skip to content

[Kibana][9.1 & Serverless] Update ML docs to reflect new nav #2086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ There are a few limitations to consider before you create this type of job:

If those limitations are acceptable, try creating an {{anomaly-job}} that uses the [`lat_long` function](/reference/data-analysis/machine-learning/ml-geo-functions.md#ml-lat-long) to analyze your own data or the sample data sets.

To create an {{anomaly-job}} that uses the `lat_long` function, in {{kib}} you must click **Create job** on the **{{ml-cap}} > {{anomaly-detect-cap}} > Jobs** page and select the advanced job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job).
To create an {{anomaly-job}} that uses the `lat_long` function, navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md). Then click **Create job** and select the appropriate job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job).

For example, create a job that analyzes the sample eCommerce orders data set to find orders with unusual coordinates (`geoip.location` values) relative to the past behavior of each customer (`user` ID):

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ If you have fields that contain valid vector layers, you can use the **{{data-vi

## Create an {{anomaly-job}} [mapping-anomalies-jobs]

To create an {{anomaly-job}} in {{kib}}, click **Create job** on the **{{ml-cap}} > {{anomaly-detect-cap}}** page and select an appropriate job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job).
To create an {{anomaly-job}}, navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md). Then click **Create job** and select the appropriate job wizard. Alternatively, use the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job).

For example, use the multi-metric job wizard to create a job that analyzes the sample web logs data set to detect anomalous behavior in the sum of the data transferred (`bytes` values) for each destination country (`geo.dest` values):

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ If your data is located outside of {{es}}, you cannot use {{kib}} to create your

## Create an {{anomaly-job}} [ml-ad-create-job]

You can create {{anomaly-jobs}} by using the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job). {{kib}} also provides wizards to simplify the process, which vary depending on whether you are using the {{ml-app}} app, {{security-app}} or {{observability}} apps. To open **Anomaly Detection**, find **{{ml-app}}** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
You can create {{anomaly-jobs}} by using the [create {{anomaly-jobs}} API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-put-job). {{kib}} also provides wizards to simplify the process, which vary depending on whether you are using the {{ml-app}} app, {{security-app}} or {{observability}} apps. To start creating an {{anomaly-job}}, navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).

:::{image} /explore-analyze/images/machine-learning-ml-create-job.png
:alt: Create New Job
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ Categorization is a {{ml}} process that tokenizes a text field, clusters similar

## Creating categorization jobs [creating-categorization-jobs]

1. In {{kib}}, navigate to **Jobs**. To open **Jobs**, find **{{ml-app}} > Anomaly Detection** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
2. Click **Create job**, select the data view you want to analyze.
1. To create an {{anomaly-job}}, navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
2. Click **Create anomaly detection job**, select the data view you want to analyze.
3. Select the **Categorization** wizard from the list.
4. Choose a categorization detector - it’s the `count` function in this example - and the field you want to categorize - the `message` field in this example.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Population analysis is resource-efficient and scales well, enabling the analysis

## Creating population jobs [creating-population-jobs]

1. In {{kib}}, navigate to **Jobs**. To open **Jobs**, find **{{ml-app}} > Anomaly Detection** in the main menu, or use the [global search field](/explore-analyze/query-filter/filtering.md#_finding_your_apps_and_objects).
1. To create an {{anomaly-job}}, navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
2. Click **Create job**, select the {{data-source}} you want to analyze.
3. Select the **Population** wizard from the list.
4. Choose a population field - it’s the `clientip` field in this example - and the metric you want to use for the analysis - `Mean(bytes)` in this example.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ products:

[Snapshots of the {{ml}} model](ml-ad-run-jobs.md#ml-ad-model-snapshots) for each {{anomaly-job}} are saved frequently to an internal {{es}} index to ensure resilience. It makes it possible to reset the model to a previous state in case of a system failure or if the model changed significantly due to a one-off event.

1. In {{kib}}, navigate to **Jobs**. To open **Jobs**, find **{{ml-app}} > Anomaly Detection** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
1. Navigate to the **Anomaly Detection Jobs** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md).
2. Locate the {{anomaly-job}} whose model you want to revert in the job table.
3. Open the job details and navigate to the **Model Snapshots** tab.
:::{image} /explore-analyze/images/machine-learning-anomaly-job-model-snapshots.jpg
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ To learn more about choosing the class assignment objective that fits your goal,

The model that you created is stored as {{es}} documents in internal indices. In other words, the characteristics of your trained model are saved and ready to be deployed and used as functions.

1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.
1. To deploy {{dfanalytics}} model in a pipeline, navigate to the **Trained models** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.
2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu.
:::{image} /explore-analyze/images/machine-learning-ml-dfa-trained-models-ui.png
:alt: The trained models UI in {{kib}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ The goal of {{oldetection}} is to find the most unusual documents in an index. L
2. Create a {{transform}} that generates an entity-centric index with numeric or boolean data to analyze.
In this example, we’ll use the web logs sample data and pivot the data such that we get a new index that contains a network usage summary for each client IP.
In particular, create a {{transform}} that calculates the number of occasions when a specific client IP communicated with the network (`@timestamp.value_count`), the sum of the bytes that are exchanged between the network and the client’s machine (`bytes.sum`), the maximum exchanged bytes during a single occasion (`bytes.max`), and the total number of requests (`request.value_count`) initiated by a specific client IP.
You can preview the {{transform}} before you create it in **{{stack-manage-app}}** > **Transforms**:
You can preview the {{transform}} before you create it. Go to the **Transforms** page in the main menu or by using the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.:
:::{image} /explore-analyze/images/machine-learning-logs-transform-preview.jpg
:alt: Creating a {{transform}} in {{kib}}
:screenshot:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ R-squared (R^2^) represents the goodness of fit and measures how much of the var

The model that you created is stored as {{es}} documents in internal indices. In other words, the characteristics of your trained model are saved and ready to be deployed and used as functions. The [{{infer}}](#ml-inference-reg) feature enables you to use your model in a preprocessor of an ingest pipeline or in a pipeline aggregation of a search query to make predictions about your data.

1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.
1. To deploy {{dfanalytics}} model in a pipeline, navigate to the **Trained models** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.
2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu.
:::{image} /explore-analyze/images/machine-learning-ml-dfa-trained-models-ui.png
:alt: The trained models UI in {{kib}}
Expand Down Expand Up @@ -234,74 +234,78 @@ To predict the number of minutes delayed for each flight:
3. Optionally improve the quality of the analysis by adding a query that removes erroneous data. In this case, we omit flights with a distance of 0 kilometers or less.
4. Choose `FlightDelayMin` as the {{depvar}}, which is the field that we want to predict.
5. Add `Cancelled`, `FlightDelay`, and `FlightDelayType` to the list of excluded fields. These fields will be excluded from the analysis. It is recommended to exclude fields that either contain erroneous data or describe the `dependent_variable`.

The wizard includes a scatterplot matrix, which enables you to explore the relationships between the numeric fields. The color of each point is affected by the value of the {{depvar}} for that document, as shown in the legend. You can highlight an area in one of the charts and the corresponding area is also highlighted in the rest of the chart. You can use this matrix to help you decide which fields to include or exclude from the analysis.
:::{image} /explore-analyze/images/machine-learning-flightdata-regression-scatterplot.png
:alt: A scatterplot matrix for three fields in {{kib}}
:screenshot:
:::

:::{image} /explore-analyze/images/machine-learning-flightdata-regression-scatterplot.png
:alt: A scatterplot matrix for three fields in {{kib}}
:screenshot:
:::

If you want these charts to represent data from a larger sample size or from a randomized selection of documents, you can change the default behavior. However, a larger sample size might slow down the performance of the matrix and a randomized selection might put more load on the cluster due to the more intensive query.

6. Choose a training percent of `90` which means it randomly selects 90% of the source data for training.
7. If you want to experiment with [{{feat-imp}}](ml-feature-importance.md), specify a value in the advanced configuration options. In this example, we choose to return a maximum of 5 {{feat-imp}} values per document. This option affects the speed of the analysis, so by default it is disabled.
8. Use a model memory limit of at least 50 MB. If the job requires more than this amount of memory, it fails to start. If the available memory on the node is limited, this setting makes it possible to prevent job execution.
9. Add a job ID (such as `model-flight-delay-regression`) and optionally a job description.
10. Add the name of the destination index that will contain the results of the analysis. In {{kib}}, the index name matches the job ID by default. It will contain a copy of the source index data where each document is annotated with the results. If the index does not exist, it will be created automatically.

::::{dropdown} API example

```console
PUT _ml/data_frame/analytics/model-flight-delays-regression
{
"source": {
"index": [
"kibana_sample_data_flights"
],
"query": {
"range": {
"DistanceKilometers": {
"gt": 0
::::{dropdown} API example

```console
PUT _ml/data_frame/analytics/model-flight-delays-regression
{
"source": {
"index": [
"kibana_sample_data_flights"
],
"query": {
"range": {
"DistanceKilometers": {
"gt": 0
}
}
}
},
"dest": {
"index": "model-flight-delays-regression"
},
"analysis": {
"regression": {
"dependent_variable": "FlightDelayMin",
"training_percent": 90,
"num_top_feature_importance_values": 5,
"randomize_seed": 1000
}
},
"model_memory_limit": "50mb",
"analyzed_fields": {
"includes": [],
"excludes": [
"Cancelled",
"FlightDelay",
"FlightDelayType"
]
}
}
}
},
"dest": {
"index": "model-flight-delays-regression"
},
"analysis": {
"regression": {
"dependent_variable": "FlightDelayMin",
"training_percent": 90,
"num_top_feature_importance_values": 5,
"randomize_seed": 1000
}
},
"model_memory_limit": "50mb",
"analyzed_fields": {
"includes": [],
"excludes": [
"Cancelled",
"FlightDelay",
"FlightDelayType"
]
}
}
```
```

::::
::::


After you configured your job, the configuration details are automatically validated. If the checks are successful, you can proceed and start the job. A warning message is shown if the configuration is invalid. The message contains a suggestion to improve the configuration to be validated.
After you configured your job, the configuration details are automatically validated. If the checks are successful, you can proceed and start the job. A warning message is shown if the configuration is invalid. The message contains a suggestion to improve the configuration to be validated.

3. Start the job in {{kib}} or use the [start {{dfanalytics-jobs}}](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-start-data-frame-analytics) API.

The job takes a few minutes to run. Runtime depends on the local hardware and also on the number of documents and fields that are analyzed. The more fields and documents, the longer the job runs. It stops automatically when the analysis is complete.
The job takes a few minutes to run. Runtime depends on the local hardware and also on the number of documents and fields that are analyzed. The more fields and documents, the longer the job runs. It stops automatically when the analysis is complete.

::::{dropdown} API example
::::{dropdown} API example

```console
POST _ml/data_frame/analytics/model-flight-delays-regression/_start
```
```console
POST _ml/data_frame/analytics/model-flight-delays-regression/_start
```

::::
::::

4. Check the job stats to follow the progress in {{kib}} or use the [get {{dfanalytics-jobs}} statistics API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-ml-get-data-frame-analytics-stats).

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Alternatively, you can use APIs like [get trained models](https://www.elastic.co

### Models trained by {{dfanalytics}} [_models_trained_by_dfanalytics]

1. To deploy {{dfanalytics}} model in a pipeline, navigate to **Machine Learning** > **Model Management** > **Trained models** in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.
1. To deploy {{dfanalytics}} model in a pipeline, navigate to the **Trained models** page in the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}.

2. Find the model you want to deploy in the list and click **Deploy model** in the **Actions** menu.

Expand Down
4 changes: 2 additions & 2 deletions explore-analyze/machine-learning/nlp/ml-nlp-e5.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ After you created the E5 {{infer}} endpoint, it’s ready to be used for semanti

### Alternative methods to download and deploy E5 [alternative-download-deploy-e5]

You can also download and deploy the E5 model either from **{{ml-app}}** > **Trained Models**, from **Search** > **Indices**, or by using the trained models API in Dev Console.
You can also download and deploy the E5 model from the **Trained models** page, from **Search** > **Indices**, or by using the trained models API in Dev Console.

::::{note}
For most cases, the preferred version is the **Intel and Linux optimized** model, it is recommended to download and deploy that version.
Expand All @@ -62,7 +62,7 @@ For most cases, the preferred version is the **Intel and Linux optimized** model

#### Using the Trained Models page [trained-model-e5]

1. In {{kib}}, navigate to **{{ml-app}}** > **Trained Models** from the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md). E5 can be found in the list of trained models. There are two versions available: one portable version which runs on any hardware and one version which is optimized for Intel® silicon. You can see which model is recommended to use based on your hardware configuration.
1. In {{kib}}, navigate to the **Trained Models** page from the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md). E5 can be found in the list of trained models. There are two versions available: one portable version which runs on any hardware and one version which is optimized for Intel® silicon. You can see which model is recommended to use based on your hardware configuration.
2. Click the **Add trained model** button. Select the E5 model version you want to use in the opening modal window. The model that is recommended for you based on your hardware configuration is highlighted. Click **Download**. You can check the download status on the **Notifications** page.

:::{image} /explore-analyze/images/machine-learning-ml-nlp-e5-download.png
Expand Down
2 changes: 1 addition & 1 deletion explore-analyze/machine-learning/nlp/ml-nlp-elser.md
Original file line number Diff line number Diff line change
Expand Up @@ -265,7 +265,7 @@ For a file-based access, follow these steps:

## Testing ELSER [_testing_elser]

You can test the deployed model in {{kib}}. Navigate to **Model Management** > **Trained Models** from the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}. Locate the deployed ELSER model in the list of trained models, then select **Test model** from the Actions menu.
You can test the deployed model in {{kib}}. Navigate to the **Trained Models** page from the main menu, or use the [global search field](../../find-and-organize/find-apps-and-objects.md) in {{kib}}. Locate the deployed ELSER model in the list of trained models, then select **Test model** from the Actions menu.

You can use data from an existing index to test the model. Select the index, then a field of the index you want to test ELSER on. Provide a search query and click **Test**. Evaluating model recall is simpler when using a query related to the documents.

Expand Down
Loading
Loading