-
Notifications
You must be signed in to change notification settings - Fork 244
Feat/genapi add shared responsibility model #5125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
b134a6e
feat(genapi): add shared responsibility model
fpagny 6bfca36
fix(genapi): typos in shared responsibility model
fpagny 1f6c088
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 8f725ae
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny c840b47
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 7ccf703
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 78b906d
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 50ef4c7
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 955c915
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 39dc2f5
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 2f5cc25
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny 9e257c8
Update pages/generative-apis/reference-content/security-and-reliabili…
fpagny f486e3e
Apply suggestions from code review
RoRoJ 0d834a7
Apply suggestions from code review
RoRoJ 51eb952
Apply suggestions from code review
RoRoJ 7489d6f
Update pages/generative-apis/reference-content/security-and-reliabili…
RoRoJ 1351726
fix(ai): add to menu
RoRoJ a905fa7
fix(ai): fix posted date
RoRoJ File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
133 changes: 133 additions & 0 deletions
133
pages/generative-apis/reference-content/security-and-reliability.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
--- | ||
meta: | ||
title: Security and Reliability in Generative APIs | ||
description: Learn more about shared responsibility in security and reliability practices for Generative APIs | ||
content: | ||
h1: Security and Reliability in Generative APIs | ||
paragraph: Learn more about shared responsibility in security and reliability practices for Generative APIs | ||
tags: generative-apis ai-data shared responsibility security reliability | ||
dates: | ||
posted: 2025-06-16 | ||
validation: 2025-06-16 | ||
--- | ||
|
||
This page outlines key principles and best practices to help you ensure your applications' security and reliability when using Generative APIs. | ||
|
||
## Resilience | ||
|
||
Resilience ensures the continuity and availability of your applications and data, even in the face of disruptions or failures. In Generative APIs, you can promote resilience through three pillars: **availability**, **durability** and **performance**. | ||
|
||
### Availability and durability | ||
|
||
Generative APIs Service Level Agreements (SLAs) target the following Service Level Objectives (SLOs): | ||
|
||
| Processing Type | Configuration Details | Availability | | ||
| ------------ | ------------------- | ------ | | ||
| Standard | Standard synchronous HTTP calls to Generative APIs providing the generated content directly in HTTP response. These calls include stream and non-stream requests. | 99.9% | | ||
| Batch | Asynchronous processing of files sent to Generative APIs providing the generated content through files. | 99.9% | | ||
|
||
The detailed SLA measurements and guarantees can be found on the [Service Level Agreement for Generative APIs](https://www.scaleway.com/en/database/sla/) page. | ||
|
||
As we do not store any data with standard processing, durability requirements do not apply. | ||
|
||
When processing data using batch processing, your input data is stored only during processing time (24 hours): | ||
- As input data storage is only temporary, no specific durability guarantee applies. | ||
- Output data (processing results) durability depends on the target storage system used. The storage system used by default is the [Object Storage Standard Class](/object-storage/concepts/#storage-class) | ||
|
||
|
||
## Performance | ||
|
||
Standard processing (synchronous HTTP calls): | ||
- Generative APIs run models on mutualized infrastructure, and therefore ensure good performance in average usage. We monitor and respond quickly to any drops in token generation throughput, but cannot strictly guarantee performance, especially during customer peak loads. As a consequence, [rate limits](/generative-apis/reference-content/rate-limits/) apply, to ensure fair use of synchronous HTTP calls. Bigger volumes of requests should be treated through batch processing. | ||
- Guaranteed performance can be provided using dedicated resources on the [Managed Inference](/managed-inference/) product. | ||
|
||
Batch processing (asynchronous file processing): | ||
- When using batch processing, we handle scheduling of batch jobs to optimize both processing resource allocation and processing time. Processing time is therefore only guaranteed to be lower than 24 hours and [rate limits](/generative-apis/reference-content/rate-limits/) (larger than Standard processing) still apply. | ||
|
||
## Monitoring | ||
|
||
Monitoring is an essential pillar to ensure the security and reliability of your services. It provides real-time insights into the performance, security, and resource consumption of your Generative APIs usage. | ||
|
||
### Metrics and logs | ||
|
||
Generative APIs metrics and logs are stored inside [Scaleway Cockpit](/cockpit/). | ||
|
||
This includes: | ||
- **Metrics**: Input and output tokens and API requests. Metrics are refreshed every minute (some dashboards may aggregate data by the hour for accuracy reasons, but metrics can be queried at a finer rate using Cockpit custom dashboards) | ||
- **Logs**: No logs are currently stored inside Cockpit. | ||
|
||
## Configuration and version management | ||
|
||
Configuration and version management are critical for maintaining reliability and performance across your services. | ||
|
||
### Configuration | ||
|
||
Currently, Generative APIs do not provide specific configuration properties stored within your account. All configuration parameters are the ones you send through each API HTTP call (such as `temperature`, `top_p` or `seed`) and you remain responsible for any change in outputs based on these parameters. | ||
|
||
Since Generative AI models are by definition non-deterministic, we cannot guarantee the same input will provide the same output over time (for example when using two different HTTP calls). If you want deterministic processing, we encourage you to use [Managed Inference](/managed-inference) with a specific model and set all randomness parameters to deterministic levels (for example using for instance `temperature`:`0` and a specific `seed` value). | ||
|
||
### Version management | ||
|
||
#### Supported models | ||
|
||
Any changes to supported models and associated guarantees are detailed in our [model lifecycle policy page](/generative-apis/reference-content/model-lifecycle/). | ||
|
||
#### API versions | ||
|
||
Two types of API version updates may be performed: | ||
|
||
| Upgrade Type | Description| | ||
| ------------ | ------------------- | | ||
| Minor | These updates do not change the API's current fields format and are backward compatible (no action is required on your side). New fields and features can however be added. | | ||
| Major | These updates change the API's current fields or paths. They may require action from your side. In this case, we will notify you with at least 3 months' warning before deprecating significant features that might break your application. | | ||
|
||
#### Compatibility with third party tools | ||
|
||
By following industry standards (such as targeting OpenAI API compatibility), we aim to provide compatibility with most AI ecosystems and tools by default. However, as ecosystems evolve quickly, we cannot always guarantee compatibility with third party tools. We do provide extensive documentation: | ||
- Current API supported features are available in our [API Documentation](/generative-apis/api-cli/) | ||
- Advanced errors and edge case workarounds are provided in our [Troubleshooting section](/generative-apis/troubleshooting/fixing-common-issues/). | ||
- Information about integration with third party tools is available in our [dedicated documentation page](/generative-apis/reference-content/integrating-generative-apis-with-popular-tools/#openai-client-libraries) | ||
|
||
## Data protection | ||
|
||
Our data protection measures are detailed on our [privacy policy page](/generative-apis/reference-content/data-privacy/). | ||
|
||
- Scaleway does not store sensitive data (such as the content of your prompt), unless we need it to provide the service (such as when using batch processing) | ||
- When data is stored, it is protected using a state of the art method (such as encryption at rest) | ||
- During transit, your data is encrypted via the HTTPS protocol | ||
|
||
### Scaleway access | ||
|
||
In order to perform maintenance operations and guarantee the reliability of Generative APIs, or comply with local regulations, we need to access servers hosting the Generative APIs service. | ||
|
||
Most of the time, any actions Scaleway carries out are automatic, e.g. updating configuration or upgrading software versions. | ||
|
||
Manual interventions might be required occasionally for troubleshooting reasons (such as specific customer requests generating errors or carrying out malicious activity). We may temporarily complete the content of HTTP requests to identify a root cause issue or any security vulnerability. All Scaleway access is authenticated and traced following industry security standards. | ||
|
||
## Identity and access management | ||
|
||
Identity and access management allows you to enable granular control over user permissions and to mitigate the risk of unauthorized access or data breaches. | ||
|
||
All access to Generative APIs is authenticated and authorized, relying on [Scaleway IAM permissions sets](/iam/reference-content/permission-sets/). | ||
|
||
You are responsible for attributing these permissions to the relevant users or applications, and for reviewing these accesses frequently. | ||
|
||
## Compliance | ||
|
||
Several regulations apply to us (Scaleway) directly, whereas others apply to your usage. Even in this case, we help you ease your compliance process by providing you with the information you need from your cloud provider. | ||
### AI Act | ||
|
||
We (Scaleway) ensure our compliance with the [AI Act](https://artificialintelligenceact.eu/) within our scope of responsibilities. We also ensure that you have the information needed to comply with the requirements that apply to you. This means concretely: | ||
- Gathering information from AI Model Providers about their models (such as whether their training capacity is above 10²⁵ FLOPs, and falls into a specific category), and providing you with a link to these documents when they are made available. | ||
- Providing you with links towards licensing required by the AI Model Providers. | ||
|
||
Scaleway has no access to, nor knowledge of any inputs and outputs generated by the models. By using our AI products, you agree and acknowledge that you are (a) responsible for this use, including any content integrated into the models, and (b) required to use the AI products in compliance with our General Terms of Services. | ||
|
||
As a client of our AI products, you are likely to be considered an AI System Provider or Deployer under the AI Act. As such, it is your responsibility to ensure you comply with requirements that apply to you. | ||
|
||
### Additional local regulation | ||
|
||
If you require additional information to comply with specific regulations, you can create a [support ticket](https://console.scaleway.com/support/tickets/create) or contact your account manager. | ||
|
||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.