-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: decoupled prometheus exporter's calculation and output #12383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: decoupled prometheus exporter's calculation and output #12383
Conversation
…heus-exporter-concurrency
@@ -454,10 +458,11 @@ local function collect(ctx, stream_only) | |||
local config = core.config.new() | |||
|
|||
-- config server status | |||
local vars = ngx.var or {} | |||
local hostname = vars.hostname or "" | |||
local hostname = core.utils.gethostname() or "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because API disabled in the context of ngx.timer, context: ngx.timer
.
-- FIXME: | ||
-- Now the HTTP subsystem loads the stream plugin unintentionally, which shouldn't happen. | ||
-- It breaks the initialization logic of the plugin, | ||
-- here it is temporarily fixed using a workaround. | ||
if ngx.config.subsystem ~= "stream" then | ||
return | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create an issue for this. thx @SkyeYoung
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll create it a little later
local enabled = core.table.array_find(http_plugin_names, "prometheus") ~= nil | ||
local active = exporter.get_prometheus() ~= nil | ||
if not enabled then | ||
exporter.destroy() | ||
end | ||
if enabled and not active then | ||
exporter.http_init() | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some description under this comment explaining why we removed it and moved to init and destroy hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code skipped plugin.init() and old_plugin.destroy() used in https://github.com/apache/apisix/blob/6fb9bf94281525c1fca397f681b4890b69440369/apisix/plugin.lua, and implemented the overload of the prometheus plugin for some reason that I have not yet understood (perhaps because prometheus.lua
originally did not contain two functions init and destroy).
The initial reason was that even after separating the init_prometheus
part and placing it at the end of init_worker
, directly calling exporter_timer()
would still cause an error. After debugging, another initialization logic was found here. This is obviously redundant.
Currently, we provide init
and destroy
functions in prometheus.lua
, allowing the initialization and reloading of the prometheus plugin to be handled within the plugin's own files, reducing coupling.
This also allows the prometheus plugin to revert to the mechanism provided by plugin.lua
, reducing special cases, lowering the cost of understanding, and making the code easier to maintain.
require("apisix.plugins.prometheus.exporter").http_init(prometheus_enabled_in_stream) | ||
elseif not is_http and core.table.array_find(stream_plugin_names, "prometheus") then | ||
require("apisix.plugins.prometheus.exporter").stream_init() | ||
if is_http and (enabled_in_http or enabled_in_stream) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
We will always only handle metrics generation in the http subsystem.
- This will ensure that there is no duplication of execution on http and stream to waste compute resources.
- This simplifies the design.
- Whether or not the user has http enabled (i.e., whether or not it is in stream only mode), an http block for the Prometheus export API and its server block (
:9091
) will always be present, otherwise Prometheus would be pointless. This means that we can always have an http subsystem context for periodic generation of timers and metrics anyway, even if we are currently in stream only mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some comments to the code to document the design intent. @SkyeYoung
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to add a link to this PR comment here.
https://github.com/apache/apisix/pull/12383/files#r2221993953
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bzp2010 I think this part of the code can be found by modifying the history, just like the old code.
@@ -35,6 +34,7 @@ local _M = { | |||
priority = 500, | |||
name = plugin_name, | |||
log = exporter.http_log, | |||
destroy = exporter.destroy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
This will always destroy the plugin (the prometheus instance in it) when reloading it using the Admin API and loading it again based on the latest configuration.
If a reload is performed after a plugin is removed from the profile list, this plugin will not be restored again. Until the next reload.
Technically, exporter.destroy
just backs up that instance of the prometheus module and copies it to another variable.
This will cause the export API to stop working, at which point it will always return a {}
, which is consistent with the current behavior.
Under the hood, the timer will also stop working, no longer generating metrics based on interval timing, and the metrics computation overhead introduced by APISIX is completely eliminated.
When the next plugin reload occurs, if prometheus is re-enabled, the timer will resume running.
Regarding the background timer introduced by the prometheus third-party library, unfortunately, it never stops running.
It is registered with ngx.timer.every
to perform the task of synchronizing the shdict at regular intervals, and this overhead cannot be paused or resumed by external intervention unless we fork and modify the library itself.
So this destruction does not mean that the prometheus instance is actually destroyed, the synchronization timer is stopped, and the shdict is cleared. none of this happens.
@@ -55,4 +55,11 @@ function _M.api() | |||
end | |||
|
|||
|
|||
function _M.init() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
We turned to using the built-in hooks of the plugin system, namely init
to initialize the prometheus instance and prometheus metrics registration.
Note, however, that data padding only occurs the first time the plugin is started (it is usually when the worker is started, i.e. the init_prometheus
call in init.lua http_init_worker
) and every timer.
This initialization just registers the metrics, but doesn't really populate the data.
function _M.init() | ||
local local_conf = core.config.local_conf() | ||
local enabled_in_stream = core.table.array_find(local_conf.stream_plugins, "prometheus") | ||
exporter.http_init(enabled_in_stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
The prometheus plugin, loaded by the http subsystem, will register http metrics there, and will decide whether to register stream metrics (xrpc) depending on whether the stream subsystem has been started.
This is mainly for metrics generation needs in privileged processes, stream data is not really reported at any phase in the http subsystem.
local version, err = config:server_version() | ||
if version then | ||
metrics.etcd_reachable:set(1) | ||
if yieldable then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
Metrics contains the etcd reachability report and the etcd latest modified index report, which relies on communication with etcd.
According to openresty's restrictions, yield is prohibited in the init_worker
phase, i.e. cosocket-based communication with etcd is not allowed.
So we skip a capture here until the timer does them.
return | ||
end | ||
|
||
if not prometheus then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
This is used for dynamic disable (by plugin reload API) of the plugin, i.e. done in exporter.destroy
.
This is where we stop if the prometheus instance is "destroyed", and as you can see, this will happen before scheduling the next timer task, which means the timer will stop.
Technically, this is the advantage of ngx.timer.at
over ngx.timer.every
, every is not terminable, the developer can't get an "instance" of a timer to pause or stop it.
But by using ngx.timer.at
, we can precisely control whether or not to continue scheduling the next timed task, which allows us to stop the timer. If you need to resume it, just re-execute ngx.timer.at(0)
.
return | ||
end | ||
|
||
exporter_timer(false, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE
The initialization of the timer will perform an acquisition task synchronously, i.e. the first acquisition will always occur in the init_worker phase, which provides initial access to the metrics data.
If at any time , the metrics data (the string in the prometheus-cache
shdict) is not available, the API will report an error and log it. This is by design not very likely to happen.
local cached_metrics_text = shdict_prometheus_cache:get(CACHED_METRICS_KEY) | ||
if not cached_metrics_text then | ||
core.log.error("Failed to retrieve cached metrics: data is nil") | ||
return 500, "Failed to retrieve metrics: no data available" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it more standard to return a JSON with reference to
https://github.com/apache/apisix/pull/12383/files#diff-390eaff60bfa1071dd1850bce9c7689452eaa13f07bf45a927921ecf05886d1bR580 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if not prometheus then | ||
core.response.exit(200, "{}") | ||
return core.response.exit(200, "{}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JFI, this behavior seems to be inconsistent with what is in get_cached_metrics and we need to confirm which mode should be used. cc @membphis
BTW, prometheus to nil will happen when the plugin is dynamically disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current way is good to me
if not prometheus then | ||
core.response.exit(200, "{}") | ||
return core.response.exit(200, "{}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the current way is good to me
conf/config.yaml.example
Outdated
@@ -170,6 +170,7 @@ nginx_config: # Config for render the template to generate n | |||
meta: | |||
lua_shared_dict: # Nginx Lua shared memory zone. Size units are m or k. | |||
prometheus-metrics: 15m | |||
prometheus-cache: 10m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls add some comments, tell users when they need to modify it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
core.log.error("Failed to collect metrics: ", res) | ||
return | ||
end | ||
shdict_prometheus_cache:set(CACHED_METRICS_KEY, res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to capture the return value, it maybe fail
if err
, tell user what is the reason, and tell the user to change the default size if the shdict
is small
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
local _, err, forcible = shdict_prometheus_cache:set(CACHED_METRICS_KEY, res) | ||
|
||
if err then | ||
core.log.error("Failed to save metrics to shdict: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need more information, eg: the name of shdict
and the size of value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the value is to large, will fail when we call set
method
core.response.set_header("content_type", "text/plain") | ||
return 200, core.table.concat(prometheus:metric_data()) | ||
local cached_metrics_text = shdict_prometheus_cache:get(CACHED_METRICS_KEY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirm, no "err" is returned?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the part code and found that the second return value, marked as "flags" in its documentation, actually returns an error when an error occurs.
Let me fix this part of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
-- Clear the cached data after cache_exptime to prevent stale data in case of an error. | ||
local _, err, forcible = shdict_prometheus_cache:set(CACHED_METRICS_KEY, res, cache_exptime) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this blank line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
require("apisix.plugins.prometheus.exporter").http_init(prometheus_enabled_in_stream) | ||
elseif not is_http and core.table.array_find(stream_plugin_names, "prometheus") then | ||
require("apisix.plugins.prometheus.exporter").stream_init() | ||
if is_http and (enabled_in_http or enabled_in_stream) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to add a link to this PR comment here.
https://github.com/apache/apisix/pull/12383/files#r2221993953
Description
This PR decouples the calculation and output processes of the Prometheus exporter. The "calculation" is performed in the privileged agent process at intervals defined by the
refresh_interval
(default is 15s) and written to a shared dict, while the "output" (i.e., the/apisix/prometheus/metrics
API) is moved to the worker process, which only reads and returns the cached data in the shared dict.The above are just the core changes. In fact, I encountered many other problems, which have been commented or annotated in the corresponding positions, and will not be repeated here.
For the testing part, since the Prometheus exporter currently refreshes data every 15 seconds, I used a smaller interval in the relevant tests to pass the original tests.
Which issue(s) this PR fixes:
Fixes #
Stress Testing
How to
./test.sh init-etcd
), nginx(as upstream,./test.sh start-nginx
)make run
./test.sh create
)./test.sh enable-prometheus
)./test.sh benchmark
)Results
Key Performance Indicators
Performance Impact Summary
Checklist