Skip to content

[BUG] long_callback short interval makes job run in loop #1769

@usr-ein

Description

@usr-ein

I'm really excited to use long_callbacks because it helps a lot for my use-cases, but while reading the doc (Example 5) I found that there exists an undocumented parameter interval which is set to 1000 in this example.

I tried setting it to 1000 too, but the progress updates were too slow, so I progressively lowered it. It behaved as I wanted when setting it to 100 (ms), but when running my job with interval=100, when the job finishes, it is immediatly executed again, even though the launch button's event is not triggered.

Minimal replication code

#!/usr/bin/env python3
import time
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.long_callback import DiskcacheLongCallbackManager
from dash.dependencies import Input, Output
import plotly.graph_objects as go

## Diskcache
import diskcache
cache = diskcache.Cache("./cache")
long_callback_manager = DiskcacheLongCallbackManager(cache)

def make_progress_graph(progress, total):
    progress_graph = (
        go.Figure(data=[go.Bar(x=[progress])])
        .update_xaxes(range=[0, total])
        .update_yaxes(
            showticklabels=False,
        )
        .update_layout(height=100, margin=dict(t=20, b=40))
    )

    return progress_graph


app = dash.Dash(__name__, long_callback_manager=long_callback_manager)

app.layout = html.Div(
    [
        html.Div(
            [
                html.P(id="paragraph_id", children=["Button not clicked"]),
                dcc.Graph(id="progress_bar_graph", figure=make_progress_graph(0, 10)),
            ]
        ),
        html.Button(id="button_id", children="Run Job!"),
        html.Button(id="cancel_button_id", children="Cancel Running Job!"),
        html.Span(id="heavy-data-span", children="", style={"display": "none"}),
        html.Span(id="output-span", children="", style={"display": "none"}),
    ]
)

@app.long_callback(
    output=[Output("paragraph_id", "children"),Output("output-span", "children"),],
    inputs=Input("button_id", "n_clicks"),
    running=[
        (Output("button_id", "disabled"), True, False),
        (Output("cancel_button_id", "disabled"), False, True),
        (
            Output("paragraph_id", "style"),
            {"visibility": "hidden"},
            {"visibility": "visible"},
        ),
        (
            Output("progress_bar_graph", "style"),
            {"visibility": "visible"},
            {"visibility": "hidden"},
        ),
    ],
    cancel=[Input("cancel_button_id", "n_clicks")],
    progress=[Output("progress_bar_graph", "figure"), Output("heavy-data-span", "children")],
    progress_default=(make_progress_graph(0, 20), "A"),
    interval=38,
    prevent_initial_call=True,
)
def callback(set_progress, n_clicks):
    total = 20

    for i in range(total):
        print(f"Running {i}/{total}...")
        time.sleep(0.3)
        set_progress((make_progress_graph(i, total), str(i%10)*4_000_000))

    return [f"Clicked {n_clicks} times", "A"*4_000_000]


if __name__ == "__main__":
    app.run_server(debug=True)

If you can't replicate this exact looping bug on your setup, try reducing the interval value until you see a looping behaviour with some diskcache I/O errors, and when you do, progressively set it to something higher until the looping remains, but that the error disappear.

Here is what such diskcache I/O errors look like, though in my real application I never witnessed these. There simply were no warnings !

Traceback (most recent call last):
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/multiprocess/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/multiprocess/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/dash/long_callback/managers/diskcache_manager.py", line 143, in job_fn
    user_callback_output = fn(*maybe_progress, *user_callback_args)
  File "/Users/sam1902/Downloads/test_dash_bug.py", line 78, in callback
    set_progress(make_progress_graph(i, total))
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/dash/long_callback/managers/diskcache_manager.py", line 137, in _set_progress
    cache.set(progress_key, progress_value)
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/diskcache/core.py", line 799, in set
    with self._transact(retry, filename) as (sql, cleanup):
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/contextlib.py", line 117, in __enter__
 58         ),
    return next(self.gen)
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/diskcache/core.py", line 713, in _transact
    sql = self._sql
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/diskcache/core.py", line 651, in _sql
    return self._con.execute
  File "/Users/sam1902/.pyenv/versions/3.9.1/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/diskcache/core.py", line 626, in _con
    con = self._local.con = sqlite3.connect(
sqlite3.OperationalError: disk I/O error

Those are not critical errors causing the server to crash. It simply ends one of the processes, but the callback still loops back. Also, those happen anywhere within the loop (at any iteration, not just at the end).

Environment

pip list | grep dash:

dash                      2.0.0
dash-bootstrap-components 0.13.0
dash-core-components      2.0.0
dash-html-components      2.0.0
dash-table                5.0.0
diskcache                  5.2.1
psutil                     5.8.0
multiprocess               0.70.12.2
  • if frontend related, tell us your Browser, Version and OS

    • OS: macOs 11.5.2
    • Browser: replicated on
      • Firefox 92.0 (i.e. latest on 2021-09-21)
      • Chromium 93.0.4577.82 (i.e. latest on 2021-09-21)

Describe the bug

When setting an interval too small, the callback runs, but once it finishes it runs again, and again etc in a loop fashion.
From my replication attempts, this seems to be due to passing too large values to the long_callback's Output or progress=[Output(..)], either during progress updates or at the end of the callback.

I ran into this in my code because I dealt with large images encoded as data URIs, and the file size here are approximated by "A"*4_000_000, which should weigh around 3.8MB, which is comparable to my image weight.

Expected behavior

The callback should only be called once, no matter the progress callback frequency (i.e. interval time)

Screenshots
As you can see, it loops from Runnning 19/20 back to Running 0/20 without any warning or error:
Screenshot 2021-09-21 at 11 52 25

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions