Skip to content

CI is broken in master #422

@curita

Description

@curita

Issue

The CI checks fail in master. This is affecting new PRs (#421)

Reproduce

❯ python --version
Python 3.10.9
❯ python -m venv .venv
❯ source .venv/bin/activate
❯ pip install tox
❯ tox -e base

Traceback

❯ tox -e base
base: install_deps> python -I -m pip install Jinja2 pytest pytest-cov pytest-mock scrapy
.pkg: install_requires> python -I -m pip install 'setuptools>=40.8.0' wheel
.pkg: _optional_hooks> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: get_requires_for_build_sdist> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: get_requires_for_build_wheel> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: install_requires_for_build_wheel> python -I -m pip install wheel
.pkg: prepare_metadata_for_build_wheel> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
.pkg: build_sdist> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
base: install_package_deps> python -I -m pip install Jinja2 boto boto3 itemadapter 'jsonschema[format]>=3.2.0' premailer python-slugify requests scrapinghub scrapinghub-entrypoint-scrapy scrapy sentry-sdk slack-sdk
base: install_package> python -I -m pip install --force-reinstall --no-deps /Users/julia/src/spidermon/.tox/.tmp/package/1/spidermon-1.20.0.tar.gz
base: commands[0]> pytest -s --ignore=./tests/contrib --ignore=./tests/utils/test_zyte.py tests
================================================================================== test session starts ===================================================================================
platform darwin -- Python 3.10.9, pytest-7.4.2, pluggy-1.3.0
cachedir: .tox/base/.pytest_cache
Spidermon monitor filtering
rootdir: /Users/julia/src/spidermon
plugins: cov-4.1.0, mock-3.11.1
collected 384 items                                                                                                                                                                      

tests/test_actions.py ......
tests/test_add_field_coverage.py ..........
tests/test_data.py .........
tests/test_descriptions.py ...
tests/test_expressions.py ....
tests/test_extension.py FFFFFF
tests/test_item_scraped_signal.py ...............
tests/test_levels.py .
tests/test_loaders.py ...
tests/test_messagetranslator.py ...
tests/test_names.py ....
tests/test_ordering.py ..
tests/test_spidermon_signal_connect.py ......
tests/test_suites.py ........
tests/test_templateloader.py ...
tests/test_validators_jsonschema.py ................................................................................................................................................................................................................................................................
tests/utils/test_field_coverage.py ..
tests/utils/test_settings.py ......

======================================================================================== FAILURES ========================================================================================
__________________________________________________________________________ test_spider_opened_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe320>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_opened_suites_should_run(get_crawler, suites):
        """The suites defined at spider_opened_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(crawler, spider_opened_suites=suites)
        spidermon.spider_opened_suites[0].run = mock.MagicMock()
>       spidermon.spider_opened(crawler.spider)

tests/test_extension.py:18: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:120: in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x127811270>, spider = <Spider 'dummy' at 0x1247c9030>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
__________________________________________________________________________ test_spider_closed_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe440>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_closed_suites_should_run(get_crawler, suites):
        """The suites defined at spider_closed_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(
            crawler, spider_opened_suites=suites, spider_closed_suites=suites
        )
        spidermon.spider_closed_suites[0].run = mock.MagicMock()
>       spidermon.spider_opened(crawler.spider)

tests/test_extension.py:30: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:120: in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12787f3d0>, spider = <Spider 'dummy' at 0x127835390>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
_________________________________________________________________________ test_engine_stopped_suites_should_run __________________________________________________________________________

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe950>, suites = ['tests.fixtures.suites.Suite01']

    def test_engine_stopped_suites_should_run(get_crawler, suites):
        """The suites defined at engine_stopped_suites should be loaded and run"""
        crawler = get_crawler()
        spidermon = Spidermon(crawler, engine_stopped_suites=suites)
        spidermon.engine_stopped_suites[0].run = mock.MagicMock()
>       spidermon.engine_stopped()

tests/test_extension.py:41: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
spidermon/contrib/scrapy/extensions.py:136: in engine_stopped
    self._run_suites(spider, self.engine_stopped_suites)
spidermon/contrib/scrapy/extensions.py:203: in _run_suites
    data = self._generate_data_for_spider(spider)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12779a590>, spider = <Spider 'dummy' at 0x127798a60>

    def _generate_data_for_spider(self, spider):
        return {
>           "stats": self.crawler.stats.get_stats(spider),
            "stats_history": spider.stats_history
            if hasattr(spider, "stats_history")
            else [],
            "crawler": self.crawler,
            "spider": spider,
            "job": self.client.job if self.client.available else None,
        }
E       AttributeError: 'NoneType' object has no attribute 'get_stats'

spidermon/contrib/scrapy/extensions.py:210: AttributeError
____________________________________________________________________ test_spider_opened_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4957297904'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe710>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_opened_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_SPIDER_OPEN_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_SPIDER_OPEN_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.spider_opened_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.spider_opened, spider=crawler.spider)
>       spidermon.spider_opened_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:53: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.spider_opened of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x1277a4520>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 120, in spider_opened
    self._run_suites(spider, self.spider_opened_suites)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 203, in _run_suites
    data = self._generate_data_for_spider(spider)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 210, in _generate_data_for_spider
    "stats": self.crawler.stats.get_stats(spider),
AttributeError: 'NoneType' object has no attribute 'get_stats'
____________________________________________________________________ test_spider_closed_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4958067808'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe8c0>, suites = ['tests.fixtures.suites.Suite01']

    def test_spider_closed_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_SPIDER_CLOSE_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_SPIDER_CLOSE_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.spider_closed_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.spider_closed, spider=crawler.spider)
>       spidermon.spider_closed_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:63: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.spider_closed of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x127860220>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 128, in spider_closed
    self._add_field_coverage_to_stats()
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 181, in _add_field_coverage_to_stats
    stats = self.crawler.stats.get_stats()
AttributeError: 'NoneType' object has no attribute 'get_stats'
___________________________________________________________________ test_engine_stopped_suites_should_run_from_signal ____________________________________________________________________

self = <MagicMock id='4957251824'>, args = (<ANY>,), kwargs = {}, msg = "Expected 'mock' to be called once. Called 0 times."

    def assert_called_once_with(self, /, *args, **kwargs):
        """assert that the mock was called exactly once and that that call was
        with the specified arguments."""
        if not self.call_count == 1:
            msg = ("Expected '%s' to be called once. Called %s times.%s"
                   % (self._mock_name or 'mock',
                      self.call_count,
                      self._calls_repr()))
>           raise AssertionError(msg)
E           AssertionError: Expected 'mock' to be called once. Called 0 times.

../../.pyenv/versions/3.10.9/lib/python3.10/unittest/mock.py:940: AssertionError

During handling of the above exception, another exception occurred:

get_crawler = <function get_crawler.<locals>._crawler at 0x1277fe950>, suites = ['tests.fixtures.suites.Suite01']

    def test_engine_stopped_suites_should_run_from_signal(get_crawler, suites):
        """The suites defined at SPIDERMON_ENGINE_STOP_MONITORS setting should be loaded and run"""
        settings = {"SPIDERMON_ENGINE_STOP_MONITORS": suites}
        crawler = get_crawler(settings)
        spidermon = Spidermon.from_crawler(crawler)
        spidermon.engine_stopped_suites[0].run = mock.MagicMock()
        crawler.signals.send_catch_log(signal=signals.engine_stopped, spider=crawler.spider)
>       spidermon.engine_stopped_suites[0].run.assert_called_once_with(mock.ANY)
E       AssertionError: Expected 'mock' to be called once. Called 0 times.

tests/test_extension.py:73: AssertionError
----------------------------------------------------------------------------------- Captured log call ------------------------------------------------------------------------------------
ERROR    scrapy.utils.signal:signal.py:59 Error caught on signal handler: <bound method Spidermon.engine_stopped of <spidermon.contrib.scrapy.extensions.Spidermon object at 0x12779afe0>>
Traceback (most recent call last):
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/scrapy/utils/signal.py", line 46, in send_catch_log
    response = robustApply(
  File "/Users/julia/src/spidermon/.tox/base/lib/python3.10/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 136, in engine_stopped
    self._run_suites(spider, self.engine_stopped_suites)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 203, in _run_suites
    data = self._generate_data_for_spider(spider)
  File "/Users/julia/src/spidermon/spidermon/contrib/scrapy/extensions.py", line 210, in _generate_data_for_spider
    "stats": self.crawler.stats.get_stats(spider),
AttributeError: 'NoneType' object has no attribute 'get_stats'
==================================================================================== warnings summary ====================================================================================
spidermon/contrib/pytest/plugins/filter_monitors.py:10
  /Users/julia/src/spidermon/spidermon/contrib/pytest/plugins/filter_monitors.py:10: PytestDeprecationWarning: The hookimpl pytest_collection_modifyitems uses old-style configuration options (marks or attributes).
  Please use the pytest.hookimpl(trylast=True) decorator instead
   to configure the hooks.
   See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
    @pytest.mark.trylast

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ short test summary info =================================================================================
FAILED tests/test_extension.py::test_spider_opened_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_spider_closed_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_engine_stopped_suites_should_run - AttributeError: 'NoneType' object has no attribute 'get_stats'
FAILED tests/test_extension.py::test_spider_opened_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
FAILED tests/test_extension.py::test_spider_closed_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
FAILED tests/test_extension.py::test_engine_stopped_suites_should_run_from_signal - AssertionError: Expected 'mock' to be called once. Called 0 times.
======================================================================== 6 failed, 341 passed, 1 warning in 1.78s ========================================================================
base: exit 1 (3.90 seconds) /Users/julia/src/spidermon> pytest -s --ignore=./tests/contrib --ignore=./tests/utils/test_zyte.py tests pid=81920
.pkg: _exit> python /Users/julia/src/spidermon/.venv/lib/python3.10/site-packages/pyproject_api/_backend.py True setuptools.build_meta __legacy__
  base: FAIL code 1 (55.27=setup[51.37]+cmd[3.90] seconds)
  evaluation failed :( (55.38 seconds)

Initial Diagnosis

Looks like self.crawler.stats is None and that presents issues later on. Seems the crawler we are creating in the get_crawler() fixture (located in conftest.py) doesn't have a stats instance initialized yet.

It's possible this is something that changed in latest Scrapy versions, and it's showing up now as Scrapy isn't pinned.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions