Skip to content

Add watchdog lower limit timeout test #11203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Oct 31, 2019

Conversation

Tharazi97
Copy link
Contributor

@Tharazi97 Tharazi97 commented Aug 12, 2019

Description

Currently there is no test that checks if watchdog on the device meets the assumption specified in docs: "The watchdog should trigger at or after the timeout value.".
This PR adds one to the base of watchdog reset tests.
It checks whether the watchdog did not restart the device before the window left to the end of timeout. If watchdog restarted the device, the test will time out.

Targets with not calibrated watchdog clock won't pass this test. This is the case for some STM MCU, default calibration is not accurate at all, so we should probably:

  1. Change the watchdog API of these targets so it meets assumptions specified in docs.
  2. Make this test optional (specify that these targets won't have their watchdog trigger at or after timeout, but in between half value of timeout to 2 times timeout value).
  3. Relax assumptions.

Pull request type

[ ] Fix
[ ] Refactor
[ ] Target update
[X] Functionality change
[ ] Docs update
[ ] Test update
[ ] Breaking change

Reviewers

@fkjagodzinski @0xc0170 @maciejbocianski @jamesbeyond

Release Notes

watchdog_features_t extended by two uint32_t members: clock_typical_frequency and clock_max_frequency. These values can be used to determine the accuracy of an uncalibrated watchdog clock.

@ciarmcom
Copy link
Member

@Tharazi97, thank you for your changes.
@maciejbocianski @jamesbeyond @fkjagodzinski @0xc0170 @ARMmbed/mbed-os-core @ARMmbed/mbed-os-hal @ARMmbed/mbed-os-test @ARMmbed/mbed-os-maintainers please review.

@Tharazi97 Tharazi97 force-pushed the Watchdog_lower_limit_timeout_test branch from 2611bba to bb32361 Compare August 20, 2019 11:52
@0xc0170
Copy link
Contributor

0xc0170 commented Aug 20, 2019

STM targets won't pass this test, so we should probably:
Change the STM watchdog API so it meets assumptions specified in docs.
Make this test optional (specify that STM targets won't have their watchdog trigger at or after timeout, but in between half value of timeout to 2 times timeout value).
Relax assumptions.

@Tharazi97 if we can fix it, would be great. Just need more details.
@ARMmbed/team-st-mcd Please review

@0xc0170 0xc0170 removed request for a team August 20, 2019 11:57
@0xc0170
Copy link
Contributor

0xc0170 commented Aug 20, 2019

I reduced reviewers. Please review

@Tharazi97
Copy link
Contributor Author

@0xc0170
According to documentation watchdog should trigger at or after timeout value. (https://os.mbed.com/docs/mbed-os/v5.13/porting/watchdog-port.html)
STM targets have their watchdog clocked by LSI oscillator and its frequency may vary very much. So if we program watchdog to trigger at, or after 100 ms timeout it may physically trigger i. e. even after 70ms. So assumption that "The watchdog should trigger at or after the timeout value." is incorrect.

Test that I added checks if watchdog is physically triggering after the timeout value, if not it chcecks how close we are to this value.

@Tharazi97
Copy link
Contributor Author

We can change the assumption to: "The watchdog should trigger at or after the 75% timeout value.", or: "The watchdog should trigger at or after the timeout value (not applicable STM devices)."

@jeromecoutant
Copy link
Collaborator

STM targets won't pass this test

I would prefer to change this line...
Targets with not calibrated watchdog clock won't pass this test.
This is the case for some STM MCU, default calibration is not accurate at all.

Thx

@jeromecoutant
Copy link
Collaborator

Maybe we could add some frequency accuracy parameter in watchdog_features_t that test can check ?

@Tharazi97
Copy link
Contributor Author

@jeromecoutant
I think that could work. Add this parameter, change the test and change the assumption to: The watchdog should trigger at or after the timeout value times frequency accuracy. Or something like that. What do you think @jamesbeyond @fkjagodzinski?

@@ -64,6 +64,16 @@ void test_restart_reset();
*/
void test_kick_reset();

/** Test Watchdog timeout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to duplicate this test for the driver layer? As I see it, this test simply checks the actual hardware performance, right? Both HAL and driver APIs are verified by other test cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -264,6 +266,48 @@ void test_kick_reset()
TEST_ASSERT_MESSAGE(0, "Watchdog did not reset the device as expected.");
}

template<uint32_t window>
void test_timeout()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a timing test, I think it should be moved to TESTS/mbed_hal/watchdog_timing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -42,6 +42,8 @@
*/
#define KICK_ADVANCE_MS 35UL

#define TIMEOUT_LOWER_LIMIT 1000UL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the convention of adding a unit suffix.

Suggested change
#define TIMEOUT_LOWER_LIMIT 1000UL
#define TIMEOUT_LOWER_LIMIT_MS 1000UL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -310,6 +354,12 @@ Case cases[] = {
#endif
Case("Watchdog started again", case_setup, test_restart_reset),
Case("Kicking the Watchdog prevents reset", case_setup, test_kick_reset),
Case("timeout accuracy 20%", case_setup, test_timeout<200UL>),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we know, for some targets these tests will fail for sure if the watchdog clock is left uncalibrated. I think we should add #if MBED_EXTENDED_TESTS here, to skip these tests on the CI.


// Kick watchdog before timeout.
// Watchdog should not trigger before timeout.
// If device restarts while waiting for the kick, test fails.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this produces an ERROR test case result instead of FAIL. Could you update that?

@fkjagodzinski
Copy link
Member

Maybe we could add some frequency accuracy parameter in watchdog_features_t that test can check ?

That seems useful; would help the user to determine the watchdog timing capabilities. @donatieng, @0xc0170, what do you think?

@0xc0170
Copy link
Contributor

0xc0170 commented Aug 22, 2019

What would this number define and how it would be used? Illustration might help

@Tharazi97
Copy link
Contributor Author

LSIstm32f073RZ

So for example this is the documentation of STM32L073RZ. As we can see typical frequency is 38 kHz, but it may vary from 26 to 56 kHz. In its implementation of the watchdog LSI value is defined as 38 kHz, so ratio of defined value to max value is equal to 67% (rounding down).

I would name this parameter clock_accuracy and it would mean how accurate is not calibrated watchdog clock. The bigger value, the watchdog clock is more precise.

Practical use of this value would be to determine when to kick watchdog when it is not calibrated, so the device would not restart when not expected.

With this value, the test would look something like this:

void test_timeout()
{
    watchdog_features_t features = hal_watchdog_get_platform_features();
    if (TIMEOUT_LOWER_LIMIT_MS > features.max_timeout) {
        TEST_IGNORE_MESSAGE("Requested timeout value not supported for this target -- ignoring test case.");
        return;
    }

    // Phase 2. -- verify the test results.
    if (current_case.received_data != CASE_DATA_INVALID) {
        TEST_ASSERT_EQUAL(CASE_DATA_PHASE2_OK, current_case.received_data);
        current_case.received_data = CASE_DATA_INVALID;
        return;
    }

    // Phase 1. -- run the test code.
    watchdog_config_t config = { TIMEOUT_LOWER_LIMIT_MS };
    TEST_ASSERT_EQUAL(WATCHDOG_STATUS_OK, hal_watchdog_init(&config));

    // Kick watchdog before timeout.
    // Watchdog should not trigger before timeout * clock accuracy.
    // If device restarts while waiting for the kick, test fails.
    wait_ms(TIMEOUT_LOWER_LIMIT_MS * features.clock_accuracy / 100);
    hal_watchdog_kick();

    if (send_reset_notification(&current_case, 2 * TIMEOUT_MS) == false) {
        TEST_ASSERT_MESSAGE(0, "Dev-host communication error.");
        return;
    }
    hal_watchdog_kick();

    // Watchdog should fire before twice the timeout value.
    wait_ms(2 * TIMEOUT_LOWER_LIMIT_MS);

    // Watchdog reset should have occurred during that wait() above;

    hal_watchdog_kick();  // Just to buy some time for testsuite failure handling.
    TEST_ASSERT_MESSAGE(0, "Watchdog did not reset the device as expected.");
}

@Tharazi97 Tharazi97 force-pushed the Watchdog_lower_limit_timeout_test branch from bb32361 to 944e8f2 Compare September 2, 2019 10:14
Copy link
Contributor

@0xc0170 0xc0170 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clock accuracy - is this in percentage (0-100) ? Not documented in hal header file

@@ -136,6 +136,9 @@
#endif /* LSI_VALUE */ /*!< Value of the Internal Low Speed oscillator in Hz
The real value may vary depending on the variations
in voltage and temperature.*/
#if !defined (LSI_ACCURACY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not be touching these device files as they get updated from upstream. This could be defined in our codebase (config for instance), and used in the watchdog HAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ARMmbed/team-st-mcd please review

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, don't patch files in device directory (except bugs)
Thx

@0xc0170 0xc0170 requested a review from a team September 2, 2019 10:57
@0xc0170
Copy link
Contributor

0xc0170 commented Sep 2, 2019

@ARMmbed/team-st-mcd Please review

@fkjagodzinski
Copy link
Member

Silabs build errors are fixed now. As for the release notes -- I couldn't edit the original PR description. @adbridge, @0xc0170, could you paste this note there?:

Release Notes

watchdog_features_t extended by two uint32_t members: clock_typical_frequency and clock_max_frequency. These values can be used to determine the accuracy of an uncalibrated watchdog clock.

@fkjagodzinski
Copy link
Member

Minor Watchdog docs update in ARMmbed/mbed-os-5-docs#1155.

@Tharazi97
Copy link
Contributor Author

Hi, sorry for not responding. Thank you @fkjagodzinski for taking care of this functionality.

@0xc0170 0xc0170 added release-version: 6.0.0-alpha-1 First pre-release version of 6.0.0 needs: CI and removed needs: work labels Oct 18, 2019
@0xc0170
Copy link
Contributor

0xc0170 commented Oct 18, 2019

👍 for the update, will schedule CI soon

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 21, 2019

CI started

@mbed-ci
Copy link

mbed-ci commented Oct 21, 2019

Test run: FAILED

Summary: 1 of 11 test jobs failed
Build number : 2
Build artifacts

Failed test jobs:

  • jenkins-ci/mbed-os-ci_greentea-test

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 21, 2019

Failures related, tests-mbed_hal-watchdog_timing failing

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 28, 2019

@Tharazi97 Any update?

@fkjagodzinski
Copy link
Member

@0xc0170, I'm looking into this failure right now. Will post an update shortly.

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 30, 2019

CI restarted

@fkjagodzinski
Copy link
Member

fkjagodzinski commented Oct 30, 2019

I updated the new test case (added in this PR) so that sleep or deepsleep mode will not be active during the wait for the watchdog reset.

Testing a watchdog performance in both sleep modes is taken care of in a different test suite. As for the K64F, I've created an issue for the deepsleep watchdog case -- #11774.

@mbed-ci
Copy link

mbed-ci commented Oct 30, 2019

Test run: SUCCESS

Summary: 11 of 11 test jobs passed
Build number : 3
Build artifacts

@0xc0170
Copy link
Contributor

0xc0170 commented Oct 31, 2019

Travis has not reported back, I restarted it

@0xc0170 0xc0170 merged commit eea8300 into ARMmbed:master Oct 31, 2019
@fkjagodzinski fkjagodzinski deleted the Watchdog_lower_limit_timeout_test branch October 31, 2019 14:39
@0xc0170 0xc0170 added release-version: 5.15.0-rc1 and removed release-version: 6.0.0-alpha-1 First pre-release version of 6.0.0 labels Nov 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants