-
Notifications
You must be signed in to change notification settings - Fork 2
HeterogeneousCore/SonicTriton: add RetryActionDiffServer; expose connectToServer; update tests #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
HeterogeneousCore/SonicTriton: add RetryActionDiffServer; expose connectToServer; update tests #21
Conversation
…r method in TritonClient. Update BuildFile.xml and fix formatting in header files.
…tructor for TritonClient, and update BuildFile.xml to include Catch2 for testing.
…tests; remove old cfg
Preliminary comments:
|
alt_server_url_ = conf.getUntrackedParameter<std::string>("altServerUrl", ""); | ||
alt_server_token_ = conf.getUntrackedParameter<std::string>("altServerToken", ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parameters should be removed. The alternative server URLs should be obtained from the TritonService
, which keeps a master list of all known servers (rather than each module/client keeping its own list inside of its RetryAction(s)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n.b. the function TritonClient::updateServer()
in #19 is provided for this purpose
</export> | ||
|
||
<test name="RetryActionDiffServer_test" command="RetryActionDiffServer.cc"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests should be defined in HeterogeneousCore/SonicTriton/test/BuildFile.xml
, not the package-level BuildFile.xml
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the correct syntax is:
<bin file="test_RetryActionDiffServer.cc" name="TestHeterogeneousCoreSonicTritonRetryActionDiffServer">
<use name="catch2"/>
<use name="FWCore/ParameterSet"/>
<use name="HeterogeneousCore/SonicTriton"/>
</bin>
@@ -1,5 +1,6 @@ | |||
<test name="TestHeterogeneousCoreSonicTritonProducerCPU" command="unittest.sh ${LOCALTOP} CPU"/> | |||
<test name="TestHeterogeneousCoreSonicTritonProducerGPU" command="unittest.sh ${LOCALTOP} GPU"/> | |||
<test name="TestHeterogeneousCoreSonicTritonRetryAction" command="unittest.sh ${LOCALTOP} CPU tritonRetryActionTest_cfg.py"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the file tritonRetryActionTest_cfg.py
is not committed here. Preferably, the existing tritonTest_cfg.py
should be adapted/extended to perform these tests as well, rather than duplicating functionality.
* parameters. | ||
* @param is_testing A boolean flag to select this constructor. | ||
*/ | ||
TritonClient(bool is_testing); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this just be a protected default constructor (with no arguments)? is_testing
is never actually used.
}; | ||
} | ||
|
||
TEST_CASE("Test RetryActionDiffServer Logic", "[RetryActionDiffServer]") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice setup, so I think we should keep it (in addition to adding to tritonTest_cfg.py
, which is somewhere between a unit test and a full integration test).
@@ -0,0 +1,109 @@ | |||
#define CATCH_CONFIG_MAIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file should be renamed test_RetryActionDiffServer.cc
or similar
</export> | ||
|
||
<test name="RetryActionDiffServer_test" command="RetryActionDiffServer.cc"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the correct syntax is:
<bin file="test_RetryActionDiffServer.cc" name="TestHeterogeneousCoreSonicTritonRetryActionDiffServer">
<use name="catch2"/>
<use name="FWCore/ParameterSet"/>
<use name="HeterogeneousCore/SonicTriton"/>
</bin>
@@ -7,9 +7,13 @@ | |||
<use name="HeterogeneousCore/CUDAUtilities"/> | |||
<use name="triton-inference-client"/> | |||
<use name="protobuf"/> | |||
<use name="catch2"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be moved to test/BuildFile.xml
as indicated in other comments (i.e. removed from here)
Title
HeterogeneousCore/SonicTriton: add RetryActionDiffServer; expose connectToServer; update tests
Body
PR description
RetryActionDiffServer
to switch to an alternative Triton server upon failure.TritonClient::connectToServer(std::string url)
and add a testing constructor to enable unit tests.BuildFile.xml
; remove obsoletetritonRetryActionTest_cfg.py
.PR validation
CMSSW_15_0_0_pre3
(scram b -j
).Backport
Reviewers: @jmduarte @kpedro88