Adding x509_cert input plugin#3768
Conversation
0691f34 to
44af23c
Compare
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| @@ -0,0 +1,155 @@ | |||
| package ssl_cert | |||
There was a problem hiding this comment.
I can see that some packages in the repository do not conform with the standards described in Effective Go - Package names , but I would suggest to drop the underscore.
-> ssl_certsslcert
There was a problem hiding this comment.
I will keep the underscore in the package name because the name appears in the config file and I believe that it improves readability of the config file.
| AoQUXWIiFBoFfpNVcvUgHyGGc1hv7TX8Eh8KGo2+VuPzxnFuRrbbYCs= | ||
| -----END RSA PRIVATE KEY-----` | ||
|
|
||
| func getTestPrefix(testN int) string { |
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| for _, file := range sc.Files { | ||
| cert, err := getLocalCert(file) | ||
| if err != nil { | ||
| return errors.New(fmt.Sprintf("Cannot get local SSL cert: %s", err)) |
There was a problem hiding this comment.
You could use fmt.Errorf.
Here is another alternative, which is not using fmt: errors.New("my error message: " + err.Error()). In this case you could remove the fmt package.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| for _, server := range sc.Servers { | ||
| cert, err := getRemoteCert(server, sc.Timeout*time.Second, sc.CloseConn) | ||
| if err != nil { | ||
| return errors.New(fmt.Sprintf("Cannot get remote SSL cert: %s", err)) |
There was a problem hiding this comment.
Error messages should not be treated as sentences:
- should not start with uppercase unless the first word has special meaning, i.e. exported method/struct name etc.
- should end with
.,!,?etc.
Think about the recommended error message style: [caller 2 error message]:[caller 1 error message]:[original error message]
See here
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| @@ -0,0 +1,155 @@ | |||
| package ssl_cert | |||
There was a problem hiding this comment.
I think this is not that important as it's quite clear what the package is (it's an input plugin) and further description is provided by the Description function.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| // For tests | ||
| CloseConn bool | ||
| } | ||
|
|
There was a problem hiding this comment.
I would recommend adding a check, which makes sure SSLCert implements telegraf.Input.
// make sure SSLCert implements telegraf.Input
var _ telegraf.Input = &SSLCert{}I would also consider it as a valid point, if someone prefers to have this check performed only in the _test.go file.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| for _, server := range sc.Servers { | ||
| cert, err := getRemoteCert(server, sc.Timeout*time.Second, sc.CloseConn) | ||
| if err != nil { | ||
| return errors.New(fmt.Sprintf("Cannot get remote SSL cert: %s", err)) |
There was a problem hiding this comment.
The error message could provide a little more information: consider including the server details. Currently neither all getRemoteCert possible errors include the server, nor the Gather method.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| Timeout time.Duration `toml:"timeout"` | ||
|
|
||
| // For tests | ||
| CloseConn bool |
There was a problem hiding this comment.
Do not define interfaces on the implementor side of an API "for mocking"; instead, design the API so that it can be tested using the public API of the real implementation.
See here
I see no reason to test if the connection has been closed on your end. Instead I would test the scenarios under which a connection error could occur.
func TestSSLCert_Gather_ConnectionErrors(t *testing.T) {
pair, err := tls.X509KeyPair([]byte(testCert), []byte(testKey))
if err != nil {
t.Fatal(err)
}
config := &tls.Config{
Certificates: []tls.Certificate{pair},
}
t.Run("server connection refused", func(t *testing.T) {
ln, err := tls.Listen("tcp", ":0", config)
if err != nil {
t.Fatal(err)
}
svr := ln.Addr().String()
ln.Close()
sc := SSLCert{
Servers: []string{svr},
Timeout: 5,
}
acc := testutil.Accumulator{}
err = sc.Gather(&acc)
if err == nil {
t.Error("got nil; want error")
}
})
t.Run("server closes the connection", func(t *testing.T) {
ln, err := tls.Listen("tcp", ":0", config)
if err != nil {
t.Fatal(err)
}
defer ln.Close()
go func() {
sconn, err := ln.Accept()
if err != nil {
return
}
sconn.Close()
}()
sc := SSLCert{
Servers: []string{ln.Addr().String()},
Timeout: 5,
}
acc := testutil.Accumulator{}
err = sc.Gather(&acc)
if err == nil {
t.Error("got nil; want error")
}
})
t.Run("server closes without handshake", func(t *testing.T) {
ln, err := tls.Listen("tcp", ":0", config)
if err != nil {
t.Fatal(err)
}
defer ln.Close()
go func() {
sconn, err := ln.Accept()
if err != nil {
return
}
serverConfig := config.Clone()
srv := tls.Server(sconn, serverConfig)
srv.Close()
}()
sc := SSLCert{
Servers: []string{ln.Addr().String()},
Timeout: 5,
}
acc := testutil.Accumulator{}
err = sc.Gather(&acc)
if err == nil {
t.Error("got nil; want error")
}
})
}|
|
||
| ln, err := tls.Listen("tcp", ":0", config) | ||
| if err != nil { | ||
| t.Error(err) |
There was a problem hiding this comment.
If there is an error, the test can not continue. You should call t.Fatal(err) instead
|
|
||
| pair, err := tls.X509KeyPair([]byte(testCert), []byte(testKey)) | ||
| if err != nil { | ||
| t.Error(err) |
There was a problem hiding this comment.
If there is an error, the test cannot continue. You should call t.Fatal(err) instead
|
Thanks for the pull request @jtyr and also the review @sepetrov (definitely appreciate non-maintainer reviews!). We have a pretty large backlog, but I'll try to take a look at this for 1.7, please nudge me if I don't get to it sometime in the next month. I'll also mention that there is a similar pull request #1762, it may be useful to read over it for ideas. |
2f38465 to
8aae40a
Compare
|
@sepetrov Thanks a lot for your review! I have implemented most of your feedback. Please see the latest commit. |
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| Timeout time.Duration `toml:"timeout"` | ||
|
|
||
| // For tests | ||
| CloseConn bool |
There was a problem hiding this comment.
As I said in #3768 (comment) I would prefer to see black box tests where your tests live in separate package ssl_cert_test, and therefore you test only the public API. But if you have to have access to parts of the package, which change the behaviour, so that more test cases can be covered, then I would recommend the following:
Move the test-specific code outside the configuration struct SSLCert
// SSLCert holds the configuration of the plugin.
type SSLCert struct {
Servers []string `toml:"servers"`
Files []string `toml:"files"`
Timeout time.Duration `toml:"timeout"`
}
// For tests
var (
closeConn bool
unsetCerts bool
)This way the test-related code does not become part of the public API.
The disadvantage of having global state modifiers for the package by using this approach is that you will not be able to run tests in parallel, which you do not do here anyway.
There was a problem hiding this comment.
Thanks for your review. It should be fixed in the last commit.
2f2c5ff to
9889fed
Compare
|
@danielnelson Please could you review it once you have time? |
|
Will do, sorry about the delay... once 1.6 is released please ping me again if I still haven't reviewed yet |
|
LGTM 👍 |
|
@danielnelson Ping. Please could you review? |
plugins/inputs/ssl_cert/README.md
Outdated
| # Reads metrics from a SSL certificate | ||
| [[inputs.ssl_cert]] | ||
| ## List of local SSL files | ||
| #files = [] |
There was a problem hiding this comment.
Let's add a bit more detail about what certificates can be loaded (PEM encoded). Perhaps we can use /etc/ssl/certs/ssl-cert-snakeoil.pem as the default path.
plugins/inputs/ssl_cert/README.md
Outdated
| ## List of local SSL files | ||
| #files = [] | ||
| ## List of servers | ||
| #servers = [] |
There was a problem hiding this comment.
We should have examples of the how this variable could look, use example.org as the hostname.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| InsecureSkipVerify: true, | ||
| } | ||
|
|
||
| ipConn, err := net.DialTimeout("tcp", server, timeout) |
There was a problem hiding this comment.
I think we should allow the user to specify the network in the configuration, it would then look like servers = ["tcp://example.org:8080"]
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
|
|
||
| func getRemoteCert(server string, timeout time.Duration) (*x509.Certificate, error) { | ||
| tlsCfg := &tls.Config{ | ||
| InsecureSkipVerify: true, |
There was a problem hiding this comment.
Do you think the tls.Config could be configurable? We have a feature request for client authentication in #3854 so perhaps we could use the standard set of SSL options:
telegraf/plugins/inputs/apache/apache.go
Lines 43 to 48 in 851efc9
I think we would need to check what happens if we ignore the error from Handshake(), perhaps we can still get the PeerCertificates? We could also explore a custom tls.Config.VerifyPeerCertificate func.
plugins/inputs/all/all.go
Outdated
| _ "github.com/influxdata/telegraf/plugins/inputs/socket_listener" | ||
| _ "github.com/influxdata/telegraf/plugins/inputs/solr" | ||
| _ "github.com/influxdata/telegraf/plugins/inputs/sqlserver" | ||
| _ "github.com/influxdata/telegraf/plugins/inputs/ssl_cert" |
There was a problem hiding this comment.
Should we rename the plugin? We don't always do SSL/TLS (in the case of files), maybe it should be x509_cert or certificate?
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
|
|
||
| certs := conn.ConnectionState().PeerCertificates | ||
|
|
||
| if unsetCerts { |
There was a problem hiding this comment.
I'm not a big fan of this testing method but I guess it's acceptable. It would be better IMO to use an interface for tls.Client and a mock during tests or to implement the behavior in the tests.
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| age := int(now.Sub(cert.NotBefore).Seconds()) | ||
| expiry := int(cert.NotAfter.Sub(now).Seconds()) | ||
| startdate := int(cert.NotBefore.Unix()) | ||
| enddate := int(cert.NotAfter.Unix()) |
There was a problem hiding this comment.
For startdate, enddate no need to convert to int
There was a problem hiding this comment.
.Seconds() returns float64 so it's necessary to convert it to int to get only the integer part of the number.
There was a problem hiding this comment.
Just make the change for startdate, enddate. (All integer fields added are converted to int64 or uint64 in the call to AddFields)
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| type SSLCert struct { | ||
| Servers []string `toml:"servers"` | ||
| Files []string `toml:"files"` | ||
| Timeout time.Duration `toml:"timeout"` |
There was a problem hiding this comment.
Use internal.Duration so we can specify timeout using a string like "5s"
plugins/inputs/ssl_cert/ssl_cert.go
Outdated
| "expiry": expiry, | ||
| "startdate": startdate, | ||
| "enddate": enddate, | ||
| } |
There was a problem hiding this comment.
Ideally we could also have metrics for if the certificate is valid, including values for common errors such as the hostname not matching the certificate. This is implemented in the related pr #3829, though ideally we could leverage the standard library code for this.
| "github.com/influxdata/telegraf/testutil" | ||
| ) | ||
|
|
||
| const testCert = `-----BEGIN CERTIFICATE----- |
There was a problem hiding this comment.
Can you update to use the testing certs in testutil/pki, check plugins/inputs/socket_listener/socket_listener_test.go for examples.
|
@danielnelson I have implemented all your suggestions. Please review. |
|
@danielnelson Ping. Please could you review? |
|
I made some slight modifications based on my review (see here). I know the first commit is fine, I agree with sepetrov regarding the test specific variables. Unifying the two |
| "crypto/tls" | ||
| "crypto/x509" | ||
| "encoding/pem" | ||
| "errors" |
There was a problem hiding this comment.
Replace errors.New with fmt.Errorf
|
|
||
| // For tests | ||
| var ( | ||
| closeConn bool |
There was a problem hiding this comment.
There isn't a need for test only variables in here, the same can be accomplished without them. (see this commit)
|
Thanks for your review, @glinton. I have merged your changes and added few more. Any idea how to further improve test coverage (current coverage is only 92.7%)? |
|
😄 I guess one thing you could do is remove this check if it never gets called (i believe there would be an error on the handshake if there were no certs), but i think you've got plenty of coverage. |
|
Thanks for the tip, @glinton. I have removed the check but the coverage is still only 94.3%! 😞 |
| expiry := int(cert.NotAfter.Sub(now).Seconds()) | ||
| startdate := cert.NotBefore.Unix() | ||
| enddate := cert.NotAfter.Unix() | ||
| valid := expiry > 0 |
There was a problem hiding this comment.
I think we should drop this field for now. There are various aspects of the certificate which currently we do not take into account, such as hostname and chain of trust. We can add it back in if/when we include these features.
| return certs, nil | ||
| case "file": | ||
| fallthrough | ||
| default: |
There was a problem hiding this comment.
I think default should be an error, but "" could be file, this would avoid schemes like "foo://".
| fallthrough | ||
| case "udp": | ||
| fallthrough | ||
| case "tcp": |
There was a problem hiding this comment.
Let's add support for tcp4, tcp6, udp4, udp6?
| conn := tls.Client(ipConn, tlsCfg) | ||
| defer conn.Close() | ||
|
|
||
| hsErr := conn.Handshake() |
There was a problem hiding this comment.
I'm curious about what happens when a certificate has an invalid hostname or the cert chain is invalid? What I think would be ideal is if we still called acc.AddFields( but with valid=false and potentially with more detailed information. If this is possible, Gather would only return an error if the network connection fails or we can't get any certificates.
There was a problem hiding this comment.
I have removed the valid metric as per your request bellow (line 116).
| {mode: 0640, content: pki.ReadServerCert(), error: false}, | ||
| } | ||
|
|
||
| for i, test := range tests { |
There was a problem hiding this comment.
Not required, but would be a nice touch if we used subtests like here https://github.com/influxdata/telegraf/blob/master/plugins/inputs/docker/docker_test.go#L720
|
|
||
| const sampleConfig = ` | ||
| ## List certificate sources | ||
| # sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "tcp://example.org:443"] |
There was a problem hiding this comment.
Let's uncomment this one line since it will almost always need to be changed.
| # sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "tcp://example.org:443"] | ||
|
|
||
| ## Timeout for SSL connection | ||
| # timeout = 5 |
There was a problem hiding this comment.
Since this is an internal.Duration, it should be like timeout = "5s"
|
|
||
| func getFields(cert *x509.Certificate, now time.Time) map[string]interface{} { | ||
| age := int(now.Sub(cert.NotBefore).Seconds()) | ||
| expiry := int(cert.NotAfter.Sub(now).Seconds()) |
There was a problem hiding this comment.
Aren't these already ints?
There was a problem hiding this comment.
No, it's float64:
https://golang.org/pkg/time/#Duration.Seconds
jtyr
left a comment
There was a problem hiding this comment.
@danielnelson All should be fixed in the last commit. Please review it.
|
|
||
| func getFields(cert *x509.Certificate, now time.Time) map[string]interface{} { | ||
| age := int(now.Sub(cert.NotBefore).Seconds()) | ||
| expiry := int(cert.NotAfter.Sub(now).Seconds()) |
There was a problem hiding this comment.
No, it's float64:
https://golang.org/pkg/time/#Duration.Seconds
| # sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "tcp://example.org:443"] | ||
|
|
||
| ## Timeout for SSL connection | ||
| # timeout = 5 |
|
|
||
| const sampleConfig = ` | ||
| ## List certificate sources | ||
| # sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "tcp://example.org:443"] |
| expiry := int(cert.NotAfter.Sub(now).Seconds()) | ||
| startdate := cert.NotBefore.Unix() | ||
| enddate := cert.NotAfter.Unix() | ||
| valid := expiry > 0 |
| fallthrough | ||
| case "udp": | ||
| fallthrough | ||
| case "tcp": |
| return certs, nil | ||
| case "file": | ||
| fallthrough | ||
| default: |
| conn := tls.Client(ipConn, tlsCfg) | ||
| defer conn.Close() | ||
|
|
||
| hsErr := conn.Handshake() |
There was a problem hiding this comment.
I have removed the valid metric as per your request bellow (line 116).
| sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "https://example.org"] | ||
|
|
||
| ## Timeout for SSL connection | ||
| # timeout = 5s |
There was a problem hiding this comment.
Add quotes to be valid toml
| sources = ["/etc/ssl/certs/ssl-cert-snakeoil.pem", "tcp://example.org:443"] | ||
|
|
||
| ## Timeout for SSL connection | ||
| # timeout = 5s |
There was a problem hiding this comment.
Add quotes to be valid toml
Required for all PRs:
This PR is adding new input plugin which allows to monitor SSL certificates.