Skip to content

HTTP Probe plugin generates new service for every new run #1107

@je-bugshell

Description

@je-bugshell

Hi Amass maintainers,

I’m seeing what looks like unintended duplication of Service entities across repeated amass enum runs in v5.0.1, and I think it may be caused by how the HTTP-Probes service discovery plugin generates Service identifiers.

Environment

  • Amass: v5.0.1
  • Tool: amass enum
  • Plugin: engine/plugins/service_discovery/http_probes

What I’m observing

If I run amass enum against the same target multiple times, I end up with new Service entities being created every run even when nothing about the target is changing. In my DB export, the same host repeatedly gets new service unique_id values.

Example for one host (same host, repeated runs produce new service IDs):

entity_id,unique_id,edge_protocol,edge_port
102,bugshell.com2461898493269828364,https,443
102,bugshell.com2461898493269828364,https,80
646,bugshell.com-4815389494055600025,https,443
646,bugshell.com-4815389494055600025,https,80
785,bugshell.com5648222110259363094,https,443
785,bugshell.com5648222110259363094,https,80
802,bugshell.com6485818286381474201,https,443
802,bugshell.com6485818286381474201,https,80
820,bugshell.com-6635030850544857504,https,443
820,bugshell.com-6635030850544857504,https,80

Also: I noticed the unique_id suffix is sometimes negative (leading -). That’s not my main concern, but it suggests the ID is being derived from a signed integer representation of a hash output.

Expected behavior (what I assumed)

I expected that rerunning amass enum against the same stable services would not create new Service entities each time, i.e., Services would be deduplicated/merged based on a stable identity.

Suspected cause in source

From engine/plugins/service_discovery/http_probes/plugin.go:

  • In Start() the plugin seeds a maphash.Hash with a fresh seed each start:
hp.hash.SetSeed(maphash.MakeSeed())
  • In store() it calls:
serv := support.ServiceWithIdentifier(&hp.hash, e.Session.ID().String(), addr)

And support.ServiceWithIdentifier hashes sessionid + address and uses the result in the Service ID:

_, _ = h.WriteString(sessionid + address)
serv := &platform.Service{
  ID: address + strconv.Itoa(int(h.Sum64())),
}

So the Service ID changes:

  1. because the maphash seed is re-generated (random) on each process start, and
  2. because the identifier input includes the session ID (e.Session.ID().String()), which changes each enum run.

This seems sufficient to explain why Services churn across runs.

Related: service_type always empty

I also noticed that Service assets created by http_probes appear to have service_type set to an empty string. In the plugin, it looks like only Output / OutputLen / Attributes are set and no service type is populated.

Question: is this intended?

Is it intended behavior that:

  • the HTTP-Probes plugin generates a new maphash seed each time amass enum is run, and
  • Service IDs are derived from the session ID (making them run-specific rather than stable)?

If the goal is stable Service identity across runs, would you be open to:

  • removing session ID from the Service identifier material, and/or
  • using a deterministic hash for Service IDs, potentially keyed by a configurable seed (from config options)?

Happy to provide additional details (how I’m exporting from the asset DB, etc.) if helpful.

Thanks!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions