Add support for crawling subdomains by alexspeller · Pull Request #27 · chriskite/anemone

alexspeller · 2011-08-03T11:57:04Z

Merge changes to support subdomain crawling from runa@91559bd

MaGonglei · 2011-11-07T11:12:13Z

This feature is very useful.
I think anemone should also support for printing out the external links, just print out it but not scan it in deep.
The link checker tool XENU (http://home.snafu.de/tilman/xenulink.html) has this feature.

wokkaflokka · 2011-11-08T16:24:26Z

MaGonglei: It is very simple to gather external links using Anemone, and comparably simple to actually check these links to verify they are valid, etc. The 'on_every_page' block is very helpful in this regard.

If you'd like some code that does exactly what you are asking, I could send an example your way.

MaGonglei · 2011-11-14T00:58:56Z

Hi,wokkaflokka,thanks for your reply.
I think I know what you mean,but I prefer to have this feature when I initialize the anemone crawl like :
Anemone.crawl("http://www.example.com",:external_links => false) do |anemone|
....
end

Because if I use the "on_every_page" block to search the external links (e.g. "page.doc.xpath '//a[@href]') ,it seemed cost too much CPU and Memorys.

If I'm wrong,give me the example.

Thanks.

Merge changes to support subdomain crawling from runa@91559bd

4419464

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for crawling subdomains#27

Add support for crawling subdomains#27
alexspeller wants to merge 1 commit intochriskite:nextfrom
alexspeller:4419464056d3de337162

alexspeller commented Aug 3, 2011

Uh oh!

MaGonglei commented Nov 7, 2011

Uh oh!

wokkaflokka commented Nov 8, 2011

Uh oh!

MaGonglei commented Nov 14, 2011

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexspeller commented Aug 3, 2011

Uh oh!

MaGonglei commented Nov 7, 2011

Uh oh!

wokkaflokka commented Nov 8, 2011

Uh oh!

MaGonglei commented Nov 14, 2011

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants