Skip to content

wspider (v2.0)

Probably one of the fastest crawlers you've ever seen

This is a newer version of the former wmirror which is now archived. I will spend a lot more time on wspider to make it more powerful. Wmirror was using a single thread with old wget during download, while wspider allows you to download with multiple threads. wspider is extremely small and fast. I will do my best to add support for authentications as soon as possible, so you can crawl sites you've logged in to.

wspider is EXTREMELY powerful and fast. It uses wget2, which in many cases downloads much faster than wget1.x due to HTTP2, HTTP compression, parallel connections, and the use of If-Modified-Since HTTP header.

Be careful and use wspider at your own risk, as it can be extremely fast!

Happy crawling! ;)

wspider (250 threads)

Screenshot

wspider (10 threads)

Screenshot

wspider (1 thread)

Screenshot

GNU/Parallel (old and slow - using a lot of resources for nothing)

Crawler is from here: https://www.gnu.org/software/parallel/man.html

Get Started On Linux/MacOSX

git clone https://github.com/wuseman/wspider
 cd wspider
 chmod +x wspider.sh
 ./wspider.sh -u <url> -d <path> -t <threads>

System Requirements

  • wget2 - Find more info about wget2 here

IMPORTANT

wuseman cannot be held responsible for users' actions regardless of what damage a user can achieve with the information/data wspider might collect. All users who gather information or data via wspider are 100% responsible for their own actions. wspider has been developed for legal purposes.