Sirobot is a perl script for the console to make downloads faster.
It can traverse into a WWW tree and download the whole page including
all images and linked documents. Unlike wget, sirobot is able to
download more than one file at the same time.
Sirobot runs on all platforms that have Perl and the necessary
libraries installed. As Sirobot was written in
Perl it needs no compilation. If you
have the required libraries installed (see
requirements) it can be
used out of the box.
Sirobot not only downloads all specified URLs but can also fetch
all images and links listed if you wish. You may specify a particular
depth and limit which files to get to prevent downloading the whole www.
Continue aborted downloads
Even if you started to download the file with another program,
it can finish it (if supported by the server)
Concurrent jobs
This is quite useful if you have to get a lot of files because Sirobot
is able to fetch several files (called jobs) at once
Convert links
Sirobot can convert absolute links inside badly designed pages so you
can easily browse through the files offline
Pattern matching
You may specify regular expressions that control which files will be
get (eg. to prevent downloading unwanted tar, mpg, ps, ... files)
Documentation
Documentation is now written in POD which can be converted to other
documentation formats. HTML and man are already included in the archive.
Graphical Frontend
Roel has developed a Perl/GTK based
frontend for Sirobot called Sirofront. It is available
here. There's also a
screenshot availabe.
Sirobot can handle https style schemes (eg.
https://www.openssl.org/)
if LWP has SSL support. You'll need an additional library, eg.
Crypt-SSLeay. Read the README.SSL file that comes with the
libwww-perl (LWP) package for details and where to get it from.
Other features
Of course, sirobot can use a proxy, access protected pages with a
user/password combination and takes care of areas not allowed for
robots. It also takes advantage of the Curses library if available to
improve it's UI.