0) Properly install Jakarta Ant and Java on your system.
1) "ant dist" or "ant build"
2) cd dist
3) Edit conf/crawlerConfig.xml to customize settings (this file has
inline comments so you should not have a problem to figure-out how
3) run.sh or run.bat (for Unix or Windows). On Unix you may need to
make the shell script executable by: "chmod u+x run.sh"
4) Watch the console output and tail the monitor.log file for monitor
messages. Program can be stopped, at any time, by pressing any key
in the console window.
If you are running JCrawler remotely on a Unix box, you will want two things
1) To both watch monitor.log and standard output on a single connection
rather than logging-in two times for each of them
2) The JCrawler process not to be dependent on your connection to the server
and not die if you suddenly get kicked off by SSH.
Both of these desires can be beautifully achieved by using GNU Screen and we highly recommend this great tool, (and not just only for JCrawler :) ).