SYNOPSIS
estwaver init rootdir
estwaver crawl [-restart|-revisit|-revcont] rootdir
estwaver unittest rootdir
estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url
DESCRIPTION
estwaver is an aggregation of sub commands. The name of a sub command
is specified by the first argument. Other arguments are parsed accord-
ing to each sub command. The argument rootdir specifies the crawler
root directory which contains configuration file and so on.
estwaver init rootdir
Create the crawler root directory.
estwaver crawl [-restart|-revisit|-revcont] rootdir
Start crawling.
If -restart is specified, crawling is restarted from the seed
documents.
If -revisit is specified, collected documents are revisited.
If -revcont is specified, collected documents are revisited and
then crawling is continued.</dd>
estwaver unittest rootdir
Perform unit tests.
estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url
Fetch a document.
url specifies the URL of a document.
-proxy specifies the host name and the port number of the proxy
server.
-tout specifies timeout in seconds.
-il specifies the preferred language. By default, it is
English.
All sub commands return 0 if the operation is success, else return 1.
A running crawler finishes with closing the database when it catches
the signal 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), or 15 (SIGTERM).
When crawling finishes, there is a directory _index in the crawler root
directory. It is an index available by estcmd and so on.
SEE ALSO
estconfig(1), estcmd(1), estmaster(1), estcall(1), estraier(3), estn-
|