Veronica FAQ (Part 1 of 2)

Common Questions and Answers about veronica, a title search and retrieval system for use with the Internet Gopher.

Last-modified: 1995/01/13. Mail comments and questions about the FAQ to: foster@scs.unr.edu Copyright (C) Steven Foster 1993,1994,1995. This FAQ may be freely copied and redistributed provided it is copied entire and unmodified and this copyright statement remains intact. The current version of this FAQ can be retrieved through gopher at gopher://veronica.scs.unr.edu/00/veronica/veronica-faq

Questions in the veronica FAQ:

Q1:
    What is veronica?
Q2:
    How many information servers are included in the index?
Q3:
    How many information items are in the index?
Q4:
    Which types of information resources are included?
Q5:
    Which gopher servers are not included in veronica?
Q6:
    How can I connect to veronica?
Q7:
    How do I know where a particular information item is located?
Q8:
    How can I get my server into the veronica database?
Q9:
    How can I keep my server out of the veronica database?
Q10:
    How can I tell veronica to index only PARTS of my gopher server?
Q11:
    Does veronica work with Jughead?
Q12:
    How often is the database updated?
Q13:
    Where can I get the software to run veronica?
Q14:
    Where can I get the veronica dataset?
Q15:
    What does "veronica" mean?
Q16:
    Why doesn't my server show up in veronica searches?
Q17:
    Where are the veronica server sites?
Q18:
    How do I contact the veronica operators?
Q19:
    How do I compose veronica search queries and use the veronica options?


Q1: What is veronica?

veronica: very easy rodent-oriented net-wide index of computerized archives.

veronica is a resource-discovery system providing access to information resources held on most (99% +) of the world's gopher servers. In addition to native gopher data, veronica includes references to many resources provided by other types of information servers, such as WWW servers, usenet archives, and telnet-accessible information services.

veronica queries are keyword-in-title searches. A simple query can be quite powerful because a large number of information servers are included in the index.

veronica is accessed through gopher client software (see Q6). A veronica user submits a query (via a gopher client) which may contain boolean keyword expressions as well as special veronica directives. The result of a veronica search is a gopher menu comprising information items whose titles contain the specified keywords. The results menu may be browsed like any other gopher menu.
Q2: How many information servers are included in the index?

In January 1995, 5057 gopher servers were indexed. The index also includes items from approximately 5000 other servers, in cases where those servers are referenced on gopher menus. These other servers include 3905 WWW servers and about 1000 telnet-type services.

The gophers are exhaustively indexed; almost every item offered by the gopher servers is included in the index (see Q4 for exceptions). The contents of WWW servers are not exhaustively indexed: veronica includes HTML items only when they are referenced on the menus of some gopher server.
Q3: How many information items are included in the index?

Approximately 15 million items are indexed. (November 1994).
Q4: Which types of information resources are included in the index?

All resources directly served by gopher servers are included in the index. The following types were indexed in December 1994:

   0 -- Text File
   1 -- Directory
   2 -- CSO name server
   4 -- Mac HQX file.
   5 -- PC binary
   7 -- Full Text Index (Gopher menu)
   8 -- Telnet Session
   9 -- Binary File
   s -- Sound
   e -- Event  (not in 2.06)
   I -- Image (other than GIF)
   M -- MIME multipart/mixed message
   T -- TN3270 Session
   g -- GIF image
   h -- HTML, HyperText Markup Language
   :
   ;
   !

Certain types of data NOT served directly by gopher servers are also included in the index if the resources are referenced on menus of indexed gopher servers. These types are: telnet sessions, CSO sessions, html files served by WWW servers, and type-7 searches. These items are included in the index even though they reside on non-gopher servers.

Resources provided by gateways to other types of servers are given special handling, as follows:

    gopher-to-ftp gateway resources: ftp gateway resources are indexed if and only if they are defined as type-1 gopher items; in other words, if they are links to DIRECTORIES in ftp servers.
    Individual files (gopher types 0, 9, 4, 5, 6, I, etc.) offered by ftp gateways are not included in the index. File-type ftp items were discontinued in May, 1994; at that time the number of ftp-gatewayed files was greater than five million.
    go4gw gateway items: gopher directory-type items represented by gateway servers running at ports 4320-4324 are included at the DIRECTORY level only (type-1 resource). These gateways are not "followed" by the indexing software. The gateway's title appears in the index; information items SERVED BY THE GATEWAY are not included. Go4gw gateways running at other ports may be followed inadvertently by the veronica harvester, and thus some gatewayed data of other types may be included in the index.
    NNTP and FINGER: heuristics are used to avoid finger services and nntp (usenet news) articles provided via go4gw and other gateways. Usenet news articles held as files and offered directly by gopher servers are however included in the index. 

Q5: Which gopher servers are not included in the veronica index?

A gopher may not be in the index for several reasons:

    The gopher administrator may have requested that the server be excluded. (see Q9 below).
    The gopher may be relatively new. In this case the veronica index data may have been harvested before the gopher came on line. If the new gopher has been registered with Minnesota, it will be included in the next index update, provided it is accessible to the veronica harvester.
    A gopher may not be included because it was not accessible to the veronica data harvester. This may occur because of transient network outages, or because the server was not operational when the harvester tried to contact it. It may also be the case that a gopher server restricts access to outside domains and thus is not indexable.
    A gopher server will not be indexed if it is running at an IP port numbered below 1024, except of course for port 70. This exclusion is necessary because the harvester needs to avoid all sorts of peculiar information services which run at registered low ports. 

Q6: How can I connect to veronica?

veronica is accessed through any gopher client. The client may be one of the gopher-specific clients (TurboGopher, Unix curses gopher, WSGopher, etc.) or a multiprotocol browser such as Mosaic, NetScape, Chameleon, etc.

Use the client to find a veronica-access menu on a gopher server menu. Most gopher servers will have a menu named something like "Search GopherSpace using veronica".

The client may have a "starting points" list including veronica. If your local gopher server does not have a veronica access menu, point your gopher client to the veronica HOME MENU at:
gopher://veronica.scs.unr.edu:70/11/veronica

An alternative veronica access menu is at the Mother Gopher:
gopher://gopher.tc.umn.edu:70/11/Other Gopher and Information Servers/Veronica

The veronica home menu contains several types of items. There is this FAQ, and a short document "How to Compose veronica Queries". There is a submenu containing advice for gopher server administrators, statistics about gopherspace, access to veronica software, and HTML versions of several documents.

More importantly, a number of veronica servers will be listed. This home menu is automatically reconfigured every ten minutes, so only the currently-active veronica servers will be displayed. You may choose to submit a search to any of these publicly-accessible servers, or you can submit your query to the "simplified veronica search" option which also appears on the menu. The "simplified" search is a gateway which contacts all the veronica servers for you, saving you the chore of trying servers until you find one which accepts your search. Sometimes all the servers are busy; in that case, resubmit your search in a minute or so.
Q7: How do I know where an information resource is located?

Most gopher clients offer a "get information" command or an "item descriptor" menu choice. For instance, TurboGopher uses "command i"; the unix curses client uses an equal sign "="; WSGopher has the "File/info on item" option, and Mosaic has the "Options/Show Current URL" option.

Various degrees of information may be available. For items served by gopher-0 servers, you will be able to determine the domain name and hostname of the server, which may be of some use. Gopher+ servers may offer Institution Name attributes, contact person names, or abstracts with further meta-data about the resource.

Advanced clients may automatically retrieve the Gopher+ item descriptor meta-information, and display it for each item as the pointer is moved across the veronica results menu.
Q8: How can I get my server into the veronica database?

The veronica harvester software will find your gopher server IF it is registered with the Mother Gopher at Minnesota, OR IF it is referenced on the menu of another gopher server which is registered at Minnesota. Of course, the veronica harvester will not be able to access your server if you have restricted access to your local site.

veronica does not currently have the ability to add new gophers to the index immediately when they come online. New servers will be included only at the next general update.

If your server has been omitted, send mail to veronica@scs.unr.edu
Q9: How can I keep my server OUT of the veronica database?

There are two ways:

    If you run the Unix gopher+ server from U. of Minnesota, you may include the line "veronicaindex: no" in the gopherd.conf file. The veronica harvester will completely exclude your server from the index.
    You can use the special veronica control file protocol. THIS IS THE RECOMMENDED WAY TO CONTROL VERONICA'S BEHAVIOR WITH RESPECT TO YOUR GOPHER SERVER. The control files will work with gopher-level-0 servers, and can be used with any kind of gopher server on any operating system platform. In brief, the control file lets you specify a number of options for the veronica harvester. You can completely exclude the server from the index, or specify that only certain menus are to be indexed. For more information, see the home veronica menu (Q6) and look in the "More veronica" submenu (gopher://veronica.scs.unr.edu/11/veronica/About) 

Q10: How can I tell veronica to index PART of my gopher server?

Use the veronica-control-file protocol. THIS IS THE RECOMMENDED WAY TO CONTROL VERONICA'S BEHAVIOR WITH RESPECT TO YOUR GOPHER SERVER. The control files will work with gopher-level-0 servers, and can be used with any kind of gopher server on any operating system platform. In brief, the control file lets you specify a number of options for the veronica harvester. You can completely exclude the server from the index, or specify that only certain menus are to be indexed. For more information, see the home veronica menu (Q6) and look in the "More veronica" submenu (gopher://veronica.scs.unr.edu/11/veronica/About)
Q11: Does veronica work with Jughead?

Jughead can supply a prepared index file to the veronica harvester. This involves setting up jughead with certain options, and configuring a veronica control file (see Q10) to tell the veronica harvester how to obtain the data file from Jughead. For more information, see the Jughead distribution documents and the veronica control file documents.
Q12: How often is the database updated?

Approximately once per month.
Q13: Where can I get the software to run veronica?

The veronica server software can be obtained by anonymous ftp from futique.scs.unr.edu or veronica.scs.unr.edu The veronica server code is in the directory "veronica-code". The current version (Dec 94) of the veronica server is 0.6.5f It runs on most flavors of unix boxes, requires a perl interpreter and ndbm, and about 2 GB for the dataset. (data of Jan 18, 1995)
Q14: Where can I get the veronica data set?

You can anonymous-ftp the full veronica dataset from futique.scs.unr.edu, in the "veronica-data" subdirectory. This data has been processed to eliminate redundant references, to avoid loops in the gopher network, and to remove most data that is known to be highly transient.
Q15: What does "veronica" mean?

very easy rodent-oriented net-wide index to computerized archives.
Q16: Why doesn't my server show up in veronica searches?

See Q5.
Q17: Where are the veronica server sites?

There are currently (January 13, 1995) ten publicly-accessible veronica servers. All of them can be accessed via the veronica Gopher menu at veronica.scs.unr.edu. If that menu is unavailable, consult the Mother Gopher for a veronica access menu. See Q6.

The ten public veronica servers are provided by:

    Nevada System Computing Services
    University of Pisa
    University of Koeln
    University of Bergen
    University of Texas, Dallas
    University of Manitoba
    NYSERNET
    PSI, Inc.
    SUNET
    Tachyon Communications 

Q18: How do I contact the veronica developers and providers?

Send mail to veronica@scs.unr.edu.
Q19: How do I compose veronica search queries, and use various options?

The simplest veronica search is just a single word, followed by a RETURN. The following (better) answer is from the document "How to Compose veronica Queries". 


Veronica FAQ (Part 2 of 2)
HOW TO COMPOSE VERONICA QUERIES
- June 23, 1994: Steven Foster

This document is an introduction to using veronica. 278 lines. gopher://veronica.scs.unr.edu:70/00/veronica/how-to-query-veronica

veronica: very easy rodent-oriented net-wide index of computerized archives.

Contents:

    Introduction.
    Types of Searches.
    Multiple Servers.
    Pre-defined Search Types.
    Entering a Query.
    Default Maximum Items.
    Query Logic, Boolean Searching, and Wildcards.
    Finding Resources of a Certain Gopher "Type".
    Summary of Options.
    Examples. 

INTRODUCTION

veronica is an index and retrieval system which can locate items on most of the gopher servers in the Internet. The veronica index contains about 10 million items from approximately 5500 gopher servers (June 1994).

veronica finds resources by searching for WORDS in TITLES. It does not do a full-text search of the contents of the resources; it finds resources whose titles contain your specified search word(s). The "title" is the title of the resource as it appears on the menu of its HOME gopher server.

veronica is used with a gopher client. You will choose "veronica" from the menu of some gopher server, and enter a set of query words or special directives. When the search is finished, the results will be presented as a normal gopher menu. You may browse the discovered resources in this menu, as you would use any other gopher menu. TYPES OF SEARCHES

Most veronica-access menus offer several types of searches. In addition to these pre-defined types, you can compose veronica queries using a number of special options to focus your search more precisely. You should use these options when appropriate, as they will make it much easier to locate resources. (See sections below for PRE-DEFINED SEARCH TYPES and FINDING RESOURCES OF A CERTAIN GOPHER TYPE ) MULTIPLE SERVERS

Many veronica-access menus offer a list of various veronica server sites; in this case you will have to choose a server site to use. Ideally, it does not matter which server you use, as all servers will give the same answers. In practice, the servers do not all update the index at the same time, so there will be some difference in the results. Some servers will return an answer faster than others, depending on load and network traffic.

Many other veronica-access menus offer a single entry rather than a list of servers. In this case, simply click on the search type desired, and submit your query in the dialog box. PRE-DEFINED SEARCH TYPES:

Most access menus offer two predefined search types:

Search GopherSpace by keywords in Titles

This search will find ALL TYPES of resources whose titles contain your specified search words. The resources may be of any Gopher data TYPE; e.g. ascii documents, gopher directories, image files, binary files, etc.

Search Gopher DIRECTORIES ONLY for keywords in Titles.

This search will find only Gopher DIRECTORIES whose titles contain the specified words. This search can be very useful to find only major holdings of information which relate to your query. After veronica finds the gopher directories, you can open any of them to see the contents in more detail. This is especially useful to avoid being overwhelmed by too many results if you are searching with a common word such as "women" or "internet"!

You can define your own query, specifying only certain TYPES of gopher resources, by using the -t option. For instance, you could search for ONLY image files by including the phrase "-tI" in your query. See below for more about the -t option. ENTERING A QUERY

When you select a query type, your gopher client will present a dialog box. Enter your query words. The search is NOT case-sensitive.

You may get better results by entering a multi-word query rather than a single word. Multiple word queries will find only those items whose titles contain ALL of the specified words. For instance, "women" will find 5223 items; but "league women voters" will find 126 items. Be as specific as you can.

It also helps to be imaginative. Think about how gophers are organized; the information you want may not be found under "league of women voters", but under the more general heading of "politics".

A multiple-word query does not require that the words be adjacent in the title, nor that they appear in any particular order. So, "marx brothers" will locate the same items as "brothers marx".

There is more information on composing queries below. DEFAULT MAXIMUM ITEMS and the "-m" option.

By default, the veronica servers will deliver only the first 200 items which match your query. You can request any number of items by including the "-mX" command phrase in your query. X is the number of items you wish. If X is omitted ( "-m" ), there is no limit to the number of items delivered.

For instance:

    "women" will provide 200 items.
    "women -m1000" will provide 1000 items.
    "women -m" will provide all available matching items.

You may find a message at the end of your veronica results menu, like "*** There are 576 more items matching your query". If you are not satisfied with the 200 items you got, you can resubmit the query, requesting more items with "-m".

Note that some veronica servers will provide more than 200 items by default. QUERY LOGIC, BOOLEAN SEARCHING, and WILDCARDS.

The search understands the logical operators AND, NOT, OR, (, and ).

If you use a simple multiple-word query, it is the same as using AND between the words. For instance "acid rain" is the same query as "rain and acid". "League women voters" is the same as "league and women and voters".

As noted above, we recommend using AND to create a tightly-focused query.

We recommend that the word "OR" be used VERY RARELY. Usually, OR will just produce thousands of hit-or-miss results. OR is best used in conjunction with other operators, as "rice and (fried or curr*) ".

An asterisk ("*") at the TRAILING END of a query word will match anything. Use it as a limited form of wildcard search. The asterisk character may be used ONLY at the end of words; the search will fail if a "*" is placed within a word or at the beginning of a word.

Search words must be at least two characters long. Shorter words will be ignored.

Interpretation of the query starts from the right-hand, interpreting operators as encountered. If in doubt about order of interpretation, USE PARENTHESES! The veronica server at University of Koeln ( june94 ) interprets the query logic from left-to-right. FINDING RESOURCES OF A CERTAIN GOPHER "TYPE": the "-t" flag.

You can use veronica to find resources of (only) a specified gopher type. You specify the type(s) of interest by adding the "-tX" option phrase to your query.

The -t flag may appear anywhere in the search specification. For example:

"women -t1"
"-t1 women"

Either of these search phrases will find resources with the word "women" in the title. All the resources will be gopher DIRECTORY items ( type 1 ).

There must NOT be any spaces between the -t and the type specifier.

You may specify MORE THAN ONE type in the query. DO NOT use separate -t options to do this; simply put all the types together (with no spaces) after the -t. For example:

"-tgs mac" returns a menu of GIF images or SOUNDS with the word "mac" in titles.

Official gopher types, from the Gopher Protocol Document, are:

          0  -- Text File
          1  -- Directory
          2  -- CSO name server
          4  -- Mac HQX file.
          5  -- PC binary
          7  -- Full Text Index (Gopher menu)
          8  -- Telnet Session
          9  -- Binary File
          s  -- Sound
          e  -- Event    (not in 2.06)
          I  -- Image (other than GIF)
          M  -- MIME multipart/mixed message
          T  -- TN3270 Session
          c  -- Calendar (not in 2.06)
          g  -- GIF image
          h  -- HTML, HyperText Markup Language

SUMMARY OF THE OPTIONS

    -t	limit the search to items of specified data type(s).
    -m	specify maximum number of items to find.
    -l  create a file of links for the discovered resources.  The file
 	will be displayed as the first item on the veronica results menu.
 	You can then retrieve that file and include the links in menus 
 	which you may be building.  Not all veronica servers support the
 	"-l" option.

Just include the options in the search query. They will work with any gopher client. You can put options before the query words, after the query words, or even between query words.

DO NOT cluster more than one option behind a single hyphen; instead, use a separate hyphen for each separate option. For example:

gopher -t1s -m400

This example requests 400 items containing the word "gopher", and specifies that we want only items whose type is "directory" or "sound".

EXAMPLES (from Fred Barrie):

Simple examples:

Search on the word "internet". This will return a menu list of (at most) 200 records that have the word internet in the title field.

Just type-
internet

Search on the word "internet", but specify 1000 items instead of the default 200.

type-
internet -m1000

or
-m1000 internet

Search on the words "chicken" and "wine". This returns a menu list of (at most) 200 records that have _BOTH_ "chicken" and "wine".

Type-
chicken and wine

Search for the keywords "chicken" or "wine", specifying directories only. This returns a menu list of resources that have _EITHER_ chicken or wine, and which are GOPHER DIRECTORY entries. Type-

chicken or wine -t1
or

-t1 chicken or wine

Examples for the operator "NOT":

To use the operator "NOT" in a query:

chicken not wine
    (will search for all titles with the word chicken _BUT NOT_ the word wine) 
chinese food not msg
    (will search for our health nuts all the titles with the words chinese _AND_ food _BUT NOT_msg. Remember there is an implied _AND_ between two words) 

Examples for parenthesis queries:

chicken (wine or curry) -m
    (will list ALL titles with the words chicken _AND_ either wine _OR_ curry. -m asks for ALL records.) 
(chicken or wine) not (msg or growing)
    (will search for titles with the words chicken _OR_ wine _BUT NOT_msg _OR_ growing) 

Examples for word stemming: The metacharacter "*" matches anything at the TRAILING END of a search word.

chicken*
    (will search for all titles with the word chicken, chickens, ...) 
chicken* or wine*
    (will search for all titles with the word chicken, chickens, ... _OR_ wine, wines, wineries, ...)