[Up to bibliography collection]
Help on the simple search interface for the Computer Science Bibliography Collection
You can search the bibliography
collection for references or for bibliographies. The search
mechanism is based on glimpse.
The bibliographies are hierarchically structured and a search
at any point in the hierarchy will only search the contents of the
bibliographies contained in that subtree. Thus, if you are
searching at the very top you search
the complete collection, whereas if you descend to the bottom of the hierarchy
you only search a single bibliography.
You are searching all the information contained in the references,
that is, including author names, titles, journal and conference names,
publication dates, keywords, abstracts, ...
Unfortunately there is no way to restrict searches to only one field with this search interface. If you need this capability then try the advanced search interface.
Phrases
You just enter the phrases you are looking for except
for the special characters: '(', ')'. You cannot use
parentheses in search phrases.
A search phrase can actually consist of several words with spaces
inbetween. The phrase must appear verbatim in any matching reference,
even a simple line break in the reference will prevent a
match. Therefore phrases should be kept short (max. 2 or 3 words).
Boolean Operators
You can combine search phrases in an arbitrary fashion with the logical AND
operator 'and' and/or the OR operator 'or'. By putting
matching parentheses '(' and ')' around expressions you
can define the precedences of the boolean operators.
- migration
- search for "migration"
- H{\"o}pfner or Hoepfner
- If you are looking for names containing umlauts, accents or
digraphs, you must use the corresponding LaTeX expressions and you
should also try common ascii transcriptions.
- object migration and Lazowska
- search for references to publications by "Lazowska" containing
"object migration": the two words must appear exactly like that in a
matching reference, a line break inbetween the two words would prevent
a match.
- object and (migrating or migration) and Lazowska
- search for references to publications by "Lazowska" containing
both "object" and "migration" anywhere in the reference.
- (object or process) and migration and 1994
- search for references that contain "object" or "process", and "migration" in the year 1994.
- embedding and ((mesh and hypercube) or (tree and butterfl))
- Search for embeddings of meshes in hypercubes or for embeddings of trees in butterflies. Use with partial word matching to catch plural forms.
- @techreport and (CM5 or CM-5 or {CM}-5)
- Search for technical report references to the Connection Machine 5. Note the use
of braces to account for the BibTeX idiosyncrasy of requiring
that proper names containing uppercase letters be surrounded by braces
in title fields.
- parallel and url
- Search for online publications on parallel processing.
- SmithJ
-
You can search for authors using the initals. Just append their
initials to their name in capital letters. The search should be made
case-sensitive and partial word matching must be set. The above query would match both John Smith and James Thomas Smith.
When you search for bibliographies you are searching the
descriptions of the individual bibliographies in the
collection and obtain pointers to bibliographies as results. This
search facility may be used instead of hierachically browsing the
collection and actually amounts to searching the HTML files in the
collection. Generally, queries should be kept very simple, without
boolean operators, and search terms should be kept general, describing
areas rather than specific topics.
The results page that will be returned to you after a
search indicates the search terms of your query in easily readable
form. This will show you how your query expression has been
interpreted by the query parser.
Then follow the BibTeX entries matching your query. Any
sequence of matches of entries from the same bibliography will be
introduced by the name of the bibliography that can be used a direct
link to the title page for that bibliography.
The BibTeX entries are rendered in HTML with the matching words
highlighted and any URLs in the entry made into a clickable link. That
allows you to immediately access that paper if a reference contains a
URL to the online version of the paper.
If you save the page to a file in plain format (not
HTML) you will obtain perfectly valid BibTeX references that can
directly be used with LaTeX or merged with your own bibliographies.
Rendering of results:
- Citation
- The matching entries will be reduced to the bare bibliographic
necessities and rendered like an entry in a literature list. Note, that wou will not see any possible abstracts and any other information beyond what is necessary to locate a paper version of the publication.
The advantage of this format is that the list of matching entries is shorter and therefore somewhat easier to browse..
- BibTeX
- This format displays the full BibTeX entry, including all the
available information that might be missing in the citation format
(abstracts, keywords, annotations). The BibTeX entry can be easily
transferred into your personal BibTeX bibliography using
cut&paste.
- Count Only
- Do not display any references, only count the number of matches in each bibliography.
Some references also contain forward or backward crossreferences to
papers citing the paper or cited by the paper. These crossreferences
are rendered as live searches that display the crossreferenced
publication.
On the bottom of the page you will find how many matches have been
found by the search engine and over how many bibliographies the
results spread.
If you choose to obtain compressed result data then the data will be compressed with gzip. The HTTP response
header
will specify the Content-Encoding x-gzip.
You should configure your browser correctly such that it can
uncompress the data transparently on-the-fly. These valuable
tips might help you with the configuration.
If you browser cannot handle compressed data transparently you can
simply save the results to a file with the ending .html.gz
and use gzip manually
to decompress the file. Then you can browse the resulting
.html file with your browser.
Why compression?
Compression of results helps deliver the results to you quickly and
terminates the memory-consuming processes on the server earlier, thus
increasing throughput by reduced paging.
How much exactly are the savings? Well, the average BibTeX entry in
this collection consumes 550 bytes. The average compression ratio
obtained with gzip is 5:1. Therefore if you retrieve 200
matches, you transfer only 22,000 bytes with compression instead of
110,000 bytes uncompressed!
The search database consists of all the bibliographies in the
bibliography collection in slightly modified form (see below). The
complete database including indices consumes 400 MBytes of disk space.
The search database differs from the bibliographies in the collection
in the following respects:
- Duplicates:
- The database contains only about 75% of the references in the
bibliography collection, the other 25% are automatically detected
duplicate entries. Therefore you will not get as many redundant
references as results of your search.
- Abbreviations:
- Many bibliographies in the collection define text macros
(@String abbreviations) and use these macros in the fields of
references. Those abbreviations have been expanded for the purpose of
searching, such that the words in those macros are also indexed and can
be searched for. Furthermore, the references returned as search
results contain the complete information: there is no need to find the
definitions of any macros occurring in a reference.
- Crossreferences:
-
Some bibliographies in the collection use the BibTeX feature of
crossreferences and thereby omit some information in the
references. These crossreferences have been expanded for the purpose
of searching which improves the search a lot.
- Length of query:
- If the query terms contains any non-alphanumerical characters then the query must
not be longer than 32 characters.
- Server Load:
- The number of simultaneous searches is limited. If your search
request is not serviced then try again later. Complicated queries
might lead to long search times and eventually might cause a timeout
for the search. Try to keep your queries neat and simple, that is, use
only alphanumerical characters in your search words and search for
complete words.
-
-
- Case-sensitive/case insensitive
- Here you can choose to make the search case sensitive, e.g.
ocr will not match OCR and Fisher will
not match fisher. This is only recommended for acronyms or proper names.
- partial word(s)/exact
- "exact" means that any search phrase must occur in the references as a
full phrase, that is, both its ends must fall on word
boundaries. By choosing "partial word(s)" you can also have
the search phrases
match only as partial phrases. "partial word(s)" should e.g. be chosen to
include plural forms of search terms in the search, but should be
avoided if any short words appear in the query that might frequently
appear as syllables in other, unrelated words (for instance, "att"
might appear in "Seattle", "attention", "attenuation", ...)
Note, that a multi-word phrase "introduction to neur" will
match "introduction to neural nets", but "intro to neur" will not.
This option increases the response time significantly if short
search words are used.
- Maximum number of returned references
- You can adjust the maximum number of matching references
returned by the search engine. This number is 40 by default.
First:
- Check your spelling, mispellings in queries are a common error. Try both american and british spelling if your search terms are in English.
- Check the syntax of your query. In particular, many clients
forget the boolean operators.
- Check if you should set the "partial word matching" option.
Make sure you did not only search for plural forms of nouns
where the singular form would be interesting, too.
- If you used hyphenated words you ought to try the same query
with a space instead of the hyphen or with a compound word
(e.g. real-time or real time or realtime). Hyphenation seems to be pretty
arbitrary for many technical terms.
If you still did not succeed, you should try to
- relax your query if you are in a subarea of the bibliography
collection. For instance, if you are in the Database section you might
not need to include the term database in your query.
- go to a higher place in the hierarchically organized
bibliography and try again. The organization of the bibliographies
into sections is not as clearcut as it might seem, there exists a
considerable overlap between them. As you go higher in the hierarchy
you include more and more sections into your search.
If you tried all of the above and still could not find what you were
looking for then the bibliography collection probably does not contain
it. The best solution is to create a bibliography relating to the
topic you are interested in and to contribute that bibliography to the
collection.
Copyright © 1995,1996 Alf-Christian Achilles <achilles@ira.uka.de>
Last modified: Tue Jan 14 10:15:49 1997