Massive Articles

Find the Article or Book you were Looking for

Random Facts

Even though it is widely attributed to him Shakespeare never actually used the word 'gadzooks'.

Articles & Books

Home | Cars | Finance & Business | Cellphones | Collectibles | Internet & Computers | Education | Self Improvement and Motivation | Environment | Family | Food & Drinks | Gadgets and Gizmos | Gardening | Gifts and Gift Baskets | Government | Health | Hobbies | Home Improvement | Kids and Teens | Legal Matters | Marketing | Music and Entertainment | Online Business | Parenting | Pets | Pets & Animals | Recreation and Sports | Religion | Site Promotion | Paranormal | Astrology | Literature | Travel | Women |

Famous Birthdays

1901-12-27
Marlene Dietrich, in Berlin Germany (Blue Angel)

Historic Events

1944-06-13 Nazi Germany begins V-1 (Fieseler Fi-103) buzz-bomb attacks

Quote on Advice

Jong, Erica: "Advice is what we ask for when we already know the answer but wish we didn?t."

Massive Articles / Computers and The Internet

A Beginner's Guide to the Internet

he composition of this booklet was originally started because the Com-
puter Science department at Widener University was in desperate need of
documentation describing the capabilities of this "great new Internet link"
we obtained.

    It's since grown into an effort to acquaint the reader with much of what's
currently available over the Internet. Aimed at the novice user, it attempts
to remain operating system "neutral"_little information herein is specific
to Unix, VMS, or any other environment.  This booklet will, hopefully, be
usable by nearly anyone.

Bookmark this Article

Google Bookmarks StumbleUpon Digg Windows Live Facebook Ask Technorati del.icio.us Netscape reddit Furl BlinkList

  A Beginner's Guide to the Internet
                                                                  First Edition
                                                                  January 1992



by Brendan P. Kehoe
_______________________________________________________________________________


This is revision 1.0 of February 2, 1992.



Copyright Oc 1992 Brendan P. Kehoe

Permission is granted to make and distribute verbatim copies of this guide
provided the copyright notice and this permission notice are preserved on all
copies.

Permission is granted to copy and distribute modified versions of this booklet
under the conditions for verbatim copying, provided that the entire resulting
derived work is distributed under the terms of a permission notice identical
to this one.

Permission is granted to copy and distribute translations of this booklet into
another language, under the above conditions for modified versions, except
that this permission notice may be stated in a translation approved by the
author.
Preface                                                                       1





Preface


    The composition of this booklet was originally started because the Com-
puter Science department at Widener University was in desperate need of
documentation describing the capabilities of this "great new Internet link"
we obtained.

    It's since grown into an effort to acquaint the reader with much of what's
currently available over the Internet. Aimed at the novice user, it attempts
to remain operating system "neutral"_little information herein is specific
to Unix, VMS, or any other environment.  This booklet will, hopefully, be
usable by nearly anyone.

    Some typographical conventions are maintained throughout this guide. All
abstract items like possible filenames, usernames, etc., are all represente d
in italics. Likewise, definite filenames and email addresses are represented
in a quoted `typewriter' font. A user's session is usually offset from the
rest of the paragraph, as such

     prompt> command
          The results are usually displayed here.

    The purpose of this booklet is two-fold first, it's intended to serve as a
reference piece, which someone can easily grab on the fly and look something
up.  Also, it forms a foundation from which people can explore the vast
expanse of the Internet.  Zen and the Art of the Internet doesn't spend a
significant amount of time on any one point; rather, it provides enough for
people to learn the specifics of what his or her local system offers.

    One warning is perhaps in order_this territory we are entering can be-
come a fantastic time-sink. Hours can slip by, people can come and go, and
you'll be locked into Cyberspace. Remember to do your work!

    With that, I welcome you, the new user, to The Net.


                                                       brendan@cs.widener.edu
                                                                   Chester, PA
2                                             Zen and the Art of the Internet


Acknowledgements                                                           3





Acknowledgements


    Certain sections in this booklet are not my original work_rather, they
are derived from documents that were available on the Internet and already
aptly stated their areas of concentration. The chapter on Usenet is, in large
part, made up of what's posted monthly to news.announce.newusers, with
some editing and rewriting.  Also, the main section on archie was derived
from `whatis.archie' by Peter Deutsch of the McGill University Computing
Centre. It's available via anonymous FTP from archie.mcgill.ca. Much of
what's in the telnet section came from an impressive introductory document
put together by SuraNet. Some definitions in the one are from an excellent
glossary put together by Colorado State University.

    This guide would not be the same without the aid of many people on The
Net, and the providers of resources that are already out there.  I'd like to
thank the folks who gave this a read-through and returned some excellent
comments, suggestions, and criticisms, and those who provided much-needed
information on the fly. Glee Willis deserves particular mention for all of his
work; this guide would have been considerably less polished without his help.



  o  Andy Blankenbiller, Army at Aberdeen

  o  Alan Emtage, McGill University Computer Science Department

  o  Brian Fitzgerald, Rensselaer Polytechnic Institute

  o  John Goetsch, Rhodes University, South Africa

  o  Jeff Kellem, Boston University's Chemistry Department

  o  Bill Krauss, Moravian College

  o  Steve Lodin, Delco Electronics

  o  Mike Nesel, NASA

  o  Bob Neveln, Widener University Computer Science Department

  o  Wanda Pierce, McGill University Computing Centre

  o  Joshua Poulson, Widener University Computing Services

  o  Dave Sill, Oak Ridge National Laboratory

  o  Bob Smart, CitiCorp/TTI

  o  Ed Vielmetti, Vice President of MSEN

  o  Craig Ward, USC/Information Sciences Institute (ISI)

  o  Glee Willis, University of Nevada, Reno

  o  Chip Yamasaki, OSHA
4                                             Zen and the Art of the Internet


Chapter 1 Network Basics                                                  5





1  Network Basics


    We are truly in an information society.  Now more than ever, moving
vast amounts of information quickly across great distances is one of our
most pressing needs.  From small one-person entrepreneurial efforts, to the
largest of corporations, more and more professional people are discovering
that the only way to be successful in the '90s and beyond is to realize that
technology is advancing at a break-neck pace_and they must somehow keep
up. Likewise, researchers from all corners of the earth are finding that their
work thrives in a networked environment. Immediate access to the work of
colleagues and a "virtual" library of millions of volumes and thousands of
papers affords them the ability to encorporate a body of knowledge hereto-
fore unthinkable. Work groups can now conduct interactive conferences with
each other, paying no heed to physical location_the possibilities are endless.

    You have at your fingertips the ability to talk in "real-time" with someone
in Japan, send a 2,000-word short story to a group of people who will critique
it for the sheer pleasure of doing so, see if a Macintosh sitting in a lab in
Canada is turned on, and find out if someone happens to be sitting in front
of their computer (logged on) in Australia, all inside of thirty minutes. No
airline (or tardis, for that matter) could ever match that travel itinerary.

    The largest problem people face when first using a network is grasping all
that's available.  Even seasoned users find themselves surprised when they
discover a new service or feature that they'd never known even existed. Once
acquainted with the terminology and sufficiently comfortable with making
occasional mistakes, the learning process will drastically speed up.



1.1  Domains


    Getting where you want to go can often be one of the more difficult
aspects of using networks.  The variety of ways that places are named will
probably leave a blank stare on your face at first.  Don't fret; there is a
method to this apparent madness.

    If someone were to ask for a home address, they would probably expect
a street, apartment, city, state, and zip code. That's all the information the
post office needs to deliver mail in a reasonably speedy fashion.  Likewise,
computer addresses have a structure to them. The general form is

     a person's email address on a computer user@somewhere.domain
     a computer's name somewhere.domain

    The user portion is usually the person's account name on the system,
though it doesn't have to be.  somewhere.domain tells you the name of a
6                                             Zen and the Art of the Internet





system or location, and what kind of organization it is. The trailing domain
is often one of the following

com          Usually a company or other commercial institution or organiza-
             tion, like Convex Computers (`convex.com').

edu          An educational institution, e.g.  New York University, named
             `nyu.edu'.

gov          A government site; for example, NASA is `nasa.gov'.

mil          A military site, like the Air Force (`af.mil').

net          Gateways and other administrative hosts for a network (it does
             not mean all of the hosts in a network).1  One such gateway is
             `near.net'.

org          This is a domain reserved for private organizations, who don't
             comfortably fit in the other classes of domains.  One example
             is the Electronic Frontier Foundation (see Section 8.3.3 [EFF],
             page 66), named `eff.org'.

    Each country also has its own top-level domain.  For example, the us
domain includes each of the fifty states.  Other countries represented with
domains include

au           Australia

ca           Canada

fr           France

uk           The United Kingdom.  These also have sub-domains of things
             like `ac.uk' for academic sites and `co.uk' for commercial ones.

    The proper terminology for a site's domain name (somewhere.domain
above) is its Fully Qualified Domain Name (FQDN). It is usually selected
to give a clear indication of the site's organization or sponsoring agent. For
example, the Massachusetts Institute of Technology's FQDN is `mit.edu';
similarly, Apple Computer's domain name is `apple.com'.  While such ob-
vious names are usually the norm, there are the occasional exceptions that
are ambiguous enough to mislead_like `vt.edu', which on first impulse one
might surmise is an educational institution of some sort in Vermont; not so.
It's actually the domain name for Virginia Tech. In most cases it's relatively
easy to glean the meaning of a domain name_such confusion is far from the
norm.

_________________________________


 1  The Matrix, 111.
Chapter 1 Network Basics                                                  7





1.2  Internet Numbers


    Every single machine on the Internet has a unique address,2 called its
Internet  number  or  IP  Address.   It's  actually  a  32-bit  number,  but
i s most commonly represented as four numbers joined by periods (`.'), like
147.31.254.130.  This is sometimes also called a dotted quad; there are
literally thousands of different possible dotted quads.  The ARPAnet (the
mother to today's Internet) originally only had the capacity to have up to 256
systems on it because of the way each system was addressed. In the early
eighties, it became clear that things would fast outgrow such a small limit;
the 32-bit addressing method was born, freeing thousands of host numbers.

    Each piece of an Internet address (like 192) is called an "octet," rep-
resenting one of four sets of eight bits.  The first two or three pieces (e.g.
192.55.239) represent the network that a system is on, called its subnet.
For example, all of the computers for Wesleyan University are in the subnet
129.133.  They can have numbers like 129.133.10.10, 129.133.230.19,
up to 65 thousand possible combinations (possible computers).

    IP  addresses  and  domain  names  aren't  assigned  arbitrarily_that
would  lead  to  unbelievable  confusion.    An  application  must  be  filed
with  the  Network  Information  Center  (NIC),  either  electronically  (to
hostmaster@nic.ddn.mil) or via regular mail.



1.3  Resolving Names and Numbers


    Ok, computers can be referred to by either their FQDN or their Internet
address. How can one user be expected to remember them all?

    They aren't. The Internet is designed so that one can use either method.
Since humans find it much more natural to deal with words than numbers
in most cases, the FQDN for each host is mapped to its Internet number.
Each domain is served by a computer within that domain, which provides
all of the necessary information to go from a domain name to an IP address,
and vice-versa. For example, when someone refers to foosun.bar.com, the
resolver knows that it should ask the system foovax.bar.com about systems
in bar.com. It asks what Internet address foosun.bar.com has; if the name
foosun.bar.com really exists, foovax will send back its number. All of this
"magic" happens behind the scenes.
_________________________________


 2  At least one address, possibly two or even three_but we won't go into

    that.
8                                             Zen and the Art of the Internet





    Rarely will a user have to remember the Internet number of a site (al-
though often you'll catch yourself remembering an apparently obscure num-
ber, simply because you've accessed the system frequently).  However, you
will remember a substantial number of FQDNs.  It will eventually reach a
point when you are able to make a reasonably accurate guess at what do-
main name a certain college, university, or company might have, given just
their name.



1.4  The Networks


Internet     The Internet is a large "network of networks."   There is no
             one network known as The Internet; rather, regional nets like
             SuraNet, PrepNet, NearNet, et al., are all inter-connected (nay,
             "inter-networked") together into one great living thing,  com-
             municating at amazing speeds with the TCP/IP protocol.  All
             activity takes place in "real-time."
UUCP         The UUCP network is a loose association of systems all com-
             municating with the `UUCP' protocol. (UUCP stands for `Unix-
             to-Unix Copy Program'.) It's based on two systems connecting to
             each other at specified intervals, called polling, and executing
             any work scheduled for either of them. Historically most UUCP was
             done with Unix equipment, although the software's since been
             implemented on other platforms (e.g. VMS). For example, the
             system oregano polls the system basil once every two hours. If
             there's any mail waiting for oregano, basil will send it at that
             time; likewise, oregano will at that time send any jobs waiting
             for basil.
BITNET       BITNET (the "Because It's Time Network") is comprised of
             systems connected by point-to-point links, all running the NJE
             protocol.  It's continued to grow, but has found itself suffering
             at the hands of the falling costs of Internet connections.  Also,
             a number of mail gateways are in place to reach users on other
             networks.



1.5  The Physical Connection


    The actual connections between the various networks take a variety of
forms. The most prevalent for Internet links are 56k leased lines (dedicated
telephone lines carrying 56kilobit-per-second connections) and T1 links (spe-
cial phone lines with 1Mbps connections). Also installed are T3 links, acting
Chapter 1 Network Basics                                                  9





as backbones between major locations to carry a massive 45Mbps load of
traffic.

    These links are paid for by each institution to a local carrier (for exam-
ple, Bell Atlantic owns PrepNet, the main provider in Pennsylvania).  Also
available are SLIP connections, which carry Internet traffic (packets) over
high-speed modems.

    UUCP links are made with modems (for the most part), that run from
1200 baud all the way up to as high as 38.4Kbps.  As was mentioned in
Section 1.4 [The Networks], page 8, the connections are of the store-and-
forward variety.  Also in use are Internet-based UUCP links (as if things
weren't already confusing enough!). The systems do their UUCP traffic over
TCP/IP connections, which give the UUCP-based network some blindingly
fast "hops,"  resulting in better connectivity for the network as a whole.
UUCP connections first became popular in the 1970's, and have remained
in wide-spread use ever since. Only with UUCP can Joe Smith correspond
with someone across the country or around the world, for the price of a local
telephone call.

    BITNET links mostly take the form of 9600bps modems connected from
site to site.   Often places have three or more links going;  the majority,
however, look to "upstream" sites for their sole link to the network.



                                     "The Glory and the Nothing of a Name"
                                                      Byron, Churchill's Grave
10                                            Zen and the Art of the Internet


Chapter 2 Electronic Mail                                                 11





2  Electronic Mail


    The desire to communicate is the essence of networking.  People have
always wanted to correspond with each other in the fastest way possible,
short of normal conversation. Electronic mail (or email) is the most preva-
lent application of this in computer networking.  It allows people to write
back and forth without having to spend much time worrying about how the
message actually gets delivered.  As technology grows closer and closer to
being a common part of daily life, the need to understand the many ways it
can be utilized and how it works, at least to some level, is vital.



2.1  Email Addresses


    Electronic mail is hinged around the concept of an address; the section
on Networking Basics made some reference to it while introducing domains.
Your email address provides all of the information required to get a message
to you from anywhere in the world. An address doesn't necessarily have to
go to a human being. It could be an archive server,1 a list of people, or even
someone's pocket pager.  These cases are the exception to the norm_mail
to most addresses is read by human beings.



2.1.1  %@!. Symbolic Cacophony


    Email addresses usually appear in one of two forms_using the Internet
format which contains `@', an "at"-sign, or using the UUCP format which
contains `!', an exclamation point, also called a "bang."  The latter of the
two, UUCP "bang" paths, is more restrictive, yet more clearly dictates how
the mail will travel.

    To reach Jim Morrison on the system south.america.org, one would
address the mail as `jm@south.america.org'. But if Jim's account was on
a UUCP site named brazil, then his address would be `brazil!jm'. If it's
possible (and one exists), try to use the Internet form of an address; bang
paths can fail if an intermediate site in the path happens to be down. There
is a growing trend for UUCP sites to register Internet domain names, to help
alleviate the problem of path failures.

    Another symbol that enters the fray is `%'_it acts as an extra "rout-
ing"  method.   For  example,  if  the  UUCP  site  dream  is  connected  to
_________________________________


 1  See  [Archive Servers], page 77, for a description.
12                                            Zen and the Art of the Internet





south.america.org, but doesn't have an Internet domain name of its own,
a user debbie on dream can be reached by writing to the address

     debbie%dream@south.america.org

The form is significant. This address says that the local system should first
send the mail to south.america.org.  There the address debbie%dream
will turn into debbie@dream, which will hopefully be a valid address. Then
south.america.org will handle getting the mail to the host dream, where
it will be delivered locally to debbie.

    All of the intricacies of email addressing methods are fully covered in
the book !%@   A Directory of Electronic Mail Addressing and Networks
published by O'Reilly and Associates, as part of their Nutshell Handbook
series.  It is a must for any active email user.  Write to nuts@ora.com for
ordering information.



2.1.2  Sending and Receiving Mail


    We'll make one quick diversion from being OS-neuter here, to show you
what it will look like to send and receive a mail message on a Unix system.
Check with your system administrator for specific instructions related to
mail at your site.

    A person sending the author mail would probably do something like this

     % mail brendan@cs.widener.edu
     Subject print job's stuck


     I typed `print babe.gif' and it didn't work! Why??

The next time the author checked his mail, he would see it listed in his
mailbox as

     % mail
     "/usr/spool/mail/brendan" 1 messages 1 new 1 unread
      U  1 joeuser@foo.widene Tue May  5 2036   29/956   print job's stuck
     ?

which gives information on the sender of the email, when it was sent, and
the subject of the message. He would probably use the `reply' command of
Unix mail to send this response
Chapter 2 Electronic Mail                                                 13


     ? r
     To joeuser@foo.widener.edu
     Subject Re print job's stuck


     You shouldn't print binary files like GIFs to a printer!


     Brendan

    Try sending yourself mail a few times, to get used to your system's mailer.
It'll save a lot of wasted aspirin for both you and your system administrator.



2.1.3  Anatomy of a Mail Header


    An electronic mail message has a specific structure to it that's common
across every type of computer system.2 A sample would be

     From bush@hq.mil Sat May 25 170601 1991
     Received from hq.mil by house.gov with SMTP id AA21901
       (4.1/SMI for dan@house.gov); Sat, 25 May 91 170556 -0400
     Date Sat, 25 May 91 170556 -0400
     From The President <bush@hq.mil>
     Message-Id <9105252105.AA06631@hq.mil>
     To dan@senate.gov
     Subject Meeting


     Hi Dan .. we have a meeting at 930 a.m. with the Joint Chiefs. Please
     don't oversleep this time.

The first line, with `From' and the two lines for `Received' are usually
not very interesting.  They give the "real" address that the mail is coming
from (as opposed to the address you should reply to, which may look much
different), and what places the mail went through to get to you.  Over the
Internet, there is always at least one `Received' header and usually no more
than four or five.  When a message is sent using UUCP, one `Received'
header is added for each system that the mail passes through.  This can
often result in more than a dozen `Received' headers.  While they help
with dissecting problems in mail delivery, odds are the average user will
never want to see them.  Most mail programs will filter out this kind of
"cruft" in a header.

    The `Date' header contains the date and time the message was sent.
Likewise, the "good" address (as opposed to "real" address) is laid out in
the `From' header. Sometimes it won't include the full name of the person
_________________________________


 2  The standard is written down in RFC-822. See  [RFCs], page 73 for more

    info on how to get copies of the various RFCs.
14                                            Zen and the Art of the Internet





(in this case `The President'), and may look different, but it should always
contain an email address of some form.

    The `Message-ID' of a message is intended mainly for tracing mail rout-
ing, and is rarely of interest to normal users. Every `Message-ID' is guar-
anteed to be unique.

    `To' lists the email address (or addresses) of the recipients of the
          message.

There may be a `Cc' header, listing additional addresses.  Finally, a brief
subject for the message goes in the `Subject' header.

    The exact order of a message's headers may vary from system to system,
but it will always include these fundamental headers that are vital to proper
delivery.



2.1.4  Bounced Mail


    When an email address is incorrect in some way (the system's name is
wrong, the domain doesn't exist, whatever), the mail system will bounce the
message back to the sender, much the same way that the Postal Service does
when you send a letter to a bad street address.  The message will include
the reason for the bounce; a common error is addressing mail to an account
name that doesn't exist. For example, writing to Lisa Simpson at Widener
University's Computer Science department will fail, because she doesn't have
an account.3

     From Mail Delivery Subsystem <MAILER-DAEMON>
     Date Sat, 25 May 91 164514 -0400
     To mg@gracie.com
     Cc Postmaster@cs.widener.edu
     Subject Returned mail User unknown


        ----- Transcript of session follows -----
     While talking to cs.widener.edu
     >>> RCPT To<lsimpson@cs.widener.edu>
     <<< 550 <lsimpson@cs.widener.edu>... User unknown
     550 lsimpson... User unknown

As you can see, a carbon copy of the message (the `Cc' header entry) was
sent to the postmaster of Widener's CS department.  The Postmaster  is
responsible for maintaining a reliable mail system on his system.  Usually
postmasters at sites will attempt to aid you in getting your mail where it's
_________________________________


 3  Though if she asked, we'd certainly give her one.
Chapter 2 Electronic Mail                                                 15





supposed to go. If a typing error was made, then try re-sending the message.
If you're sure that the address is correct, contact the postmaster of the site
directly and ask him how to properly address it.

    The message also includes the text of the mail, so you don't have to
retype everything you wrote.

        ----- Unsent message follows -----
     Received by cs.widener.edu id AA06528; Sat, 25 May 91 164514 -0400
     Date Sat, 25 May 91 164514 -0400
     From Matt Groening <mg@gracie.com>
     Message-Id <9105252045.AA06528@gracie.com>
     To lsimpson@cs.widener.edu
     Subject Scripting your future episodes
     Reply-To writing-group@gracie.com


      verbiage

The full text of the message is returned intact, including any headers that
were added. This can be cut out with an editor and fed right back into the
mail system with a proper address, making redelivery a relatively painless
process.



2.2  Mailing Lists


    People that share common interests are inclined to discuss their hobby or
interest  at  every  available  opportunity.   One  modern  way  to  aid  in
this exchange of information is by using a mailing list_usually an email
address  that  redistributes  all  mail  sent  to  it  back  out  to  a  list
of  addresses.  For example, the Sun Managers mailing list (of interest to
people that administer computers manufactured by Sun) has the address `sun-
managers@eecs.nwu.edu'. Any mail sent to that address will "explode" out to
each person named in a file maintained on a computer at Northwestern
University.

    Administrative tasks (sometimes referred to as administrivia) are often
handled through other addresses, typically with the suffix `-request'.  To
continue the above, a request to be added to or deleted from the Sun Man-
agers list should be sent to `sun-managers-request@eecs.nwu.edu'.

    When in doubt, try to write to the `-request' version of a mailing list
address first; the other people on the list aren't interested in your desire
to be added or deleted, and can certainly do nothing to expedite your request.
Often if the administrator of a list is busy (remember, this is all peripheral
to real jobs and real work), many users find it necessary to ask again and
again,
16                                            Zen and the Art of the Internet





often with harsher and harsher language, to be removed from a list.  This
does nothing more than waste traffic and bother everyone else receiving the
messages. If, after a reasonable amount of time, you still haven't succeeded
to be removed from a mailing list, write to the postmaster at that site and
see if they can help.

    Exercise caution when replying to a message sent by a mailing list.  If
you wish to respond to the author only, make sure that the only address
you're replying to is that person, and not the entire list. Often messages of
the sort "Yes, I agree with you completely!" will appear on a list, boring the
daylights out of the other readers. Likewise, if you explicitly do want to send
the message to the whole list, you'll save yourself some time by checking to
make sure it's indeed headed to the whole list and not a single person.

    A list of the currently available mailing lists is available in at least
two places; the first is in a file on ftp.nisc.sri.com called
`interest-groups' under the `netinfo/' directory.  It's updated fairly
regularly, but is large (presently around 700K), so only get it every once in
a while. The other list is maintained by Gene Spafford (spaf@cs.purdue.edu),
and is posted in parts to the newsgroup news.lists semi-regularly. (See
Chapter 4 [Usenet News], page 29, for info on how to read that and other
newsgroups.)



2.2.1  Listservs


    On BITNET there's an automated system for maintaining discussion lists
called the listserv.  Rather than have an already harried and overworked
human take care of additions and removals from a list, a program performs
these and other tasks by responding to a set of user-driven commands.

    Areas  of  interest  are  wide  and  varied_ETHICS-L  deals  with  ethics
in  computing,  while  ADND-L  has  to  do  with  a  role-playing  game.   A
full list of the available BITNET lists can be obtained by  writing to

`LISTSERV@BITNIC.BITNET' with a body containing the command

     list  global

However, be sparing in your use of this_see if it's already on your system
somewhere. The reply is quite large.

    The most fundamental command is `subscribe'. It will tell the listserv
to add the sender to a specific list. The usage is

     subscribe  foo-l  Your Real Name

It will respond with a message either saying that you've been added to the
list, or that the request has been passed on to the system on which the list
is actually maintained.
Chapter 2 Electronic Mail                                                 17





    The mate to `subscribe' is, naturally, `unsubscribe'.  It will remove a
given  address  from  a  BITNET  list.   It,  along  with  all  other listserv
commands,  can be abbreviated_`subscribe' as `sub',  `unsubscribe' as `unsub',
etc.  For a full list of the available listserv commands, write to

`LISTSERV@BITNIC.BITNET', giving it the command `help'.

    As an aside, there have been implementations of the listserv system for
non-BITNET hosts (more specifically, Unix systems). One of the most com-
plete is available on cs.bu.edu in the directory `pub/listserv'.



                                "I made this letter longer than usual because
                                           I lack the time to make it shorter."
                                                Pascal, Provincial Letters XVI
18                                            Zen and the Art of the Internet


Chapter 3 Anonymous FTP                                               19





3  Anonymous FTP



    FTP (File Transfer Protocol) is the primary method of transferring files
over the Internet. On many systems, it's also the name of the program that
implements the protocol.  Given proper permission, it's possible to copy a
file from a computer in South Africa to one in Los Angeles at very fast
speeds (on the order of 5-10K per second).  This normally requires either
a user id on both systems or a special configuration set up by the system
administrator(s).

    There is a good way around this restriction_the anonymous FTP ser- vice.
It essentially will let anyone in the world have access to a certain area of
disk space in a non-threatening way.  With this, people can make files
publicly available with little hassle.  Some systems have dedicated en- tire
disks  or  even  entire  computers  to  maintaining  extensive  archives of
source code and information.   They include gatekeeper.dec.com (Digi- tal),
wuarchive.wustl.edu (Washington University in Saint Louis),  and
archive.cis.ohio-state.edu (The Ohio State University).

    The process involves the "foreign" user (someone not on the system it-
self) creating an FTP connection and logging into the system as the user
`anonymous', with an arbitrary password

     Name (foo.site.comyou) anonymous
     Password jm@south.america.org

Custom and netiquette dictate that people respond to the Password query
with an email address so that the sites can track the level of FTP usage, if
they desire.  (See Section 2.1 [Addresses], page 11 for information on email
addresses).

    The speed of the transfer depends on the speed of the underlying link. A
site that has a 9600bps SLIP connection will not get the same throughput as a
system with a 56k leased line (see Section 1.5 [The Physical Connection], page
8, for more on what kinds of connections can exist in a network). Also, the
traffic of all other users on that link will affect performance. If there are
thirty people all FTPing from one site simultaneously, the load on the system
(in addition to the network connection) will degrade the overall throughput of
the transfer.



3.1  FTP Etiquette


    Lest we forget, the Internet is there for people to do work. People using
the network and the systems on it are doing so for a purpose, whether it be
20                                            Zen and the Art of the Internet





research, development, whatever.  Any heavy activity takes away from the
overall performance of the network as a whole.

    The effects of an FTP connection on a site and its link can vary; the
general rule of thumb is that any extra traffic created detracts from the
ability of that site's users to perform their tasks. To help be considerate of
this, it's highly recommended that FTP sessions be held only after normal
business hours for that site, preferably late at night. The possible effects
of a large transfer will be less destructive at 2 a.m. than 2 p.m. Also,
remember that if it's past dinner time in Maine, it's still early afternoon in
California _ think in terms of the current time at the site that's being
visited, not of local time.



3.2  Basic Commands


    While there have been many extensions to the various FTP clients out
there, there is a de facto "standard" set that everyone expects to work. For
more specific information, read the manual for your specific FTP program.
This section will only skim the bare minimum of commands needed to op-
erate an FTP session.



3.2.1  Creating the Connection


    The actual command to use FTP will vary among operating systems; for
the sake of clarity, we'll use `FTP' here, since it's the most general form.

    There are two ways to connect to a system_using its hostname or its
Internet number.  Using the hostname is usually preferred.  However, some
sites aren't able to resolve  hostnames properly,  and have no alternative.
We'll assume you're able to use hostnames for simplicity's sake. The form is

     ftp  somewhere.domain

See Section 1.1 [Domains], page 5 for help with reading and using domain
names (in the example below, somewhere.domain is ftp.uu.net).

    You must first know the name of the system you want to connect to.
We'll use `ftp.uu.net' as an example. On your system, type

     ftp ftp.uu.net

(the actual syntax will vary depending on the type of system the connection's
being made from). It will pause momentarily then respond with the message

     Connected to ftp.uu.net.

and an initial prompt will appear
Chapter 3 Anonymous FTP                                               21





     220 uunet FTP server (Version 5.100 Mon Feb 11 171328 EST 1991) ready.
     Name (ftp.uu.netjm)

to which you should respond with anonymous

     220 uunet FTP server (Version 5.100 Mon Feb 11 171328 EST 1991) ready.
     Name (ftp.uu.netjm) anonymous

The system will then prompt you for a password; as noted previously, a good
response is your email address

     331 Guest login ok, send ident as password.
     Password jm@south.america.org
     230 Guest login ok, access restrictions apply.
     ftp>

The password itself will not echo. This is to protect a user's security when
he or she is using a real account to FTP files between machines. Once you
reach the ftp> prompt, you know you're logged in and ready to go.



3.2.2  dir


    At the `ftp>' prompt, you can type a number of commands to perform
various functions.  One example is `dir'_it will list the files in the current
directory. Continuing the example from above

     ftp> dir


     200 PORT command successful.
     150 Opening ASCII mode data connection for /bin/ls.
     total 3116
     drwxr-xr-x  2 7         21             512 Nov 21  1988 .forward
     -rw-rw-r--  1 7         11               0 Jun 23  1988 .hushlogin
     drwxrwxr-x  2 0         21             512 Jun  4  1990 Census
     drwxrwxr-x  2 0         120            512 Jan  8 0936 ClariNet
                                etc etc
     -rw-rw-r--  1 7         14           42390 May 20 0224 newthisweek.Z
                                etc etc
     -rw-rw-r--  1 7         14         2018887 May 21 0101 uumap.tar.Z
     drwxrwxr-x  2 7         6             1024 May 11 1058 uunet-info


     226 Transfer complete.
     5414 bytes received in 1.1 seconds (4.9 Kbytes/s)
     ftp>

The file `newthisweek.Z' was specifically included because we'll be using it
later.  Just for general information, it happens to be a listing of all of the
22                                            Zen and the Art of the Internet





files added to UUNET's archives during the past week.

    The directory shown is on a machine running the Unix operating system_
the dir command will produce different results on other operating systems
(e.g. TOPS, VMS, et al.). Learning to recognize different formats will take
some time. After a few weeks of traversing the Internet, it proves easier to
see, for example, how large a file is on an operating system you're otherwise
not acquainted with.

    With many FTP implementations, it's also possible to take the output
of dir and put it into a file on the local system with

     ftp> dir n* outfilename

the contents of which can then be read outside of the live FTP connec-
tion; this is particularly useful for systems with very long directories (like
ftp.uu.net).  The above example would put the names of every file that
begins with an `n' into the local file outfilename.



3.2.3  cd


    At the beginning of an FTP session, the user is in a "top-level" directory.
Most things are in directories below it (e.g. `/pub'). To change the current
directory, one uses the cd command.  To change to the directory `pub', for
example, one would type

     ftp> cd pub

which would elicit the response

     250 CWD command successful.

Meaning the "Change Working Directory" command (`cd') worked properly.
Moving "up" a directory is more system-specific_in Unix use the command
`cd ..', and in VMS, `cd [-]'.



3.2.4  get and put


    The actual transfer is performed with the get and put commands.  To
get a file from the remote computer to the local system, the command takes
the form

     ftp> get filename

where filename is the file on the remote system. Again using ftp.uu.net as
an example, the file `newthisweek.Z' can be retrieved with
Chapter 3 Anonymous FTP                                               23


     ftp> get newthisweek.Z
     200 PORT command successful.
     150 Opening ASCII mode data connection for newthisweek.Z (42390 bytes).
     226 Transfer complete.
     local newthisweek.Z remote newthisweek.Z
     42553 bytes received in 6.9 seconds (6 Kbytes/s)
     ftp>

The section below on using binary mode instead of ASCII will describe why
this particular choice will result in a corrupt and subsequently unusable file.

    If, for some reason, you want to save a file under a different name (e.g.
your system can only have 14-character filenames, or can only have one dot
in the name), you can specify what the local filename should be by providing
get with an additional argument

     ftp> get newthisweek.Z uunet-new

which will place the contents of the file `newthisweek.Z' in `uunet-new' on
the local system.

    The transfer works the other way, too.  The put command will transfer
a file from the local system to the remote system. If the permissions are set
up for an FTP session to write to a remote directory, a file can be sent with

     ftp> put filename

As with get, put will take a third argument, letting you specify a different
name for the file on the remote system.



3.2.4.1  ASCII vs Binary


    In the example above, the file `newthisweek.Z' was transferred, but sup-
posedly not correctly.  The reason is this  in a normal ASCII transfer (the
default), certain characters are translated between systems, to help make
text files more readable. However, when binary files (those containing non-
ASCII characters) are transferred, this translation should not take place.
One example is a binary program_a few changed characters can render it
completely useless.

    To avoid this problem, it's possible to be in one of two modes_ASCII or
binary. In binary mode, the file isn't translated in any way. What's on the
remote system is precisely what's received.  The commands to go between
the two modes are

     ftp> ascii
     200 Type set to A.   (Note the A, which signifies ASCII mode.)


     ftp> binary
     200 Type set to I.   (Set to Image format, for pure binary transfers.)
24                                            Zen and the Art of the Internet





Note that each command need only be done once to take effect; if the user
types binary, all transfers in that session are done in binary mode (that is,
unless ascii is typed later).

    The transfer of `newthisweek.Z' will work if done as

     ftp> binary
     200 Type set to I.
     ftp> get newthisweek.Z
     200 PORT command successful.
     150 Opening BINARY mode data connection for newthisweek.Z (42390 bytes).
     226 Transfer complete.
     local newthisweek.Z remote newthisweek.Z
     42390 bytes received in 7.2 seconds (5.8 Kbytes/s)

          Note  The file size (42390) is different from that done in ASCII
          mode (42553) bytes; and the number 42390 matches the one in the
          listing of UUNET's top directory.  We can be relatively sure that
          we've received the file without any problems.



3.2.4.2  mget and mput


    The commands mget and mput allow for multiple file transfers using wild-
cards to get several files, or a whole set of files at once, rather than having
to do it manually one by one.  For example, to get all files that begin with
the letter `f', one would type

     ftp> mget f*

Similarly, to put all of the local files that end with .c

     ftp> mput *.c

    Rather than reiterate what's been written a hundred times before, con-
sult a local manual for more information on wildcard matching (every DOS
manual, for example, has a section on it).

    Normally, FTP assumes a user wants to be prompted for every file in a
mget or mput operation. You'll often need to get a whole set of files and not
have each of them confirmed_you know they're all right. In that case, use
the prompt command to turn the queries off.

     ftp> prompt
     Interactive mode off.

Likewise, to turn it back on, the prompt command should simply be issued
again.
Chapter 3 Anonymous FTP                                               25





3.3  The archie Server


    A group of people at McGill University in Canada got together and cre-
ated a query system called archie. It was originally formed to be a quick and
easy way to scan the offerings of the many anonymous FTP sites that are
maintained around the world.  As time progressed, archie grew to include
other valuable services as well.

    The archie service is accessible through an interactive telnet session,
email queries, and command-line and X-window clients. The email responses can
be used along with FTPmail servers for those not on the Internet.  (See
[FTP-by-Mail Servers], page 77, for info on using FTPmail servers.)



3.3.1  Using archie Today


    Currently, archie tracks the contents of over 800 anonymous FTP archive
sites containing over a million files stored across the Internet.
Collectively , these files represent well over 50 gigabytes of information,
with new entries being added daily.

    The archie server automatically updates the listing information from each
site about once a month.  This avoids constantly updating the databases,
which could waste network resources, yet ensures that the information on
each site's holdings is reasonably up to date.

    To access archie interactively,  telnet to one of the existing servers.1
They include

       archie.ans.net (New York, USA)
       archie.rutgers.edu (New Jersey, USA)
       archie.sura.net (Maryland, USA)
       archie.unl.edu (Nebraska, USA)
       archie.mcgill.ca (the first Archie server, in Canada)
       archie.funet.fi (Finland)
       archie.au (Australia)
       archie.doc.ic.ac.uk (Great Britain)

At the login prompt of one of the servers, enter `archie' to log in.  A
greeting will be displayed, detailing information about ongoing work in the
archie project; the user will be left at a `archie>' prompt, at which he may
enter commands.  Using `help' will yield instructions on using the `prog'
command to make queries, `set' to control various aspects of the server's
_________________________________


 1  See Chapter 5 [Telnet], page 45, for notes on using the telnet program.
26                                            Zen and the Art of the Internet





operation, et al.  Type `quit' at the prompt to leave archie.  Typing the
query `prog vine.tar.Z' will yield a list of the systems that offer the source
to the X-windows program vine; a piece of the information returned looks
like

     Host ftp.uu.net   (137.39.1.9)
     Last updated 1030  7 Jan 1992


         Location /packages/X/contrib
           FILE       rw-r--r--     15548  Oct  8 2029   vine.tar.Z


     Host nic.funet.fi   (128.214.6.100)
     Last updated 0507  4 Jan 1992


         Location /pub/X11/contrib
           FILE       rw-rw-r--     15548  Nov  8 0325   vine.tar.Z



3.3.2  archie Clients


    There are two main-stream archie clients, one called (naturally enough)
`archie',  the other `xarchie' (for X-Windows).   They query the archie
databases and yield a list of systems that have the requested file(s) avail-
able for anonymous FTP, without requiring an interactive session to the
server. For example, to find the same information you tried with the server
command `prog', you could type

     % archie vine.tar.Z
     Host athene.uni-paderborn.de
         Location /local/X11/more_contrib
                 FILE -rw-r--r--       18854  Nov 15 1990  vine.tar.Z


     Host emx.utexas.edu
         Location /pub/mnt/source/games
                 FILE -rw-r--r--       12019  May  7 1988  vine.tar.Z


     Host export.lcs.mit.edu
         Location /contrib
                 FILE -rw-r--r--       15548  Oct  9 0029  vine.tar.Z

    Note that your system administrator may not have installed the archie
clients yet; the source is available on each of the archie servers, in the
direc- tory `archie/clients'.

    Using the X-windows client is much more intuitive_if it's installed, just
read its man page and give it a whirl. It's essential for the networked
desktop .
Chapter 3 Anonymous FTP                                               27





3.3.3  Mailing archie


    Users limited to email connectivity to the Internet should send a message
to the address `archie@archie.mcgill.ca' with the single word help in the
body of the message. An email message will be returned explaining how to
use the email archie server, along with the details of using FTPmail. Most
of the commands offered by the telnet interface can be used with the mail
server.



3.3.4  The whatis database


    In addition to offering access to anonymous FTP listings,  archie also
permits access to the whatis description database.  It includes the names
and brief synopses for over 3,500 public domain software packages, datasets
and informational documents located on the Internet.

    Additional whatis databases are scheduled to be added in the future.
Planned offerings include listings for the names and locations of online
library catalog programs, the names of publicly accessible electronic mailing
lists, compilations of Frequently Asked Questions lists, and archive sites for
the most popular Usenet newsgroups. Suggestions for additional descriptions or
locations databases are welcomed and should be sent to the archie developers
at `archie-l@cs.mcgill.ca'.



                                                       "Was f"ur pl"undern!"
                                                 ("What a place to plunder!")
                                                   Gebhard Leberecht Bl"ucher
28                                            Zen and the Art of the Internet


Chapter 4 Usenet News                                                    29





4  Usenet News



    The first thing to understand about Usenet is that it is widely misunder-
stood. Every day on Usenet the "blind men and the elephant" phenomenon
appears, in spades.  In the opinion of the author, more flame wars (rabid
arguments) arise because of a lack of understanding of the nature of Usenet
than from any other source.  And consider that such flame wars arise, of
necessity,  among people who are on Usenet.  Imagine,  then,  how poorly
understood Usenet must be by those outside!

    No essay on the nature of Usenet can ignore the erroneous impressions
held by many Usenet users. Therefore, this section will treat falsehoods first.
Keep reading for truth. (Beauty, alas, is not relevant to Usenet.)



4.1  What Usenet Is


    Usenet is the set of machines that exchange articles tagged with one
or more universally-recognized labels, called newsgroups (or "groups" for
short).  (Note that the term `newsgroup' is correct, while `area', `base',
`board', `bboard', `conference', `round table', `SIG', etc. are incorrect. If
you want to be understood, be accurate.)



4.2  The Diversity of Usenet


    If the above definition of Usenet sounds vague, that's because it is. It is
almost impossible to generalize over all Usenet sites in any non-trivial way.
Usenet encompasses government agencies, large universities, high schools,
businesses of all sizes, home computers of all descriptions, etc.

    Every administrator controls his own site.  No one has any real control
over any site but his own. The administrator gets his power from the owner
of the system he administers. As long as the owner is happy with the job the
administrator is doing, he can do whatever he pleases, up to and including
cutting off Usenet entirely. C'est la vie.



4.3  What Usenet Is Not


Usenet is not an organization.

             Usenet has no central authority.  In fact, it has no central any-
             thing. There is a vague notion of "upstream" and "downstream"
30                                            Zen and the Art of the Internet





             related to the direction of high-volume news flow. It follows
             that , to the extent that "upstream" sites decide what traffic
             they will carry for their "downstream" neighbors, that "upstream"
             sites have some influence on their neighbors. But such influence
             is usually easy to circumvent, and heavy-handed manipulation
             typically results in a backlash of resentment.

Usenet is not a democracy.

             A democracy can be loosely defined as "government of the peo-
             ple, by the people, for the people." However, as explained above,
             Usenet is not an organization, and only an organization can be
             run as a democracy.  Even a democracy must be organized, for
             if it lacks a means of enforcing the peoples' wishes, then it may
             as well not exist.

             Some people wish that Usenet were a democracy. Many people
             pretend that it is. Both groups are sadly deluded.

Usenet is not fair.

             After all, who shall decide what's fair? For that matter, if some-
             one is behaving unfairly, who's going to stop him? Neither you
             nor I, that's certain.

Usenet is not a right.

             Some  people  misunderstand  their  local  right  of  "freedom  of
             speech" to mean that they have a legal right to use others' com-
             puters to say what they wish in whatever way they wish, and
             the owners of said computers have no right to stop them.

             Those people are wrong. Freedom of speech also means freedom
             not to speak; if I choose not to use my computer to aid your
             speech, that is my right. Freedom of the press belongs to those
             who own one.

Usenet is not a public utility.

             Some Usenet sites are publicly funded or subsidized.  Most of
             them, by plain count, are not. There is no government monopoly
             on Usenet, and little or no control.

Usenet is not a commercial network.

             Many Usenet sites are academic or government organizations; in
             fact, Usenet originated in academia. Therefore, there is a Usenet
             custom of keeping commercial traffic to a minimum.  If such
             commercial traffic is generally considered worth carrying, then
             it may be grudgingly tolerated. Even so, it is usually separated
             somehow from non-commercial traffic; see comp.newprod.
Chapter 4 Usenet News                                                    31





Usenet is not the Internet.

             The Internet is a wide-ranging network, parts of which are subsi-
             dized by various governments. The Internet carries many kinds
             of traffic; Usenet is only one of them.  And the Internet is only
             one of the various networks carrying Usenet traffic.

Usenet is not a Unix network, nor even an ASCII network.

             Don't assume that everyone is using "rn" on a Unix machine.
             There are Vaxen running VMS, IBM mainframes, Amigas, and
             MS-DOS PCs reading and posting to Usenet.  And, yes, some
             of them use (shudder) EBCDIC. Ignore them if you like, but
             they're out there.

Usenet is not software.

             There are dozens of software packages used at various sites to
             transport and read Usenet articles. So no one program or pack-
             age can be called "the Usenet software."

             Software designed to support Usenet traffic can be (and is) used
             for other kinds of communication, usually without risk of mixing
             the two.  Such private communication networks are typically
             kept distinct from Usenet by the invention of newsgroup names
             different from the universally-recognized ones.

Usenet is not a UUCP network.

             UUCP is a protocol (some might say protocol suite, but that's
             a technical point) for sending data over point-to-point connec-
             tions, typically using dialup modems. Usenet is only one of the
             various kinds of traffic carried via UUCP, and UUCP is only one
             of the various transports carrying Usenet traffic.

    Well, enough negativity.



4.4  Propagation of News


    In the old days,  when UUCP over long-distance dialup lines was the
dominant means of article transmission, a few well-connected sites had real
influence in determining which newsgroups would be carried where.  Those
sites called themselves "the backbone."

    But things have changed. Nowadays, even the smallest Internet site has
connectivity the likes of which the backbone admin of yesteryear could only
dream.  In addition, in the U.S., the advent of cheaper long-distance calls
and high-speed modems has made long-distance Usenet feeds thinkable for
smaller companies.  There is only one pre-eminent UUCP transport site
32                                            Zen and the Art of the Internet





today in the U.S., namely UUNET. But UUNET isn't a player in the propa-
gation wars, because it never refuses any traffic_it gets paid by the minute,
after all; to refuse based on content would jeopardize its legal status as an
enhanced service provider.

    All of the above applies to the U.S. In Europe, different cost structures
favored the creation of strictly controlled hierarchical organizations with
central registries. This is all very unlike the traditional mode of U.S. sites
(pick a name, get the software, get a feed, you're on).  Europe's "benign mo-
nopolies", long uncontested, now face competition from looser organizations
patterned after the U.S. model.



4.5  Group Creation


    As discussed above, Usenet is not a democracy.  Nevertheless, currently
the most popular way to create a new newsgroup involves a "vote" to de-
termine popular support for (and opposition to) a proposed newsgroup. See
Appendix C [Newsgroup Creation], page 79, for detailed instructions and
guidelines on the process involved in making a newsgroup.

    If you follow the guidelines, it is probable that your group will be
created and will be widely propagated. However, due to the nature of Usenet,
there is no way for any user to enforce the results of a newsgroup vote (or
any other decision, for that matter).  Therefore, for your new newsgroup to be
propagated widely, you must not only follow the letter of the guidelines; you
must also follow its spirit.  And you must not allow even a whiff of shady
dealings or dirty tricks to mar the vote.

    So, you may ask How is a new user supposed to know anything about the
"spirit" of the guidelines?  Obviously, she can't.  This fact leads inexorably
to the following recommendation

     If you're a new user, don't try to create a new newsgroup alone.

If you have a good newsgroup idea, then read the news.groups newsgroup for
a while (six months, at least) to find out how things work. If you're too im-
patient to wait six months, then you really need to learn; read news.groups
for a year instead. If you just can't wait, find a Usenet old hand to run the
vote for you.

    Readers may think this advice unnecessarily strict.  Ignore it at your
peril. It is embarrassing to speak before learning. It is foolish to jump into
a society you don't understand with your mouth open. And it is futile to try
to force your will on people who can tune you out with the press of a key.
Chapter 4 Usenet News                                                    33





4.6  If You're Unhappy


    Property rights being what they are,  there is no higher authority on
Usenet than the people who own the machines on which Usenet traffic is
carried. If the owner of the machine you use says, "We will not carry alt.sex
on this machine," and you are not happy with that order, you have no Usenet
recourse. What can we outsiders do, after all?

    That doesn't mean you are without options.  Depending on the nature
of your site, you may have some internal political recourse.  Or you might
find external pressure helpful. Or, with a minimal investment, you can get a
feed of your own from somewhere else. Computers capable of taking Usenet
feeds are down in the $500 range now, Unix-capable boxes are going for
under $2000, and there are at least two Unix lookalikes in the $100 price
range.

    No matter what, appealing to "Usenet" won't help.  Even if those who
read such an appeal regarding system administration are sympathetic to
your cause, they will almost certainly have even less influence at your site
than you do.

    By the same token, if you don't like what some user at another site is
doing, only the administrator and/or owner of that site have any authority to
do anything about it. Persuade them that the user in question is a problem
for them, and they might do something (if they feel like it).  If the user in
question is the administrator or owner of the site from which he or she posts,
forget it; you can't win.  Arrange for your newsreading software to ignore
articles from him or her if you can, and chalk one up to experience.



4.7  The History of Usenet (The ABCs)


    In the beginning, there were conversations, and they were good.  Then
came Usenet in 1979, shortly after the release of V7 Unix with UUCP; and
it was better.  Two Duke University grad students in North Carolina, Tom
Truscott and Jim Ellis, thought of hooking computers together to exchange
information with the Unix community.  Steve Bellovin, a grad student at
the University of North Carolina, put together the first version of the news
software using shell scripts and installed it on the first two sites  unc and
duke.  At the beginning of 1980 the network consisted of those two sites
and phs (another machine at Duke), and was described at the January 1980
Usenix conference in Boulder, CO.1 Steve Bellovin later rewrote the scripts
into C programs, but they were never released beyond unc and duke. Shortly
_________________________________
34                                            Zen and the Art of the Internet





thereafter, Steve Daniel did another implementation in the C programming
language for public distribution. Tom Truscott made further modifications,
and this became the "A" news release.

    In 1981 at the University of California at Berkeley, grad student Mark
Horton and high school student Matt Glickman rewrote the news software to
add functionality and to cope with the ever increasing volume of news_"A"
news was intended for only a few articles per group per day.  This rewrite
was the "B" news version. The first public release was version 2.1 in 1982;
all versions before 2.1 were considered in beta test.  As The Net grew, the
news software was expanded and modified. The last version maintained and
released primarily by Mark was 2.10.1.

    Rick Adams, then at the Center for Seismic Studies, took over coordi-
nation of the maintenance and enhancement of the news software with the
2.10.2 release in 1984. By this time, the increasing volume of news was be-
coming a concern, and the mechanism for moderated groups was added to
the software at 2.10.2.  Moderated groups were inspired by ARPA mailing
lists and experience with other bulletin board systems.  In late 1986, ver-
sion 2.11 of news was released, including a number of changes to support a
new naming structure for newsgroups, enhanced batching and compression,
enhanced ihave/sendme control messages, and other features.  The current
release of news is 2.11, patchlevel 19.

    A new version of news, becoming known as "C" news, has been developed at
the University of Toronto by Geoff Collyer and Henry Spencer.  This version is
a rewrite of the lowest levels of news to increase article processing speed,
decrease article expiration processing and improve the reliability of the news
system through better locking,  etc.  The package was released to The Net in
the autumn of 1987.  For more information, see the paper News Need Not Be
Slow, published in the Winter 1987 Usenix Technical Conference proceedings.

    Usenet software has also been ported to a number of platforms, from the
Amiga and IBM PCs all the way to minicomputers and mainframes.



4.8  Hierarchies


    Newsgroups are organized according to their specific areas of concentra-
tion.  Since the groups are in a tree structure, the various areas are called
hierarchies. There are seven major categories
_________________________________
 1  The Usenix conferences are semi-annual meetings where members of the

    Usenix Association, a group of Unix enthusiasts, meet and trade notes.
Chapter 4 Usenet News                                                    35





`comp'       Topics of interest to both computer professionals and hobby-
             ists, including topics in computer science, software sources, and
             information on hardware and software systems.

`misc'       Group addressing themes not easily classified into any of the
             other headings or which incorporate themes from multiple cat-
             egories.  Subjects include fitness, job-hunting, law, and invest-
             ments.

`sci'        Discussions marked by special knowledge relating to research in
             or application of the established sciences.

`soc'        Groups primarily addressing social issues and socializing.  In-
             cluded are discussions related to many different world cultures.

`talk'       Groups largely debate-oriented and tending to feature long dis-
             cussions without resolution and without appreciable amounts of
             generally useful information.

`news'       Groups concerned with the news network, group maintenance,
             and software.

`rec'        Groups oriented towards hobbies and recreational activities

    These  "world"  newsgroups  are  (usually)  circulated  around  the  entire
Usenet_this implies world-wide distribution.  Not all groups actually en-
joy such wide distribution, however. The European Usenet and Eunet sites
take only a selected subset of the more "technical" groups, and controversial
"noise" groups are often not carried by many sites in the U.S. and  Canada
(these groups are primarily under the `talk' and `soc' classifications). Many
sites do not carry some or all of the comp.binaries groups because of the
typically large size of the posts in them (being actual executable programs).

    Also available are a number of "alternative" hierarchies

`alt'        True anarchy; anything and everything can and does appear;
             subjects include sex, the Simpsons, and privacy.

`gnu'        Groups concentrating on interests and software with the GNU
             Project of the Free Software Foundation.  For further info on
             what the FSF is, see Section 8.3.4 [FSF], page 68.

`biz'        Business-related groups.



4.9  Moderated vs Unmoderated


    Some newsgroups insist that the discussion remain focused and on-target;
to serve this need, moderated groups came to be.  All articles posted to a
moderated group get mailed to the group's moderator. He or she periodically
(hopefully sooner than later) reviews the posts, and then either posts them
36                                            Zen and the Art of the Internet





individually to Usenet, or posts a composite digest of the articles for the
past day or two. This is how many mailing list gateways work (for example,
the Risks Digest).



4.10  news.groups & news.announce.newgroups


    Being a good net.citizen includes being involved in the continuing growth
and  evolution  of  the  Usenet  system.   One  part  of  this  involvement
includes following the discussion in the groups news.groups and the notes
in news.announce.newgroups. It is there that discussion goes on about the
creation of new groups and destruction of inactive ones.  Every person on
Usenet is allowed and encouraged to vote on the creation of a newsgroup.



4.11  How Usenet Works


    The transmission of Usenet news is entirely cooperative. Feeds are gener-
ally provided out of good will and the desire to distribute news everywhere.
There are places which provide feeds for a fee (e.g.  UUNET), but for the
large part no exchange of money is involved.

    There are two major transport methods, UUCP and NNTP. The first is
mainly modem-based and involves the normal charges for telephone calls.
The second, NNTP, is the primary method for distributing news over the
Internet.

    With UUCP, news is stored in batches on a site until the neighbor calls
to receive the articles, or the feed site happens to call.  A list of groups
which the neighbor wishes to receive is maintained on the feed site.  The
Cnews system compresses its batches, which can dramatically reduce the
transmission time necessary for a relatively heavy newsfeed.

    NNTP, on the other hand, offers a little more latitude with how news
is sent.  The traditional store-and-forward method is, of course, available.
Given the "real-time" nature of the Internet, though, other methods have
been devised.  Programs now keep constant connections with their news
neighbors, sending news nearly instantaneously, and can handle dozens of
simultaneous feeds, both incoming and outgoing.

    The  transmission  of  a  Usenet  article  is  centered  around  the  unique
`Message-ID' header.  When an NNTP site offers an article to a neigh-
bor, it says it has that specific Message ID. If the neighbor finds it hasn't
received the article yet, it tells the feed to send it through; this is repeated
for each and every article that's waiting for the neighbor. Using unique IDs
Chapter 4 Usenet News                                                    37





helps prevent a system from receiving five copies of an article from each of
its five news neighbors, for example.

    Further information on how Usenet works with relation to the various
transports is available in the documentation for the Cnews and NNTP pack-
ages, as well as in RFC-1036, the Standard for Interchange of USENET
Messages and RFC-977, Network News Transfer Protocol A Proposed Stan-
dard for the Stream-Based Transmission of News. The RFCs do tend to be
rather dry reading, particularly to the new user.  See  [RFCs], page 73 for
information on retrieving RFCs.



4.12  Mail Gateways


    A natural progression is for Usenet news and electronic mailing lists to
somehow become merged_which they have, in the form of news gateways.
Many mailing lists are set up to "reflect" messages not only to the readership
of the list, but also into a newsgroup.  Likewise, posts to a newsgroup can
be sent to the moderator of the mailing list, or to the entire mailing list.
Some examples of this in action are comp.risks (the Risks Digest) and
comp.dcom.telecom (the Telecom Digest).

    This method of propagating mailing list traffic has helped solve the prob-
lem of a single message being delivered to a number of people at the same
site_instead, anyone can just subscribe to the group.  Also, mailing list
maintenance is lowered substantially, since the moderators don't have to be
constantly removing and adding users to and from the list.  Instead, the
people can read and not read the newsgroup at their leisure.



4.13  Usenet "Netiquette"


    There are many traditions with Usenet, not the least of which is dubbed
netiquette_being polite and considerate of others. If you follow a few basic
guidelines, you, and everyone that reads your posts, will be much happier in
the long run.



4.13.1  Signatures


    At the end of most articles is a small blurb called a person's signature.
In Unix this file is named `.signature' in the person's login directory_
it will vary for other operating systems.  It exists to provide information
about how to get in touch with the person posting the article, including
38                                            Zen and the Art of the Internet





their email address, phone number, address, or where they're located. Even
so, signatures have become the graffiti of computers. People put song lyrics,
pictures, philosophical quotes, even advertisements in their ".sigs". (Note,
however, that advertising in your signature will more often than not get you
flamed until you take it out.)

    Four lines will suffice_more is just extra garbage for Usenet sites to carry
along with your article, which is supposed to be the intended focus of the
reader.  Netiquette dictates limiting oneself to this "quota" of four_some
people make signatures that are ten lines or even more, including elaborate
ASCII drawings of their hand-written signature or faces or even the space
shuttle. This is not cute, and will bother people to no end.

    Similarly, it's not necessary to include your signature_if you forget to
append it to an article, don't worry about it.  The article's just as good as
it ever would be, and contains everything you should want to say.  Don't
re-post the article just to include the signature.



4.13.2  Posting Personal Messages


    If mail to a person doesn't make it through, avoid posting the message to
a newsgroup. Even if the likelihood of that person reading the group is very
high, all of the other people reading the articles don't give a whit what you
have to say to Jim Morrison. Simply wait for the person to post again and
double-check the address, or get in touch with your system administrator
and see if it's a problem with local email delivery. It may also turn out that
their site is down or is having problems, in which case it's just necessary to
wait until things return to normal before contacting Jim.



4.13.3  Posting Mail


    In the interests of privacy, it's considered extremely bad taste to post any
email that someone may have sent, unless they explicitly give you permis-
sion to redistribute it.  While the legal issues can be heavily debated, most
everyone agrees that email should be treated as anything one would receive
via normal snailmail,2, with all of the assumed rights that are carried with
it.

_________________________________


 2  The slang for the normal land and air postal service.
Chapter 4 Usenet News                                                    39





4.13.4  Test Messages


    Many people, particularly new users, want to try out posting before actu-
ally taking part in discussions. Often the mechanics of getting messages out
is the most difficult part of Usenet.  To this end, many, many users find it
necessary to post their tests to "normal" groups (for example, news.admin
or comp.mail.misc). This is considered a major netiquette faux pas in the
Usenet world.  There are a number of groups available, called test groups,
that exist solely for the purpose of trying out a news system, reader, or even
new signature. They include

     alt.test
     gnu.gnusenet.test
     misc.test

some of which will generate auto-magic replies to your posts to let you know
they made it through. There are certain denizens of Usenet that frequent the
test groups to help new users out. They respond to the posts, often including
the article so the poster can see how it got to the person's site. Also, many
regional hierarchies have test groups, like phl.test in Philadelphia.

    By all means, experiment and test_just do it in its proper place.



4.13.5  Famous People Appearing


    Every once in a while, someone says that a celebrity is accessible through
"The Net";  or, even more entertaining, an article is forged to appear to
be  coming  from  that  celebrity.   One  example  is  Stephen  Spielberg_the
rec.arts.movies readership was in an uproar for two weeks following a
couple of posts supposedly made by Mr.  Spielberg.  (Some detective work
revealed it to be a hoax.)

    There are a few well-known people that are acquainted with Usenet and
computers in general_but the overwhelming majority are just normal peo-
ple. One should act with skepticism whenever a notable personality is "seen"
in a newsgroup.



4.13.6  Summaries


    Authors of articles occasionally say that readers should reply by mail and
they'll summarize. Accordingly, readers should do just that_reply via mail.
Responding with a followup article to such an article defeats the intention of
the author. She, in a few days, will post one article containing the highlights
of the responses she received. By following up to the whole group, the author
may not read what you have to say.
40                                            Zen and the Art of the Internet





    When creating a summary of the replies to a post, try to make it as
reader-friendly as possible.  Avoid just putting all of the messages received
into one big file. Rather, take some time and edit the messages into a form
that contains the essential information that other readers would be interested
in.

    Also, sometimes people will respond but request to remain anonymous
(one example is the employees of a corporation that feel the information's not
proprietary, but at the same time want to protect themselves from political
backlash).  Summaries should honor this request accordingly by listing the
`From' address as `anonymous' or `(Address withheld by request)'.



4.13.7  Quoting


    When following up to an article, many newsreaders provide the facility
to quote the original article with each line prefixed by `> ', as in

     In article <1232@foo.bar.com>, sharon@foo.bar.com wrote
     > I agree, I think that basketweaving's really catching on,
     > particularly in Pennsylvania.  Here's a list of every person
     > in PA that currently engages in it publicly


     etc

    This is a severe example (potentially a horribly long article), but proves
a point.  When you quote another person, edit out whatever isn't directly
applicable to your reply.3  This gives the reader of the new article a better
idea of what points you were addressing.  By including the entire article,
you'll only annoy those reading it.  Also, signatures in the original aren't
necessary; the readers already know who wrote it (by the attribution).

    Avoid being tedious with responses_rather than pick apart an article,
address it in parts or as a whole. Addressing practically each and every word
in an article only proves that the person responding has absolutely nothing
better to do with his time.

    If a "war" starts (insults and personal comments get thrown back and
forth), take it into email_exchange email with