Add gopher-extension.md. - gopher-protocol - Gopher Protocol Extension Project
git clone git://bitreich.org/gopher-protocol git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws65d7roiv6bfj7d652fid.onion/gopher-protocol
Log
Files
Refs
Tags
README
LICENSE
---
commit 7c5d9f434a37a64f1fd5b64991df8e154f1dd84d
parent f7bb7959972ce0d459cb2a7b478250f908b3728d
Author: Christoph Lohmann <20h@r-36.net>
Date:   Mon,  8 Aug 2022 18:27:05 +0200

Add gopher-extension.md.

Thanks Evil_Bob for noting.

Diffstat:
  A gopher-extension.md                 |     364 +++++++++++++++++++++++++++++++

1 file changed, 364 insertions(+), 0 deletions(-)
---
diff --git a/gopher-extension.md b/gopher-extension.md
@@ -0,0 +1,364 @@
+Gopher Extension
+=================
+
+# Goals of this document
+
+The intention is to not make radical changes to the RFC1336 standard.
+
+This document also describes the common-used extensions to the
+Gopher RFC and some clarifications to the wording of the RFC.
+
+Since the publication of the RFC1436 standard around March 1993 there
+have been developments, such as the adoption of the UTF-8
+text-encoding and the use of SSL and later TLS encryption.
+
+The recommendations can be therefore be seen as guidelines or
+"SHOULD".
+
+
+# Added Types
+
+Types can be added, this doesn't violate the RFC specification:
+section 3.8: "Characters '0' through 'Z' are reserved.".
+
+These are types that are commonly used.
+
+* The 'h' type: HTML file, with the "URL:" prefix in the selector it points to
+  an URL, see historical mail conversation (embedded below).
+* The 'i' type, Informational message: display as text.
+  i Some message  empty selector server TAB port CR LF
+  The server and port should be included for compatibility.
+* As mentioned in the original Gopher RFC, for other types:
+  Anything primary Text file? Use the 0 type.
+  Anything unknown  or binary file? Use the 9 type and a file extension.
+* Use the image (I) type for png, jpg etc. Make sure to set the file extension.
+
+
+# Using the proper type for Text file or binary
+
+* Sometimes a question comes up where PDF or XML is binary. If the
+  file is readable as text it is a text file, otherwise it is binary.
+
+  For example PDF would be using the binary 9 type and a .pdf file
+  extension.  XML would be a text 0 type.
+
+  Type 0 is files which are pure text and can be displayed in a text
+  editor.
+
+
+# Text Encoding
+
+* The Notes section in the Gopher RFC mentions Latin1 encoding.
+
+  Recommendation: Use UTF-8 or ASCII-only for the Gopher
+  username/title field.  A client may want to display the other fields,
+  so be polite and use UTF-8 or ASCII there as well if possible.
+
+  Reason: UTF-8 is a simple text-encoding and commonly used these days.
+
+  People who use Latin1 eat children.
+
+
+# Accessibility
+
+* From the Gopher RFC standard:
+  "It is *highly* recommended that the User_Name field contain only
+   printable characters, since many different clients will be using
+   it.  However if eight bit characters are used, the characters
+   should conform with the ISO Latin1 Character Set.  The length of
+   the user-displayable line should be less than 70 characters; longer
+   lines may not fit across some screens."
+
+  New Recommendation: Don't use longer than 79 columns of UTF-8 encoded displayed
+  "username" text.
+  New Recommendation: Try to reduce the amount of ASCII art which can contain
+  non-printable characters. Think of the blind or tools used to parse actual
+  textual content.
+  Reason: A clarification of the term characters is needed.
+
+* "The selector string should be no longer than 255 characters."
+
+  Recommendation: use no longer than 255 bytes.
+  Reasons:
+  * A clarification of the term characters is needed. "characters" could be
+    interpreted as columns.
+  * Clients can simply use a static buffer to fit 255 bytes.
+  * Although Gopher does not have to map to a filesystem, filesystems typically
+    have a limit of around 255 bytes also.
+
+* From section 3.5:
+  "If a client does not understand what a, say, type 'B' item (not a core
+   item) is, then it may simply ignore the item in the directory
+   listing; the user never even has to see it.  Alternatively, the item
+   could be displayed as an unknown type."
+
+  Recommendation: For clients, do not silently ignore an item, but display it
+  as an unknown type.
+  Reason: Define a recommendation for consistent behaviour in clients.
+
+
+# Server and client handling of text file types
+  
+The RFC defines:
+  
+  "Textfile Entity
+  
+  TextFile  ::= {TextBlock} Lastline"
+
+  "Note:  Lines beginning with periods must be prepended with an extra
+       period to ensure that the transmission is not terminated early.
+       The client should strip extra periods at the beginning of the line."
+  
+  "Note:  The client should be prepared for the server closing the
+         connection without sending the Lastline.  This allows the
+         client to use fingerd servers."
+  
+  From section 4:
+  
+        "(b) The well-tempered server ought to send "text" (unless a file
+        must be transferred as raw binary).  Should this text include
+        tabs, formfeeds, frufru?  Probably not, but rude servers will
+        probably send them anyway.  Publishers of documents should be
+        given simple tools (filters) that will alert them if there are any
+        funny characters in the documents they wish to publish, and give
+        them the opportunity to strip the questionable characters out; the
+        publisher may well refuse.
+  
+        (c) The well-tempered client should do something reasonable with
+        funny characters received in text; filter them out, leave them in,
+        whatever."
+  
+The above description we think is too vague and it can be simpler.
+  
+Recommendation: handle retrieving text file types the same as binary types.
+For clients the Lastline pattern (".\r\n") is not handled specially in this case,
+it is part of the data.
+For servers no preprocessing is done on the TextFile data.
+  
+Reason: Simplify the implementation of handling text types. Make the behaviour
+of text output consistent for clients.
+  
+  
+# The 'h' type: h_type.txt
+  
+Below is an archived conversation about the Gopher 'h' type:
+  
+  "Received: with LISTAR (v1.0.0; list gopher); Tue, 12 Feb 2002 14:19:47 -0500 (EST)
+  Return-Path: 
+  Delivered-To: gopher@complete.org
+  To: gopher@complete.org
+  Subject: [gopher] Links to URL
+  From: John Goerzen 
+  Date: 12 Feb 2002 14:19:46 -0500
+  Content-type: text/plain; charset=us-ascii
+  Content-Transfer-Encoding: 8bit
+  
+  I think it is best to start small with modifications to the protocol.
+  Therefore, I propose the following:
+  
+  Method to link to URLs from Gopherspace
+  ---------------------------------------
+  
+  1. Protocol issues
+  
+  Links to URLs from a gopher directory shall be defined as follows:
+  
+   Type -- the appropriate character corresponding to the type of the
+   document on the remote end; h if HTML.
+  
+   Path -- the full URL, preceeded by "URL:".  For instance:
+           URL:http://www.complete.org/
+  
+   Host, Port -- pointing back to the gopher server that provided
+   the directory for compatibility reasons.
+  
+   Name -- as usual for a Gopher directory entry.
+  
+  2. Conforming client requirements
+  
+  A client adhering to this specification will, when it sees a Gopher
+  selector with a path starting with URL:, interpret the path as a URL.
+  It will ignore the host and port components of the Gopher selector,
+  using those components from the URL instead (if applicable).
+  
+  3. Conforming server requirements
+  
+  A server with Gopher URL support will not, in most cases, need to take
+  extra steps to provide this support beyond those outlined in
+  Compatibility below.  Servers not implementing those steps outlined in
+  Compatibility will be deemed to be not in compliance.
+  
+  4. Authoring compliance
+  
+  The use of URL: selectors should be avoided wherever possible.  In
+  particular, it should be avoided when pre-existing gopher facilities
+  exist for the type of content linked.  The following URL types are
+  explicitly prohibited by this specification:
+  
+    gopher
+    telnet
+    tn3270
+  
+  Authors should avoid links to any document not of HTML type whenever
+  possible.  Linking to non-HTML documents will break compatibility with
+  Gopher browsers that do not implement this specification.  The ranks
+  of these browsers include most Web browsers, so that is a significant
+  audience.
+  
+  5. Compatibility
+  
+  Links to HTML pages may be accomodated even for non-comforming
+  browsers by providing additional capabilities in the server.
+  
+  When a non-conforming browser is instructed to follow a link to a URL,
+  it will contact the Gopher server that provided the menu (since these
+  are specified per section 1).
+  
+  When a conforming Gopher server receives a request whose path begins
+  with URL:, it will write out a HTML document that will send the
+  non-compliant browser to the appropriate place.  One such conforming
+  document is:
+  
+    
+    
+    
+    
+    
+    You are following a link from gopher to a web site.  You will be
+    automatically taken to the web site shortly.  If you do not get sent
+    there, please click
+    here to go to the web site.
+    

+ The URL linked is: +

+ http://www.acm.org/classics/ +

+ Thanks for using gopher! + + + + This document may be any desired by the server authors, but must + adhere to these requirements: + * It must provide a refresh of a duration of 10 seconds or less + * It must not use IMG tags, frames, or have any reference whatsoever + to content outside that particular file -- other than the link + to the real destination. + * It must not use JavaScript. + * It must adhere to the W3C HTML 3.2 standard. + + When a non-conforming Gopher client finds a reference to a HTML file + (type h), it will open up the file via Gopher (getting the redirect + document) but using a web browser. The web browser will then be + redirected to the actual link destination. Conforming clients will + follow the link directly. + + END + + + Comments? + +/h_type + + +# TLS support + +From: 2020-06-07 Gopher TLS prototype in geomyidae by 20h at +gophers://bitreich.org/0/usr/20h/phlog/2020-06-07T18-28-23-863932.md + + # 2020-06-07 18:28:23.863932 UTC (+0000) + + Gopher TLS prototype in geomyidae + + We are happy and proud to announce, that there is now a prototype of + gopher tls in geomyidae + + git://bitreich.org/geomyidae + + How does it work? + + When a client tries to connect via TLS, the first byte of the packet + will be 0x16 or 22 decimal, which is forbidden as a selector in Gopher. + This gives the server a hint to start TLS. Old servers will simply + reject such a connection attempt. + + For now clic supports TLS. We are working on hurl TLS support. And for + sacc it is on its way. + + git://bitreich.org/clic + git://bitreich.org/sacc + git://codemadness.org/hurl + + Hopefully further support will come to other clients. + + If you do not have anything at hand, here are some commandline clients: + + Plain old Gopher: + + printf "/\r\n" | nc bitreich.org 70 + + And with TLS: + + printf "/\r\n" | socat openssl-connect:bitreich.org:70,verify=0 - + + Have fun using TLS on gopher! + + + All patches and recommendations are welcome. + + + Sincerely yours, + + 20h + Senior Security Manager (SSM) + +/post by 20h + +# Gopher TLS URI + +A gopher TLS URI is the same as the Gopher URI described in RFC4266, +except the protocol scheme is gophers://. + +When the client using the Gopher protocol does not support TLS it can +simply use a plain gopher:// connection. + + +# Gopher TLS downgrades + +A client COULD implement the following logic: + +When a user uses gophers:// then it should use TLS and not downgrade +automatically to a plain connection. The client COULD also show a +_clear_ message if the TLS connection is not accepted and offer a +manual downgrade option to plain-text. + +When further selectors of the same host and port are accessed it should use +TLS automatically as well. + + +# Gopher+ compatibility + +Gopher+ allows adding more TAB-separated fields to the output. For +Gopher, to be compatible with Gopher+ clients, it can simply accept the +line, but ignore these additional fields. + + +# Other references: + +* RFC1436 - The Internet Gopher Protocol + https://www.rfc-editor.org/rfc/rfc1436.txt + or references/rfc1436.txt + +* RFC4266 - The gopher URI Scheme + https://www.rfc-editor.org/rfc/rfc4266.txt + or references/rfc4266.txt + +* Gopher+: + https://github.com/gopher-protocol/gopher-plus/blob/main/gopherplus.txt + or references/gopherplus.txt + +* geomyidae Gopher server: + git://bitreich.org/geomyidae + +* Helper tool to validate gopher and DirEntities: + git://bitreich.org/gopher-validator +