links [options] Unix-name-of-served-directoryThis is an ordinary Unix program that uses read access to a directory tree served by a web server. It recursively considers each html file in the named directory tree. It reports internal links to non existent files, and reports files to which there are no internal links. The program ignores symbolic links in the Unix file system. It does not reprocess files due to hard links.
Links of the form <a … href="…" … >, <embed … src="…" … > and <img … src ="…" …> are recognized. A link is considered internal if its href or src field does not contain a colon. <a href="link.html"> is an internal link but <a href="http://www.google.com"> is external.
There is as yet no “transitive closure” logic in the program. A file that refers to itself is considered referenced even if no other files refer to it.
In file /home/norm/cap-lore.com/CapTheory/Rees/figs.html : on line 11, Bad keyword on line 14, No such file as domain-figure.pngThe unreferenced files are listed at the end, one file name per line. When whole directories and their contents are unreferenced the report is as:
Priv/obscure/ …When a file is unreferenced but in a directory without an index file, and some other file in that directory is referenced, then the unreferenced file name is listed, but followed by “except trunk” as in
annotes/88314F2.JPG except truncA surffer may find that file by truncating the URL of the referenced file.
These names are relative to the root of the server tree. Funny characters in reported file names are escaped in a form compatible with http conventions, I think.
A few miscellaneous statistics are reported and explained at the very end.
An option beginning “-o” selects several of a few debugging options, one per character following the “-o”. Possible selection characters are “nAHF”.
“-mmax” makes the program refrain from reading more than max bytes from files. “max” is in decimal. This is to voluntarily limit impact on system especially while debugging the program. Default is 50000000.
On Mac OS X (2011 Apr) I get an executable with:
gcc -O3 -arch i386 -fnested-functions tree.c
Some guidelines for portable html