For most people, collecting textfiles is an incidental process, a by-product of having seen something funny on a BBS (or a Usenet newsgroup, or FTP site as time went on...) and buffering it to disk or doing a screen capture; in other words, tucking it away for "later". Of course, "later" turns out to be anything from a week to over a decade, and at that time the user is faced with probably a couple dozen weirdly-named files, some of which still produce a chuckle and others which have lost any contextual meaning altogether.
Now, change "couple dozen" to "over 15,000" and you start to see the problem.
My personal collection of textfiles numbered in the range of about 3,000, but once I started doing some serious looking around and calling in of favors, I suddenly had a rough collection of five times that many. I make it a point not to give any limits to what I'm looking for and what I'll accept; but as I go through the textfile directories, there are some basic types of files that are more likely to disappear:
Source files with no contextual explanation or meaning will not stick around, unless it's something historical (the source code to the original "adventure" game) or importantly contextual in itself (the code to the 1989 internet worm got in that way). I don't consider it within the scope of this site to be a shareware archive and many times I couldn't tell if I had all the proper components anyway.Binary files would seem to be an obvious type that gets deleted right away, but there, I said it outright. In one or two cases there are image files here, but this is so rare as to be lost in statistical variation. (For example, one of the 480+ UXU files was a .jpeg file and I felt like deleting the charming staff photo would just be mean.)
Press releases and news stories that do not strike me as entertaining or otherwise distinct are usually lost. There is a news directory, so it's probably possible to track my bias(es).
An important topic to discuss is that of doubled files, files that show up a couple times in textfiles.com. If several files are almost exact duplicates of each other, with maybe the BBS tagline being different, then I will choose one arbitrarily to go into the archive. However, the nature of textfiles are such that some unusually popular ones will be reformatted, augmented, spell checked, and my personal favorite, re-credited. This "morphing" happens most often with "underground" files and what I call "office humor". I believe the record goes to the Gold Box Plans by Sir William, which I've seen no less than 30 incarnations of! Popular file, that.
Finally, the last set of files that I'll send to the Null hereafter are those nebulous "hanger-on" files that always seem to get underfoot: data files, lists of files from packages long split up or deleted, README files and their ilk, or any file that is uninformative/completely useless without the accompanying "stuff".