When looking at large scale preservation projects, a lesson I have learned from the Geocities situation, is that one person cannot do it all.  Two people cannot do it all.   Three people cannot do it all.  Four people cannot do it all, even if they are dedicated, well meaning, have certain skills and are willing to put the time into saving and preserving the history of an online community.  I’m ecstatic about what we accomplished but I’m also disappointed.  I feel like we could have done more, if there was more than a core team of a few of us at Fan History. 

In doing a preservation project, one person trying to save things means that things will be missed.  The process can be greatly aided by institutional help in determining the size and scope of what needs to be saved, and in developing a list of resources that need to be saved.  The following of institutional structures that would have been useful for us to have had relationships with:

  • Yahoo/Geocities.  ArchiveTeam tried to reach out to them and were rebuffed.
  • Open Directory Project.  We didn’t really need them because they helpfully provide a list of all the sites listed on DMOZ.  We were able to utilize this.  For that, we are happy.
  • WebRing.  Lots of Geocities sites hosted there.  Couldn’t scrape them easily to try to easily pull a url list.  Didn’t respond to our contact requests.  Couldn’t find anyone associated with them on Twitter to help us get that.  We probably should have called and called and nagged. :/
  • ArchiveTeam.  We used one of the lists they provided to get urls to scrape but most of what they provided publicly wasn’t that useful for our needs.  I did a comment on their blog.   We should have reached out more to them.  They had the technical knowledge to do things, had the computers, had the dedication to do this.  We have the structure to provide a historical framework for some of what they were doing.
  • The Internet Archive. They archive old copies of pages across the Internet.  
  • Organization for Transformative Works.   They are dedicated to preserving fandom history.  Repeatedly reached out to them on Twitter, on LiveJournal, via e-mail and elsewhere.  Never responded.  EXTREMELY FRUSTRATING. They have manpower and a shared purpose that would have been helpful.

We should have gotten institutional support.  They could have provided us with three things: 1] A (structured) list of fandom related urls, 2] Technical assistance, 3] Community support and manpower.  Some of that we got because these institutions provide some of this as part of their own efforts and mission.

After institutional support, we needed technical assistance.  We were lucky in that we got some assistance from Lewis Collard and illyism.   We just didn’t think to ask some people until too late, about two weeks before Geocities closed.  Asking two individuals for all our technical assistance and implementation of technical solutions isn’t fair, won’t result in a timely solution.  It doesn’t allow enough time for integration with institutional support.  Lewis scraped about 5,000 pages and screencapped them.  He downloaded about 2,000 files.  In doing all this, he blew through his monthly bandwidth allotment. Illyism developed a Firefox plugin.   There are several resources we could have, probably should have tapped in our own community, the wiki community and the fandom community.  They include:

  • #mediawiki, #wiki, #yourwiki, #recentchangescamp, #wikia, #pywikipediabot, #wikihow on irc.freenode.net.  These chat rooms have awesome people in them.  They do a lot of open source projects. 
  • Organization for Transformative Works.  They are training female programmers for their projects. 
  • ArchiveTeam and Internet Archive.  They already have technical people. 
  • Idealist.Org.  Listing requests for volunteers might have gotten some assistance.
  • Wikimedia related mailing lists.
  • Tech and programming people we met at RecentChangesCamp.
  • Relevant LinkedIn communities.
  • weget developer community.
  • identi.ca.  They have a lot of developers on the site who use it instead of Twitter.  It is smaller and more intimate.  It has a lot of open source advocates who like doing things for the greater good.

There are probably a few more places where we could have asked for help.  We could have used help creating a list or urls, screencapping sites, creating tools to automatically input the attained data into a usable format, and tools to make it easier for non-tech people to contribute.

The last component in the we cannot do it by ourselves and succeed is community support.  In our case, that community involves fandom.  We needed the community to help edit relevant articles to do deeper documentation on stub articles created by our core team and created with assistance from our tech team.  Sometimes in fandom, it is easy to get locked into a box of thinking of your corner as representative of everyone.  The more people we had, the smaller that box would have been.  We should have been more aggressive with our outreach to that community and to internet news and mainstream news sites that would have covered our project and helped us get greater exposure with our core audience.  Places we should have been more aggressive include:

  • wikiFur, Transformers Wiki, Futurama Wiki, Battlestar Galactica Wiki, Wikia, other fandom related wikis.  The content we created on Fan History that preserved the history of those fandoms could have been shared with them so everyone could have won.  If we had reached out.
  • Twitter.  We should have been tweeting damned near every fan person who mentioned Geocities on Twitter to beg for assistance.
  • Facebook, MySpace, Yahoo!Groups, Ning.  We didn’t reach out there at all.  Large fan communities exist on them and are ones that probably used Geocities.
  • Fanac Fan History Project and other science fiction historical projects.  They stood/stand to lose a lot of their own history too.  This includes convention reports, early histories of scifi fandom on Geocities, pictures, etc.
  • Fansites.  Lots of domain level fansites will plug meaningful projects.  They could have helped us get the word out, helped us to recruit people with this project.

We mostly tried to mobile through LiveJournal, through blogging outreach, on IRC, through contacts on the wiki community (Thank you ProjectOregon and AboutUs!) and our own personal network of acquaintances.  This wasn’t as successful as a method as it could have been.

Repeating: One person (or a small team) cannot do it all.  For this type of project to be successful, you need three things: Institutional support, technical support and community support.  If we hear that other sites are at risk in the future, we now know what needs to be done and better understand the consequences of failing.  We’ve learned an important lesson.

In the meantime, we’ve still got a lot of data that needs to be processed into the wiki into a usable format.  We could use some technical and community assistance in accomplishing that.  How do we get 3 gigs of screencaps uploaded?  What is the best way to upload 1,000 text files to a wiki?  Is there a way to scan those text files to make sure that they are what we think they are?  As a community member, can you help us improve articles we did create?  Can you help by improving descriptions on the screencaps we do have uploaded? We still need help.

Now, I’m going [maybe] to take a bit of a wiki break, data mining break, data processing break because I’ve pretty much been doing that straight through for the past week.