I’ve been bouncing off the walls for a bit now as we’re supposed to be getting a new bot for Fan History that is similar to Fan Fiction Stat Bot. (It won’t probably be ready for another three weeks to a month. I’m not in that much of a hurry and I’d rather the developer do it right.) The major difference is that this one will look at LiveJournal, its clones and the growth/size of fandom on them by monitoring the number of new posts and total comments to a selection of communities, which will then sorted by fandom so as to be able to compare the sizes of different fandom groups.
The sample community list is about 2,500 different communities. It represents probably about 750 different fandoms. The list isn’t 100% comprehensive because you can’t find every LiveJournal community based on a fandom and you can’t list every fandom. And that’s what leaves me flummoxed. How much time should be spent building a more comprehensive list of LiveJournal communities? And InsaneJournal communities?
The thing with the sample is that I know going in that it won’t cover everything. It isn’t possible. It isn’t feasible. The communities need to be manually vetted to make sure that while they might actually say list Twilight as a fandom, the community is actually about Twilight. (And not say a community of pictures of sunsets.) This list takes a lot of time to compile. I’ve probably spent in the neighborhood of 24 hours compiling the list that was used for LiveJournal bot. The updated list which will be used for this bot I’ve probably spent an equal amount of time compiling as I’ve needed to develop lists for InsaneJournal, JournalFen, Inksome, Scribbled, Blurty, DeadJournal and ivanovo.ru. I could easily spend another week adding to the list beyond that, bringing the LiveJournal list to 5,000 communities and the InsaneJournal list to 1,000. The other services have much less activity and fandom communities are much harder to find. To a degree, it takes much more time to find those fandom communities for a much smaller list. Two or three of those services are lucky to have five communities on them.
There are other issues. A lot of communities are long abandoned, not having been updated in years in some cases. (This feels like it the case for smaller fandoms.) They are never going to appear on any list of active fandoms as a result. Including them feels necessary but also counterproductive because of the sparse amount of activity related to them. Still, if we don’t have them in our list, how good of a sample do we really have? And what is the cut off point? I know for Fan History, we’ve posted to communities which haven’t had activity in more than a year… so new posts, new members, new comments are always possible. Except probably in the case of role playing communities. Those pretty much feel dead once the players have quit the game.
Another issue is sample size. How many communities is enough? When do you stop the list? Is it better to have fewer fandoms represented but to get a more communities represented for that fandom? Or should we find one or two communities which we can have represent the whole of the fandom? More fandoms or more communities per fandom? I look at the Harry Potter and Twilight fandom lists and go ZOMG! Those huge fandoms only have about 20 communities in that sample! They are HUGE! They should have at least 100! Naruto, Inuyasha! Same deal! But we are still missing a whole slew of actors and television shows and anime and manga!
The dataset we are going to develop is going to be really, really interesting, and really, really useful. It will help provide some data which can give a quantitative picture to exactly what is happening in parts of LiveJournal fandom. It will help give a picture as to the size, comparitive size of various fandoms on LiveJournal. You’ll also be able to examine the effect of certain events in the fandom to the size and amount of activity in fandom. For instance, does a community being features on Fandom Wank lead to posting and membership spikes or a membership drop? Does the release of canon cause an increase in posting volume, create a membership spike or both? I’m really excited about getting this bot developed and up and operating.
In the meantime, you’ll see me over there busy working on adding to that list…