Mindmap
Livejournal MindMap appears from its description to be a graph drawing system for visualizing connections between livejournals. The interesting part, to me, is not so much how the graphs are actually drawn (likely some kind of force-based system) but how they pick out which information to make more and less prominent in the graph via coloring, sizing, or even by choosing to include or exclude some vertices. Livejournal has about 8 million blogs, too much for a single drawing. Even if it's just restricted to other journals one person connects to, it may be difficult to fit them all as labels within a single drawing — the user mcfnord who wrote the entry I'm linking to has nearly 300 individual journals he or she lists as friends, and nearly 200 who list him or her as friends, not even counting community journals etc. So obviously some amount of selectivity is necessary in order to make any useful information apparent in a drawing.
Following some links, it appears the system looks only at livejournals connected directly to yours, and does some sort of clique listing algorithm to find clusters among the graph induced by those vertices. Vertices are displayed as bigger, brighter, and closer to the center when they belong to larger cliques. Maybe the colors indicate different cliques? Apparently it's very slow if you have large maximal cliques, because they list all cliques and not just the maximal ones; perhaps they should look into less naive maximal clique listing algorithms... (actually, not the algorithm of mine linked here, that's intended for independent sets in sparse graphs or cliques in dense ones, but some of its references might be relevant).
Unfortunately the email-return user interface makes it cumbersome to experiment, and anyway this journal hasn't been around long enough to acquire an interesting neighborhood of connections.
ETA 10/8/05: mindmap for this journal. As I said above, not a very interesting graph yet.
Comments:
2005-10-08T08:27:07Z
At some point I would like to translate this from Python to C#: http://tucc.us/clique.py.txt
This will go a long way to speeding things up. But all that's an optimization, and relatively speaking, the pipeline chugs along. Speed is not my primary goal, so the old code is allowed to clunk along.
You get at a central question: what's so big about cliques that I show them big and closer to the center?
Vertices are a purely practical reason why. The vertex density is highest in the largest cliques.
My design has been criticized for lacking "proper" forced-based design. Just as cliques should pull together, they should repel from each other. But it is not a generalized map. It is based on a single individual. What is sensible about a cluster "repelled" from the individual without whom there would be no structure here at all?
Early in my implementation, I realized that there are so many different LJ scenarios, such as mine w/ ~200 people, and yours with 2 two-way connections. Font sizing cascades with the "top two" clique sizes at the largest font size... and if someone only has one clique of them and their buddy, then of course large fonts are in order. I didn't want the product to suggest you're "smaller" than anyone else... if I'm doing little more than showing you and your buddy's name, let's get some shine on that. I haven't thought about this detail in some time but it was a major consideration in the initial design.
While my major motivation for each modality (size, placement, color) is of course a "best guess" at cleaving out some connectivity patterns and inferences, another motive was to overwhelm the viewer so people would stop seeking strict literal explanations. My dream modality is time.
I have wished to add in other factors, such as available information from server logs (which can give clues about activity, not just friend bit linkage), and of course comment analysis. But you can get a lot from the clean and simple who-friends-who. I do think the clique approach is pretty good but other data sources should operate as 'hinting and highlights' to draw out patterns when some vantage provides them.
I've also considered expanding visualization beyond the "first hop"... I wasn't sure what this would try to accomplish. Showing the "frontier" just beyond the people you know? For context or for discovery? I settled on the new Radar feature which calculates who is up-and-coming among people i read. I now use the radar feature regularly to find cool people. It really works! In a way the whole project just feeds my junkie relationship with livejournal, and then I wired it up so everyone can play.
2005-10-08T20:52:04Z
If you want to speed it up, I'd expect to get more mileage out of tweaking the algorithm first (easier to do while you're still in Python), and only then if you still need it moving to a faster language. For instance, have you tried timing it with the computation of "with" before the one of "without" instead of after? I'd imagine the "with" cliques are likely to be larger, therefore provide a more likely early cutoff in the "without" computation.
Re the force-based design, who cares if it's "proper" as long as it produces the visual results you want. But I think you might find it interesting to look at the work of Noack that I mention in this entry. At least in the examples he showed, it did an amazing job of making hidden graph structure visible.
BTW, I was interested to see the "radar" in the map your software made for me, listing other journals I might be interested in. It's just going from the friend graph structure, and my own lj's friend structure is sparse, so it's obviously noisy, but two or three of them (out of 80 or so) were very on-target.
2005-10-08T22:12:03Z
The Python code is already fast! I just have to get it into C# and I'm on my way! I'm going to outsource this some day. Another user wrote this code for his "cliques" meme. I think it's darn good but I only understand 85% of it. There's recursion... with a twist.
What can I say about repelling force? Is there truly any data here that I can describe as representing repelling force between anyone?
The "first run" of the radar feature is about half as cool as subsequent runs. The first run was the hardest to figure out. I had no historical context, which the "core" approach relies on. I had to simply find another way, and until that was there, I couldn't really provide "radar" without a three week wait! Another user worked closely with me to think up an acceptable "first run". If your network grows very slowly, then radar might "throw you back in for another three weeks"... when it sees a threshhold of change, it updates the associated iamge and publishes the results.
I am adding a lot of my radar hits. I did some tests exposing results to "really popular people" and when they started drooling I knew that special sauce was ready. But it's not visual and it's not sensibly a meme... I could make it a meme, sorta... thankfully i already have one.
2005-10-08T22:23:43Z
The repelling force is just to keep the labels spread out instead of landing on top of each other. So it exists between every pair of nodes. Think of it as being like "personal space" in real life — if someone steps to within a few inches of you, you'll probably step back, or at least feel annoyed about the violation of your space. It's the attracting force that depends more strongly on the graph structure.