Rexa, a U.Mass. project for indexing online research in CS and related fields, just became publically available. Similar to CiteSeer, Google Scholar, and Microsoft Academic Search, but with first class objects for people and grants as well as papers, and using machine learning techniques for the duplicate elimination. Free registration required for use (maybe to prevent spammers from harvesting their data?).

Searching for myself (isn't that the first thing you'd do?) their data looks pretty clean and comprehensive. E.g. they have my correct academic affiliation, home page, email, photo, a list of publications up to 2004 (which I didn't check thoroughly), and a list of co-authors. Checking that, they spelled Alper Üngör's name wrong (dropped the Ü), misspelled Jiří Matoušek's first name as "Jir", confused Scott Mitchell with someone named Sandra, and have doubles for Paul Chew (once under his real first name Leslie), David Fernández-Baca (once with an apostrophe intead of an á), Erich Friedman (once under Eric), Raffaele Giancarlo (once under "aele"), and Shang-Hua Teng (once under "S.-H."). But that's out of a bit over 100 people, and who knows how many different spellings of their names, so it seems like a pretty good level of accuracy to me. I didn't check whether there was anyone missing but it's about the right number of names.

The 2004 limit on pubs is a little worrisome, since I'd want a service like this to be more up-to-date, but I imagine that's something they can work more on as they become more polished.

They also have a blog with project announcements.





Comments:

phenyx:
2006-04-18T06:17:59Z

Well, it can't be any worse than Citeseer. The only advantage to Citeseer is that it keeps PDF/PS archives of the papers - and sometimes, but sometimes, it has proper BibTeX records.

Most of the time it's something like

@misc{ noack03energy,
  author = "A. Noack",
  title = "An energy model for visual graph clustering",
  text = "Andreas Noack. An energy model for visual graph clustering. In G. Liotta,
    editor, Proceedings of the 11th International Symposium on Graph Drawing
    (GD 2003), LNCS 2912, pages 425--436, Berlin, 2004. Springer-Verlag.",
  year = "2003",
  url = "citeseer.ist.psu.edu/article/noack03energy.html" }

vs. a much more correct record from GDEA:

@inproceedings {GDEA-472, 
author = {Noack, Andreas}, 
year = {2004}, 
title = {An Energy Model for Visual Graph Clustering}, 
editor = {Liotta, Giuseppe}, 
pages =  {pp. 425-436}, 
booktitle =  {Graph Drawing, Perugia, 2003}, 
publisher = {Springer}, 
}

If only U of L's library would set up so that I could actually jump into ACM Digital Library and the other subscriptions they have, directly from Google Scholar.

11011110:
2006-04-18T06:30:51Z

Google Scholar is the one I almost always use these days. I liked Citeseer when it was newer — and put quite a bit of effort into cleaning up its records for my own papers — but eventually became frustrated with all the inconsistencies, and with the infrequency with which it ran web crawls. Web of Science isn't bad, but I tend to avoid it because of the need to set up a VNC first when I'm connecting from home, and because its coverage is too journal-centric. And I don't often need specialized databases such as MathSciNet, although that one is good for finding old math papers.

helger:
2006-04-19T09:38:21Z

According to WOS I have 8-9 times less citations than according to Scholar - this shows something about the coverage of WOS (no IEEE, no ACM, no anything). It is also privately owned and not free, you need registration etc. I also hate their interface. => WOS out.

CiteSeer was nice, and I really loved it. However, a) it crawled too infrequently which made its entries less than complete, and b) it depended too much on somebody submitting his own papers which made its entries less than complete. However, it has good statistics (citings per year, non-self citings separately, expected influence of your paper, possibility to enter bibtex/abstract etc) that Scholar lacks.

Scholar has by far the best coverage (well, it's owned by Google) but it lacks all the bells of CiteSeer: you only get to see papers and who cites them. It also does not update so frequently --- the last time was almost three months ago.

Rexa seems to have most of the bells that made CiteSeer so nice plus better user interface. But their library is yet really tiny (they say Phil Rogaway has 109 citations?!)

I hope Scholar/Rexa/CiteSeer will be able to learn from each other!

11011110:
2006-04-19T16:47:32Z

According to WOS I have 8-9 times less citations than according to Scholar - this shows something about the coverage of WOS (no IEEE, no ACM, no anything). It is also privately owned and not free, you need registration etc. I also hate their interface. => WOS out.

The good part about WOS is that their data is hand-entered. So it's expensive and limited coverage, but what data they do have is quite clean.