Formatting a couple hundred references for a proposal led me to wonder: If you find yourself wanting to look up the BibTeX data for a paper, where do you go? And how much do you have to edit it yourself afterwards?

The three most obvious choices for me are DBLP, ACM Digital Library, or MathSciNet.

There used to be a project to maintain a collective file "geom.bib" with all the references that any computational geometer would ever use. I still have about 18 copies of it on my computer (presumably not all in sync with each other) from various papers that used it, but it became unwieldy (too big to use as one file) and seems to have fallen by the wayside. Additionally, many publishers supply citation files for their own publications, so you could use those, or even take the time to write your own. But my experience is that most of the publishers are not good at generating clean data (e.g. they use hyphens instead of en-dashes for page ranges, or permute conference title words into a different order than what you'd want to use in a citation), although at least they're better at it than Google scholar.

The big three above all have their quirks, but they generate pretty clean data (especially if you tell DBLP not to use crossref). Copying from them can be a lot easier and less error-prone than typing it all in yourself, and picking one source and sticking to it could also help achieve greater consistency. DBLP has the best coverage for Computer Science, I think. I recently looked at a five-year window of my papers (for the prior work section of that proposal) and it missed only three (two in non-computer science journals about topology and mathematical psychology, and the third in an edited volume about cellular automata).

My own idiosyncratic preference is for MathSciNet, though. Their coverage is almost as good for my purposes (sometimes better) but what ends up making the difference for me is their care about the capitalization of title words and formatting of math in titles. DBLP and ACM leave lots of words capitalized and let the bibtex style lowercase them later, which mostly works, but fails when some words are proper nouns that should stay capitalized. MathSciNet takes care to lowercase everything to how it should appear in a citation (my preference) and to protect the letters that should remain uppercase. And for titles that contain formulas, MathSciNet gets it right and the other two don't.

Example: ACM: "The h-Index of a Graph and Its Application to Dynamic Subgraph Statistics".
DBLP: "The h-Index of a Graph and Its Application to Dynamic Subgraph Statistics" (journal version); "The \emph{h}-Index of a Graph and Its Application to Dynamic Subgraph Statistics" (conference version).
MathSciNet: "The {$h$}-index of a graph and its application to dynamic subgraph statistics". One of these is correct and the others aren't.

But maybe there's some new tool or database that beats all of these that I haven't yet found out about. One of my co-authors uses Zotero, but I haven't tried that myself. Are systems like it based on shared libraries rather than comprehensive databases still useful?