Duplicates Detection in Social Bookmarking Sites
UPLICATE entries are some of the evil residues of sites where editorial involves many people. There are ways of preventing duplicates (dupes), but none is perfect.
I personally find Digg’s dupe detector somewhat flawed because, by the time the user finds matches based on similarity, much of the entry (and effort) has already been put into it. The user is thus tempted never to retract and concede the submission. Netscape, on the other hand, checks for title and URL similarity (identity only) in-line.
Wishlist items:
- Have matches that are more ‘fluid’ appear on the side while input is entered (not just exact matches)
- Permit the user to preview without entering a channel and without tags. This enables the submitter to check for dupes before giving some supplemental information. I am aware that it requires some parsing of the text, which is harder than using tag-based similarity.
- What would be nice is an option for supplemental items, URL’s, and followup news. Maybe have a hierarchical connector between related items, or at least some linkage that connects an item with a correction, clarification, op/ed, etc.