Friday, November 3rd, 2006, 6:55 pm
Duplicates Detection in Social Bookmarking Sites
UPLICATE entries are some of the evil residues of sites where editorial involves many people. There are ways of preventing duplicates (dupes), but none is perfect.
I personally find Digg’s dupe detector somewhat flawed because, by the time the user finds matches based on similarity, much of the entry (and effort) has already been put into it. The user is thus tempted never to retract and concede the submission. Netscape, on the other hand, checks for title and URL similarity (identity only) in-line.
Wishlist items:
- Have matches that are more ‘fluid’ appear on the side while input is entered (not just exact matches)
- Permit the user to preview without entering a channel and without tags. This enables the submitter to check for dupes before giving some supplemental information. I am aware that it requires some parsing of the text, which is harder than using tag-based similarity.
- What would be nice is an option for supplemental items, URL’s, and followup news. Maybe have a hierarchical connector between related items, or at least some linkage that connects an item with a correction, clarification, op/ed, etc.
November 5th, 2006 at 8:26 am
Good thoughts. I don’t know if I’m the only one who feels this way, but I think there is really so much spammy and duplicate stuff on the Internet, and it sometimes get irritating.
For example, I can have a google alert on “Social Bookmarking” and then I find that all three links are exactly the same content. Really irks me!