Introduction About Site Map

RSS 2 Feed RSS 2 Feed

Main Page | Blog Index

Monday, January 2nd, 2006, 3:29 pm

Blog Plagiarism

Laundry machines
Help the search engines clean up the Web.
Report duplicates.

I recently mentioned site scrapers in the context of Internet plagiarism. More often do I hear about blogs copied systematically nowadays.

Blog plagiarism is a growing phenomenon, or so it seems on the surface. This even happens to me sometimes, but I refuse to spend my time or lose sleep over it. The process needed to remove stolen content is unnecessarily cumbersome. As as example, Podz and Mike Little, who are both WordPress developers, had people copy their entire site merely post-by-post. This can ultimately lead to mirror/duplicate penalties, which deter search engines. As far as I know, they had to engage in a lengthy process of correspondence before action was taken. The best one can do is keep an eye on the dodgy sites and report abuse when all blows out of proportion. As long as a site is public, it is susceptible to copyright infringement and can, in due time, become a victim.

As one example of stolen content, RSS Site Map is one such item that was once copied verbatim and in full. If I recall correctly, a Blogger member was the culprit. A subtle link was at least there, but no real attribution was made.

Other content thieves scrape random bits and stick them together to form ‘doorway pages’. These pages serve as a mechanism which hogs search engine referrals. It is one among many popular aspects of black-hat SEO practices, which are a form of spam by any definition.

Frequently-Asked Questions (or Useful Facts)

  • Q: How does one copy content systematically?
    A: RSSBlog [rel="nofollow"] and the like. Magpie can do this vis RSS when misused.
  • Q: How does one detect plagiarism?
    A: Tools such as Copyscape appear to do that trick. I imagine that they run a series of Web searches with large sentences involved. They then attempt to identify excessive overlap across sites on the Internet. These Web-based tools simplify and automate, at an upper-level at least, an old-styled method for detection of duplicates. This type of technique I can still recall from my days as an undergraduate.
  • Q: How does one report plagiarism?
    A: Probably the most suitable response is contacting the host of the offending site. Examples are needed to support the complaint/s.

2 Responses to “Blog Plagiarism”

  1. Podz Says:

    Thanks for the elevation, but I’m no dev :)

    I just hang around the forums and make noises in all the wrong places…

  2. Roy Schestowitz Says:

    Hi Podz,

    I pondered the use of the word “developers”, but other words did not sit right. Very glad to find you visiting my site…

Back to top

Retrieval statistics: 21 queries taking a total of 0.127 seconds • Please report low bandwidth using the feedback form
Original styles created by Ian Main (all acknowledgements) • PHP scripts and styles later modified by Roy Schestowitz • Help yourself to a GPL'd copy
|— Proudly powered by W o r d P r e s s — based on a heavily-hacked version 1.2.1 (Mingus) installation —|