Introduction About Site Map

XML
RSS 2 Feed RSS 2 Feed
Navigation

Main Page | Blog Index

Archive for the ‘Technology’ Category

Moderate the Moderators

CCTV

Can the men/women behind CCTV be trusted?

REMEBER the a popular phrase “Who is watching the watchers?”

Well, it appears as though, under the umbrella of Web 2.0, where visitors’ involvement is perpetually encouraged, we face yet another challenge: how can comments be moderated, articles ranked and statistics assembled reliably?

Can you truly trust a moderator? Need you ever moderate the moderator? If so, are you not getting into a cyclic moderation trap here? Slashdot have introduced the idea of meta moderation, where moderators can be penalised for unfair treatment of comments. For example, some would be aware of the effects a terrible day has on personality. Even in a peer review process, people are more likely to punish others passionately due to their own personal problems.

Why has this idea sprung to my mind and awoken in my consciousness? WordPress has recently seen some collaborative comment spam filtering, currently known as Akismet. I was involved in testing the plug-in (see my entry on ending comment spam) and I can finally give more details about it.

Comments are marked as “genuine” or “spam” and their status is shipped to a central repository where filtering is administered. At present, API keys are given to trusted bloggers. This is a handicapping property of the service, which may exclude many. It may even be interpreted as insulting to some. The API keys are intended to keep spammers out of any ability to flag comments badly, thus collapsing the system and breaking everybody’s long-taught filters. Essentially, there is the possibility of filling the engine with noise, which makes it utterly unusable.

Finally, the scenario above begs for the question: how can the moderators — the test set which flags spam — ever be moderated if there is no trust or moderation atop the moderators? We are yet to discover where it will all end up. I believe they may be an exclusive ‘army’ of flaggers while the remainder will be just client of the Akismet filtering engine.

AdSense Frauds

ADSENSE is Google’s program for site-hosted advertisement, which pays per click rather than per page impression (view). In other words, whenever a site visitor clicks on such ads, money percolates from the advertiser and winds up in the hands of the publisher (site owner), as well as Google. I am beginning to hear more and more about misuse of that program. In UseNet, I hear about automated tools for ad clicks and computer centres in poorer countires where staff cycles around sites pressing ads. I have just come across yet another such complaint, which may illustare how truly severe the problem has become.

When I activate my AdSense campaign, not much more than 5 minutes go by before they are all over it.. Multiple clicks from the same Internet IP’s in Malaysia, Poland, Hongkong etc. (I tried to exclude certain countries in my AdSense account, but they seem to go through proxies, so its not much use)..

Tried just now and within 2 minutes I had around 20 clicks, which were clearly fraudulent (they seem to use some kind of tool – no pictures on the site were loaded according to my log). I guess that was around €20, which went up in smoke there. The super-duper top secret internal Google clickfraud prevention system, which is supposed to deduct the invalid clicks at the end of the month, only seems to catch an extremely small fraction of the clicks, but not nearly enough. I can’t see which clicks I actually pay for in the invoice from Google, so it’s a bit hard to say.

Selecting or Manipulating Ad Content

THERE are a variety of technique for summarising page content. Excerpts may be considered one of them, metadata in the (X)HTML header might be another. There is also a sharp rise in the use of tags, which can easily infer the ‘theme’ of a page a and can cohesively reflect on trends across sites (confer tag clouds or see image below).

I am discovering more and more services that are beginning to rely on a succinct collection of keyword, much like tags in Technorati, del.icio.us or the new meta search service gada.be. To each page, a concise representation simply gets bound. Prepare for more of that tagging phenomenon to be seen in the future. In its absence, pages become less desirable as they are more bandwidth-consuming.

Tags cloud

Contextual tags cloud in July 2005

Finally, and perhaps more interestingly, advertisements in a page can be made more relevant by using tags, having manually embedded them in the page. This avoids advertisement from appearing where they would become a contextual misfit. Thus far, however, I have only come across support for tag-guidedads in WordPress. As tags are often generated automatically, e.g. derived from the page using scripts/tools, I can envision the same ideas being extended and exposed to the entire World Wide Web. Google AdSense makes an attempt at finding out for itself what a page is primarily about. It does so off-line or ‘on the fly’. Why not involve the user and use his/her knowledge for assistance? That is where tagging, as in the example above, bears tremendous potential.

TV First, Then Science

I quite liked the critical spin that a Slashdot contributer put to an article on the move to digital TV.

After budgets cuts led to the layoff of engineers and scientists at NASA Jet Propulsion Laboratory, a US Senate committee has approved a $3 billion dollar subsidy to assist Americans in their difficult transition to digital television in 2009.

TV X-FilesWhile we should all know that it is science that drives innovation, money gets spent where the long-term future is uncertain. Television and advertisements that accompany its existence shape up a tremendous industry. However, it is a well-established fact that economy cannot safely propagate to the future (Wall street and the ‘bubble effect’) whereas exploration and new discoveries are capable of putting the States at the forefront. This all comes at a very sensitive time when the whitehouse issues budgetary cuts on science and research while creationism and defence (or contrariwise armament) are better catered for. I am truly concerned.

Firefox Fork

Firefox in the dock

I have just become aware of Flock, which is an interesting fork of Firefox 1.5. The much-anticipated version 1.5 has not been formally released yet, which makes this a somewhat controversial scenario. Flock is now being promoted by WordPress.com, where a download of Flock warrant a free WordPress blog (at least for the time being).

With all due respect, I am always slightly apprehensive when it comes to adopting, thus relying on forks. I am also aware of the problem associated with forking one’s own application. Once you lag behind, the long-invested dedication can wind up being disposed of. Flock appears to me like the conspicuous rationale behind Mozilla becoming a foundation and going by the identity of Mozilla.com.

As regards WordPress.com and the Flock relationship, I might give it a try, but I would certainly seek plenty of convincing arguments before I do so. My past experiences with Firefox 1.5 betas (AKA Deer Park) have been fairly disappointing and led to regrets. I have made two such attempts to migrate to a version that was not finalised.

Having said that, a certain other fact is worrying me slightly more. According to ZDNet, the lead developer of Flock said:

“Please note that this is a developer preview and that there are still plenty of bugs, many of which we are aware of.”

To me that sounds as if the application is not quite ready for “prime time” so WordPress.com are possibly getting users on a dangerous wagon.

It was also said, however:

“In architecting our software, build systems and engineering processes, we have given considerable thought to how our code will be able to evolve alongside the Mozilla code, without forking it”

This sounds rather re-assuring and I sure hope these folks will walk as they preach.

Related item: Best Technology Products of 2005

BoxTrapper Problems

Dog scooping
Let a BoxTrapper handle the ‘poop’

bOXTRAPPERS are a mechanism for stopping large volumes of E-mail spam. The key idea is relatively simple as the following paragraph explains.

For each E-mail that comes in, require the sender to post a quick confirmation of his/her existence. The server sends the unknown sender a stub, which is then replied to as-is to complete verification. Once this is done for the first time, the sender is whitelisted and need never verify his/her identity again. Under this type of framework, untrusted senders must be accepted in order for their messages to be viewed immediately and not considered to be spam. And guess what? It works! BoxTrappers queues can be viewed periodically, just in case a genuine senders did not bother to get themselves whitelisted by replying to the verification request.

I have 3 BoxTrappers on this domains, but they are sometimes misused as spammers attempt to break them, much as they destroy anything where scams, links, and on-line shopping are involved. Spammers will often identify themselves using E-mail addresses of real people, who are not truly themselves, thereby causing traffic from the BoxTrappers (if not abusive mail from the spam recipient) to be sent to genuine innocent people and businesses. Moreover, I have recently come to grips with a trend where the spammers identify themselves as people coming from my own domain. They get whitelisted automatically in this way, so I guess they found a BoxTrapper weakness or loophole. Nonetheless, it remains easy to filter or identify such spam. It is only a shame that it can become visible by escaping the queue and thus be time-consuming.

These days, as I continue to edit this item, the spammers still manage to get past the BoxTrapper. Again, they do so by intentionally picking up E-mail addresses with my domain name, e.g. register@schestowitz.com. These come up with message bodies like “Please change you password, go to URL…” and with other username variations to register, e.g. webmaster, admin, etc.

As explained before, the domain name gets them automatically whitelisted, which is the core and very source of the trouble. These repeat almost on a daily basis (several times a day in fact) and I wonder how many Webmasters are gullible enough to fall for these scams, which I am convinced have become a widespread plague by now.

Desktop Environment Freedom

Desktop with previews

PDF‘s, text files, HTML‘s and
directories in the KDE Desktop with previews
(click to enlarge)

CERTAIN issues arise when habits and user orientation in his/her desktop are interfered with. Desktop environments, installers, filesystem structures, or even platforms in general are often more workable and thus successful if they comply with the expectation of new users. What if these are made too stringent by the developers, however? What if decisions and conventions are voted for without involvement of the end-user?

Many desktop environments, free ones in particular, are not made uniform. The resulting diversity and flexibility leads to difficulties in using somebody else’s settings, i.e. working in conjunction under the same session. Some GNU/Linux distributions include more or less any desktop environment which exists while settings are kept apart and well-separable for each user of the system and each desktop environment. So, where does the the problem lie? Give users more freedom, I suggest, and make that the norm. It is, after all, their own computer and they understand their needs better than the developers of the operating system.

What appears worse than all is the scenario where decisions are arrogantly by the vendor and then enforced. This is often the path that operating systems such as Mac O/S (to its varieties) and Windows do. Give the user some more choice, I say, by adding options for endless customisation that has very few boundaries, if any (Open Source). Give all users the freedom they deserve and allow them to express individuality and adapt/tailor their desktop environment to suit their needs. Menu entries and other widgets will remain unaffected, so documentation, for instance, will not suffer as a consequence.

People work differently on a variety of applications, for a wide variety of purposes. The domains in which they work differ as well. A person working with many open windows at any given time might prefer to have “focus follows mouse cursor”. Contrariwise, to some, this “focus follows mouse cursor” behaviour is highly adverse to habits. To list yet another example, a Web server needs to have a light desktop environment that is less susceptible to breakage and consumes small amounts of RAM.

These considerations are all highly defensible. If Linux was to be deployed in more public clusters, for example, choice could (and should) be given as to which desktop environment should be used. Different strokes for different folks, but all can be catered for provided that disk space is made available. The pertinent settings would reside in the home directory of each user. Thumbs up to Gnome, KDE, and the rest of the self-motivated window manager teams around the world.

Related item: KDE Versus GNOME

Retrieval statistics: 21 queries taking a total of 0.238 seconds • Please report low bandwidth using the feedback form
Original styles created by Ian Main (all acknowledgements) • PHP scripts and styles later modified by Roy Schestowitz • Help yourself to a GPL'd copy
|— Proudly powered by W o r d P r e s s — based on a heavily-hacked version 1.2.1 (Mingus) installation —|