schestowitz.com » Technology

Archive for the ‘Technology’ Category

Google-IBM Desktop Search

Filed under:

Google Desktop

FTER teaming up with NASA¹ and forming a pact with Sun Microsystem, Google now sidle nearer to IBM. This move comes to show that Google are recognised as a market leader that is here to stay and further prosper.

SAN FRANCISCO (Reuters) – IBM and Google Inc. are collaborating to make it easier for office workers not only to search for local documents and personal e-mail but to delve deep into corporate databases, the companies said on Friday.

¹ Side note: Brin’s mother work/ed in NASA

Related item: Google Desktop Released

Comments Off | Mail

Send this to a friend

Moderate the Moderators

Filed under:

Technology

by Roy Schestowitz at 3:27 am

CCTV

Can the men/women behind CCTV be trusted?

EMEBER the a popular phrase “Who is watching the watchers?”

Well, it appears as though, under the umbrella of Web 2.0, where visitors’ involvement is perpetually encouraged, we face yet another challenge: how can comments be moderated, articles ranked and statistics assembled reliably?

Can you truly trust a moderator? Need you ever moderate the moderator? If so, are you not getting into a cyclic moderation trap here? Slashdot have introduced the idea of meta moderation, where moderators can be penalised for unfair treatment of comments. For example, some would be aware of the effects a terrible day has on personality. Even in a peer review process, people are more likely to punish others passionately due to their own personal problems.

Why has this idea sprung to my mind and awoken in my consciousness? WordPress has recently seen some collaborative comment spam filtering, currently known as Akismet. I was involved in testing the plug-in (see my entry on ending comment spam) and I can finally give more details about it.

Comments are marked as “genuine” or “spam” and their status is shipped to a central repository where filtering is administered. At present, API keys are given to trusted bloggers. This is a handicapping property of the service, which may exclude many. It may even be interpreted as insulting to some. The API keys are intended to keep spammers out of any ability to flag comments badly, thus collapsing the system and breaking everybody’s long-taught filters. Essentially, there is the possibility of filling the engine with noise, which makes it utterly unusable.

Finally, the scenario above begs for the question: how can the moderators — the test set which flags spam — ever be moderated if there is no trust or moderation atop the moderators? We are yet to discover where it will all end up. I believe they may be an exclusive ‘army’ of flaggers while the remainder will be just client of the Akismet filtering engine.

Comments Off | Mail

Send this to a friend

AdSense Frauds

Filed under:

Technology

by Roy Schestowitz at 3:15 am

DSENSE is Google’s program for site-hosted advertisement, which pays per click rather than per page impression (view). In other words, whenever a site visitor clicks on such ads, money percolates from the advertiser and winds up in the hands of the publisher (site owner), as well as Google. I am beginning to hear more and more about misuse of that program. In UseNet, I hear about automated tools for ad clicks and computer centres in poorer countires where staff cycles around sites pressing ads. I have just come across yet another such complaint, which may illustare how truly severe the problem has become.

When I activate my AdSense campaign, not much more than 5 minutes go by before they are all over it.. Multiple clicks from the same Internet IP’s in Malaysia, Poland, Hongkong etc. (I tried to exclude certain countries in my AdSense account, but they seem to go through proxies, so its not much use)..

Tried just now and within 2 minutes I had around 20 clicks, which were clearly fraudulent (they seem to use some kind of tool – no pictures on the site were loaded according to my log). I guess that was around â‚¬20, which went up in smoke there. The super-duper top secret internal Google clickfraud prevention system, which is supposed to deduct the invalid clicks at the end of the month, only seems to catch an extremely small fraction of the clicks, but not nearly enough. I can’t see which clicks I actually pay for in the invoice from Google, so it’s a bit hard to say.

Add/view comments (1) | Mail

Send this to a friend

Selecting or Manipulating Ad Content

Filed under:

by Roy Schestowitz at 12:28 pm

HERE are a variety of technique for summarising page content. Excerpts may be considered one of them, metadata in the (X)HTML header might be another. There is also a sharp rise in the use of tags, which can easily infer the ‘theme’ of a page a and can cohesively reflect on trends across sites (confer tag clouds or see image below).

I am discovering more and more services that are beginning to rely on a succinct collection of keyword, much like tags in Technorati, del.icio.us or the new meta search service gada.be. To each page, a concise representation simply gets bound. Prepare for more of that tagging phenomenon to be seen in the future. In its absence, pages become less desirable as they are more bandwidth-consuming.

Contextual tags cloud in July 2005

Finally, and perhaps more interestingly, advertisements in a page can be made more relevant by using tags, having manually embedded them in the page. This avoids advertisement from appearing where they would become a contextual misfit. Thus far, however, I have only come across support for tag-guidedads in WordPress. As tags are often generated automatically, e.g. derived from the page using scripts/tools, I can envision the same ideas being extended and exposed to the entire World Wide Web. Google AdSense makes an attempt at finding out for itself what a page is primarily about. It does so off-line or ‘on the fly’. Why not involve the user and use his/her knowledge for assistance? That is where tagging, as in the example above, bears tremendous potential.

Comments Off | Mail

Send this to a friend

TV First, Then Science

Filed under:

by Roy Schestowitz at 6:16 am

I quite liked the critical spin that a Slashdot contributer put to an article on the move to digital TV.

After budgets cuts led to the layoff of engineers and scientists at NASA Jet Propulsion Laboratory, a US Senate committee has approved a $3 billion dollar subsidy to assist Americans in their difficult transition to digital television in 2009.

TV X-Files While we should all know that it is science that drives innovation, money gets spent where the long-term future is uncertain. Television and advertisements that accompany its existence shape up a tremendous industry. However, it is a well-established fact that economy cannot safely propagate to the future (Wall street and the ‘bubble effect’) whereas exploration and new discoveries are capable of putting the States at the forefront. This all comes at a very sensitive time when the whitehouse issues budgetary cuts on science and research while creationism and defence (or contrariwise armament) are better catered for. I am truly concerned.

Comments Off | Mail

Send this to a friend

Firefox Fork

Filed under:

by Roy Schestowitz at 3:36 pm

Firefox in the dock

have just become aware of Flock, which is an interesting fork of Firefox 1.5. The much-anticipated version 1.5 has not been formally released yet, which makes this a somewhat controversial scenario. Flock is now being promoted by WordPress.com, where a download of Flock warrant a free WordPress blog (at least for the time being).

With all due respect, I am always slightly apprehensive when it comes to adopting, thus relying on forks. I am also aware of the problem associated with forking one’s own application. Once you lag behind, the long-invested dedication can wind up being disposed of. Flock appears to me like the conspicuous rationale behind Mozilla becoming a foundation and going by the identity of Mozilla.com.

As regards WordPress.com and the Flock relationship, I might give it a try, but I would certainly seek plenty of convincing arguments before I do so. My past experiences with Firefox 1.5 betas (AKA Deer Park) have been fairly disappointing and led to regrets. I have made two such attempts to migrate to a version that was not finalised.

Having said that, a certain other fact is worrying me slightly more. According to ZDNet, the lead developer of Flock said:

“Please note that this is a developer preview and that there are still plenty of bugs, many of which we are aware of.”

To me that sounds as if the application is not quite ready for “prime time” so WordPress.com are possibly getting users on a dangerous wagon.

It was also said, however:

“In architecting our software, build systems and engineering processes, we have given considerable thought to how our code will be able to evolve alongside the Mozilla code, without forking it”

This sounds rather re-assuring and I sure hope these folks will walk as they preach.

Related item: Best Technology Products of 2005

Comments Off | Mail

Send this to a friend

BoxTrapper Problems

Filed under:

Technology

by Roy Schestowitz at 4:06 pm

Dog scooping
Let a BoxTrapper handle the ‘poop’

OXTRAPPERS are a mechanism for stopping large volumes of E-mail spam. The key idea is relatively simple as the following paragraph explains.

For each E-mail that comes in, require the sender to post a quick confirmation of his/her existence. The server sends the unknown sender a stub, which is then replied to as-is to complete verification. Once this is done for the first time, the sender is whitelisted and need never verify his/her identity again. Under this type of framework, untrusted senders must be accepted in order for their messages to be viewed immediately and not considered to be spam. And guess what? It works! BoxTrappers queues can be viewed periodically, just in case a genuine senders did not bother to get themselves whitelisted by replying to the verification request.

I have 3 BoxTrappers on this domains, but they are sometimes misused as spammers attempt to break them, much as they destroy anything where scams, links, and on-line shopping are involved. Spammers will often identify themselves using E-mail addresses of real people, who are not truly themselves, thereby causing traffic from the BoxTrappers (if not abusive mail from the spam recipient) to be sent to genuine innocent people and businesses. Moreover, I have recently come to grips with a trend where the spammers identify themselves as people coming from my own domain. They get whitelisted automatically in this way, so I guess they found a BoxTrapper weakness or loophole. Nonetheless, it remains easy to filter or identify such spam. It is only a shame that it can become visible by escaping the queue and thus be time-consuming.

These days, as I continue to edit this item, the spammers still manage to get past the BoxTrapper. Again, they do so by intentionally picking up E-mail addresses with my domain name, e.g. register@schestowitz.com. These come up with message bodies like “Please change you password, go to URL…” and with other username variations to register, e.g. webmaster, admin, etc.

As explained before, the domain name gets them automatically whitelisted, which is the core and very source of the trouble. These repeat almost on a daily basis (several times a day in fact) and I wonder how many Webmasters are gullible enough to fall for these scams, which I am convinced have become a widespread plague by now.

Comments Off | Mail

Send this to a friend

Navigation

Archive for the ‘Technology’ Category

Google-IBM Desktop Search

Moderate the Moderators

AdSense Frauds

Selecting or Manipulating Ad Content

TV First, Then Science

Firefox Fork

BoxTrapper Problems

Archives

Categories