Introduction About Site Map

RSS 2 Feed RSS 2 Feed

Main Page | Blog Index

Archive for the ‘Technology’ Category

Thoughts on Privacy on the Web

Cookies and cross-site connections help track Internet users in ways far worse than most people realise. People assume that when they visit a particular site then it is this site alone which knows about them. Moreover, they assume that they are logged off and thus offer no identifying details. In reality, things are vastly different and it is much worse when public service sites act as “traps” that jeopardise privacy. A site that I recently looked at (as part of my job) does seem to comply with some of the basic rules, but new advisories are quite strict. To quote: “The UK government has revised the Privacy and Electronic Communications Regulations, which came into force in the UK on 26 May, to address new EU requirements. The Regulations make clear that UK businesses and organisations running websites in the UK need to get consent from visitors to their websites in order to store cookies on users’ computers.”

The BBC coverage of this indicates that “[t]he law says that sites must provide “clear and comprehensive” information about the use of cookies…”

Regulating cookies is not enough. ISPs too can store data about the Web surfer and, as Phorm taught us, they sometimes do. They sell information about people.

In more and more public sites, HTTPS/SSL is supported and cookies remain within the domain that is “root” in the sense that the visitors intended to visit only this one domain (despite some external bits like Twitter timelines in the sidebars/front page. Loading up, even via an API, might help a third party track identities). Shown in the following image is the large number of cookies used when one accesses pages from Google/GMail (even without having a GMail account).


Although SSL is now an integral part of this service (since the security breaches that Windows caused), privacy is not assured here. Although they don’t swap cookies across domain visitors, Google’s folks do track the user a great deal and they have many cookies in place (with distant expiry date) to work with.

Information on how Google will use cookies is hard to obtain, and the problem is of course not unique to Google cookies. Most web browsers automatically accept cookies, so it is safe to assume that about 99% of people (or more) will just accept this situation by default. If a site had provided visitors information about cookies, permitted secure connections (secure to a man in the middle) and not shared information about its visitors, contrary to the EU Commission which foolishly wanted to put spyware (Google Analytics) in pages, then there is at least indication of desire to adhere to best practices.

Cookies are not malicious by design as they are necessary for particular features, but to keep people in the dark about the impact of cookies on privacy is to merely assume that visitors don’t care and won’t care about the matter. And that would be arrogant.

To make some further recommendations, privacy should be preserved by limiting the number of direct connection to other sites. Recently, I have been checking the source of some pages to see if there’s any HotLinking that’s unnecessary in public sites, which would be a privacy offense in the sense that it leave visitors’ footprints on another site. Outbound links can help tracking, but only upon clicking. The bigger issues are things like embedded objects that invoke other sites like YouTube. HotLinking, unlike Adobe Trash, cannot result in quite the same degree of spying (Google knows about IP address and individual people). If all files can be copied locally, then the problem is resolved. Who operates linked sites anyway? If it’s a partner of a sister site, then storing files remotely might be fine, but with AWS growing in popularity, Amazon now tracks a lot of sites, e.g. through image hosting.

Sites like Google, Facebook (FB) and Twitter, if linked or embedded onto a Web page, can end up taking a look at who’s online at the site. All it takes from the visitor is the loading of a page, any page for that matter. FB is often criticised for the “like” button too (spyware). JavaScript (JS) has made the spying harder to keep track of; it would be best practice to perhaps offer JS-free pages by default, which limits viewing by a third party assuming those scripts invoke something external. Magpie RSS can help cache copies of remote data locally and then deliver that to the visitor without the visitor having to contact another server when loading up the primary target site. Some sites these days have you contact over 10 different domains per pageload. It’s the downside of mashup, and it extends to particular browser components too (those which “phone home”, but the user usually had more control over them than over known and unpredictable page source). Google and Microsoft uses their cookie to track people at both levels – browser and in-page (sometimes under the guise of “security”, babysitting and warning about “bad” sites you visit). Facebook and Twitter only do the latter and a lot of people don’t welcome that. Facebook, notoriously, profiles people (e.g. are they closeted gay? Is there fertility/erectile dysfunction? Any illnesses the person obsesses over?) and then sells this data to marketing firms and partners, reportedly Microsoft too.

Public sites have different regulations applied to them because many people are required to visit them (e.g. paying taxes), it is not a choice, not to mention the sovereignty principles (e.g. should Google know who and when and how European citizens access their government sites which they themselves paid for?).

In society there is a lot of ransom going on — a lot of ransom people do not regonise or will never be known or reported. This relies primarily in information, unless there is a physical hostage situation (where the prison is at danger of mortal harm). But the bottom line is, those who have the potential to embarrass others possess a lot of power, so there is a fundamental issue of civil liberties at stake. This is why, among several reasons, the TSA agents stripping off (literally or figuratively, or in scanner) is a way of dehumanising and thus weakening the population, normalising indecency and maybe returning us to memories of some human tragedies. The privacy people have is tied to their indignity, worth, and sense of self/mutual respect. Privacy is not a luxury; it is an important tenet of society. Society will suffer if privacy is altogether lost.

GIF Animations in LATEX

LATEX helps render for a variety of output types including posters and Web pages, not just A4 sheets. As a typesetting language it is very powerful, but for advanced functionality it requires additional packages, included in the preamble. It appears as though GIF animations are not supported in LATEX despite the fact that, if exported as Web pages for instance, the notion of animation makes sense. This is a shame really and if someone knows of a workaround, please leave a comment. I am currently writing a 400-page report which is a comprehensive summary of what I am doing and without animations it might be hard to express what is going on. For example compare the following triplet of static and dynamic (which HTML is happy with):

New Interview With Me

Head over to Muktware where there is there is this new interview.

YouTube Versus Television

TECHNOLOGY moves on and one must adapt to it. “Luddite” a word that is often used to discourage those who deviate from the norm, such as those who refuse to carry a mobile phone everywhere they go.The term Luddite in this case refers not necessarily to rejection of progress but to conformity of lack thereof.

Television is a generic term which refers to a device for remote viewing of something. Conventionally, however, we think of television as a set although some large bits of furniture or even projectors might nowadays qualify as televisions. What is common in almost all of them is that, with the exception of streaming or on-demand viewing, television is controlled by broadcasters, who have a lot of control over the viewer’s mind many hours of each day. The viewer can typically select the least undesirable channel among a finite number. Just because there are many channels these days does not mean one can watch a lot of them at once (simultaneously), so this limitation remains. The choice is elusive.

YouTube is different for several reasons as by its nature it allows anyone to broadcast and it also gives the viewer a lot more control over what is being watched. This is why I stopped watching television and eventually gave my set to a friend. The set was of no use anymore. It felt more like a device for passing commercials and clips that I did not wish to see. Sure, there were exceptions, but those were very rare. To choose a channel is still an illusion of choice as that hardly leaves much selection in the hands of the viewer. The choices are preselected by other people.

Recently I started to get more actively involved in YouTube not as a mere viewer. Back when YouTube presented statistics on how many videos a given account has watched the number 22,000 came up and since then I have watched probably about 50,000 videos on YouTube. So I pine to become part of those who contribute. In the coming weeks I will convert some older material and upload it to YouTube. It may be an interesting experience. Can a viewer engage in a two-way exchange of information? That certainly would be beneficial to society as it can weaken the power of media empires over people’s minds. It can also help promote the TechBytes show to people who never heard about it. At the very least as an experiment I shall see how it goes. This might be rethought.

Legacy Pages

CLEARLY, when one writes/maintains a Web site, keeping pages up to date is a tough task and the bigger the site is, the harder it gets. Updates can be made to the appearance of pages, as well as the content. Unlike newspapers, for example, sites can be accessed 10 years down the line and not carry a timestamp to indicate that the information in them may no longer be accurate. This is fine in the case of because most pages contain some sort of timestamp (most pages here are about 6 or 7 years old). Even when pages get updated if makes sense to keep the old content in tact, at least as a form of legacy. That’s what I did over the weekend with the introduction page, which someone complained about as it was about 7 years old and needed a refresh. The bottom line is, for certain types of sites, keeping them up to date is a monumental task. Webmasters do not deserve hassle for it.

Free Software More Than a Hobby

Throughout my career I’ve always had many eggs in the basket. I’ve usually had multiple jobs and I was never fired; I always succeeded in job interviews (since 2003), except the ones with Google, which came to me three time (I never approached them regarding a job). One thing I’ve learned over the years is that one must choose a job one enjoys, otherwise it’s a chore. I never accepted a job that I disliked. I have been working in two jobs simultaneously several times (simultaneously as in overlapping months/years), sometimes on top of already being a full-time Ph.D. candidate/student. I still work two jobs and I very much enjoy both; it’s like leisure as there is a sense of achievement. Besides all of this, as a hobby I maintain some sites that promote freedom; I was never paid for this. This is part of my reading of material; it’s like a learning experience which also proved beneficial to many others — those who share interests with mine. Being enthusiastic about freedom comes very naturally.

After many years wanting to be running an independent business on the side I’ve decided to start creating a professional site. The original idea was to come up with a new name (and domain), but after much consideration I came to the conclusion that giving visibility to a new name and new site would be a lot of work. As this new blog post from Forbes correctly indicates, reputation matters a lot when seeking business. That’s why I decided to stick with my surname and in the coming days/weeks there will be a formal announcement regarding my third job, in which the work capacity cannot be guaranteed (depends on clients). The focus is affordable scientific computing solutions that put the client in control. In essence, it is about spreading free/open source software and charging for the scarcity, which is skill and (wo)man hours. There is nothing unethical about it.

Together with some friends (I shall add people to the appropriate pages), a new logo, CMS theme, and a soon-to-be redirection (dupe of index.htm will ensure all the older pages remain accessible), will soon have a sort of relaunch. The site no longer attracts about 3,000 visitors per days like it used to (back in the days when it was regularly updated), but we shall see if it takes off not just as a personal workspace with a lot of informal pages. I remain very much committed to all my jobs; starting something as my own ‘boss’ will just be something on the side.

With ‘Cloud Computing’ You Can’t Keep Your Data Under Your Control

Propeller in 2008
Propeller in 2008 (I was ranked higher at some stages)

THINK before you touch Cloud Computing. The term “Cloud Computing” is vague and broad. It refers to all sorts of things and it’s malicious in the sense that it tends to take both data and control away from the user. That’s why I call it “Fog Computing”, avoiding marketing euphemisms.

Many people gradually take their computer activities online (e.g. photo sharing, news readings) and there is always risk when there is a mediating party either between peers or between a producer and a peer. This mediator offers a so-called ‘cloud’ or a Web platform under which people engage in some activities. This gives the mediator/intermediate enormous control and makes all parties dependent upon this mediator, e.g. for advertising, lifeline, costs, features, and data.

Yesterday I received another reminder of why I must not ever trust so-called ‘clouds’ or Web platforms that store my data in some mysterious proprietary form and give me no access to this data (except data slices that are presented as Web pages, not raw data).

So, what’s it all about?

AOL has money to spare in order to buy the Microsoft-funded Arrington with his rag while at the very same time AOL betrays a vast community of existing users at Propeller (good treatment to few bloggers, but not for a site with like a million members). Well, AOL has just killed Propeller with no prior warning. I have been on this platform for over 4 years and submitted about 24,000 stories there. It all vanished overnight without warning (none that I saw), just an apology. The whole site was shut down. The Webmaster appears to have also blocked the Web Archive a couple of years ago.

Propeller shows why social networks and Fog Computing are a risk. One day you just can’t access your messages, submissions, etc. It’s just like that and it’s not a violation of the terms of service. The mediator (AOL in this case) is allowed to do this.

So yesterday I asked, “Can #identica and #twitter guarantee that they won’t just suddenly announce shutdown one day? What about #reddit #digg #facebook etc.?” I wrote this as part of my Fog Computing cautionary tale. “Has #identica yet implemented a feature for exporting one’s entire user history in a way that makes it displayable/usable? And #twitter,” I asked.

“I’m not a tech person,” replied a peer, “but would assume it should be possible to transfer into own status net app (I believe it is free/libre)”

My reply was that the “first thing I did when I joined #identica was check I could export just in case. At the time there was no such option.”

As far as I know, none of the Web platforms I’m on allows me access to my own data in a form that I can interpret without access to a server I neither own nor control. If that does not scare you, wait a few years. No Web site lives forever and life of a Web site is often just a matter of money; it doesn’t need to make sense to keep it alive, it needs to make money to keep it alive.

Retrieval statistics: 21 queries taking a total of 0.226 seconds • Please report low bandwidth using the feedback form
Original styles created by Ian Main (all acknowledgements) • PHP scripts and styles later modified by Roy Schestowitz • Help yourself to a GPL'd copy
|— Proudly powered by W o r d P r e s s — based on a heavily-hacked version 1.2.1 (Mingus) installation —|