Re: Google and validation

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: Google and validation

Subject: Re: Google and validation
From: Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx>
Date: Sun, 03 Sep 2006 05:45:45 +0100
Newsgroups: alt.internet.search-engines
Organization: schestowitz.com / ISBE, Manchester University / ITS / Netscape/ MCC
References: <rv2if2p6mheq04ta9g2nqcil4tpjcvl1ap@4ax.com> <Xns98324626C5B4castleamber@130.133.1.4> <op.te8r2ycx26l578@borek> <3885708.rW5Pnx7RUX@schestowitz.com> <Xns98328B6CFF361castleamber@130.133.1.4>
Reply-to: newsgroups@xxxxxxxxxxxxxxx
User-agent: KNode/0.7.2

__/ [ John Bokma ] on Saturday 02 September 2006 19:42 \__

> Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:
> 
>> __/ [ Borek ] on Saturday 02 September 2006 09:19 \__
>> 
>>> On Sat, 02 Sep 2006 07:25:54 +0200, John Bokma <john@xxxxxxxxxxxxxxx>
>>> wrote:
>>> 
>>>> second note: 4 944 vs 3 902 sounds like quite a saving. The problem
>>>> is that nowadays quite some site send out their HTML compressed
>>>> (gzip), and it might very well be the case that the former is
>>>> smaller then the latter.
>>>>
>>>> But Google *should* have a serious look at their HTML, I agree on
>>>> that point.
>>> 
>>> Google is a bunch of morons when it comes to HTML. Look at their page
>>> - they for ages use one-letter ids (two years at least IIRC) to make
>>> the code shorter and to save on bandwidth, but they can't understand
>>> that they can save huge properly using css. That's an old news for
>>> some.
>> 
>> Very sad news, too. To elaborate on my other post, this sets
>> a  terrible  examples for Webmasters (think along the  lines
> 
> I am right thinking about the lines. Fully justified means there are no
> easy anchors anymore. Especially with monospaced fonts fully justified
> text if a pain in the ass to read.


Okay, okay. *smile* It's just CTRL+ALT+F7 away, so I'm tempted to give it a
go every now and then...


>> of:  "well,  even Google don't make it valid, so why  should
>> /I/?").
> 
> An important question that more people should ask themselves: wtf is
> W3C. Their ideas are not always the best ones, sadly. They are
> "everywhere", but I have sometimes the idea that they should focus on
> HTML and CSS, and not coming up with wild ideas like XML for mobile
> vibrators that soon everybody will call a standard no matter how crappy
> it's thought out :-)


XML and standards facilitate modularity. Where would OSS be without
specifications? Look at the mess Windows has reached (and Apple's Mac OS
before it took Darwin). Windows still requires a 60% rewrite of the code
(Jim Allchin) because it's utterly unmaintainable (all the planned features
are conceded because they can't be implemented). The need to accommodate
many implementation gives power in maintaining a system and replacing weaker
components with superior ones. That's why OSS is winning.


>> What's more, how are newer and less mature browsers
>> supposed  to cope with attributes that intentionally neglect
>> quotes/apostrophes?          Isn't         that         what
>> specification/standards/recommendations  are  for?  Equality
>> and  independence  on a product?
> 
> For HTML it's specified in the recommendation(s) when attributes must be
> quoted and when it's ok to leave them out. No idea if Google follows
> this. Also, a lot of people forget that HTML 4.01 has a lot of optional
> stuff, a page can sometimes be made way shorter by leaving out all
> implied stuff.


This should not be done. If you code a quick-and-dirty, then fine. If you
build a system which delivers billions of pages a day and you spew out junk,
then it's just irresponsible and selfish. WordPress, for example, was built
as a standards-compliant and accessible CMS from the start. And look where
it stands today. People should stop programming browsers to render site X
correctly just as Web developers should stop wasting their times on hacks.
Standards resolve it /all/.


> On the other hand, read my gzip story, a lot of webservers now serve
> compressed content to browsers that tell them they can handle it. Why
> gzip was chosen is a bit beyond me, because there are better compression
> algorithms, and maybe even better results can be obtained with a
> dedicated one for HTML.


Gzip is a /de facto/ standard.


>> That which doesn't  involve
>> hacks, workarounds and undocumented exception handling? What
>> about OpenDocument? I am glad that Google don't have a go at
>> making  /that/  'efficient'... I am worried that  Google  is
>> beginning   to  adopt  Microsoft's  habits  of   'extending'
>> standards   to  suit  their  own  convenience   and   agenda
> 
> You think that w3c's agenda is different? Or any OS project for that
> matter? Most OS projects are a bunch of followers with one ego at the
> wheel. Sometimes one minor change isn't accepted because the head didn't
> think of it first, so instead of saying: wow, nifty! You get 10001
> arguments (most wrong) why it shouldn't be added.
> 
> Maybe I bumped into the wrong people, but with the OS projects I
> contacted I always got:
> 
> - deny
> - make it sound insignificant
> - argue why it shouldn't be added no matter what


I can attest to the same experience. Probably self-centred programmers who
are possessive.


> A very common mistake, which you also seem to make, is that you think
> that OS is different from a company with a bunch of people all
> developing. The only difference is that you can take the source, and
> modify it to your own needs. Things like support and speed of fixes all
> depend on the bunch of people.
> 
>> (compromising  for  speed  in that case).  Microsoft  Office
>> formats,  for example, use binary because it's quicker  than
>> XML   or   a   well-structured  and   easily   interpertable
>> (backward-'engineerable') form, among other reasons.
> 
> XML is often mistaken for a better solution, especially compared to
> binary, because it's human readable. For most people a hex dump and an
> XML dump is equally readable: not.


I strongly disagree. Many people don't document their hex dump. Trust me,
they don't. And sometimes, human-readable has its merits. My Palm archives
are utterly useless if they are a mishmash of binary and ASCII. If it was
XML, I could at least migrate my data manually, understanding what I'm
doing. I have also done some mass alteration of configurations in programs
using search and replace in XML settings files. Why? Because it's quicker.
You don't get this flexibility with 'binary blobs'.


> One thing you see a lot in IT is that suddenly a "new technique" pops up
> and a lot of people jump on that and claim that they are better then the
> competition because of their use of that technique. XML is a good
> example as any.


XML is a concept. Let's just replace "XML" with the term "structured data".


> A well documented binary format is as good as a well documented XML
> format for implementing it. For the majority of users it doesn't really
> matter. And yes, the binary format can be made 10-50 times smaller
> compared to XML, which is noticeable in speed.


Computers are fast nowadays. Some people still work with bloatware. There are
also /ad hoc/ methods for making things quicker, e.g. cumulative read/write.
Speaking of which, OpenDocument speeds will improve. Let the implementation
mature. And disregard the Microsoft FUD. They are just afraid because their
cash cow is in jeopardy as many countries are putting ODF policies in place.


> Also often forgotten is that XML is a very bare bone specification. It's
> not more complicated then: an int is always 64 bits, a character always
> 16 bits. An implementation is nothing more then describing the structure
> (records) and one can do that as good with XML as with a binary format,
> since nothing is stopping you from replacing each XML element with a,
> for example, 8 bit value, and pack attributes similar. Or: hexdumping
> XML just shows a binary format which uses a lot of space to store a
> little information nothing more, nothing less.


Binary is serial. XML is by nature hierarchical and explicitly so. That's why
folks like Tim Bray had it proposed in the first place, I assume.

Best wishes,

Roy

-- 
Roy S. Schestowitz      |    "Oops. My brain just hit a bad sector"
http://Schestowitz.com  | Free as in Free Beer ¦  PGP-Key: 0x74572E8E
Load average (/proc/loadavg): 1.21 1.00 0.96 4/140 13374
      http://iuron.com - semantic search engine project initiative

Follow-Ups:
- Re: Google and validation
  - From: John Bokma

References:
- Re: Google and validation
  - From: Roy Schestowitz

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index