Re: Google and validation

Home	Messages Index

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index

Re: Google and validation

Subject: Re: Google and validation
From: John Bokma <john@xxxxxxxxxxxxxxx>
Date: 3 Sep 2006 07:30:11 GMT
Newsgroups: alt.internet.search-engines
Organization: Castle Amber - software development
References: <rv2if2p6mheq04ta9g2nqcil4tpjcvl1ap@4ax.com> <Xns98324626C5B4castleamber@130.133.1.4> <op.te8r2ycx26l578@borek> <3885708.rW5Pnx7RUX@schestowitz.com> <Xns98328B6CFF361castleamber@130.133.1.4> <25168712.XUPGHPTNhf@schestowitz.com>
User-agent: Xnews/2005.10.18
Xref: news.mcc.ac.uk alt.internet.search-engines:93029

Roy Schestowitz <newsgroups@xxxxxxxxxxxxxxx> wrote:

> __/ [ John Bokma ] on Saturday 02 September 2006 19:42 \__

Short, it's late, and I really like you as a person, and I hope that 
reality one day is able to slap you, since I seem to fail :-)

> XML and standards facilitate modularity. Where would OSS be without
> specifications?

Where it is now? Most OS projects work evolutionary, someone has an 
idea, writes code, decides to OS it, and it grows and grows, until the 
team discovers that many people are not happy with lack of documentation 
and specifications ("use the source Luke" is not funny).

Examples: PHP and Perl (Perl 6 is now specified, which I am happy 
about).

CS OTOH is often developed with a tight budget and hurray, some 
companies understand that writing specs before starting to program might 
offer at least some way to make things work with c * budget (with c 
hopefully less or equal to 2).

Of course there are exceptions on both, but most OS projects I am aware 
of suck at 2 of the following 3: code, specification, documentation. And 
most on all 3 :-D.

> Look at the mess Windows has reached (and Apple's Mac
> OS before it took Darwin). Windows still requires a 60% rewrite of the
> code (Jim Allchin) because it's utterly unmaintainable (all the
> planned features are conceded because they can't be implemented).

The thing with any large project is that there is a time that writing 
new specs, and entirely *rewrite* core code is the best thing to do. 
Netscape did it with their render engine (hurray, now we have Gecko), 
and Perl is doing it with Perl 6.

Firefox is replacing their bookmarks system (because bookmarks *do* 
suck), and their history format (because Mork does suck donkey ass).

> The
> need to accommodate many implementation gives power in maintaining a
> system and replacing weaker components with superior ones. That's why
> OSS is winning. 

I am not going to hold my breath. Personally the winning team for me is 
a mix of CS and OSS (as probably a lot of people are using atm). Which 
one does the best job, is the winner, and I don't care if it's CS or 
OSS. I don't have the time to manually patch OSS to my requirements, so 
if there is a better CS solution, I go for it, even if that means a 
vendor lock in (which is a joke, since if the OSS goes in a different 
direction I am fcked as well).

>> For HTML it's specified in the recommendation(s) when attributes must
>> be quoted and when it's ok to leave them out. No idea if Google
>> follows this. Also, a lot of people forget that HTML 4.01 has a lot
>> of optional stuff, a page can sometimes be made way shorter by
>> leaving out all implied stuff.
> 
> 
> This should not be done. If you code a quick-and-dirty, then fine.

Read again Roy, I guess you missed something. From the HTML 4.01 
Specification (not standard):

"7.4.1 The HEAD element

 [... brevity ... ]

 Start tag: *optional*, End tag: *optional*
"

How can coding according to the standard^H^H^H^H^H^Hspecification be 
quick and dirty?

> If
> you build a system which delivers billions of pages a day and you spew
> out junk, then it's just irresponsible and selfish.

See above.

> WordPress, for
> example, was built as a standards-compliant and accessible CMS from
> the start. And look where it stands today. People should stop
> programming browsers to render site X correctly just as Web developers
> should stop wasting their times on hacks. Standards resolve it /all/.

You must really be kidding yourself. Anyway, WP uses XHTML, bad choice 
IMNSHO. As soon as XHTML is used as XML I am afraid quite some bloggers 
might end up with "Parsing error 131313 unclosed element foo at line 
1232" on their page...

>> On the other hand, read my gzip story, a lot of webservers now serve
>> compressed content to browsers that tell them they can handle it. Why
>> gzip was chosen is a bit beyond me, because there are better
>> compression algorithms, and maybe even better results can be obtained
>> with a dedicated one for HTML.
> 
> Gzip is a /de facto/ standard.

Your point is? Why does this stop W3C from creating a dedicated 
compression algorithm for HTML and XML? (There is already an application 
that "optimizes" XML so gzip and friends work better, forgot the name, 
will look up later).

Anyway, http://en.wikipedia.org/wiki/Bzip2

No reason why gzip was picked over bzip2 IMO, unless someone did a test 
with 10,000 HTML pages and gzip was the clear winner (which I doubt, but 
I can be wrong sometimes :-D)

[ OS hackers ]
>> - deny
>> - make it sound insignificant
>> - argue why it shouldn't be added no matter what
> 
> I can attest to the same experience. Probably self-centred programmers
> who are possessive.

When people create, it's their baby :-D. I have used both non-OS and OS 
libraries, and so far have had better and faster support in the former 
case. Paying does have now and then advantages :-)

>> XML is often mistaken for a better solution, especially compared to
>> binary, because it's human readable. For most people a hex dump and
>> an XML dump is equally readable: not.
> 
> I strongly disagree. Many people don't document their hex dump. Trust
> me, they don't.

And they document their XML dumps? 

Can you say what the legal values for size are in:

<font>
    	<size>15</size>
</font>

Have a look at machine generated XML, and wonder :-) To most people it 
doesn't differ from a hex dump. Of course if you are familiar with the 
format it *is* readable, and yes, more readable then a hex dump. But I 
am afraid that to most people: look for the foo element in the bar 
element around line 14 and change it into baz is as magic as: fire up a 
hex editor, goto line 3efad and type deadbeef

> And sometimes, human-readable has its merits. My Palm
> archives are utterly useless if they are a mishmash of binary and
> ASCII. If it was XML, I could at least migrate my data manually,
> understanding what I'm doing.

Yes, for some people it *is* more readable. Like I wrote: "For most 
people a hex dump and an XML dump is equally readable: not."

> I have also done some mass alteration of
> configurations in programs using search and replace in XML settings
> files. Why? Because it's quicker. You don't get this flexibility with
> 'binary blobs'.

My point was that to most people the tasks are equal: beyond their 
reach.

>> A well documented binary format is as good as a well documented XML
>> format for implementing it. For the majority of users it doesn't
>> really matter. And yes, the binary format can be made 10-50 times
>> smaller compared to XML, which is noticeable in speed.
> 
> Computers are fast nowadays. Some people still work with bloatware.

Firefox and OOo? Why shouldn't they, it works ok :-D.

> There are also /ad hoc/ methods for making things quicker, e.g.
> cumulative read/write. Speaking of which, OpenDocument speeds will
> improve. Let the implementation mature. And disregard the Microsoft
> FUD. They are just afraid because their cash cow is in jeopardy as
> many countries are putting ODF policies in place.

I do my best to ignore MS FUD as well as GNU/Linux, Firefox, and general 
OS FUD. 

> Binary is serial. XML is by nature hierarchical and explicitly so.
> That's why folks like Tim Bray had it proposed in the first place, I
> assume. 

XML *is* binary. Like I said, make a hexdump of an XML file, it's 
educational.

-- 
John    Need help with SEO? Get started with a SEO report of your site:

    --> http://johnbokma.com/websitedesign/seo-expert-help.html

References:
- Re: Google and validation
  - From: Roy Schestowitz
- Re: Google and validation
  - From: Roy Schestowitz

[Date Prev]	[Date Next]	[Thread Prev]	[Thread Next]

Author Index	Date Index	Thread Index