[RFC] Formati

Sab 12 Feb 2005 16:36:48 CET

Marco d'Itri evidenziò:

>>Peccato che ignori il fatto che il mimetype possa venire specificato tra 
>>i <meta> con l'apposito valore "content-type".
> 
> Non lo ignora, sei tu che ignori che ha la precendenza l'header HTTP
> (vedi http://apache.lexa.ru/english/meta-http-eng.html). Infatti Firefox
> correttamente riporta "text/html".

Uops. Resta il fatto che il renderer rimanga in strict mode, che alla 
fine è la cosa che interessa la visualizzazione.

>>quindi non richiede altro intervento. Se volete posso specificare 
>>correttamente il content-type passato dal server, ma non credo cambi nulla.
> 
> No che non puoi (a meno di distruggere il caching facendo browser
> sniffing), il documento che avevo citato spiega perché.

Se gentilmente mi prepari apache in modo che generi al volo l'HTML4 per 
me va bene. La soluzione corretta sarebbe questa, nonostante sia ridicola.

Ricordo comunque il fatto non trascurabile che la situazione, al 
momento, è perfettamente funzionante.

Putroppo il supporto IE limita l'aderenza agli standard, ma ciò 
coinvolge *tutti* i siti che usano XHTML. Siccome tutti i nuovi siti lo 
sono (più o meno) tutti i browser non sono affatto schizzinosi.

> SPECIFIC PROBLEMS
> 
> These are the issues that affect documents when they are switched from
> text/html to application/xhtml+xml:
> 
>  * <script> and <style> elements in XHTML sent as text/html have to be
>    escaped using ridiculously complicated strings.

Non ci provoca problemi. I CSS sono esterni e non usiamo javascript.

>  * A CSS stylesheet written for an HTML4 document is interpreted
>    slightly differently in an XHTML context (e.g. the <body> element
>    is not magical in XHTML, tag names must be written in lowercase in
>    XHTML). Thus documents change rendering when parsed as XHTML.

Anche questo non ci provoca problemi.

>  * A DOM-based script written for an HTML4 document has subtly
>    different semantics in an XHTML context (e.g. element names are
>    case insensitive and returned in uppercase in HTML4, case sensitive
>    and always lowercase in XHTML; you have to use the namespace-aware
>    methods in XHTML, but not in HTML4). BUT, if you send your
>    documents as text/html, then they will use the HTML4 semantics
>    DESPITE being XHTML! Thus, scripts are highly likely to break when
>    the document is parsed as XHTML.

Idem.

>  * Scripts that use document.write() will not work in XHTML contexts.
>    (You have to use DOM Core methods.)

Idem.

>  * Current UAs are, for text/html content, HTML4 user agents (at best)
>    and certainly not XHTML user agents. Therefore if you send them
>    XHTML you are sending them content in a language which is not
>    native to them, and instead relying on their error handling. Since
>    this is not defined in any specification, it may vary from one user
>    agent to the other.

Avendo testato le pagine sulla maggior parte dei browser in uso non 
credo possa protare gravi problemi di visualizzazione. Inoltre la 
situazione è, come ho già detto, un mal comune..

>  * XHTML documents that use the "/>" notation, as in "<link />" have
>    very different semantics when parsed as HTML4. So if there was to
>    be a fully compliant HTML4 UA, it would be quite correct to show
>    ">" characters all over the page.

Già. Però non lo fanno. Se vogliamo parlare di mondi ideali diversi dal 
presente mi va bene, basta che non debba riscrivere alcunché per 
assecondarli.

>  * Documents sent as text/html are handled as tag soup [1] by most UAs.
> 
>    This is the key. If you send XHTML as text/html, as far as browsers
>    are concerned, you are just sending them Tag Soup. It doesn't
>    matter if it validates, they are just going to be treating it the
>    same was as plain old HTML 3.2 or random HTML garbage.

[...]

>    Therefore the main advantage of using XHTML, that errors are caught
>    early because it _has_ to be valid, is lost if the document is then
>    sent as text/html. (Yes, I said _most_ authors. If you are one of
>    the few authors who understands how to avoid the issues raised in
>    this document and does validate all their markup, then this
>    document probably does not apply to you -- see Appendix B.)

I documenti sono validati con il sito del W3C.

>  * If you ever switch your documents that claim to be XHTML from
>    text/html to application/xhtml+xml, then you will in all likelyhood
>    end up with a considerable number of XML errors, meaning your
>    content won't be readable by users. (See above: most of these
>    documents do not validate.)

Sono validi. E se presi in locale hanno quel mimetype. E vengono resi 
nello stesso modo.

>  * If a user saves such an text/html document to disk and later
>    reopens it locally, triggering the content type sniffing code since
>    filesystems typically do not include file type information, the
>    document could be reopened as XML, potentially resulting in
>    validation errors, parsing differences, or styling differences.
>    (The same differences as if you start sending the file with an XML
>    MIME type.)

Ovviamente lavoro in locale quando faccio modifiche. Quindi mi è 
difficile non notare la distruzione e morte che potrebbe accadere in tal 
caso.

>  * The only real advantage to using XHTML rather than HTML4 is that it
>    is then possible to use XML tools with it. However, if tools are
>    being used, then the same tools might as well produce HTML4 for you.
>    Alternatively, the tools could take SGML as input instead of XML.
>    (SGML is over a decade older than XML and the tools have existed
>    for years.)

I miei vantaggi sono che:

1) li valido dal sito del w3c, così non scrivo cazzate
2) i browser rimangono in strict mode e quindi sono più affidabili
3) un domani non devo riscrivere tutto

>  * HTML 4.01 contains everything that XHTML 1.0 contains, so there is
>    little reason to use XHTML in the real world. It appears the main
>    reason is simply "jumping on the bandwagon" of using the latest and
>    (perceived) greatest thing.

La moda è passata qualche anno fa. Ora, invece, voler HTML4 mi sembra 
voler comportarsi da vecchio brontolone.

>  * The "/>" empty tag syntax actually has totally different meaning in
>    HTML4. (It's the SHORTTAG minimisation feature known as NET, if I
>    recall the name correctly.) Specifically, the XHTML
> 
>      <p> Hello <br /> World </p>
> 
>    ...is, if interpreted as HTML4, exactly equivalent to:
> 
>      <p> Hello <br>&gt; World </p>
> 
>    ...and should really be rendered as:
> 
>      Hello
>      > World

Quesa cosa deve apparirgli terribile. Nonostante l'abbia ripetuta non 
diviene più grave e nessun browser ha cominciato a farlo.

>  * Script and style elements cannot have their contents hidden from
>    legacy UAs. The following XHTML:

[...]

Non è il nostro caso.

>  * The "xmlns" attribute is invalid HTML4.

>  * The XHTML DOCTYPEs are not valid HTML4 DOCTYPEs.

Ok. Per questo credo che i browser continuino a trattarlo come XHTML, 
nonstante quanto dica apache, applicando una semplice euristica.

>  * Documents sent as text/html are handled as tag soup by most UAs.
>    This means that authors are not checking for validity, and thus
>    most XHTML documents on the web now are invalid. A conforming XML
>    UA would thus be unable to show as many documents as current UAs,
>    and would therefore never get enough marketshare to be relevant.

Io controllo la validità.

>  * It is impossible to reliably autodetect XHTML when sent as
>    text/html. This is why UAs could not ever treat text/html documents
>    as XML, even if they did not care about not being usable (see the
>    first point in this section).

Non è ottimale, ma mi pare resti l'unica opzione finché IE6 verrà usato. 
Segnalate la cosa a Microsoft.

> * Even if you could detect XHTML, what do you do with a document that
>    is not well formed (such as the example above)? If you fall back on
>    HTML4, then there is no advantage to using an XML processor, and you
>    might as well always treat it as HTML4.
> 
>  * The HTML working group said that UAs should not do this:
>       http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html

Ciò nonstante le cose funzionano. Odio riparare cose funzionanti.

Scusate la lunghezza.

-- 
Buongiorno.
Complimenti per l'ottima scelta.