Not feeling very talkative today. Took me a while to recognize what I was feeling as early warning signs of another migraine and medicate myself against it -- hoping it works that way. (One thing I liked about Midrin is that it seemed to provide a few days of protection in addition to taking care of the migraine I took it for. AC&C doesn't do that, but it's what I've got. OTOH, it also doesn't give me that frustrating, dopey "drugged feeling" that Midrin does.) Anyhow, in addition to an "I'm alive" ping, I felt like tossing out a thought to get other folks' perspectives on...
What do y'all think about using proper typographic non-ASCII punctuation in LiveJournal entries or on other web pages? I'm thinking mostly about the correct uses of hyphens, en-dashes, and em-dashes instead of just using an ASCII dash ('-'/45/0x2D) for both the first two, and a pair ("--") in place of the em-dash (""), but this would also apply to opening and closing quotation marks ( ) instead of the symmetrical ASCII quote ('"'/34/0x22), and so on.
On the one hand, it's not a Big Deal, people are used to reading typewriter/email conventions on computer screens and know what is meant, and coding the right codes as though typesetting instead of merely typing is more effort for little gain
On the other hand, with Unicode and HTML &-codes we are no longer limited to only those characters that could've been produced on a manual typewriter and displayed on an ADM3a or VT52 computer terminal. We do not have to make do with the same conventions on the web as make sense in email, and we can make our sites and our blogs look more professional, more like books and magazines, by going back to the typographical conventions that predate the typewriter and putting the nuances of punctuation into how we type what we write for the web.
On the gripping hand1, those non-ASCII characters can be a pain in the arse when using copy-and-paste, especially when one wishes to paste text into some place where the full range of Unicode characters isn't supported or if one's operating system pastes its own special Extended-ASCII code or the Unicode value where an HTML &#; code is what is actually needed. (For example: although the Telnet client I usually use seems to be able to deal with a certain amount of Unicode, the vim editor displays numeric codes for non-ASCII characters, and if I fail to notice such characters when pasting text into email, at least one friend's mail server will reject the message. I also have to clean up non-ASCII characters when posting quoted text here in my journal.)
So ... pointless affectation, useful or aesthetically appealing liberation from the typewriter mindset, or annoying menace? I do not expect consensus, but I'm interested in how people feel about this and what they can tell me about why they feel the way they do. (Obviously, using Unicode characters so as to be able to correctly spell names and other words from languages that have letters (or accents) that English does not would be a different question -- the existence of Unicode is a Good Thing; I'm specifically asking about the use of it for punctuation that can be approximated in ASCII.)
And now to see what the second paragraph of this entry looks like in Lynx.
ETA: Lynx translated the opening and closing quotation marks to ASCII quotes and translated the em-dash to a pair of ASCII dashes. A similar browser, Links, handled the quotation marks the same way but translated the em-dash to a single ASCII dash. And I just realized this should've been a start-of-the-workday entry since I want lots of people to comment, but I'm not feeling sufficiently motivated to delete it and hand it to the 'at' daemon for reposting eleven hourse hence, now that I've already posted it.
(no subject)
(no subject)
(no subject)
(no subject)
(no subject)
As I read my e-mail in plain text, I am fairly sure my e-mail would display non-characters for anything too interesting.
(no subject)
(no subject)
(no subject)
(no subject)
(no subject)
Actually, if I were going to start using the punctuation that HTML doesn't have named codes for in LJ, I would just invent my own mnemonics for them and write a shell script that translated them into the numeric codes with sed before invoking clive (the LJ client I use). Similarly for my web site, since I'm already running pages through PHP before uploading for other reasons, I'd just add some macro definitions for convenient mnemonics.
But yeah, while starting out it would be a PITA. I'm a bit surprised that HTML doesn't use named entities for those marks.
Could be...
Incidentally, go read my tech rant. You're not the only one who has interoperability/accessibility problems at times... :D
(no subject)
How set in my ways am I? I still put two spaces after a sentence-ending punctuation mark, that's how set in my ways I am.
But, just for the sake of experimentation: can I type 日本語? Can I talk about Frosty the ☃? How messed up will this comment look, since I've been typing (er, generating anyway) raw Unicode into it? Will my mailed copy be usable?
(no subject)
As for two spaces after a period, yah, me too.
Ee, koko dewa nihongo ga imasu...
(no subject)
I'd rather be sure as many people as possible can see the words and the meaning, rather than find them pretty :)
(no subject)
No, we haven't yet left plain text behind, and until we do (if ever), it's no skin off my nose to continue to exercise the courtesy and protocol that enable those reading in that way to do so easily.
Which is pretty much what
(no subject)
(no subject)
(no subject)
bother with chiral quotes and so forth. I'll also use a few special characters like
hearts and a trademark symbol. And I don't think I've ever bothered with
ligatures for LJ. I've also started opting for using asterisks for *emphasis*
instead of <em> tags, as the ASCII version I get emailed when a posting
or comment gets a comment is really ugly (I don't do HTML email).