100% Valid XHTML Database Output with Markdown and HTMLEntities()
Published Wednesday, November 07, 2007 in (X)HTML, PHP, Semantics
Editors not knowing HTML and messing up your markup can be a big problem for you if you care about validation.
There are numerous WYSIWYG editors out there, but most of them produce absolute crap code, and if you want valid and semantic XHTML they simply will not work.
Markdown is an absolutely brilliant text-to-HTML converter that everybody can learn in minutes.
In fact, you don't have to do anything special at all to produce semantically correct, valid XHTML 1.0 Strict (or HTML if you prefer).
The syntax for adding basic HTML-elements such as headings, lists and images takes about 2 minutes to learn, and is very similar to the way you add meaning to your emails and IM messages, using underscore for emphasis etc.
The problem with Markdown if you use it for user-input (as well as your own) is that it allows any type of HTML-code to be inserted. Of course this is not optimal for comments and other data anybody can submit.
Together with the PHP-function htmlentities(), however, you can always be sure to get 100% valid output.
Download NiceString and demo-files
Below is the function I use for almost all type of database output in aframework (the "framework" exscale.se runs on).
If I don't want the output to contain any HTML at all (like the title of an article for example) I use nothing but htmlentities(), but for the article itself, comments, pages and other things that are more than just a line of text I always use my ns()-function.
<?php
$str = ns($str, $cutMore = false, $substr = false, $markdownHeadingLevel = false);
?>
Let's break down what each parameter means.
Obviously $str is the string containing the database output.
$cutMore is a boolean indicating whether to cut the string at a [ more ]-tag or not, if not all [ more ]-tags are removed from the string.
If $substr has a numeric value other than 0, the string will be cut to that length and "..." will be added to the end of it (unless the string was already shorter than requested).
Using $markDownHeadingLevel you can specify the highest allowed markdown heading in the string, this is the same as I wrote about in Markdown Heading Level PHP-Function.
One downside of using htmlentities() together with markdown() is that the markdown-syntax for inserting > blockquotes and <autolinks> stops working as they use the < and >-characters. I've taken care of blockquotes using a simple regular expression, but you'll have to do without quick-links if you're gonna use ns().
ns() supports a few things other than markdown. Anything inserted between [ code ] and [ /code ]-tags will be syntax highlighted using PHP's highlight_string() function.
ns() replaces highlight_string's inline styles with semantic class-names as well, so you can apply your own styling to them.
You can use a [ youtube=youtubecode ]-tag to insert a youtube-clip using valid XHTML as described in Alistapart's Flash Satay-article.
I've also added support for textile's way of inserting abbreviations using ABBR (Abbreviation) and -I'm planning to add- I've also added support for <del>-elements using -del-. This is unfortunately not supported by PHP Markdown alone.
If you want to hide part of a string you can put it inside [ del ] and [ /del ]-tags.
Please note that I've added spaces to all tags explained here, remove spaces when you use the function.
Using ns() (or "nice string") you can always be sure to have 100% valid code.
ns() requires PHP Markdown, and also uses plenty of other functions although they are included below.
Perhaps it would make sense to turn this in to a class?
Well look at that, I did turn it in to a class: Download NiceString and demo-files
Now I'm not actually a back-end coder, and I'm sure there are ways to improve this code so please feel free to do so in the comments.
I regularly expand and improve this function and I'll try to keep this post up to date so you may want to check back every now and then to get the latest version.
The NiceString Class is released under a GNU General Public License v3. I know nothing about licenses so just picked one. I hope it allows you to do what you wish except make money off it that I never see :)






Comments
1 comments so far, why don't you post one too?
Thursday, November 08, 2007 | View all comments by Turbo
This article is so interesting so I will read it 4 tmis today and then maybe one more time tomorrow. /Da Turb