FAQ

[PHP-INTERNALS] short_open_tag

Stanislav Malyshev
Mar 7, 2008 at 8:46 pm
Hi!

I wonder - is there a reason why short_open_tag config value is per-dir
and not PHP_INI_ALL? After all, as I understand, it is private for each
compilation. So suppose you preferred it generally off (you do XML,
etc.) but you have some files in your app where you want it on - would
there be any problem anywhere if it were INI_ALL?
--
Stanislav Malyshev, Zend Software Architect
stas@zend.com http://www.zend.com/
(408)253-8829 MSN: stas@zend.com
reply

Search Discussions

107 responses

  • Lars Strojny at Mar 7, 2008 at 10:14 pm
    Hi Stas,

    Am Freitag, den 07.03.2008, 12:45 -0800 schrieb Stanislav Malyshev:
    [...]
    I wonder - is there a reason why short_open_tag config value is per-dir
    and not PHP_INI_ALL? After all, as I understand, it is private for each
    compilation. So suppose you preferred it generally off (you do XML,
    etc.) but you have some files in your app where you want it on - would
    there be any problem anywhere if it were INI_ALL?
    This is a great idea as some framework use plain PHP scripts for
    templating nowadays. So the view could enable short tags for the time
    the template script is evaluated and disable it afterwards.

    cu, Lars
  • Stanislav Malyshev at Mar 21, 2008 at 5:57 pm
    Hi!

    Forwarding this mail again since apparently many people missed it
    previously. Please discuss.

    -------- Original Message --------
    Subject: short_open_tag
    Date: Fri, 07 Mar 2008 12:45:59 -0800
    From: Stanislav Malyshev <stas@zend.com>

    Hi!

    I wonder - is there a reason why short_open_tag config value is per-dir
    and not PHP_INI_ALL? After all, as I understand, it is private for each
    compilation. So suppose you preferred it generally off (you do XML,
    etc.) but you have some files in your app where you want it on - would
    there be any problem anywhere if it were INI_ALL?


    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Marcus Boerger at Mar 21, 2008 at 6:32 pm
    Hello Stanislav,

    thanks for rewriting this.

    Friday, March 21, 2008, 6:57:40 PM, you wrote:
    Hi!
    Forwarding this mail again since apparently many people missed it
    previously. Please discuss.
    -------- Original Message --------
    Subject: short_open_tag
    Date: Fri, 07 Mar 2008 12:45:59 -0800
    From: Stanislav Malyshev <stas@zend.com>
    Hi!
    I wonder - is there a reason why short_open_tag config value is per-dir
    and not PHP_INI_ALL? After all, as I understand, it is private for each
    compilation. So suppose you preferred it generally off (you do XML,
    etc.) but you have some files in your app where you want it on - would
    there be any problem anywhere if it were INI_ALL?
    For me the largest issue is infact late enabling of short tags at run time.
    The issue tracks down to the issue where peole might use code that enables
    short open tags but forgets to disable that. Now why might I rely on short
    open tags being disabled is another question of course. Years ago when we
    last discussed whether we should discourage them the conclusion was that
    not many people rely on them being off. Today many people have php code
    interact with XML code which sees short open tags as invalid PIs. Thus we
    should try to discourage maybe even finally deprecate short open tags. This
    would also be in line with the rest of PHP where we block and discourage
    short syntactical sugar when all it does is saving keystrokes. With short
    open tags the argument usually simply is that '<?=$bla?>' is shorther than
    writing '<?php echo $bla:?>'. But then again the former is much harder to
    spot in manual written code. And for generated code it doesn't matter at
    all. That said I am against short open tags. And given my first part. I do
    not want to deal with code that does 'php_ini_set("short_open_tags", 0);'
    after every single include or require statement.

    Best regards,
    Marcus
  • Stanislav Malyshev at Mar 21, 2008 at 7:11 pm
    Hi!
    For me the largest issue is infact late enabling of short tags at run time.
    The issue tracks down to the issue where peole might use code that enables
    short open tags but forgets to disable that. Now why might I rely on short
    I think this case is very unlikely. The use case for this feature is
    template system, written in long-tags style, but using short-tags
    notation for PHP templates. To compare:

    My name is <?= $name ?> and I am <?= $age ?> years old.

    My name is <?php echo $name; ?> and I am <?php echo $age; ?> years old.

    I think there's little doubt people - especially non-programmer people
    like designers - would have much less trouble understanding and writing
    first notation than second notation. If you compare larger, more complex
    templates, the difference in readability is even bigger. And having code
    easy to work with is one of the reasons people do PHP.

    Now, in a template system, it is really hard to imagine that template
    system creator would be so sloppy as to intend to write code like:

    setShortTags();
    include $user_template;
    resetShortTags();

    and somehow "forget" to write the last function. That would require
    extreme absent-mindedness on developer's part and you definitely should
    steer clear of template systems written by such people. However, for
    real template systems I know - they are written by very smart people,
    and actually these people support this capability, as it allows them to
    use nice syntax in templates without requiring any system configuration
    (which may be unavailable or incompatible with other code).
    open tags being disabled is another question of course. Years ago when we
    This is very important question, since the only known case of why it
    might be important is when you use XML as template by including it
    directly through PHP parser. I don't think I would be mistaken if I say
    this is extremely rare use case. Actually, I'm not sure there's even one
    of common applications - like known CMSes, frameworks, blog platforms,
    e-commerce platforms, etc. - that can not work with short tags. Can you
    name which ones can't?
    Again, I consider the concept of "accidentally enabling" sort tags very
    improbable, but even if it somehow happened - IMO it would not be a
    problem except in some very rare use cases.
    last discussed whether we should discourage them the conclusion was that
    not many people rely on them being off. Today many people have php code
    And if you look at the discussion, there were opinions - including
    Zeev's - that there's nothing wrong with shorts tags in general, only in
    some rare use cases.
    short syntactical sugar when all it does is saving keystrokes. With short
    open tags the argument usually simply is that '<?=$bla?>' is shorther than
    writing '<?php echo $bla:?>'. But then again the former is much harder to
    spot in manual written code. And for generated code it doesn't matter at
    I have hard time figuring out a use case when you need to "spot" this in
    your code - and, indeed, have one in your code at all, unless it is a
    template. In a template, <?= is much better and with any decent editor,
    very easy to spot.
    all. That said I am against short open tags. And given my first part. I do
    not want to deal with code that does 'php_ini_set("short_open_tags", 0);'
    after every single include or require statement.
    You do not need to deal with this code, and there's absolutely no reason
    to do it. Only case when you may need to do it if you include *hostile*
    code - i.e. code that can intentionally try and screw up your
    environment. In this case, this code might do much worse things than
    screw with your short tags setting, which in 99.99% of cases wouldn't do
    anything - that code might drop your include path, unset your variables,
    close your files and DB connections, reset your memory limit and
    execution time to very low values, rewire charsets on input and output,
    install any kinds of stream filters, turn on magic quotes, and do a ton
    of other very bad things like messing with your file system and what
    not. Still all those variables and settings are user-accessible. And you
    are not worried about restoring your include path or resetting your
    magic quotes or memory limits after each include.

    And on top of that - if you are still concerned, you always could do
    php_admin_value which IIRC blocks setting values by user.
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Jani Taskinen at Mar 21, 2008 at 7:28 pm

    Stanislav Malyshev kirjoitti:
    I think this case is very unlikely. The use case for this feature is
    template system, written in long-tags style, but using short-tags
    notation for PHP templates. To compare:

    My name is <?= $name ?> and I am <?= $age ?> years old.

    My name is <?php echo $name; ?> and I am <?php echo $age; ?> years old.
    I'd rather see <?php= than having this whole "short_open_tag" thing at all.
    I'd even use it myself. But I will not EVER enable the damn short tags again.
    And won't allow anyone else doing it either. And speaking of hostile code: ALL
    code is hostile unless you wrote it yourself. Have you ever heard of a group of
    developers working on same code base? Like, say, in a company that develops
    apps? If you allow changing this thing in runtime, it's adding another potential
    pitfall to check for when all hell breaks loose and something starts misbehaving
    and you have no idea why. Time spend on finding the cause would be better spend
    coding new stuff.

    And as you yourself instructed to check for "short_open_tag" in the archive
    search: Count how many hits it gives which talk about _problems_ with it.
    (and no, none of those people ask about "why can't I enable it in runtime..")

    --Jani
  • Stanislav Malyshev at Mar 21, 2008 at 7:37 pm
    I'd rather see <?php= than having this whole "short_open_tag" thing at all.
    Does <?php= work? I though echo shortcut works only with short tags.
    <?php= is not much worse than <?= so it'd be OK with me. Downside would
    be template systems couldn't use it until 5.3 is widely deployed - which
    means no template system can use it as standard for about 2-3 years at
    least. Unless we put <?php= in 5.2, which would make me a happy camper,
    but might be a trouble for others.
    I'd even use it myself. But I will not EVER enable the damn short tags
    again. And won't allow anyone else doing it either. And speaking of
    What's wrong with short tags, can anybody explain me?
    hostile code: ALL code is hostile unless you wrote it yourself. Have you
    Not true. You probably use a ton of libraries, never verifying they
    don['t screw up your include path, memory limits, etc.? Why short tags
    are so different?
    And as you yourself instructed to check for "short_open_tag" in the
    archive search: Count how many hits it gives which talk about _problems_
    with it.
    Can you show which exactly search query you used, so we'd be sure we are
    talking about the same thing?

    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Jani Taskinen at Mar 21, 2008 at 7:47 pm

    Stanislav Malyshev kirjoitti:
    I'd rather see <?php= than having this whole "short_open_tag" thing at
    all.
    Does <?php= work? I though echo shortcut works only with short tags.
    No, someone decided it shouldn't be added.
    least. Unless we put <?php= in 5.2, which would make me a happy camper,
    but might be a trouble for others.
    I wouldn't mind. Nor would couple of thousand other people. :)
    I'd even use it myself. But I will not EVER enable the damn short tags
    again. And won't allow anyone else doing it either. And speaking of
    What's wrong with short tags, can anybody explain me?
    [see below where you ask about the archive search..]
    hostile code: ALL code is hostile unless you wrote it yourself. Have you
    Not true. You probably use a ton of libraries, never veerifying they
    don['t screw up your include path, memory limits, etc.? Why short tags
    are so different?
    Nobody can set memory_limit in a script during runtime. AFAICT.
    Short tags are language SYNTAX issue. That's why it's different.
    You don't get any plain error if they're "on" and something doesn't work.
    It just doesn't work or misbehaves.
    And as you yourself instructed to check for "short_open_tag" in the
    archive search: Count how many hits it gives which talk about
    _problems_ with it.
    Can you show which exactly search query you used, so we'd be sure we are
    talking about the same thing.
    Just plain "short_open_tag" (without the quotes of course :).
    Here's the longish url:
    http://www.mail-archive.com/find.php?domains=www.mail-archive.com&q=short_open_tag&sa=Search+mailing+lists&sitesearch=www.mail-archive.com&client=pub-7266757337600734&forid=1&channel=2703820358&ie=ISO-8859-1&oe=ISO-8859-1&cof=GALT%3A%23C8C8C8%3BGL%3A1%3BDIV%3A%23CD9685%3BVLC%3A000000%3BAH%3Acenter%3BBGC%3AFFFFFF%3BLBGC%3AFFFFFF%3BALC%3A006792%3BLC%3A006792%3BT%3A000000%3BGFNT%3A006792%3BGIMP%3A006792%3BFORID%3A11&hl=en

    The first hits explain quite well why short_open_tag is bad, mmkay.

    --Jani
  • Stanislav Malyshev at Mar 21, 2008 at 8:03 pm
    The first hits explain quite well why short_open_tag is bad, mmkay.
    OK, let's see what we have there:

    0. Support for my email, skipping.

    1. "The web is a rapidly changing market and standards are being
    activley evolved. <?php is more compatable with standards on the web
    than <? ... and its not about XML document headers."
    Then goes for "include XML as template" case. And Rasmus objecting to
    short tags being bad code practice.

    2. Proposal for PHP to not parse <?xml, no explanation why short tags bad.

    3. Support for short tags

    4. "We should have warned people not to use short tags years ago." no
    explanation why. Explaining why <?xml from 2. is not a good idea.

    5. Bug report on short_open_tag

    6. "As the XML community expands and more and more scripting languages
    (server and client side) are being designed to interoperate,
    cross-language compatability (or at least handling) is required." - same
    person as 1. Then:
    "The short tag <? is bad form, and shows lazyness. <?php fits the
    standard, and its my recommendation that (in version 5 of the product)
    the sort tag option is removed."

    No explanation why it "shows lazyness" or why it's bad except for
    hinting it's somehow bad for handling XML (which it isn't).

    7. "I'm -1 on removing short tags, whether now or for PHP5." from Chris.

    8. "Not going to happen, please leave this issue alone." from Zeev.
    If you have any doubts, "not going to happen" is removal of the short tags.

    9. "1. Removing (and even disabling by default) short tags is not
    necessary for all PHP-community." from Antony.

    Should I go deeper? Did we use the same search engine? I'm still missing
    explanation why short tags are bad. Since you have obviously found it,
    could you do me a favor and quickly summarize at least couple of reasons
    - omitting the one being "every application must include XML through
    parser" as obviously invalid.

    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Pierre Joye at Mar 21, 2008 at 8:13 pm

    On Fri, Mar 21, 2008 at 9:02 PM, Stanislav Malyshev wrote:

    No explanation why it "shows lazyness" or why it's bad except for
    hinting it's somehow bad for handling XML (which it isn't).
    See below.
    Should I go deeper? Did we use the same search engine? I'm still missing
    explanation why short tags are bad.
    I gave you the link to one main explanation, the XML specs. Or what
    else do you need to explain the problem in the XML context?
  • Stanislav Malyshev at Mar 21, 2008 at 8:15 pm

    I gave you the link to one main explanation, the XML specs. Or what
    else do you need to explain the problem in the XML context?
    I need to explain why XML specs have any relevance to PHP syntax and why
    PHP sources should conform to them. Are we coding in XML now? Is
    everybody using an XML parser to process PHP code? How XML spec is one
    to dictate what PHP syntax is?
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Marcus Boerger at Mar 21, 2008 at 8:26 pm
    Hello Stanislav,

    lemme think, PHP is used to generate HTML and XHTML. And often people
    have the headers outside of the PHP tags. And some people like to use tools.
    But maybe I am wrong. Either way. It appears that nearly every single
    person replying is against this. So can we please stop arguing? I don't
    think there are more arguemnts coming. And it doesn't look like you are
    convincing anyone here.

    Maybe instead work on a proposal for something like <?echo, given the
    feedback on this thread it looks like people would jumo for it.

    marcus

    Friday, March 21, 2008, 9:15:33 PM, you wrote:
    I gave you the link to one main explanation, the XML specs. Or what
    else do you need to explain the problem in the XML context?
    I need to explain why XML specs have any relevance to PHP syntax and why
    PHP sources should conform to them. Are we coding in XML now? Is
    everybody using an XML parser to process PHP code? How XML spec is one
    to dictate what PHP syntax is?


    Best regards,
    Marcus
  • Stanislav Malyshev at Mar 21, 2008 at 8:37 pm

    lemme think, PHP is used to generate HTML and XHTML. And often people
    Neither of which require <?. HTML in fact doesn't support it even.
    have the headers outside of the PHP tags. And some people like to use tools.
    But maybe I am wrong. Either way. It appears that nearly every single
    person replying is against this. So can we please stop arguing? I don't
    You mean like 3 people that can't even explain why they hate short tags,
    but just like me to shut up because their opinion can't be wrong? Yeah,
    right.
    think there are more arguemnts coming. And it doesn't look like you are
    convincing anyone here.
    Maybe because these "someone" just can't be convinced by rational
    argument. This is really sad that people are willing to sacrifice
    interests of the users and the language development for... well, I don't
    even know for what. For not admitting they were wrong for once?
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Marcus Boerger at Mar 21, 2008 at 9:41 pm
    Hello Stanislav,

    Friday, March 21, 2008, 9:37:51 PM, you wrote:
    lemme think, PHP is used to generate HTML and XHTML. And often people
    Neither of which require <?. HTML in fact doesn't support it even.
    have the headers outside of the PHP tags. And some people like to use tools.
    But maybe I am wrong. Either way. It appears that nearly every single
    person replying is against this. So can we please stop arguing? I don't
    You mean like 3 people that can't even explain
    I think you should take a step back dude.
    think there are more arguemnts coming. And it doesn't look like you are
    convincing anyone here.
    Maybe because these "someone" just can't be convinced by rational
    argument. This is really sad that people are willing to sacrifice
    interests of the users and the language development for... well, I don't
    even know for what. For not admitting they were wrong for once?
    I haven't seen a single technical argument from your side.

    Best regards,
    Marcus
  • Stanislav Malyshev at Mar 21, 2008 at 9:44 pm
    I haven't seen a single technical argument from your side.
    That's just hilarious. I spend entire half-day repeating arguments about
    XML and short tags and templates and users and what not - but why bother
    if Marcus doesn't even read it? Well, I hope at least somebody reads it.
    As for trying to convince people that couldn't care less - I'm close to
    give up talking to those.
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Elizabeth M Smith at Mar 21, 2008 at 9:53 pm
    Wow, noisy...

    I've been in the situation where I use php for templating and the short
    syntax is much nicer on the eyes. The ability to "flick the switch" for
    short tags would be nice.

    However, like Steph, I've also been bitten by having a simple xml
    declaration in a file with short tags on that completely breaks things.
    Parse errors are NOT a good thing. This is why I'd personally prefer
    short tags just go poof - having to check all your code so any
    appearance of <? is echo'd gets really annoying.

    I'd argue that a <?php= shortcut or something similar would help "split
    the difference" between the ugliness of the long version and the need to
    not break php every time an xml declaration pops up in a file. Even
    gettext has a nice _() function shortcut which is less typing than echo
    $blah; in every php tag set, and then you wouldn't be fighting with the
    potential breakage. The argument that if some new syntax only goes into
    5.3, people can't use it doesn't really hold water here because you
    wouldn't be able to rely on flipping the short_tags switch before 5.3
    either.

    I can see both sides of the story, and really don't have a preference -
    I'm curious as to the opinions of someone OTHER than Marcus, Stas,
    Pierre and Jani ;)

    Thanks,
    Elizabeth M Smith
  • Edward Z. Yang at Mar 21, 2008 at 9:55 pm

    Elizabeth M Smith wrote:
    I'd argue that a <?php= shortcut or something similar would help "split
    the difference" between the ugliness of the long version and the need to
    not break php every time an xml declaration pops up in a file.
    +1

    --
    Edward Z. Yang GnuPG: 0x869C48DA
    HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
    [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]
  • Richard Quadling at Mar 21, 2008 at 10:06 pm

    On 21/03/2008, Elizabeth M Smith wrote:
    Wow, noisy...
    And having made the commit to the dox before the revert, I'm still
    reeling/reading to try and see which way I would go...
    I'd argue that a <?php= shortcut or something similar would help "split
    the difference" between the ugliness of the long version and the need to
    not break php every time an xml declaration pops up in a file. Even
    gettext has a nice _() function shortcut which is less typing than echo
    $blah; in every php tag set, and then you wouldn't be fighting with the
    potential breakage. The argument that if some new syntax only goes into
    5.3, people can't use it doesn't really hold water here because you
    wouldn't be able to rely on flipping the short_tags switch before 5.3
    either.

    I can see both sides of the story, and really don't have a preference -
    I'm curious as to the opinions of someone OTHER than Marcus, Stas,
    Pierre and Jani ;)
    If you saw ...

    <?php $varname; ?>
    or
    <?php $varname ?>

    what would you assume this meant?

    From this, I would say it isn't a function call as I would need to add () to it.
    It is not an assignment or declaration.
    If it was documented that a PHP statement consisting of just a
    variable name would echo a string, then I think this would solve all
    the problems of readability.

    What would you assume a non programmer thought of it? If they were
    told "this is how you put a PHP variable into a template" would they
    just go with it?

    Ok, again, I'm no internals expert.

    Maybe the _$varname; would be more pleasing.

    On the plus side there is only 1 PHP tag. No matter what happens <?php
    will always be the PHP way. I see no need for <?= (and WTF the ASP
    ones? I'm late to the PHP - only 5 years - but ...!)



    --
    -----
    Richard Quadling
    Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731
    "Standing on the shoulders of some very clever giants!"
  • Rasmus Lerdorf at Mar 22, 2008 at 12:20 am

    Elizabeth M Smith wrote:
    Wow, noisy...

    I've been in the situation where I use php for templating and the short
    syntax is much nicer on the eyes. The ability to "flick the switch" for
    short tags would be nice.

    However, like Steph, I've also been bitten by having a simple xml
    declaration in a file with short tags on that completely breaks things.
    Parse errors are NOT a good thing. This is why I'd personally prefer
    short tags just go poof - having to check all your code so any
    appearance of <? is echo'd gets really annoying.

    I'd argue that a <?php= shortcut or something similar would help "split
    the difference" between the ugliness of the long version and the need to
    not break php every time an xml declaration pops up in a file. Even
    gettext has a nice _() function shortcut which is less typing than echo
    $blah; in every php tag set, and then you wouldn't be fighting with the
    potential breakage. The argument that if some new syntax only goes into
    5.3, people can't use it doesn't really hold water here because you
    wouldn't be able to rely on flipping the short_tags switch before 5.3
    either.

    I can see both sides of the story, and really don't have a preference -
    I'm curious as to the opinions of someone OTHER than Marcus, Stas,
    Pierre and Jani ;)
    There are a bunch of factors here. In the end it comes down to the
    purists vs. the pragmatists. You all know where I fall on that one.
    <?php is for the purists and <? and <?= still exists for the pragmatists.

    Now, someone mentioned <?php= which I am completely against as it breaks
    the purist side. A PI tag is defined as <?<label><whitespace> and I am
    pretty sure the PI label names can't contain '='. <?php was added and
    adopted in order to be correct, let's not break that correctness.

    Most of the arguments I have seen are basically saying <? is evil and it
    shouldn't even exist, but that isn't the current question. It does
    exist, and we aren't removing it, so the only real argument here is the
    WTF factor introduced by code that is able to enabled or disable these
    tags on the fly. That's the one and only valid argument I have seen.
    Whether or not PHP code can be validated with xmllint and whether or not
    <? is valid xml, which it obviously isn't, is completely beside the
    point. We all know that when you use <? you are not XML-compliant. And
    for the vast majority that's ok. XHTML is dead because IE, which is
    unfortunately the dominant browser has never and never will support
    XHTML. Yes, you can hack it and serve up XHTML with an HTML mime type
    and apply various hacks to sort of almost maybe sometimes get it to work
    in IE, but nobody who does any serious web development uses XHTML for
    sites that have wide audiences.

    So, we are down to a very simple decision. Does the added WTF factor of
    dynamically changing short_open_tags outweigh the benefits to the folks
    using <?-based templates?

    My view is that people want templating. As much as I hate the concept,
    and have always been very vocal about that, people want simpler
    templating tags. They will even go as far as parsing files and
    generating PHP code on every single request in order to use {blah}
    instead of <?php blah() ?>. The fact that people are willing to take an
    order of magnitude performance hit for syntactic sugar is baffling to
    me, but just look at all the templating systems out there. And yes, I
    know there are other reasons to use templating such as restricting the
    feature set for untrusted template writers, etc. but you'd be surprised
    how many people just want to type less and have their tags be prettier.
    Getting these folks to switch to <?blah()?> is a win for performance
    and sanity in my book. Yes, it isn't a full victory, but it is still a win.

    In order for a templating system to use <? they have to have
    short_open_tag on for the entire system. By allowing them to only apply
    the short_open_tags to certain parts of their code it means that they
    will write correct <?php business logic and only use the short_open_tags
    for the actual included template files. Again, not a full victory, but
    a win for us in the sense that the actual PHP code in their application
    will be using <?php everywhere. They can't get lazy and use short_tags
    in their business logic because it won't work due to limiting the
    short_open_tags to just the templates.

    I recognize the WTF factor of dynamically changing the setting, but
    frankly since it can already be changed per-dir from one request to the
    next on the same server, I really don't see the incremental WTF factor
    as being very high.

    Consider the fact that:

    <?php
    virtual('templates/main.php');
    ?>

    and

    <?php
    ini_set('short_open_tag',true);
    include 'templates/main.php';
    ini_set('short_open_tag',false);
    ?>

    Will actually do about the same thing in the sense that the top-level
    script can run with short_open_tag turned off and the main.php script
    can run with short_open_tag enabled. The first version requires that
    you configure your Apache to enable short_open_tag for the templates/
    directory, while the second lets you do it from the PHP level. The
    first suffers from being extremely slow and it isn't obvious that
    scripts in templates/ operate under different rules. The second is much
    faster and it is more obvious what is happening.

    -Rasmus
  • Pierre Joye at Mar 22, 2008 at 12:29 am
    Hi Rasmus,
    On Sat, Mar 22, 2008 at 1:20 AM, Rasmus Lerdorf wrote:

    Will actually do about the same thing in the sense that the top-level
    script can run with short_open_tag turned off and the main.php script
    can run with short_open_tag enabled. The first version requires that
    you configure your Apache to enable short_open_tag for the templates/
    directory, while the second lets you do it from the PHP level. The
    first suffers from being extremely slow and it isn't obvious that
    scripts in templates/ operate under different rules. The second is much
    faster and it is more obvious what is happening.
    as a conclusion from my point of view, I don't think it's possible to
    bring anything new to this dicsussion:

    -1 for the patch (revert)
    +1 to actually deprecate short tags
    +1 to remove them in HEAD

    That's not asked but let clear this problem once and for all.

    Cheers,
  • Andrés Robinet at Mar 22, 2008 at 2:18 am

    -----Original Message-----
    From: Pierre Joye
    Sent: Friday, March 21, 2008 8:30 PM
    To: Rasmus Lerdorf
    Cc: Elizabeth M Smith; internals@lists.php.net
    Subject: Re: [PHP-DEV] short_open_tag

    Hi Rasmus,
    On Sat, Mar 22, 2008 at 1:20 AM, Rasmus Lerdorf wrote:

    Will actually do about the same thing in the sense that the top-level
    script can run with short_open_tag turned off and the main.php script
    can run with short_open_tag enabled. The first version requires that
    you configure your Apache to enable short_open_tag for the templates/
    directory, while the second lets you do it from the PHP level. The
    first suffers from being extremely slow and it isn't obvious that
    scripts in templates/ operate under different rules. The second is much
    faster and it is more obvious what is happening.
    as a conclusion from my point of view, I don't think it's possible to
    bring anything new to this dicsussion:

    -1 for the patch (revert)
    +1 to actually deprecate short tags
    +1 to remove them in HEAD

    That's not asked but let clear this problem once and for all.

    Cheers,
    --
    Pierre
    http://blog.thepimp.net | http://www.libgd.org
    Hi,

    I'm new to the internals, but I've been reading you for months... now, let me
    ask,

    Are there any security issues with short tags?
    Is it really harder for the interpreter to have them enabled?
    Is the short tags parsing code too hard to maintain?
    Do they create a performance hit?
    Do they create bad habits, why?
    Does the patch for ini_set create a performance issue?
    What ACTUAL situation makes you hate short tags? What code are you using to do
    what such that you find short tags being evil?
    Isn't it that you just don't like them, or don't like reading them in PHP code?

    I personally hate <? ?>, but I like and use <?= ?>, it's just much more readable
    and concise than <?php echo ... ?>. What if the parser would ignore <?xml, or
    <?whatever except <?php and <?=. I don't know you guys, but every templating
    system I've had to deal with is using them and every MVC framework out there is
    using short tags for the views. And they only represent an issue if you have
    some XML code you send through the PHP parser, and you can always use a
    global/per-dir setting if you just don't like them.

    +1 for the patch (if it doesn't create a performance or maintainability issue)
    +1 to keep short tags into PHP (deprecate them if you want, but only remove them
    in PHP 7 and provide a suitable alternative)

    Regards,

    Rob

    Andrés Robinet | Lead Developer | BESTPLACE CORPORATION
    Email: info@bestplace.net  | MSN Chat: best@bestplace.net  |  SKYPE: bestplace |
    Web: bestplace.biz  | Web: seo-diy.com
  • Steph Fox at Mar 22, 2008 at 2:27 am
    Hi Andrés/Rob,

    as usual my >> are playing up so I'll use ==

    =====================
    I'm new to the internals, but I've been reading you for months... now, let
    me
    ask,

    Are there any security issues with short tags?
    Is it really harder for the interpreter to have them enabled?
    Is the short tags parsing code too hard to maintain?
    Do they create a performance hit?
    Do they create bad habits, why?
    Does the patch for ini_set create a performance issue?
    What ACTUAL situation makes you hate short tags? What code are you using to
    do
    what such that you find short tags being evil?
    Isn't it that you just don't like them, or don't like reading them in PHP
    code?

    I personally hate <? ?>, but I like and use <?= ?>, it's just much more
    readable
    and concise than <?php echo ... ?>. What if the parser would ignore <?xml,
    or
    <?whatever except <?php and <?=. I don't know you guys, but every templating
    system I've had to deal with is using them and every MVC framework out there
    is
    using short tags for the views. And they only represent an issue if you have
    some XML code you send through the PHP parser, and you can always use a
    global/per-dir setting if you just don't like them.

    +1 for the patch (if it doesn't create a performance or maintainability
    issue)
    +1 to keep short tags into PHP (deprecate them if you want, but only remove
    them
    in PHP 7 and provide a suitable alternative)
    =======================

    The problem is that if you have more than one application to deal with and
    one uses short_tags and the next does not, the second can be screwed up by
    it. Hence the widespread hysteria at the idea of making short_tags even
    easier to switch on.

    And as I write that I find 28 new questions of my own :)

    - Steph


    Regards,

    Rob

    Andrés Robinet | Lead Developer | BESTPLACE CORPORATION
    Email: info@bestplace.net | MSN Chat: best@bestplace.net | SKYPE: bestplace
    Web: bestplace.biz | Web: seo-diy.com


    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Steph Fox at Mar 22, 2008 at 1:46 am

    Elizabeth M Smith wrote:
    Wow, noisy...

    I've been in the situation where I use php for templating and the short
    syntax is much nicer on the eyes. The ability to "flick the switch" for
    short tags would be nice.

    However, like Steph, I've also been bitten by having a simple xml
    declaration in a file with short tags on that completely breaks things.
    Parse errors are NOT a good thing. This is why I'd personally prefer
    short tags just go poof - having to check all your code so any
    appearance of <? is echo'd gets really annoying.

    I'd argue that a <?php= shortcut or something similar would help "split
    the difference" between the ugliness of the long version and the need to
    not break php every time an xml declaration pops up in a file. Even
    gettext has a nice _() function shortcut which is less typing than echo
    $blah; in every php tag set, and then you wouldn't be fighting with the
    potential breakage. The argument that if some new syntax only goes into
    5.3, people can't use it doesn't really hold water here because you
    wouldn't be able to rely on flipping the short_tags switch before 5.3
    either.

    I can see both sides of the story, and really don't have a preference -
    I'm curious as to the opinions of someone OTHER than Marcus, Stas,
    Pierre and Jani ;)
    There are a bunch of factors here. In the end it comes down to the
    purists vs. the pragmatists. You all know where I fall on that one. <?php
    is for the purists and <? and <?= still exists for the pragmatists.

    Now, someone mentioned <?php= which I am completely against as it breaks
    the purist side. A PI tag is defined as <?<label><whitespace> and I am
    pretty sure the PI label names can't contain '='. <?php was added and
    adopted in order to be correct, let's not break that correctness.

    Most of the arguments I have seen are basically saying <? is evil and it
    shouldn't even exist, but that isn't the current question. It does exist,
    and we aren't removing it, so the only real argument here is the WTF
    factor introduced by code that is able to enabled or disable these tags on
    the fly. That's the one and only valid argument I have seen. Whether or
    not PHP code can be validated with xmllint and whether or not <? is valid
    xml, which it obviously isn't, is completely beside the point. We all
    know that when you use <? you are not XML-compliant. And for the vast
    majority that's ok. XHTML is dead because IE, which is unfortunately the
    dominant browser has never and never will support XHTML. Yes, you can
    hack it and serve up XHTML with an HTML mime type and apply various hacks
    to sort of almost maybe sometimes get it to work in IE, but nobody who
    does any serious web development uses XHTML for sites that have wide
    audiences.

    So, we are down to a very simple decision. Does the added WTF factor of
    dynamically changing short_open_tags outweigh the benefits to the folks
    using <?-based templates?

    My view is that people want templating. As much as I hate the concept,
    and have always been very vocal about that, people want simpler templating
    tags. They will even go as far as parsing files and generating PHP code
    on every single request in order to use {blah} instead of <?php blah() ?>.
    The fact that people are willing to take an order of magnitude performance
    hit for syntactic sugar is baffling to me, but just look at all the
    templating systems out there. And yes, I know there are other reasons to
    use templating such as restricting the feature set for untrusted template
    writers, etc. but you'd be surprised how many people just want to type
    less and have their tags be prettier. Getting these folks to switch to
    <?blah()?> is a win for performance and sanity in my book. Yes, it isn't
    a full victory, but it is still a win.

    In order for a templating system to use <? they have to have
    short_open_tag on for the entire system. By allowing them to only apply
    the short_open_tags to certain parts of their code it means that they will
    write correct <?php business logic and only use the short_open_tags for
    the actual included template files. Again, not a full victory, but a win
    for us in the sense that the actual PHP code in their application will be
    using <?php everywhere. They can't get lazy and use short_tags in their
    business logic because it won't work due to limiting the short_open_tags
    to just the templates.

    I recognize the WTF factor of dynamically changing the setting, but
    frankly since it can already be changed per-dir from one request to the
    next on the same server, I really don't see the incremental WTF factor as
    being very high.

    Consider the fact that:

    <?php
    virtual('templates/main.php');
    ?>

    and

    <?php
    ini_set('short_open_tag',true);
    include 'templates/main.php';
    ini_set('short_open_tag',false);
    ?>

    Will actually do about the same thing in the sense that the top-level
    script can run with short_open_tag turned off and the main.php script can
    run with short_open_tag enabled. The first version requires that you
    configure your Apache to enable short_open_tag for the templates/
    directory, while the second lets you do it from the PHP level. The first
    suffers from being extremely slow and it isn't obvious that scripts in
    templates/ operate under different rules. The second is much faster and
    it is more obvious what is happening.
    Ok here's a stupid suggestion.

    Is it at all possible to turn off (for all time) short_tags in php.ini BUT
    allow scripts that want to use short-tags to explicitly use them via
    ini_set()?

    It might mean a lot of re-thinking... and yes I do know it's not currently
    an option :)

    - Steph
  • Lester Caine at Mar 22, 2008 at 8:13 am

    Rasmus Lerdorf wrote:
    There are a bunch of factors here. In the end it comes down to the
    purists vs. the pragmatists. You all know where I fall on that one.
    <?php is for the purists and <? and <?= still exists for the pragmatists.

    Now, someone mentioned <?php= which I am completely against as it breaks
    the purist side. A PI tag is defined as <?<label><whitespace> and I am
    pretty sure the PI label names can't contain '='. <?php was added and
    adopted in order to be correct, let's not break that correctness.

    Most of the arguments I have seen are basically saying <? is evil and it
    shouldn't even exist, but that isn't the current question. It does
    exist, and we aren't removing it, so the only real argument here is the
    WTF factor introduced by code that is able to enabled or disable these
    tags on the fly. That's the one and only valid argument I have seen.
    I think I'm a purist pragmatist ;)

    I HAVE been moving to <?php but I've still got a lot of code that is <?.
    Having just taken a close look at the differences, the main thing is that all
    the old code has lots of in line html and adding 'php' everywhere ( actually
    4500 hits ) will make it even more difficult to read. However *I* am the only
    person using that code, and simply remembering to switch back on short tags on
    a new installation allows the old stuff to work.

    All my new work is going via bitweaver and I was a little surprised to find
    only 10 <? entries in the whole of the code base, since it's using smarty to
    pump out the html. And most of those seem to be either mistakes - '<? php' or
    are actually in SQL - and wrong :)

    An hour ago I would have said I needed <? but now I'm more than happy that
    since *I* am being dragged the right way by the other bitweaver members - <?
    will become a thing of the past for me. HOWEVER quick and dirty pages will no
    doubt come out of the shop here simply because there are so many quite complex
    pages in the archive and I'm not sure a global convert - while easy to do - is
    appropriate.

    Bottom line. Global does not work for me - yet, directory level does and I
    can't see any logical reason now for switching on and off at will. If it's in
    a short tag directory that will do for me - now I just need to actually do that :)

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://home.lsces.co.uk/lsces/wiki/?page=contact
    L.S.Caine Electronic Services - http://home.lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk//
    Firebird - http://www.firebirdsql.org/index.php
  • Stanislav Malyshev at Mar 21, 2008 at 8:14 pm

    Nobody can set memory_limit in a script during runtime. AFAICT.
    Why? It's INI_ALL. So is, for example, include_path.
    Short tags are language SYNTAX issue. That's why it's different.
    You don't get any plain error if they're "on" and something doesn't work.
    It just doesn't work or misbehaves.
    In 99.999% of cases you'd get parse error on first <?xml, i.e. very
    first line of the file. I have yet to see any other context where short
    tags wouldn't work. So I don't see - what exactly would "not work or
    misbehave" without giving an error? Can you produce any example of
    application or other real code that would silently misbehave with short
    tags on but behave OK with short tags off?
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Steph Fox at Mar 21, 2008 at 9:02 pm
    Hi Stas,
    In 99.999% of cases you'd get parse error on first <?xml, i.e. very first
    line of the file. I have yet to see any other context where short tags
    wouldn't work. So I don't see - what exactly would "not work or misbehave"
    without giving an error? Can you produce any example of application or
    other real code that would silently misbehave with short tags on but
    behave OK with short tags off?
    I had at least five PHP/XML applications that did just that during the PHP 5
    coding contest at Zend back in 5.0 days... Zeev asked me at the time to
    explain it more fully, and I sincerely regret never having made the time to
    look into it properly.

    And no I don't still have those applications on my laptop. Sorry. But it's
    as a result of working on that contest that I'm against short-tags full
    stop - I still think they should've been deprecated from 5.0 on.

    Not good evidence I know, but that's my tuppence-worth on the subject.

    - Steph
  • Igor Feghali at Mar 24, 2008 at 2:09 am

    On Fri, Mar 21, 2008 at 5:14 PM, Stanislav Malyshev wrote:
    Can you produce any example of
    application or other real code that would silently misbehave with short
    tags on but behave OK with short tags off?
    Embedding PHP in a SVG (XML) file to generate a batch of images with
    small differences. In my case web buttons with different labels and
    icons. This way I have only one SVG file with some PHP code inside (a
    few echos) that is included by an external PHP file. The PHP script
    loops through an array of strings (labels and absolute path to icons)
    and save to the filesystem as many SVGs as necessary. That saves me a
    lot of work when I have to create a some more new buttons.

    I don't mean short_open_tag should be banned or not, but thats one
    possible example for your request.

    best regards,
    iGor.
  • Marcus Boerger at Mar 21, 2008 at 7:48 pm
    Hello Stanislav,

    Friday, March 21, 2008, 8:37:18 PM, you wrote:
    I'd rather see <?php= than having this whole "short_open_tag" thing at all.
    Does <?php= work? I though echo shortcut works only with short tags.
    <?php= is not much worse than <?= so it'd be OK with me. Downside would
    be template systems couldn't use it until 5.3 is widely deployed - which
    means no template system can use it as standard for about 2-3 years at
    least. Unless we put <?php= in 5.2, which would make me a happy camper,
    but might be a trouble for others.
    The problem with '<?php=' is that it still doesn't work with XML tools.
    However the '<?echo' I mentioned would work. We could also go for something
    like '<?phpecho'. I for one would really appreciate it. And I would not
    mind adding that to 5.2 if that makes you happy. But I care to less for it
    to be available tomorrow.

    Best regards,
    Marcus
  • Stanislav Malyshev at Mar 21, 2008 at 8:08 pm

    However the '<?echo' I mentioned would work. We could also go for something
    like '<?phpecho'. I for one would really appreciate it. And I would not
    <?phpecho is too long. Really, saving one space here isn't worth a
    trouble. If we had something short and nice like <?= that'd be good and
    would make PHP templates look clean, but doing <?phpecho instead of
    <?php echo - how does it make any real difference?
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Marcus Boerger at Mar 21, 2008 at 8:16 pm
    Hello Stanislav,

    Friday, March 21, 2008, 9:08:02 PM, you wrote:
    However the '<?echo' I mentioned would work. We could also go for something
    like '<?phpecho'. I for one would really appreciate it. And I would not
    <?phpecho is too long. Really, saving one space here isn't worth a
    trouble. If we had something short and nice like <?= that'd be good and
    would make PHP templates look clean, but doing <?phpecho instead of
    <?php echo - how does it make any real difference?
    The thing is that <?php echo would require a ; while <?phpecho wouldn't.
    And if you ronly argument is saving a few keystrokes then we should really
    get rid of short open tags completely. And definitively not making their
    use easier.


    Best regards,
    Marcus
  • Stanislav Malyshev at Mar 21, 2008 at 8:30 pm

    The thing is that <?php echo would require a ; while <?phpecho wouldn't.
    And if you ronly argument is saving a few keystrokes then we should really
    get rid of short open tags completely. And definitively not making their
    use easier.
    It is easier to use templates with <?= ?> then with full PHP syntax,
    they look cleaner. Just look how many other templates use "tag" syntax -
    either {SOMETHING} or <%= SOMETHING %> or some like that. Do you that's
    because they are too stupid or too lazy - or because they listen to the
    users? ASP, JSP and Ruby all have <%=, does it teach us anything about
    people liking short tags?

    Look how many examples you short syntax - why is that? Because it looks
    better and easy to read. And the only objection to that is - it's not
    valid XML. OK, it is not valid XML - why should it be? Why should PHP
    programmer work harder and PHP code look worse than Ruby, JSP or ASP -
    just because of arbitrary requirement of XML compliance?
    --
    Stanislav Malyshev, Zend Software Architect
    stas@zend.com http://www.zend.com/
    (408)253-8829 MSN: stas@zend.com
  • Johannes Schlüter at Mar 22, 2008 at 10:22 pm
    Hi,
    On Fri, 2008-03-21 at 21:13 +0100, Marcus Boerger wrote:
    Hello Stanislav,

    Friday, March 21, 2008, 9:08:02 PM, you wrote:
    However the '<?echo' I mentioned would work. We could also go for something
    like '<?phpecho'. I for one would really appreciate it. And I would not
    <?phpecho is too long. Really, saving one space here isn't worth a
    trouble. If we had something short and nice like <?= that'd be good and
    would make PHP templates look clean, but doing <?phpecho instead of
    <?php echo - how does it make any real difference?
    The thing is that <?php echo would require a ; while <?phpecho wouldn't.
    And if you ronly argument is saving a few keystrokes then we should really
    get rid of short open tags completely. And definitively not making their
    use easier.
    You don't need a ; in front of the ?> therefore <?php echo $foo ?>
    works, so the only thing "<?phpecho" saves in fact is one character.

    The point about <?= is that it is _very_ short. <?=$foo?> can work quite
    nice in simple templating stuff ("PHP is a templating language")

    I also think that there aren't many people who preprocess "PHP Hypertext
    preprocessor" files with XML-Tools, and if they do they won't use <?.
    The XML issue I see there is the conflict with XML-PIs like <?xml which
    starts PHP's processing. See [1] as a starting point for that. Already
    the second result there [2] shows a real life example where the patch
    Stas committed might be useful. In a file called "header.php" - which
    certainly isn't the main script - the developer made sure that his
    script works with both settings, so on the one hand he has to print the
    "<?xml" PI using PHP, so there's no problem with short open tags, but
    for outputting variables on the other hand he uses the long tags since
    one can't rely on the fact that short tags are working. Using the patch
    Stas proposed (and committed) this might be made nicer. I could look for
    my own code where I had the same problem inside complexer stuff, but I
    guess that random example, which took me <5 minutes to find shows the
    possible benefit.

    So as long as we have open tags I'd like Stas' patch. The only downside
    I see are confused users ("Why doesn't that work for the script I have
    that in") but well, we have such users for all stuff we do ;-)

    Now we have the big issue: Do we want to have short open tags forever?
    Well, without tooo much thinking my idea would be to drop "<?" but keep
    "<?=", "<?=" shouldn't conflict with <?xml tags in the same file, but
    make it simple to do templating using PHP, on the other hand when not
    echo'ing stuff you already have to write more soo the four additional
    characters ("php ") don't matter that much - especially every decent
    editor/ide should be able to give you a completion on that, if you want.
    But such a change in the language should go with other breaks to 6.0 or
    so.

    johannes

    [1] <http://google.com/codesearch?hl=en&lr=&q=lang%3Aphp+%3C%5C%3Fphp
    +echo+%5B%22%27%5D%3C%5C%3Fxml&btnG=Search>
    [2] <http://google.com/codesearch?hl=en&q=+lang:php+%3C%5C%3Fphp+echo+%
    5B%22%27%5D%3C%5C%3Fxml%22
    +show:r_a97k72yKQ:0ofXUwlCDds:HkOUVl3hPgM&sa=N&cd=2&ct=rc&cs_p=http://managedtasks.com/wpthemes/gespaa_v2.zip&cs_f=gespaa_v2/header.php#first>
  • Stefan Walk at Mar 22, 2008 at 10:51 pm

    Johannes Schlüter schrieb:
    Now we have the big issue: Do we want to have short open tags forever?
    Well, without tooo much thinking my idea would be to drop "<?" but keep
    "<?=", "<?=" shouldn't conflict with <?xml tags in the same file, but
    make it simple to do templating using PHP, on the other hand when not
    echo'ing stuff you already have to write more soo the four additional
    characters ("php ") don't matter that much - especially every decent
    editor/ide should be able to give you a completion on that, if you want.
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=$item?></li>
    <? endforeach ?>
    </ul>

    you can have short stuff without outputting stuff too.

    Regards,
    Stefan

    P.S.
    This patch will be the most useful addition to PHP that has been added
    for a long time, provided the purists and
    i-don't-know-how-to-write-testcases-the-right-way folks don't succeed in
    shooting it down.
  • Gregory Beaver at Mar 22, 2008 at 11:13 pm

    Stefan Walk wrote:
    Johannes Schlüter schrieb:
    Now we have the big issue: Do we want to have short open tags forever?
    Well, without tooo much thinking my idea would be to drop "<?" but keep
    "<?=", "<?=" shouldn't conflict with <?xml tags in the same file, but
    make it simple to do templating using PHP, on the other hand when not
    echo'ing stuff you already have to write more soo the four additional
    characters ("php ") don't matter that much - especially every decent
    editor/ide should be able to give you a completion on that, if you want.
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=$item?></li>
    <? endforeach ?>
    </ul>

    you can have short stuff without outputting stuff too.
    I see many good reasons to disable short open tags. However, there is a
    compromise that is better from all vantage points:

    <ul>
    <?p foreach ($items as $item): ?>
    <li><?=$item ?></li>
    <?p endforeach ?>
    </ul>

    <?p is a valid PI and would prevent <?xml from being parsed as PHP.
    Also legal is <?: as PI's can start with or contain : or _. Honestly
    though, this is not so important to me, my primary concern is the
    conflict with <?xml. I mention it out of deference for those who
    actually do care about writing scripts that are xml-compliant for some
    strange reason :). Also possible and relatively simple to implement
    would be to allow an = at the start of an expression to alias to T_ECHO,
    so that <?p =$item ?> would work like <?=$item?>. This is, however,
    very perl-ish, so I mention it only as a possible way to preserve that
    aspect of short tags for template usage. God forbid we start seeing
    regular scripts using "=" to mean "echo" :).

    As a note, I use exclusively <?php in my templates and also use <?xml to
    generate xhtml, so I am very much against per-script enabling of short
    tag <? for the annoyance it would introduce of forcing an ini_set() at
    the top of each template and the bottom as well to be a good citizen and
    restore the old value.

    Greg
  • Marcus Boerger at Mar 22, 2008 at 11:59 pm
    Hello Gregory,

    Sunday, March 23, 2008, 12:13:20 AM, you wrote:
    Stefan Walk wrote:
    Johannes Schlüter schrieb:
    Now we have the big issue: Do we want to have short open tags forever?
    Well, without tooo much thinking my idea would be to drop "<?" but keep
    "<?=", "<?=" shouldn't conflict with <?xml tags in the same file, but
    make it simple to do templating using PHP, on the other hand when not
    echo'ing stuff you already have to write more soo the four additional
    characters ("php ") don't matter that much - especially every decent
    editor/ide should be able to give you a completion on that, if you want.
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=$item?></li>
    <? endforeach ?>
    </ul>

    you can have short stuff without outputting stuff too.
    I see many good reasons to disable short open tags. However, there is a
    compromise that is better from all vantage points:
    <ul>
    <?p foreach ($items as $item): ?>
    <li><?=$item ?></li>
    <?p endforeach ?>
    </ul>
    <?p is a valid PI and would prevent <?xml from being parsed as PHP.
    Also legal is <?: as PI's can start with or contain : or _. Honestly
    though, this is not so important to me, my primary concern is the
    conflict with <?xml. I mention it out of deference for those who
    actually do care about writing scripts that are xml-compliant for some
    strange reason :). Also possible and relatively simple to implement
    would be to allow an = at the start of an expression to alias to T_ECHO,
    so that <?p =$item ?> would work like <?=$item?>. This is, however,
    very perl-ish, so I mention it only as a possible way to preserve that
    aspect of short tags for template usage. God forbid we start seeing
    regular scripts using "=" to mean "echo" :).
    As a note, I use exclusively <?php in my templates and also use <?xml to
    generate xhtml, so I am very much against per-script enabling of short
    tag <? for the annoyance it would introduce of forcing an ini_set() at
    the top of each template and the bottom as well to be a good citizen and
    restore the old value.
    To me this sounds more like we were heading towards '<?p' as short of
    '<?php' and '<?:' as working erm conflict free form of '<?='.

    Best regards,
    Marcus
  • Jared Williams at Mar 23, 2008 at 1:21 am

    -----Original Message-----
    From: Stefan Walk
    Sent: 22 March 2008 22:52
    To: 'PHP Internals'
    Subject: Re: [PHP-DEV] short_open_tag

    Johannes Schlüter schrieb:
    Now we have the big issue: Do we want to have short open
    tags forever?
    Well, without tooo much thinking my idea would be to drop "<?" but
    keep "<?=", "<?=" shouldn't conflict with <?xml tags in the
    same file,
    but make it simple to do templating using PHP, on the other hand when
    not echo'ing stuff you already have to write more soo the four
    additional characters ("php ") don't matter that much - especially
    every decent editor/ide should be able to give you a
    completion on that, if you want.

    <ul>
    <? foreach ($items as $item): ?>
    <li><?=$item?></li>
    <? endforeach ?>
    </ul>
    The problem I have with <?= is it doesn’t really help simplify templating.
    You end up still having to explicitly encode the output with
    htmlspecialchars() or whatever.

    <ul>
    <? foreach ($items as $item): ?>
    <li><?=htmlspecialchars($item)?></li>
    <? endforeach ?>
    </ul>

    So <?= ends up not really that much of an improvement over <?php echo
    htmlspecialchars($item); ?> imo.

    J
  • Stefan Walk at Mar 23, 2008 at 11:08 am

    Jared Williams schrieb:
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=htmlspecialchars($item)?></li>
    <? endforeach ?>
    </ul>
    Well, it's the same as the "but i can't validate my php source with
    xmllint" folks: You're doing it at the wrong point. Escaping should
    happen at the point where you assign the var as a temlate var (in my
    small template class: $tpl->assign('items', $some_data) will escape all
    "leaves" in the data $some_data). This way you don't have to type it
    everytime, you don't have to read it everytime and - best of all - you
    can't forget to do it, so introducing a XSS vulnerability is much less
    likely.

    Regards,
    Stefan
  • Jared Williams at Mar 23, 2008 at 12:57 pm

    -----Original Message-----
    From: Stefan Walk
    Sent: 23 March 2008 11:08
    To: Jared Williams
    Cc: 'PHP Internals'
    Subject: Re: [PHP-DEV] short_open_tag

    Jared Williams schrieb:
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=htmlspecialchars($item)?></li>
    <? endforeach ?>
    </ul>
    Well, it's the same as the "but i can't validate my php
    source with xmllint" folks: You're doing it at the wrong
    point. Escaping should happen at the point where you assign
    the var as a temlate var (in my small template class:
    $tpl->assign('items', $some_data) will escape all "leaves" in
    the data $some_data). This way you don't have to type it
    everytime, you don't have to read it everytime and - best of
    all - you can't forget to do it, so introducing a XSS
    vulnerability is much less likely.

    Regards,
    Stefan
    A lot of people don't use templates, just raw PHP. So having a short tag
    escaping would decrease XSS vulnerabilities.

    I don't understand why need to essentially duplicate all the variables just
    to provide proper escaping.

    Jared
  • Marcus Boerger at Mar 23, 2008 at 1:25 pm
    Hello Jared,

    Sunday, March 23, 2008, 1:57:20 PM, you wrote:

    >
    -----Original Message-----
    From: Stefan Walk
    Sent: 23 March 2008 11:08
    To: Jared Williams
    Cc: 'PHP Internals'
    Subject: Re: [PHP-DEV] short_open_tag

    Jared Williams schrieb:
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=htmlspecialchars($item)?></li>
    <? endforeach ?>
    </ul>
    Well, it's the same as the "but i can't validate my php
    source with xmllint" folks: You're doing it at the wrong
    point. Escaping should happen at the point where you assign
    the var as a temlate var (in my small template class:
    $tpl->assign('items', $some_data) will escape all "leaves" in
    the data $some_data). This way you don't have to type it
    everytime, you don't have to read it everytime and - best of
    all - you can't forget to do it, so introducing a XSS
    vulnerability is much less likely.

    Regards,
    Stefan
    A lot of people don't use templates, just raw PHP. So having a short tag
    escaping would decrease XSS vulnerabilities.
    I don't understand why need to essentially duplicate all the variables just
    to provide proper escaping.
    Same here. PHP itself is the templating system. And we should focus on that
    one. Becasue that is the vast majority of users. However we shouldn't make
    other stuff harder than necessary. That said, I more and more think we need
    to revisit our tags. XML Allows ':' and '_' in names. And As Jared just
    wrote one of the things very often done is html escaping in short output.
    So i'd like to see the following:
    <?php just as now of course
    <?: just as <?= but xml compliant, i have seen other ppl mentioning <?p
    <?phtml just like <?php echo or <?= but doing html escaping


    Best regards,
    Marcus
  • Rasmus Lerdorf at Mar 23, 2008 at 2:33 pm

    Jared Williams wrote:

    -----Original Message-----
    From: Stefan Walk
    Sent: 23 March 2008 11:08
    To: Jared Williams
    Cc: 'PHP Internals'
    Subject: Re: [PHP-DEV] short_open_tag

    Jared Williams schrieb:
    <ul>
    <? foreach ($items as $item): ?>
    <li><?=htmlspecialchars($item)?></li>
    <? endforeach ?>
    </ul>
    Well, it's the same as the "but i can't validate my php
    source with xmllint" folks: You're doing it at the wrong
    point. Escaping should happen at the point where you assign
    the var as a temlate var (in my small template class:
    $tpl->assign('items', $some_data) will escape all "leaves" in
    the data $some_data). This way you don't have to type it
    everytime, you don't have to read it everytime and - best of
    all - you can't forget to do it, so introducing a XSS
    vulnerability is much less likely.

    Regards,
    Stefan
    A lot of people don't use templates, just raw PHP. So having a short tag
    escaping would decrease XSS vulnerabilities.

    I don't understand why need to essentially duplicate all the variables just
    to provide proper escaping.
    This is what the filter extension is for. You should be working with
    escaped data by default and only poke a hole in your data firewall in
    the few places where you need to work with the raw data. Doing it the
    other way around is going to lead to all sorts of security issues.

    -Rasmus
  • Soenke Ruempler at Mar 23, 2008 at 3:00 pm
    Hi Rasmus,
    On 03/23/2008 03:32 PM, Rasmus Lerdorf wrote:

    This is what the filter extension is for. You should be working with
    escaped data by default and only poke a hole in your data firewall in
    the few places where you need to work with the raw data. Doing it the
    other way around is going to lead to all sorts of security issues.
    Mhm. Isn't the the right paradigm to prepare variables at the time they
    are passed into subsystems (sql, shell, html etc.)? So what do you mean
    with "escaped data" here? html/xml escaped, sql escaped (which sql
    system and which encoding?). Sounds a bit like magic_quotes reloaded *hides*

    IMHO a short syntax for echoing data html-escaped would be very helpful
    and a great extension to the language. If people (esp. newbies) are
    pointed to this one instead of echo/print for echo'ing html parts it
    could lead to a big win for php application security.

    -soenke
  • Rasmus Lerdorf at Mar 23, 2008 at 3:14 pm

    Soenke Ruempler wrote:
    Hi Rasmus,
    On 03/23/2008 03:32 PM, Rasmus Lerdorf wrote:

    This is what the filter extension is for. You should be working with
    escaped data by default and only poke a hole in your data firewall in
    the few places where you need to work with the raw data. Doing it the
    other way around is going to lead to all sorts of security issues.
    Mhm. Isn't the the right paradigm to prepare variables at the time they
    are passed into subsystems (sql, shell, html etc.)? So what do you mean
    with "escaped data" here? html/xml escaped, sql escaped (which sql
    system and which encoding?). Sounds a bit like magic_quotes reloaded
    *hides*
    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system. Once it is in, having to remember to apply
    a filter before you use the data will never work. You might remember to
    do it 99.99% of the time, but that doesn't help you and you might as
    well not do it at all. A bit like a condom with just one little hole.

    -Rasmus
  • Soenke Ruempler at Mar 23, 2008 at 3:26 pm
    Hi Rasmus,
    On 03/23/2008 04:14 PM, Rasmus Lerdorf wrote:

    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system. Once it is in, having to remember to apply
    a filter before you use the data will never work. You might remember to
    do it 99.99% of the time, but that doesn't help you and you might as
    well not do it at all. A bit like a condom with just one little hole.
    Well, my point is: at the stage where user generated data enter your
    program you don't know for which subsystem to prepare it. Maybe for one,
    maybe for more of them (it's a common case that user input is first
    written to SQL backend and then displayed again).

    So if everything is html escaped with the filter extension and I wanna
    put it into SQL I have to remember "ah all my input is escaped for html
    so I have to DECODE it and then prepare it to go into SQL". Now the
    question is: What's easier, more intuitive and less headaching?

    I guess the real challange is not to try to do as much as possible magic
    by PHP but

    a) to give users a simple way for escaping their data for the particular
    subsystem
    b) to point them to the right solution within the manual. (addslashes is
    bad for sql, parameter binding / prepared statement is nice - echo is
    bad for html output, htmlspecialchars or a newly intoduced short-tag is
    nice).

    -soenke
  • Rasmus Lerdorf at Mar 23, 2008 at 3:49 pm

    Soenke Ruempler wrote:
    Hi Rasmus,
    On 03/23/2008 04:14 PM, Rasmus Lerdorf wrote:

    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system. Once it is in, having to remember to
    apply a filter before you use the data will never work. You might
    remember to do it 99.99% of the time, but that doesn't help you and
    you might as well not do it at all. A bit like a condom with just one
    little hole.
    Well, my point is: at the stage where user generated data enter your
    program you don't know for which subsystem to prepare it. Maybe for one,
    maybe for more of them (it's a common case that user input is first
    written to SQL backend and then displayed again).

    So if everything is html escaped with the filter extension and I wanna
    put it into SQL I have to remember "ah all my input is escaped for html
    so I have to DECODE it and then prepare it to go into SQL". Now the
    question is: What's easier, more intuitive and less headaching?
    No, that's the point. You never ever decode data. If you are using any
    sort of decode function, chances are your application is insecure. The
    filter extension keeps a copy of the raw data internally. The default
    filter you apply will filter for all the backends you use.
    htmlspecialchars with all single and double quotes converted as well,
    takes care of most commonly used stuff. When you need the raw data, or
    the data filtered in a different way, you ask the filter extension to
    re-filter from the stored raw data, you don't decode.

    -Rasmus
  • Wietse Venema at Mar 23, 2008 at 3:41 pm

    Rasmus Lerdorf:
    Soenke Ruempler wrote:
    Hi Rasmus,
    On 03/23/2008 03:32 PM, Rasmus Lerdorf wrote:

    This is what the filter extension is for. You should be working with
    escaped data by default and only poke a hole in your data firewall in
    the few places where you need to work with the raw data. Doing it the
    other way around is going to lead to all sorts of security issues.
    Mhm. Isn't the the right paradigm to prepare variables at the time they
    are passed into subsystems (sql, shell, html etc.)? So what do you mean
    with "escaped data" here? html/xml escaped, sql escaped (which sql
    system and which encoding?). Sounds a bit like magic_quotes reloaded
    *hides*
    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system.
    Input fitering has valid uses, but protecting html/sql/shell/etc.
    is not among them. Legitimate input like O'Reilly requires different
    treatments depending on html/sql/shell/etc. context. It would be
    incorrect to always insert a \, it would be incorrect to always
    remove the ', and it would be incorrect to always reject the input.
    Once it is in, having to remember to apply
    a filter before you use the data will never work. You might remember to
    do it 99.99% of the time, but that doesn't help you and you might as
    well not do it at all. A bit like a condom with just one little hole.
    Data flow control (a.k.a. taint support) can detect when output
    isn't converted with the proper conversion function. This can be
    done in reporting mode (my approach) or it can be done in "automatic
    fixing" mode (other people). These different approaches make
    different trade-offs between programmer effort and system overhead,
    and avoid the data corruption that input filtering would introduce.

    Wietse
  • Rasmus Lerdorf at Mar 23, 2008 at 3:59 pm

    Wietse Venema wrote:
    Rasmus Lerdorf:
    Soenke Ruempler wrote:
    Hi Rasmus,
    On 03/23/2008 03:32 PM, Rasmus Lerdorf wrote:

    This is what the filter extension is for. You should be working with
    escaped data by default and only poke a hole in your data firewall in
    the few places where you need to work with the raw data. Doing it the
    other way around is going to lead to all sorts of security issues.
    Mhm. Isn't the the right paradigm to prepare variables at the time they
    are passed into subsystems (sql, shell, html etc.)? So what do you mean
    with "escaped data" here? html/xml escaped, sql escaped (which sql
    system and which encoding?). Sounds a bit like magic_quotes reloaded
    *hides*
    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system.
    Input fitering has valid uses, but protecting html/sql/shell/etc.
    is not among them. Legitimate input like O'Reilly requires different
    treatments depending on html/sql/shell/etc. context. It would be
    incorrect to always insert a \, it would be incorrect to always
    remove the ', and it would be incorrect to always reject the input.
    You can also choose to never store the raw single quote and always work
    with encoded data. Or, as I suggest, always filter it by default and in
    the places where you want the raw quote back or you want it filtered for
    a specific use, specify explicitly which filter you want to apply. It
    is the data firewall approach. Filter everything by default with an
    extremely strict filter and poke holes in your data firewall as
    necessary. It also makes it easy to audit your code because you only
    have to track look at the places where you have poked a hole.
    Data flow control (a.k.a. taint support) can detect when output
    isn't converted with the proper conversion function. This can be
    done in reporting mode (my approach) or it can be done in "automatic
    fixing" mode (other people). These different approaches make
    different trade-offs between programmer effort and system overhead,
    and avoid the data corruption that input filtering would introduce.
    Having to do active checks on each use is extremely expensive. You said
    yourself you suggest only enabling this during development. The data
    firewall approach isn't actually all that different from the taint
    approach. The big win is that there is no runtime checking necessary
    and thus no performance hit.

    -Rasmus
  • Wietse Venema at Apr 1, 2008 at 9:29 pm

    Rasmus Lerdorf:
    You can also choose to never store the raw single quote and always work
    with encoded data. Or, as I suggest, always filter it by default and in
    the places where you want the raw quote back or you want it filtered for
    a specific use, specify explicitly which filter you want to apply. It
    is the data firewall approach. Filter everything by default with an
    extremely strict filter and poke holes in your data firewall as
    necessary. It also makes it easy to audit your code because you only
    have to track look at the places where you have poked a hole.
    I think I have calmed down enough that I can respond to this thread.

    Unfortunately, this data firewall does not protect all interfaces.
    For example, replacing characters by &foo; does nothing for shell
    commands where & and ; are command separators. Instead of poking
    small holes, you already have one big gaping security hole.
    Data flow control (a.k.a. taint support) can detect when output
    isn't converted with the proper conversion function. This can be
    done in reporting mode (my approach) or it can be done in "automatic
    fixing" mode (other people). These different approaches make
    different trade-offs between programmer effort and system overhead,
    and avoid the data corruption that input filtering would introduce.
    Having to do active checks on each use is extremely expensive. You said
    yourself you suggest only enabling this during development. The data
    firewall approach isn't actually all that different from the taint
    approach. The big win is that there is no runtime checking necessary
    and thus no performance hit.
    The "reporting mode" (my approach) overhead is down to the 1% level,
    as the result of some very careful design and implementation choices
    that I could not anticipate when I discussed my initial proposal.

    This level of overhead is low enough that continuous run-time
    deployment becomes a realistic option. Thus, technical objections
    can now make place for religious objections...

    Wietse
  • Rasmus Lerdorf at Mar 23, 2008 at 4:18 pm

    Stefan Walk wrote:
    Rasmus Lerdorf schrieb:
    It is, but it is magic_quotes done right. You apply a really strict
    filter that makes your data safe for display and your backend by
    default. The only place you can reliably do this this is at the point
    the data enters your system. Once it is in, having to remember to
    apply a filter before you use the data will never work. You might
    remember to do it 99.99% of the time, but that doesn't help you and
    you might as well not do it at all. A bit like a condom with just one
    little hole.

    -Rasmus
    No, it's not "done right". To work for all cases, your "default filter"
    would basically have to return an empty string all the time (if you say
    "nonsense", does the default filter strip "From" from the start of a
    line so you can put it into an mbox?). And you don't need do do that to
    be safe, because you don't have to remember to apply a filter, you use
    the subsystem that needs the escaped data to escape the data itself. So,
    when passing an arg to a MySQL query, it gets escaped the right way (by
    using pepared statements, formatted query strings ... hundreds of
    possibilities). If you pass data to the "HTML output subsection", it
    gets escaped for use in HTML. Cause this is done implicitely, you never
    ever call an escaping function yourself, so there is no way to forget it.
    But you know which backends you use. I am not suggesting that PHP can
    supply a default filter that works for everyone. But I am suggesting
    that you can supply a default filter that works for the backends you
    use. The vast majority of people need data to be safe from HTML and
    MySQL/PostgreSQL. So having a default filter that makes data safe for
    these uses and throw in Shell, CSS and Javascript as well, and you have
    a really powerful default filter. Yes, there will always be other
    subsystems out there that needs other filtering, in which case you
    extend your default filter to cover those, or you wall off those
    subsystems and have a secondary filter layer.

    The alternative of relying on the developer remembering to filter simply
    doesn't work. Wietse's taint mode is another approach, but it has
    performance implications.

    The data firewall approach is what I put in place at Yahoo 5+ years ago
    now. We have hundreds of applications written by thousands of
    developers and it works. Yes, there are still security issues from time
    to time, but they end up being logical flow issues that no amount of
    filtering would fix, or they stem from people applying the wrong filters
    in the wrong situations which again would happen under any system. What
    we don't see are security problems caused by developers forgetting to
    filter a specific bit of user data.

    The other thing this gives us is the ability to run 3rd-party untrusted
    apps. You only need to find the 2 or 3 places where the app needs
    something other than the default filtered data and even the most
    insecure app can be run with some semblance of security.

    -Rasmus
  • Stefan Walk at Mar 23, 2008 at 6:14 pm

    Rasmus Lerdorf schrieb:
    The alternative of relying on the developer remembering to filter simply
    doesn't work. Wietse's taint mode is another approach, but it has
    performance implications.
    As I said, when the backend does the escaping, you don't have to
    remember it.
    filtering would fix, or they stem from people applying the wrong filters
    in the wrong situations which again would happen under any system. What
    If the backend picked the escaping mechanism, *that* wouldn't happen (if
    the backend isn't buggy, but that can happen for any way)
    The other thing this gives us is the ability to run 3rd-party untrusted
    apps. You only need to find the 2 or 3 places where the app needs
    something other than the default filtered data and even the most
    insecure app can be run with some semblance of security.
    "Some" is the right word here. That insecure app could leak information
    from your server, write or read data to/from locations it shouldn't,
    etc. Also, I don't think it would be just 2 or 3 places. It'll be more
    like every point where it's real user input (and not form ids, hidden
    values etc), because then you have to expect almost any char that your
    filter has to strip to be safe - Mr. O'Reilly won't be amused if he's
    called OReilly, O''Reilly, O&apos;Reilly or O\'Reilly.

    Regards,
    Stefan
  • Rasmus Lerdorf at Mar 23, 2008 at 6:27 pm

    Stefan Walk wrote:
    Rasmus Lerdorf schrieb:
    The alternative of relying on the developer remembering to filter
    simply doesn't work. Wietse's taint mode is another approach, but it
    has performance implications.
    As I said, when the backend does the escaping, you don't have to
    remember it.
    filtering would fix, or they stem from people applying the wrong
    filters in the wrong situations which again would happen under any
    system. What
    If the backend picked the escaping mechanism, *that* wouldn't happen (if
    the backend isn't buggy, but that can happen for any way)
    The other thing this gives us is the ability to run 3rd-party
    untrusted apps. You only need to find the 2 or 3 places where the app
    needs something other than the default filtered data and even the most
    insecure app can be run with some semblance of security.
    "Some" is the right word here. That insecure app could leak information
    from your server, write or read data to/from locations it shouldn't,
    etc. Also, I don't think it would be just 2 or 3 places. It'll be more
    like every point where it's real user input (and not form ids, hidden
    values etc), because then you have to expect almost any char that your
    filter has to strip to be safe - Mr. O'Reilly won't be amused if he's
    called OReilly, O''Reilly, O&apos;Reilly or O\'Reilly.
    Well, I actually have years of experience taking apps and making them
    run under my strict default filter. And it tends to not be very many
    changes, if any at all. In the O'Reilly case it gets changed to
    O&#39;Reilly which for a pure web app is fine. If all input
    consistently gets changed the same way then you can store O&#39;Reilly
    in the backend and a search will still find it since the search query
    itself will be encoded the same way. If you have non web tools working
    with the same backend data, then you may have a requirement to store it
    raw, in which case you'd need to poke a hole in your data firewall.

    -Rasmus
  • Edward Z. Yang at Mar 23, 2008 at 6:37 pm

    Rasmus Lerdorf wrote:
    Well, I actually have years of experience taking apps and making them
    run under my strict default filter. And it tends to not be very many
    changes, if any at all. In the O'Reilly case it gets changed to
    O&#39;Reilly which for a pure web app is fine. If all input
    consistently gets changed the same way then you can store O&#39;Reilly
    in the backend and a search will still find it since the search query
    itself will be encoded the same way. If you have non web tools working
    with the same backend data, then you may have a requirement to store it
    raw, in which case you'd need to poke a hole in your data firewall.
    Rasmus, I'm sure these techniques work very well in practice. However,
    it's important to note that it's still an optimization, a step down from
    an "ideal" standard which would involve keeping raw data in the
    database. In theory, the data in its purest form, with no extraneous
    escaping, would be stored. In practice, most data will be used in a web
    context and thus, as you note, escaping it as &#39; is perfectly acceptable.

    I've always advocated storing both the pure data and the escaped version
    (in a kind of cache) in the database, because if you store just the
    escaped version you don't have any easy way (besides decoding) to get
    the raw version back. Of course, this doubles the storage requirement.

    --
    Edward Z. Yang GnuPG: 0x869C48DA
    HTML Purifier <http://htmlpurifier.org> Anti-XSS Filter
    [[ 3FA8 E9A9 7385 B691 A6FC B3CB A933 BE7D 869C 48DA ]]

Related Discussions