I use a multithreaded MPM of apache called "peruser" where we're
hosting a bunch of sites, each with their own virtual host. Each
virtual host needs to be isolated so I'm using chroot and other stuff
to keep them apart. APC was a must as getting good throughput was
important. However, APC would make all sites share memory containing
cached PHP files and user data. A major security breach. So I was
looking at the source code of APC and saw that it was using mmap() to
map shared memory. If mmap used a filed backed shared memory instead
you could a) get a persistent cache so the child processes could be
terminated without dropping it and b) inherit the file system
permissions c) make chroot define where the cache would be,
essentially giving every virtual host their own APC cache. You could
also use shm_get(), however IPC was not enabled on my kernel.

I then proceeded to write a hack/proof of concept to APC 1.3.7 which
accomplished this. This is a breakdown of the differences between
vanilla APC and my hack:

- Vanilla APC will allocate all shared memory when Apache starts
(before chroot etc)
+ Hacked APC will delay shared memory initialization upon the first
request of the child thread (after chroot etc)

+ Hacked APC will reserve a memory region when Apache starts so there
will be no difference in address location.
+ Hacked APC only supports exactly one memory segment.
+ Hacked APC must be built with --enable-mmap
+ Hacked APC is recommended to be built with --enable-apc-pthreadrwlocks
+ Hacked APC adds a line "Hacks: Hacked for shared MMAP support" to
phpinfo() to indicate that it has been loaded.

- Vanilla APC just uses mmap() as a way to allocate shared memory
+ Hacked APC utilizes the file backed up feature of mmap, you must set
apc_mmap_file_mask to a place to store the APC cache/swap, and this
file will have the exact size of apc.shm_size.

- Vanilla APC allocates shared memory once, it never loads previously memory.
+ Hacked APC will always load existing memory from the cache. Every
time you start Apache the start time is recorded. If the start time
differs, the cache will be zeroed, otherwise the cache will be loaded
and the already initialized cache structures in it be used. Note: This
could lead to a race-like condition if two PHP instances use the same
cache and where started at different times.

- Vanilla APC uses apc_mmap_file_mask as a temporary file and requires
XXXXXX as a placeholder for unique hash.
+ Hacked APC uses apc_mmap_file_mask as an exact file name and
requires it to be specified.

- Vanilla APC has no Environment separation. The cache is initialized
once and shared between all apache childs.
+ Hacked APC will load the cache from file specified by
apc_mmap_file_mask. In addition, it will create it if it doesn't exist
with the permissions 0600 and with the uid/gid of the the child
process. The path is also affected by chroot. This is a double layer
of security.

I'm using this to get different APC caches per virtual host and limit
memory. However it should also, theoretically, be possible to use it
to enable a shared cache for PHP instances running with fast cgi. This
could save a lot of memory. However you would then have to make some
changes. For example, the memory is currently pre-mapped when apache
is started so all children will share the same addresses.  One way to
solve this would be to implement the cache so no direct memory
addresses are used, only offsets. Also, the hack currently stores the
PHP start time and reinitializes the cache file when apache is
restarted, and if you're removing the reinitialization you would then
have a persistent data storage. A really crazy idea would then be to
use APC as a persistent database instead of mySQL. Just gotta make
sure APC is ACID compliant first! ;)

Some notes on performance: I load tested this with a wordpress site
and I did not detect any reduction in performance. I got 2x the
throughput with both vanilla and hacked APC. This could however be
slower in some cases. For example, this uses disk access to read and
store the cache. However it maps the file in shared mode, so no disk
reading should actually take place if it's already open. And simply
reading/writing the memory does not guarantee that the changes are
written to disk until the memory is actually unmapped.

I should also not have to mention that this is an experimental hack.
If you use this for anything important you're insane.

The patch: http://pastebin.com/4GS83nKs


Hannes Landeholm

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupphp-internals @
postedJan 18, '11 at 9:16p
activeJan 18, '11 at 9:16p

1 user in discussion

Hannes Landeholm: 1 post



site design / logo © 2022 Grokbase