For those who are interested, I did some loadtesting of Puts and Gets
speeds using PHP -> Thrift Server -> HBASE, and Java API Client ->
Writing and reading 5 - 10 byte cells (from Cache), is 30 times faster
using Java API client. So I am going to assume that writing near
realtime applications like search will be better with Java API, since
it takes a while for php to serialize data, send out of the socket and
then for Thrift server to talk to HBase.
Average reads per row were 0.5 ms with Java, and 15 ms (still fast!)
with PHP client.
I am thinking that Tomcat with java servlet that does a lot of work on
the backend is a way to go. When we set it up, I will follow up with
results; Should be just as fast as the HTTP wrap-around should not
add significant latency, because we are not doing multiple GETs as
most of the logic will be done on the backend.
[HBase-user] php to thrift vs java api
- Jeff Whiting: Those are interesting results. Are you using the php thrift extension? It is significantly faster with (de)serialization. You may want to grab the latest nightly build of thrift as it has quite a few bug fixes in the php thrift extension. ~Jeff -- Jeff Whiting Qualtrics Senior Software Engineer email@example.com
- Jack Levin: Yes, we are using the latest .so, but unfortunately it does not make any difference, I think this is just a matter of the language, PHP is stateless, where Java runs as servlet inside the JVM with hot Jars; With PHP, even if IO to thrift is not an issue itself, given the task say merge join two arrays of 10000 elements each will take much much longer than Java simply due to how it stores and accesses datastructures in RAM. -Jack
- Jeff Whiting: I agree. PHP is a slow language especially when it has to create any objects. PHP appears to be fast because so much code is actually in C extensions. ~Jeff -- Jeff Whiting Qualtrics Senior Software Engineer firstname.lastname@example.org