FAQ
Hi All

I have made a tool for load testing of my company's web-server product. The tool
is written using Python 3.1.


The tool basically does a HTTP or HTTPS post, gets response and parses the
response, does the response validation against expected response and maintains
the stats of average response time.
The tool can open several such parallel connections and send different request
and get different responses and validate them. This can continue till the time I
specify or the number of request I specify in the tool per parallel connection.

I am seeing some strange behavior with this tool.

When I send request using HTTP, I am able to reach 1 transaction (request sent,
response rcvd and validated.) per second from 20 parallel connections easily.
Average response time shown is about 0.15 seconds.
However, when I send request using HTTPS, I am seeing that the response time
shown by tool goes to 1.1 seconds for same 20 parallel connection each trying 1
transaction per second.

Another observation that I have made is with 10 parallel HTTPS connection each
trying 1 transaction per second from 2 different machines (effectively same load
on server), the response time is again reducing to .17 secs.
However if I run two instances of the tool with 10 parallel HTTPS connection
each trying 1 transaction per second from from same machine, the response time
is again shooting up to 1.1 seconds.

So I feel HTTPS is blocking my test if I want to achieve higher TPS
(transactions per second.) than 10*1 TPS. [I can not send the next request
till I get response to previous one and since response time is more than 1
second, I can never reach 1 TPS from each connection.] If I use HTTP I can
easily reach 20 TPS. Also if I use multiple machines, I can reach 20 TPS on
HTTPS also.


So the question is does anyone here have any idea or some data about performance
limitation of HTTPS implementation in Python 3.1?

Regards
Ashish Vyas




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20101012/8e2e8116/attachment.html>

Search Discussions

  • Antoine Pitrou at Oct 12, 2010 at 1:33 pm

    On Tue, 12 Oct 2010 05:40:43 -0700 (PDT) Ashish Vyas wrote:
    Another observation that I have made is with 10 parallel HTTPS connection each
    trying 1 transaction per second from 2 different machines (effectively same load
    on server), the response time is again reducing to .17 secs.
    However if I run two instances of the tool with 10 parallel HTTPS connection
    each trying 1 transaction per second from from same machine, the response time
    is again shooting up to 1.1 seconds.
    Is the client machine at 100% CPU when you do that?
    So the question is does anyone here have any idea or some data about performance
    limitation of HTTPS implementation in Python 3.1?
    Which API are you using? urlopen()?
    The HTTPS implementation is basically the same as the HTTP
    implementation, except for the additional SSL layer. So if indeed
    Python is responsible for the slowdown, it may be because of excessive
    overhead brought by the SSL layer.

    It would be nice if you tried the just-released Python 3.2 alpha,
    because some changes have been made to the SSL wrapper:
    http://python.org/download/releases/3.2/

    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:
    http://bugs.python.org/issue8106

    Regards

    Antoine.
  • Antoine Pitrou at Oct 12, 2010 at 5:14 pm

    On Tue, 12 Oct 2010 05:40:43 -0700 (PDT) Ashish Vyas wrote:

    I have made a tool for load testing of my company's web-server product. The tool
    is written using Python 3.1. [...]
    So I feel HTTPS is blocking my test if I want to achieve higher TPS
    (transactions per second.) than 10*1 TPS.
    For the record, a quick test on my home machine suggests that Python
    2.6, 3.1, 3.2 all take 3ms per connection+request on a dumb local HTTPS
    server (launched with "openssl s_server -www").

    Regards

    Antoine.
  • Ashish at Oct 13, 2010 at 6:11 am

    On Oct 12, 6:33?pm, Antoine Pitrou wrote:
    On Tue, 12 Oct 2010 05:40:43 -0700 (PDT)

    Ashish Vyas wrote:
    Another observation that I have made is with 10 parallel HTTPS connection each
    trying 1 transaction per second from 2 different machines (effectively same load
    on server), the response time is again reducing to .17 secs.
    However if I run two instances of the tool with 10 parallel HTTPS connection
    each trying 1 transaction per second from from same machine, the response time
    is again shooting up to 1.1 seconds.
    Is the client machine at 100% CPU when you do that?
    With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    at 53-55%.
    So the question is does anyone here have any idea or some data about performance
    limitation of HTTPS implementation in Python 3.1?
    Which API are you using? urlopen()?
    The HTTPS implementation is basically the same as the HTTP
    implementation, except for the additional SSL layer. So if indeed
    Python is responsible for the slowdown, it may be because of excessive
    overhead brought by the SSL layer.
    I am doing something like this:-

    self.conn = AsyncHTTPSConnection(self.URL, HTTPS_PORT)

    self.conn.putrequest('POST', WEBSERVER_IP)
    self.conn.putheader('Cookie', cookie)
    self.conn.putheader('Content-Length', reqLen)
    ..
    self.conn.endheaders()
    self.conn.send(str.encode(reqest))


    and AsyncHTTPSConnection class is something like this:-

    class AsyncHTTPSConnection(client.HTTPConnection):
    default_port = HTTPS_PORT
    def __init__(self, host, port=HTTPS_PORT, key_file=None,
    cert_file=None,
    strict=None,
    timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    """ Init has same eparameters as HTTPSConnection. """
    client.HTTPConnection.__init__(self, host, port, strict,
    timeout)
    self.key_file = key_file
    self.cert_file = cert_file

    def connect(self):
    try:
    log.mjLog.LogReporter ("Model", "info",
    "AsyncHTTPSConnection::connect trying to connect... "+ str(self.host)
    + ":"+ str(self.port))
    sock = socket.create_connection((self.host, self.port),
    self.timeout)
    sock2 = ssl.wrap_socket(sock, self.key_file,
    self.cert_file)
    self.sock = CBSocket(sock2)
    except:
    log.mjLog.LogReporter ("Model", "critical",
    "AsyncHTTPSConnection::connect Failed to connect to the GWS")

    It would be nice if you tried the just-released Python 3.2 alpha,
    because some changes have been made to the SSL wrapper:http://python.org/download/releases/3.2/
    Let me try to use this, I will come back with my observations.
    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:http://bugs.python.org/issue8106
    Well good to know this. Do we have any date when this will be
    available? I feel like contributing to this but kind of over occupied
    with several activities right now.
    Regards

    Antoine.
    Thanks a lot,
    Ashish
  • Ashish at Oct 13, 2010 at 9:12 am

    On Oct 13, 11:11?am, Ashish wrote:
    On Oct 12, 6:33?pm, Antoine Pitrou wrote:> On Tue, 12 Oct 2010 05:40:43 -0700 (PDT)
    Ashish Vyas wrote:
    Another observation that I have made is with 10 parallel HTTPS connection each
    trying 1 transaction per second from 2 different machines (effectively same load
    on server), the response time is again reducing to .17 secs.
    However if I run two instances of the tool with 10 parallel HTTPS connection
    each trying 1 transaction per second from from same machine, the response time
    is again shooting up to 1.1 seconds.
    Is the client machine at 100% CPU when you do that?
    With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    at 53-55%.
    So the question is does anyone here have any idea or some data about performance
    limitation of HTTPS implementation in Python 3.1?
    Which API are you using? urlopen()?
    The HTTPS implementation is basically the same as the HTTP
    implementation, except for the additional SSL layer. So if indeed
    Python is responsible for the slowdown, it may be because of excessive
    overhead brought by the SSL layer.
    I am doing something like this:-

    self.conn = AsyncHTTPSConnection(self.URL, HTTPS_PORT)

    self.conn.putrequest('POST', WEBSERVER_IP)
    self.conn.putheader('Cookie', cookie)
    self.conn.putheader('Content-Length', reqLen)
    ..
    self.conn.endheaders()
    self.conn.send(str.encode(reqest))

    and AsyncHTTPSConnection class is something like this:-

    class AsyncHTTPSConnection(client.HTTPConnection):
    ? ? default_port = HTTPS_PORT
    ? ? def __init__(self, host, port=HTTPS_PORT, key_file=None,
    cert_file=None,
    ? ? ? ? ? ? ? ? ? ? ?strict=None,
    timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
    ? ? ? ? """ Init has same eparameters as HTTPSConnection. """
    ? ? ? ? client.HTTPConnection.__init__(self, host, port, strict,
    timeout)
    ? ? ? ? self.key_file = key_file
    ? ? ? ? self.cert_file = cert_file

    ? ? def connect(self):
    ? ? ? ? try:
    ? ? ? ? ? ? log.mjLog.LogReporter ("Model", "info",
    "AsyncHTTPSConnection::connect trying to connect... "+ str(self.host)
    + ":"+ str(self.port))
    ? ? ? ? ? ? sock = socket.create_connection((self.host, self.port),
    self.timeout)
    ? ? ? ? ? ? sock2 = ssl.wrap_socket(sock, self.key_file,
    self.cert_file)
    ? ? ? ? ? ? self.sock = CBSocket(sock2)
    ? ? ? ? except:
    ? ? ? ? ? ? log.mjLog.LogReporter ("Model", "critical",
    "AsyncHTTPSConnection::connect Failed to connect to the GWS")
    It would be nice if you tried the just-released Python 3.2 alpha,
    because some changes have been made to the SSL wrapper:http://python.org/download/releases/3.2/
    Let me try to use this, I will come back with my observations.
    Well, I tried python3.2a2 and the average response time for 20 HTTPS
    tps reduced from about 1.1 seconds to about .97 seconds.
    This is a noticeable change but not enough I feel.
    Also when I tried running same test client on XEON, I am able to see
    average response time appx. 0.23 seconds.
    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:http://bugs.python.org/issue8106
    Well good to know this. Do we have any date when this will be
    available? I feel like contributing to this but kind of over occupied
    with several activities right now.


    Regards
    Antoine.
    Thanks a lot,
    Ashish
  • Antoine Pitrou at Oct 13, 2010 at 10:19 am

    On Wed, 13 Oct 2010 02:12:21 -0700 (PDT) Ashish wrote:
    Is the client machine at 100% CPU when you do that?
    With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    at 53-55%.
    And is the server at 100% CPU then?
    If the client is not at 100% CPU, it shouldn't be the bottleneck,
    unless you have something wrong in the client implementation.
    ? ? ? ? ? ? sock = socket.create_connection((self.host, self.port),
    self.timeout)
    ? ? ? ? ? ? sock2 = ssl.wrap_socket(sock, self.key_file,
    self.cert_file)
    ? ? ? ? ? ? self.sock = CBSocket(sock2)
    What is CBSocket? What happens if you just write:
    self.sock = sock2
    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:http://bugs.python.org/issue8106
    Well good to know this. Do we have any date when this will be
    available? I feel like contributing to this but kind of over occupied
    with several activities right now.
    Probably not in Python 3.2 anyway. But given your client isn't at 100%
    CPU when you launch your HTTPS test, it might not make a lot of
    difference.

    Regards

    Antoine.
  • Ashish at Oct 13, 2010 at 12:27 pm

    On Oct 13, 3:19?pm, Antoine Pitrou wrote:
    On Wed, 13 Oct 2010 02:12:21 -0700 (PDT)

    Ashish wrote:
    Is the client machine at 100% CPU when you do that?
    With HTTP, I see client CPU at appx. 97%. However with HTTPS, it stays
    at 53-55%.
    And is the server at 100% CPU then?
    If the client is not at 100% CPU, it shouldn't be the bottleneck,
    unless you have something wrong in the client implementation.
    ? ? ? ? ? ? sock = socket.create_connection((self.host, self.port),
    self.timeout)
    ? ? ? ? ? ? sock2 = ssl.wrap_socket(sock, self.key_file,
    self.cert_file)
    ? ? ? ? ? ? self.sock = CBSocket(sock2)
    What is CBSocket? What happens if you just write:
    ? ? self.sock = sock2
    Server's java process is taking 15% cpu.

    Well, CBSocket is socket implementation that calls my callback on
    data.
    Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    and use it the same way ( self.sock = CBSocket(sock2) ).
    The implemetation of AsyncHTTPConnection differs from
    AsyncHTTPSConnection only in connect method: sock2 =
    ssl.wrap_socket(sock, self.key_file, self.cert_file)

    class CBSocket(asynchat.async_chat):
    """ This is a class that calls the callback when it has data and
    it read it."""
    def __init__(self, socket):
    asynchat.async_chat.__init__(self, socket)

    self._in_buffer = io.BytesIO()
    self._closed = False
    self._cb = None

    def handle_read(self):
    try:
    read = self.socket.recv(65536)
    except:
    read = 0
    raise
    if not read and not self._closed:
    self.handle_close()
    self.close()
    self._closed = True
    return

    self._in_buffer.write(read)

    def sendall(self, data):
    self.send(data)

    def makefile(self, mode, buffsize= 8192):
    self._in_buffer.seek(0)
    return self._in_buffer

    def set_cb(self, cb):
    self._cb = cb
    if self._closed:
    self._cb()
    else:
    pass


    def handle_close(self):
    if self._cb:
    self._cb()
    self._closed = True
    self.close()
    del self._in_buffer
    Also, there's a feature request to reduce overhead of SSL
    connections, but it needs implementing:http://bugs.python.org/issue8106
    Well good to know this. Do we have any date when this will be
    available? I feel like contributing to this but kind of over occupied
    with several activities right now.
    Probably not in Python 3.2 anyway. But given your client isn't at 100%
    CPU when you launch your HTTPS test, it might not make a lot of
    difference.

    Regards

    Antoine.
  • Antoine Pitrou at Oct 13, 2010 at 1:12 pm

    On Wed, 13 Oct 2010 05:27:29 -0700 (PDT) Ashish wrote:

    Well, CBSocket is socket implementation that calls my callback on
    data.
    Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    and use it the same way ( self.sock = CBSocket(sock2) ).
    The implemetation of AsyncHTTPConnection differs from
    AsyncHTTPSConnection only in connect method: sock2 =
    ssl.wrap_socket(sock, self.key_file, self.cert_file)

    class CBSocket(asynchat.async_chat):
    [...]

    Ok, this won't work as expected. The first issue is that
    ssl.wrap_socket() is a blocking operation, where your client will send
    data and wait for the server reply (it's the SSL's handshake),
    *before* the socket has been set in non-blocking mode by asyncore. It
    means that your client will remain idle a lot of time, and explains
    that neither the client nor the server reach 100% CPU utilization.

    The second issue is that combining SSL and asyncore is more complicated
    than that; there are various situations to consider which your code
    doesn't address. The stdlib right now doesn't provide SSL support for
    asyncore (see http://bugs.python.org/issue10084 ), so you would have to
    do it yourself. I don't think it's worth the trouble, and would
    recommend switching your client to a simple thread-based approach,
    where you handle each HTTP(S) connection in a separate thread and stick
    to blocking I/O.

    Regards

    Antoine.
  • Ashish at Oct 14, 2010 at 12:06 pm

    On Oct 13, 6:12?pm, Antoine Pitrou wrote:
    On Wed, 13 Oct 2010 05:27:29 -0700 (PDT)Ashish wrote:

    Well, CBSocket is socket implementation that calls my callback on
    data.
    Both my classes AsyncHTTPSConnection and AsyncHTTPConnection use it
    and use it the same way ( self.sock = CBSocket(sock2) ).
    The implemetation of AsyncHTTPConnection differs from
    AsyncHTTPSConnection only in connect method: sock2 =
    ssl.wrap_socket(sock, self.key_file, self.cert_file)
    class CBSocket(asynchat.async_chat):
    [...]

    Ok, this won't work as expected. The first issue is that
    ssl.wrap_socket() is a blocking operation, where your client will send
    data and wait for the server reply (it's the SSL's handshake),
    *before* the socket has been set in non-blocking mode by asyncore. It
    means that your client will remain idle a lot of time, and explains
    that neither the client nor the server reach 100% CPU utilization.

    The second issue is that combining SSL and asyncore is more complicated
    than that; there are various situations to consider which your code
    doesn't address. The stdlib right now doesn't provide SSL support for
    asyncore (seehttp://bugs.python.org/issue10084), so you would have to
    do it yourself. I don't think it's worth the trouble, and would
    recommend switching your client to a simple thread-based approach,
    where you handle each HTTP(S) connection in a separate thread and stick
    to blocking I/O.

    Regards

    Antoine.
    I am impressed by the knowledge and also thankful to you for helping
    me out.

    I thought threads will be costly to use and if I go for say 200
    parallel connections with 200 total threads (+ a few more I have in my
    tool), it may not be efficient either. Let me try to change the
    implementation to use threads + blocking i/o and get back with
    results.

    One more question: If I run the tool from multicore machine, will
    python3.1 or 3.2 be able to actually use multicore? or it will be
    running only on one core?

    Thanks
    Ashish.
  • Antoine Pitrou at Oct 14, 2010 at 12:32 pm

    On Thu, 14 Oct 2010 05:06:30 -0700 (PDT) Ashish wrote:

    One more question: If I run the tool from multicore machine, will
    python3.1 or 3.2 be able to actually use multicore? or it will be
    running only on one core?
    Only partly. Pure Python code is serialized (by the Global Interpreter
    Lock), but some internal C code, such as SSL and socket routines, can
    run in parallel with other code.

    Regards

    Antoine.
  • Stefan Behnel at Oct 13, 2010 at 9:36 am

    Ashish Vyas, 12.10.2010 14:40:
    When I send request using HTTP, I am able to reach 1 transaction (request sent,
    response rcvd and validated.) per second from 20 parallel connections easily.
    Average response time shown is about 0.15 seconds.
    However, when I send request using HTTPS, I am seeing that the response time
    shown by tool goes to 1.1 seconds for same 20 parallel connection each trying 1
    transaction per second.
    You shouldn't overestimate the performance requirements for SSL/TLS support
    inside of the server application itself, simply because it's not used that
    much in real world deployments.

    It's actually very common to use a proxy to handle HTTPS traffic, and to
    forward the requests as plain HTTP to the "real" server. Separating the two
    gives you more freedom in your deployment (e.g. you can deploy the HTTPS
    proxy locally or on an entirely different machine at a suitable place in
    the network architecture), and makes your server generally more scalable.
    You can additionally use the HTTPS proxy machine to distribute the normal
    HTTP load over multiple server instances. There's even dedicated networking
    hardware for SSL/TLS proxying if you need it.

    Stefan
  • Ashish at Oct 13, 2010 at 10:21 am

    On Oct 13, 2:36?pm, Stefan Behnel wrote:
    Ashish Vyas, 12.10.2010 14:40:
    When I send request using HTTP, I am able to reach 1 transaction (request sent,
    response rcvd and validated.) per second from 20 parallel connections easily.
    Average response time shown is about 0.15 seconds.
    However, when I send request using HTTPS, I am seeing that the response time
    shown by tool goes to 1.1 seconds for same 20 parallel connection each trying 1
    transaction per second.
    You shouldn't overestimate the performance requirements for SSL/TLS support
    inside of the server application itself, simply because it's not used that
    much in real world deployments.

    It's actually very common to use a proxy to handle HTTPS traffic, and to
    forward the requests as plain HTTP to the "real" server. Separating the two
    gives you more freedom in your deployment (e.g. you can deploy the HTTPS
    proxy locally or on an entirely different machine at a suitable place in
    the network architecture), and makes your server generally more scalable.
    You can additionally use the HTTPS proxy machine to distribute the normal
    HTTP load over multiple server instances. There's even dedicated networking
    hardware for SSL/TLS proxying if you need it.

    Stefan
    Yes, I absolutely agree to you that the server shall also have similar
    overhead when HTTPS is used in place of HTTP. Thanks for suggesting
    the HTTPS proxy box.

    However, my problem here is:-

    client on XEON machine sending req over HTTPS: average response
    time ~= .2 secs
    client on P4 machine sending req over HTTP: average response
    time ~= .15 secs
    client on P4 machine sending req over HTTPS: average response
    time ~= 1.1 secs

    And I understand that until we have the feature implementation of
    issue8106 as pointed out by Antoine, we may see no further improvement
    on 1.1 secs (or .97 secs with 3.2a2) that I see.

    Kindly confirm if my above conclusion is correct.

    Thanks
    Ashish.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedOct 12, '10 at 12:40p
activeOct 14, '10 at 12:32p
posts12
users4
websitepython.org

People

Translate

site design / logo © 2023 Grokbase