Skip to content

[aiohttp] Tweaks #9783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Apr 8, 2025
Merged

Conversation

Dreamsorcerer
Copy link
Contributor

@Dreamsorcerer Dreamsorcerer commented Apr 5, 2025

Minor optimisation and rename the test runs to better match other frameworks (aiohttp -> aiohttp-orm, aiohttp-pg-raw -> aiohttp).

@Dreamsorcerer Dreamsorcerer marked this pull request as ready for review April 5, 2025 23:00
@Reskov
Copy link
Contributor

Reskov commented Apr 8, 2025

@Dreamsorcerer Great work! Sorry I didn't see your pull request prior updating mine #9767

  1. I've also found a small tweak that we can use bindparam and lambda_stmt https://docs.sqlalchemy.org/en/21/core/connections.html#engine-lambda-caching looks like it should be
READ_SELECT_ORM = lambda_stmt(lambda: select(World.randomnumber).where(World.id == bindparam("id")))

Update: I've re-read documentation and now I am not sure that example above give any benefits comparing to

READ_SELECT_ORM = select(World.randomnumber).where(World.id == bindparam("id"))

1.1) And for some reason READ_SELECT_ORM is not used at single_database_query_orm
1.2) select(Fortune.id, Fortune.message) can be also cache globally

Would you like to pick up this also? Or it is better to create a separate PR for that.

  1. Looks like deploying over gunicorn is not the most performant way to deploy aiohttp. How you any plans to test it against nginx? https://docs.aiohttp.org/en/stable/deployment.html#nginx-supervisord

@Reskov
Copy link
Contributor

Reskov commented Apr 8, 2025

To my previous comment. lambda_stmt will not provide benefits over bindparam. It's good to use when we build query on certain condition for example which is not this case. Meanwhile bindparam performs much better than original code.

ORM Select: 2.12 seconds
Bind Param: 1.26 seconds
Lambda Bind: 1.31 seconds
import os
import time
import asyncio
from typing import Callable, Awaitable

from sqlalchemy import URL, select, bindparam, lambda_stmt
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker, AsyncSession
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column


class Base(DeclarativeBase):
    pass


class World(Base):
    __tablename__ = 'world'
    id: Mapped[int] = mapped_column(primary_key=True)
    randomnumber: Mapped[int]


def pg_dsn() -> str:
    return URL.create(
        drivername='postgresql+asyncpg',
        username=os.getenv('PGUSER', 'postgres'),
        password=os.getenv('PGPASS', 'postgres'),
        host='localhost',
        port='5432',
        database='segment',
    ).render_as_string(hide_password=False)


READ_SELECT_ORM = select(World)
READ_SELECT_BIND_PARAM = select(World).where(World.id == bindparam("id"))
READ_SELECT_LAMBDA_BIND_PARAM = lambda_stmt(lambda: select(World).where(World.id == bindparam("id")))

def get_session():
    engine = create_async_engine(pg_dsn(), pool_size=10)
    return async_sessionmaker(engine)()

async def run_orm_select(session: AsyncSession, i: int) -> None:
    await session.scalar(READ_SELECT_ORM.where(World.id == i))


async def run_bind_param(session: AsyncSession, i: int) -> None:
    await session.scalar(READ_SELECT_BIND_PARAM, {"id": i})


async def run_lambda_stmt(session: AsyncSession, i: int) -> None:
    await session.scalar(READ_SELECT_LAMBDA_BIND_PARAM, {"id": i})


async def benchmark(name: str, query_fn: Callable[[AsyncSession, int], Awaitable[None]], count: int) -> None:
    start = time.time()
    async with get_session() as session:
        for i in range(count):
            await query_fn(session, i)
    duration = time.time() - start
    print(f"{name}: {duration:.2f} seconds")


if __name__ == '__main__':
    COUNT = 10000
    asyncio.run(benchmark("ORM Select", run_orm_select, COUNT))
    asyncio.run(benchmark("Bind Param", run_bind_param, COUNT))
    asyncio.run(benchmark("Lambda Bind", run_lambda_stmt, COUNT))

@Dreamsorcerer
Copy link
Contributor Author

Dreamsorcerer commented Apr 8, 2025

Meanwhile bindparam performs much better than original code

Looks goods. Do you want to put that in another PR? If you want to figure out deployment with nginx (or directly exposed without nginx?) as well, that'd be great. I'd be interested what the performance difference is (if I started from scratch, I wouldn't have used gunicorn myself).

@Reskov
Copy link
Contributor

Reskov commented Apr 8, 2025

Nginx is a large change, so I think it should be in a separate PR. I will try to add it this week. About bindparam as you wish, the change is quite small and fits the description of this PR quite well, so I prefer to have it here 😀

@Dreamsorcerer
Copy link
Contributor Author

Nginx is a large change, so I think it should be in a separate PR. I will try to add it this week.

Sounds good. Note that nginx is just a proxy used for security, so if possible, it probably makes sense to deploy raw without nginx and just expose the Python app directly, thus avoiding any proxy overhead.

About bindparam as you wish, the change is quite small and fits the description of this PR quite well, so I prefer to have it here 😀

To ensure I've got your changes correctly, can you just PR into my fork?

@Reskov
Copy link
Contributor

Reskov commented Apr 8, 2025

Yes, sure. I've prepared PR Dreamsorcerer#1 please take a look.

Note that nginx is just a proxy used for security, so if possible, it probably makes sense to deploy raw without nginx and just expose the Python app directly, thus avoiding any proxy overhead.

That is great idea also, but I am not sure how to consume all CPU cores inside single docker container at such deployment configuration, that's why I though some kind of proxy server is required. Do you have any thoughts?

@Dreamsorcerer
Copy link
Contributor Author

That is great idea also, but I am not sure how to consume all CPU cores inside single docker container at such deployment configuration, that's why I though some kind of proxy server is required

Ah, yes, of course. nginx makes the most sense then.

@msmith-techempower msmith-techempower merged commit 22a6a52 into TechEmpower:master Apr 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants