Re: Memory - Mailing list psycopg
From | Adrian Klaver |
---|---|
Subject | Re: Memory |
Date | |
Msg-id | d108fa8a-1c42-46bb-a6ef-32c3e276f4d6@aklaver.com Whole thread Raw |
In response to | Memory (Vladimir Ryabtsev <greatvovan@gmail.com>) |
Responses |
Re: Memory
|
List | psycopg |
On 12/21/24 02:45, Vladimir Ryabtsev wrote: > Hi community, > > I am reading a big dataset using code similar to this: > > query = ''' > SELECT timestamp, data_source, tag, agg_value > FROM my_table > '''I > batch_size = 10_000_000 > > with psycopg.connect(cs, cursor_factory=psycopg.ClientrCursor) as conn: FYI, ClientCursor. > > I looked the documentation, but did not find specifics related to > performance differences between Server and Client cursors. > > I am fine with ServerCursor, but I need to ask, is it by design that > with ClientCursor the result set is copied into memory despite > fetchmany() limit? ClientCursor is the default class, so may be worth > documenting the difference (sorry, if I missed that). Client side cursor https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#client-side-cursors "In such querying pattern, after a cursor sends a query to the server (usually calling execute()), the server replies transferring to the client the whole set of results requested, which is stored in the state of the same cursor and from where it can be read from Python code (using methods such as fetchone() and siblings)." https://www.psycopg.org/psycopg3/docs/api/cursors.html#psycopg.Cursor.fetchmany "fetchmany(size: int = 0) → list[+Row] Return the next size records from the current recordset. size default to self.arraysize if not specified. Return type: Sequence[Row], with Row defined by row_factory " Server side cursor https://www.psycopg.org/psycopg3/docs/advanced/cursors.html#server-side-cursors "PostgreSQL has its own concept of cursor too (sometimes also called portal). When a database cursor is created, the query is not necessarily completely processed: the server might be able to produce results only as they are needed. Only the results requested are transmitted to the client: if the query result is very large but the client only needs the first few records it is possible to transmit only them. The downside is that the server needs to keep track of the partially processed results, so it uses more memory and resources on the server." > > Thank you. > -- Adrian Klaver adrian.klaver@aklaver.com