Description
Bug report
Bug description:
Hello,
I was doing some benchmarking of python and package installation.
That got me down a rabbit hole of buffering optimizations between between pip, requests, urllib and the cpython interpreter.
TL;DR I would like to discuss updating the value of io.DEFAULT_BUFFER_SIZE. It was set to 8192 since 16 years ago.
original commit: https://github.com/python/cpython/blame/main/Lib/_pyio.py#L27
It was a reasonable size given hardware and OS at the time. It's far from optimal today.
Remember, in 2008 you'd run a 32 bits operating system with less than 2 GB memory available and to share between all running applications.
Buffers had to be small, few kB, it wasn't conceivable to have buffer measured in entire MB.
I will attach benchmarks in the next messages showing 3 to 5 times write performance improvement when adjusting the buffer size.
I think the python interpreter can adopt a buffer size somewhere between 64k to 256k by default.
I think 64k is the minimum for python and it should be safe to adjust to.
Higher is better for performance in most cases, though there may be some cases where it's unwanted
(seek and small read/writes, unwanted trigger of write ahead, slow devices with throughput in measured in kB/s where you don't want to block for long)
In addition, I think there is a bug in open() on Linux.
open() sets the buffer size to the device block size on Linux when available (st_blksize, 4k on most disks), instead of io.DEFAULT_BUFFER_SIZE=8k.
I believe this is unwanted behavior, the block size is the minimal size for IO operations on the IO device, it's not the optimal size and it should not be preferred.
I think open() on Linux should be corrected to use a default buffer size of max(st_blksize, io.DEFAULT_BUFFER_SIZE)
instead of st_blksize
?
Related, the doc might be misleading for saying st_blksize is the preferred size for efficient I/O. https://github.com/python/cpython/blob/main/Doc/library/os.rst#L3181
The GNU doc was updated to clarify: "This is not guaranteed to give optimum performance" https://www.gnu.org/software/gnulib/manual/html_node/stat_002dsize.html
Thoughts?
Annex: some historical context and technical considerations around buffering.
On the hardware side:
- HDD had 512 bytes blocks historically, then HDD moved to 4096 bytes blocks in the 2010s.
- SSD have 4096 bytes blocks as far as I know.
On filesystems:
- buffer size should never be smaller than device and filesystem blocksize
- I think ext3, ext4, xfs, ntfs, etc... follow the device block size of 4k as default, though they can be configured for any block size.
- NTFS is capped to 16TB maximum disk size with 4k blocks.
- microsoft recommends 64k block size for windows server 2019+ and larger disks https://learn.microsoft.com/en-us/windows-server/storage/file-server/ntfs-overview
- RAID setups and assimilated with zfs/btrfs/xfs can have custom block size, I think anywhere 4kB-1MB. I don't know if there is any consensus, I think anything 16k-32k-64k-128k can be seen in the wild.
On network filesystems:
- shared network home directories are common on linux (NFS share) and windows (SMB share).
- entreprise storage vendors like Pure/Vast/NetApp recommend 524488 or 1048576 bytes for IO.
- see rsize wsize in mount settings:
host:path on path type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,acregmin=60,acdirmin=60,hard,proto=tcp,nconnect=8,mountproto=tcp, ...)
- for windows I cannot find documentation for network clients, though the windows server should have the NTFS filesystem with at least 64k block size as per microsoft recommendation above.
On pipes:
- buffering is used by pipes and for interprocess communications. see subprocess.py
- posix guarantees that writes to pipes are atomic up to PIPE_BUF, 4096 bytes on Linux kernel, guaranteed to be at least 512 bytes by posix.
- Python had a default of io.DEFAULT_BUFFER_SIZE=8192 so it never benefitted from that atomic property :D
on compression code, they probably all need to be adjusted:
- the buffer size is used by compression code in cpython: gzip.py lzma.py bz2.py
- I think lzma and bz2 are using the default size.
- gzip is using a 128kb read buffer, somebody realized it was very slow 2 years ago and rewrote the buffering to 128k.
- then somebody else realized last year it was still very slow to write and added an arbitrary write buffer 4*io.DEFAULT_BUFFR_SIZE.
- eae7dad
- GzipFile.write should be buffered #89550
- base64 is reading in chunks of 76 characters???
- https://github.com/python/cpython/blob/main/Lib/base64.py#L532
On network IO:
- On Linux, TCP read and write buffers were a minimum of 16k historically. The read buffer was increased to 64k in kernel v4.20, year 2018
- the buffer is resized dynamically with the TCP window upto 4MB write 6M read, let's not get into TCP. see sysctl_tcp_rmem sysctl_tcp_wmem
- linux code: https://github.com/torvalds/linux/blame/master/net/ipv4/tcp.c#L4775
- commit Sep 2018: torvalds/linux@a337531
- I think socket buffers are managed separately by the kernel, the io.DEFAULT_BUFFER_SIZE matters when you read a file and write to network, or read from network and write to file.
on HTTP, a large subset of networking:
- HTTP is large file transfer and would benefit from a much larger buffer, but that's probably more of a concern for urllib/requests.
- requests.content is 10k chunk by default.
- requests iter_lines(chunk_size=512, decode_unicode=False, delimiter=None) is 512 chunk by default.
- requests iter_content(chunk_size=1, decode_unicode=False) is 1 byte by default
- source: set in 2012 https://github.com/psf/requests/blame/8dd3b26bf59808de24fd654699f592abf6de581e/src/requests/models.py#L80
note to self: remember to publish code and result in next message
CPython versions tested on:
3.11
Operating systems tested on:
Other
Linked PRs
- gh-117151: optimize BufferedWriter(), do not buffer writes that are the buffer size #118037
- gh-117151: IO performance improvement, increase io.DEFAULT_BUFFER_SIZE to 128k #118144
- gh-117151: increase default buffer size of shutil.copyfileobj() to 256k. #119783
- gh-117151: optimize algorithm to grow the buffer size for readall() on files #131052