Description
Narwhals started off with the objective of being a lightweight compatibility layer. As we've been adding feature and supported backends, the package size has been growing
There's a lot of essential dataframe functionality, and a lot of libraries that people want to support, so some increase in size since the earliest days is expected. But we do need to monitor it, it does need to stay under control, and Narwhals does need to stay lightweight.
Commitment: I'd like to suggest a hard commitment that:
- The Narwhals wheel size will never go above 500 kB. It's currently 305 kB
- Narwhals' size on disk will never go above 5000 kB (i.e.: when you make a virtual environment, the difference in size before vs after doing
pip install narwhals
. This includes some cached files which Python generates, but still, I think it's good to monitor the overall size). It's currently 3789 kB. - Never introduce any required dependencies nor compiled code
#1886 will probably increase our size a bit more. I think that's OK, as Ibis is a library that a few maintainers have said that they want to support. But it does bring us closer to the limits.
Some strategies to reduce size are:
- Reduce overly-long docstrings. Some examples of how to do this are in docs: Make DataFrame and LazyFrame docstrings shorter and more concise #1939 and docs: shorten docstring examples in narwhals/expr.py #1915
- More code-sharing. chore: refactor
name
namespaces to lower code duplication #1876 is a nice example, and I think there's more opportunities to do this - Directly implement some methods at the Narwhals level, instead of at the compliant level. For example, is
is_duplicated
just the negation ofis_unique
? - Freeze new features which don't have a use case.
Series.hist
is fine because it's been requested by Marimo, so it has a clear use case. Anything without a clear use-case, I think we may need to put the brakes on, at least until Request for contributions: Ibis support #1886 is resolved - See if there's any linter configurations that would reduce the size. I really don't want to minify Narwhals - legibility is important - but maybe there's some simple settings we can tweak, like line length / grouping imports / commas, that can reduce size a bit "for free"
Any help towards this goal would be appreciated - thank you, and thank you to everyone who has contributed in any way to Narwhals 🙏