Skip to content

Commit 81fbdce

Browse files
authored
Merge branch 'main' into groupby_cat_unobserved
2 parents 513c322 + b858de0 commit 81fbdce

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

67 files changed

+520
-600
lines changed

doc/source/user_guide/options.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@ displayed when calling :meth:`~pandas.DataFrame.info`.
249249
``display.max_info_rows``: :meth:`~pandas.DataFrame.info` will usually show null-counts for each column.
250250
For a large :class:`DataFrame`, this can be quite slow. ``max_info_rows`` and ``max_info_cols``
251251
limit this null check to the specified rows and columns respectively. The :meth:`~pandas.DataFrame.info`
252-
keyword argument ``null_counts=True`` will override this.
252+
keyword argument ``show_counts=True`` will override this.
253253

254254
.. ipython:: python
255255

doc/source/whatsnew/v0.13.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -733,7 +733,7 @@ Enhancements
733733
734734
.. _scipy: http://www.scipy.org
735735
.. _documentation: http://docs.scipy.org/doc/scipy/reference/interpolate.html#univariate-interpolation
736-
.. _guide: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html
736+
.. _guide: https://docs.scipy.org/doc/scipy/tutorial/interpolate.html
737737

738738
- ``to_csv`` now takes a ``date_format`` keyword argument that specifies how
739739
output datetime objects should be formatted. Datetimes encountered in the

doc/source/whatsnew/v1.5.2.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,16 @@ including other versions of pandas.
1414
Fixed regressions
1515
~~~~~~~~~~~~~~~~~
1616
- Fixed regression in :meth:`Series.replace` raising ``RecursionError`` with numeric dtype and when specifying ``value=None`` (:issue:`45725`)
17+
- Fixed regression in :meth:`DataFrame.plot` preventing :class:`~matplotlib.colors.Colormap` instance
18+
from being passed using the ``colormap`` argument if Matplotlib 3.6+ is used (:issue:`49374`)
1719
-
1820

1921
.. ---------------------------------------------------------------------------
2022
.. _whatsnew_152.bug_fixes:
2123

2224
Bug fixes
2325
~~~~~~~~~
24-
-
26+
- Bug in the Copy-on-Write implementation losing track of views in certain chained indexing cases (:issue:`48996`)
2527
-
2628

2729
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v2.0.0.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -181,6 +181,7 @@ Removal of prior version deprecations/changes
181181
- Removed deprecated :meth:`.Styler.where` (:issue:`49397`)
182182
- Removed deprecated :meth:`.Styler.render` (:issue:`49397`)
183183
- Removed deprecated argument ``null_color`` in :meth:`.Styler.highlight_null` (:issue:`49397`)
184+
- Removed deprecated ``null_counts`` argument in :meth:`DataFrame.info`. Use ``show_counts`` instead (:issue:`37999`)
184185
- Enforced deprecation disallowing passing a timezone-aware :class:`Timestamp` and ``dtype="datetime64[ns]"`` to :class:`Series` or :class:`DataFrame` constructors (:issue:`41555`)
185186
- Enforced deprecation disallowing passing a sequence of timezone-aware values and ``dtype="datetime64[ns]"`` to to :class:`Series` or :class:`DataFrame` constructors (:issue:`41555`)
186187
- Enforced deprecation disallowing unit-less "datetime64" dtype in :meth:`Series.astype` and :meth:`DataFrame.astype` (:issue:`47844`)
@@ -207,7 +208,9 @@ Removal of prior version deprecations/changes
207208
- Removed argument ``inplace`` from :meth:`Categorical.remove_unused_categories` (:issue:`37918`)
208209
- Disallow passing non-round floats to :class:`Timestamp` with ``unit="M"`` or ``unit="Y"`` (:issue:`47266`)
209210
- Remove keywords ``convert_float`` and ``mangle_dupe_cols`` from :func:`read_excel` (:issue:`41176`)
211+
- Removed ``errors`` keyword from :meth:`DataFrame.where`, :meth:`Series.where`, :meth:`DataFrame.mask` and :meth:`Series.mask` (:issue:`47728`)
210212
- Disallow passing non-keyword arguments to :func:`read_excel` except ``io`` and ``sheet_name`` (:issue:`34418`)
213+
- Disallow passing non-keyword arguments to :meth:`StringMethods.split` and :meth:`StringMethods.rsplit` except for ``pat`` (:issue:`47448`)
211214
- Disallow passing non-keyword arguments to :meth:`DataFrame.set_index` except ``keys`` (:issue:`41495`)
212215
- Disallow passing non-keyword arguments to :meth:`Resampler.interpolate` except ``method`` (:issue:`41699`)
213216
- Disallow passing non-keyword arguments to :meth:`DataFrame.reset_index` and :meth:`Series.reset_index` except ``level`` (:issue:`41496`)
@@ -224,6 +227,7 @@ Removal of prior version deprecations/changes
224227
- Disallow passing non-keyword arguments to :func:`concat` except for ``objs`` (:issue:`41485`)
225228
- Disallow passing non-keyword arguments to :func:`pivot` except for ``data`` (:issue:`48301`)
226229
- Disallow passing non-keyword arguments to :meth:`DataFrame.pivot` (:issue:`48301`)
230+
- Disallow passing non-keyword arguments to :func:`read_html` except for ``io`` (:issue:`27573`)
227231
- Disallow passing non-keyword arguments to :func:`read_json` except for ``path_or_buf`` (:issue:`27573`)
228232
- Disallow passing non-keyword arguments to :func:`read_sas` except for ``filepath_or_buffer`` (:issue:`47154`)
229233
- Disallow passing non-keyword arguments to :func:`read_stata` except for ``filepath_or_buffer`` (:issue:`48128`)
@@ -279,13 +283,16 @@ Removal of prior version deprecations/changes
279283
- Removed the ``display.column_space`` option in favor of ``df.to_string(col_space=...)`` (:issue:`47280`)
280284
- Removed the deprecated method ``mad`` from pandas classes (:issue:`11787`)
281285
- Removed the deprecated method ``tshift`` from pandas classes (:issue:`11631`)
286+
- Changed behavior of empty data passed into :class:`Series`; the default dtype will be ``object`` instead of ``float64`` (:issue:`29405`)
282287
- Changed the behavior of :func:`to_datetime` with argument "now" with ``utc=False`` to match ``Timestamp("now")`` (:issue:`18705`)
288+
- Changed behavior of :meth:`SparseArray.astype` when given a dtype that is not explicitly ``SparseDtype``, cast to the exact requested dtype rather than silently using a ``SparseDtype`` instead (:issue:`34457`)
283289
- Changed behavior of :class:`DataFrame` constructor given floating-point ``data`` and an integer ``dtype``, when the data cannot be cast losslessly, the floating point dtype is retained, matching :class:`Series` behavior (:issue:`41170`)
284290
- Changed behavior of :class:`DataFrame` constructor when passed a ``dtype`` (other than int) that the data cannot be cast to; it now raises instead of silently ignoring the dtype (:issue:`41733`)
285291
- Changed the behavior of :class:`Series` constructor, it will no longer infer a datetime64 or timedelta64 dtype from string entries (:issue:`41731`)
286292
- Changed behavior of :class:`Index` constructor when passed a ``SparseArray`` or ``SparseDtype`` to retain that dtype instead of casting to ``numpy.ndarray`` (:issue:`43930`)
287293
- Removed the deprecated ``base`` and ``loffset`` arguments from :meth:`pandas.DataFrame.resample`, :meth:`pandas.Series.resample` and :class:`pandas.Grouper`. Use ``offset`` or ``origin`` instead (:issue:`31809`)
288294
- Changed behavior of :meth:`DataFrame.any` and :meth:`DataFrame.all` with ``bool_only=True``; object-dtype columns with all-bool values will no longer be included, manually cast to ``bool`` dtype first (:issue:`46188`)
295+
- Changed behavior of comparison of a :class:`Timestamp` with a ``datetime.date`` object; these now compare as un-equal and raise on inequality comparisons, matching the ``datetime.datetime`` behavior (:issue:`36131`)
289296
- Enforced deprecation of silently dropping columns that raised a ``TypeError`` in :class:`Series.transform` and :class:`DataFrame.transform` when used with a list or dictionary (:issue:`43740`)
290297
-
291298

pandas/_libs/internals.pyx

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -676,8 +676,9 @@ cdef class BlockManager:
676676
public bint _known_consolidated, _is_consolidated
677677
public ndarray _blknos, _blklocs
678678
public list refs
679+
public object parent
679680

680-
def __cinit__(self, blocks=None, axes=None, refs=None, verify_integrity=True):
681+
def __cinit__(self, blocks=None, axes=None, refs=None, parent=None, verify_integrity=True):
681682
# None as defaults for unpickling GH#42345
682683
if blocks is None:
683684
# This adds 1-2 microseconds to DataFrame(np.array([]))
@@ -690,6 +691,7 @@ cdef class BlockManager:
690691
self.blocks = blocks
691692
self.axes = axes.copy() # copy to make sure we are not remotely-mutable
692693
self.refs = refs
694+
self.parent = parent
693695

694696
# Populate known_consolidate, blknos, and blklocs lazily
695697
self._known_consolidated = False
@@ -805,7 +807,9 @@ cdef class BlockManager:
805807
nrefs.append(weakref.ref(blk))
806808

807809
new_axes = [self.axes[0], self.axes[1]._getitem_slice(slobj)]
808-
mgr = type(self)(tuple(nbs), new_axes, nrefs, verify_integrity=False)
810+
mgr = type(self)(
811+
tuple(nbs), new_axes, nrefs, parent=self, verify_integrity=False
812+
)
809813

810814
# We can avoid having to rebuild blklocs/blknos
811815
blklocs = self._blklocs
@@ -827,4 +831,6 @@ cdef class BlockManager:
827831
new_axes = list(self.axes)
828832
new_axes[axis] = new_axes[axis]._getitem_slice(slobj)
829833

830-
return type(self)(tuple(new_blocks), new_axes, new_refs, verify_integrity=False)
834+
return type(self)(
835+
tuple(new_blocks), new_axes, new_refs, parent=self, verify_integrity=False
836+
)

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -364,15 +364,13 @@ cdef class _Timestamp(ABCTimestamp):
364364
# which incorrectly drops tz and normalizes to midnight
365365
# before comparing
366366
# We follow the stdlib datetime behavior of never being equal
367-
warnings.warn(
368-
"Comparison of Timestamp with datetime.date is deprecated in "
369-
"order to match the standard library behavior. "
370-
"In a future version these will be considered non-comparable. "
371-
"Use 'ts == pd.Timestamp(date)' or 'ts.date() == date' instead.",
372-
FutureWarning,
373-
stacklevel=find_stack_level(),
367+
if op == Py_EQ:
368+
return False
369+
elif op == Py_NE:
370+
return True
371+
raise TypeError("Cannot compare Timestamp with datetime.date. "
372+
"Use ts == pd.Timestamp(date) or ts.date() == date instead."
374373
)
375-
return NotImplemented
376374
else:
377375
return NotImplemented
378376

pandas/core/apply.py

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -58,10 +58,7 @@
5858
from pandas.core.algorithms import safe_sort
5959
from pandas.core.base import SelectionMixin
6060
import pandas.core.common as com
61-
from pandas.core.construction import (
62-
create_series_with_explicit_dtype,
63-
ensure_wrapped_if_datetimelike,
64-
)
61+
from pandas.core.construction import ensure_wrapped_if_datetimelike
6562

6663
if TYPE_CHECKING:
6764
from pandas import (
@@ -881,14 +878,12 @@ def wrap_results(self, results: ResType, res_index: Index) -> DataFrame | Series
881878

882879
# dict of scalars
883880

884-
# the default dtype of an empty Series will be `object`, but this
881+
# the default dtype of an empty Series is `object`, but this
885882
# code can be hit by df.mean() where the result should have dtype
886883
# float64 even if it's an empty Series.
887884
constructor_sliced = self.obj._constructor_sliced
888-
if constructor_sliced is Series:
889-
result = create_series_with_explicit_dtype(
890-
results, dtype_if_empty=np.float64
891-
)
885+
if len(results) == 0 and constructor_sliced is Series:
886+
result = constructor_sliced(results, dtype=np.float64)
892887
else:
893888
result = constructor_sliced(results)
894889
result.index = res_index

pandas/core/arrays/sparse/array.py

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -120,6 +120,8 @@ class ellipsis(Enum):
120120

121121
SparseIndexKind = Literal["integer", "block"]
122122

123+
from pandas.core.dtypes.dtypes import ExtensionDtype
124+
123125
from pandas import Series
124126

125127
else:
@@ -1328,14 +1330,13 @@ def astype(self, dtype: AstypeArg | None = None, copy: bool = True):
13281330
future_dtype = pandas_dtype(dtype)
13291331
if not isinstance(future_dtype, SparseDtype):
13301332
# GH#34457
1331-
warnings.warn(
1332-
"The behavior of .astype from SparseDtype to a non-sparse dtype "
1333-
"is deprecated. In a future version, this will return a non-sparse "
1334-
"array with the requested dtype. To retain the old behavior, use "
1335-
"`obj.astype(SparseDtype(dtype))`",
1336-
FutureWarning,
1337-
stacklevel=find_stack_level(),
1338-
)
1333+
if isinstance(future_dtype, np.dtype):
1334+
values = np.array(self)
1335+
return astype_nansafe(values, dtype=future_dtype)
1336+
else:
1337+
dtype = cast(ExtensionDtype, dtype)
1338+
cls = dtype.construct_array_type()
1339+
return cls._from_sequence(self, dtype=dtype, copy=copy)
13391340

13401341
dtype = self.dtype.update_dtype(dtype)
13411342
subtype = pandas_dtype(dtype._subtype_with_str)

pandas/core/base.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,6 @@
7171
from pandas.core.arraylike import OpsMixin
7272
from pandas.core.arrays import ExtensionArray
7373
from pandas.core.construction import (
74-
create_series_with_explicit_dtype,
7574
ensure_wrapped_if_datetimelike,
7675
extract_array,
7776
)
@@ -842,9 +841,12 @@ def _map_values(self, mapper, na_action=None):
842841
# expected to be pd.Series(np.nan, ...). As np.nan is
843842
# of dtype float64 the return value of this method should
844843
# be float64 as well
845-
mapper = create_series_with_explicit_dtype(
846-
mapper, dtype_if_empty=np.float64
847-
)
844+
from pandas import Series
845+
846+
if len(mapper) == 0:
847+
mapper = Series(mapper, dtype=np.float64)
848+
else:
849+
mapper = Series(mapper)
848850

849851
if isinstance(mapper, ABCSeries):
850852
if na_action not in (None, "ignore"):

pandas/core/computation/engines.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,6 @@ def _evaluate(self):
102102
-----
103103
Must be implemented by subclasses.
104104
"""
105-
pass
106105

107106

108107
class NumExprEngine(AbstractEngine):

pandas/core/construction.py

Lines changed: 0 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@
88

99
from typing import (
1010
TYPE_CHECKING,
11-
Any,
1211
Optional,
1312
Sequence,
1413
Union,
@@ -830,62 +829,3 @@ def _try_cast(
830829
subarr = np.array(arr, dtype=dtype, copy=copy)
831830

832831
return subarr
833-
834-
835-
def is_empty_data(data: Any) -> bool:
836-
"""
837-
Utility to check if a Series is instantiated with empty data,
838-
which does not contain dtype information.
839-
840-
Parameters
841-
----------
842-
data : array-like, Iterable, dict, or scalar value
843-
Contains data stored in Series.
844-
845-
Returns
846-
-------
847-
bool
848-
"""
849-
is_none = data is None
850-
is_list_like_without_dtype = is_list_like(data) and not hasattr(data, "dtype")
851-
is_simple_empty = is_list_like_without_dtype and not data
852-
return is_none or is_simple_empty
853-
854-
855-
def create_series_with_explicit_dtype(
856-
data: Any = None,
857-
index: ArrayLike | Index | None = None,
858-
dtype: Dtype | None = None,
859-
name: str | None = None,
860-
copy: bool = False,
861-
fastpath: bool = False,
862-
dtype_if_empty: Dtype = object,
863-
) -> Series:
864-
"""
865-
Helper to pass an explicit dtype when instantiating an empty Series.
866-
867-
This silences a DeprecationWarning described in GitHub-17261.
868-
869-
Parameters
870-
----------
871-
data : Mirrored from Series.__init__
872-
index : Mirrored from Series.__init__
873-
dtype : Mirrored from Series.__init__
874-
name : Mirrored from Series.__init__
875-
copy : Mirrored from Series.__init__
876-
fastpath : Mirrored from Series.__init__
877-
dtype_if_empty : str, numpy.dtype, or ExtensionDtype
878-
This dtype will be passed explicitly if an empty Series will
879-
be instantiated.
880-
881-
Returns
882-
-------
883-
Series
884-
"""
885-
from pandas.core.series import Series
886-
887-
if is_empty_data(data) and dtype is None:
888-
dtype = dtype_if_empty
889-
return Series(
890-
data=data, index=index, dtype=dtype, name=name, copy=copy, fastpath=fastpath
891-
)

pandas/core/dtypes/dtypes.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -116,8 +116,6 @@ class CategoricalDtypeType(type):
116116
the type of CategoricalDtype, this metaclass determines subclass ability
117117
"""
118118

119-
pass
120-
121119

122120
@register_extension_dtype
123121
class CategoricalDtype(PandasExtensionDtype, ExtensionDtype):

0 commit comments

Comments
 (0)