Speed up results serialization

## Describe a requested feature

I was running some performance tests and I noticed that checking if an object is pickable: https://github.com/tunib-ai/parallelformers/blob/ccaea515ee2e4d7540f2a275f6cdb0c33a7780f0/parallelformers/parallel/process.py#L209 takes a lot of time when the output is big (f.e., when a model returns a large logits tensor), because the whole object is being serialized into memory and then deserialized. I wonder what are the cases in which `check_pickable` helps, as dataclasses and `ModelOutput` should be as pickable as its dictionary representation.

If the check is still needed, I guess the code could be still sped up by modifying an object only on pickle failure. That would require some workarounds (perhaps overriding https://github.com/python/cpython/blob/9dc787ea96916552695e79397588fdfa68f22024/Lib/multiprocessing/queues.py#L275) so I want to make sure the check is still necessary, before giving it a shot. Another option is to always check for https://github.com/tunib-ai/parallelformers/blob/ccaea515ee2e4d7540f2a275f6cdb0c33a7780f0/parallelformers/parallel/process.py#L236-L239 and modify the object even if it's pickable, but that would remove custom fields added outside a definition of a given class.

	if _is_dataclass_instance(obj) or isinstance(obj, ModelOutput):
	_obj = asdict(obj)
	_obj["orig_dataclass_type"] = obj.__class__
	obj = _obj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Speed up results serialization #46

Describe a requested feature

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speed up results serialization #46

Description

Describe a requested feature

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions