Skip to content

Proto serialization v25 #4333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 4, 2021
Merged

Conversation

dstrain115
Copy link
Collaborator

This PR creates a new class called CircuitSerializer that can serialize Circuits into protocol buffers more efficiently than existing mechanisms.

This class is similar to SerializableGateSet and op_(de)serializer, in that it serializes and deserializes cirq.Circuits into Program protocol buffers, but it is about 60% more space efficient and has significant time savings as well.

This serializer works by changing the Gate field and Args dict to a specific message for each allowed gate type. It changes Qubits from a repeated Qubit message, to an index in the constant message.

This PR also adds a FloatArg message which is specific to Arguments where it is known the value will be a float (or a sympy expression).

@dstrain115 dstrain115 requested review from cduck, vtomole, wcourtney and a team as code owners July 19, 2021 17:49
@google-cla google-cla bot added the cla: yes Makes googlebot stop complaining. label Jul 19, 2021
Copy link
Collaborator

@MichaelBroughton MichaelBroughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. A few minor nits. My biggest concern before merging is that the tests are totally comprehensive. Are we including all the cases with sympy.Symbols that we can in circuit_serializer_test.py ?

@@ -34,6 +34,7 @@ message Constant {
oneof const_value {
string string_value = 1;
Circuit circuit_value = 2;
Qubit qubit = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add comments here and in the rest of the file describing these new fields (ideally mentioning that these are new for the v25 and won't nescessarily be populated by the old serializer) ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

self,
gate_set_name: str,
):
"""Construct the gate set.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This isn't a gate set ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 56 to 63
serializers: The OpSerializers to use for serialization.
Multiple serializers for a given gate type are allowed and
will be checked for a given type in the order specified here.
This allows for a given gate type to be serialized into
different serialized form depending on the parameters of the
gate.
deserializers: The OpDeserializers to convert serialized
forms of gates or circuits into Operations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Args not present.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 105 to 106
else:
raise NotImplementedError(f'Unrecognized program type: {type(program)}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Convert this to a less nested check.

if not isinstance(program, cirq.Circuit):
   raise NotImplementedError

rest of code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

msg: Union[None, v2.program_pb2.Operation, v2.program_pb2.CircuitOperation] = None,
**kwargs,
) -> Union[v2.program_pb2.Operation, v2.program_pb2.CircuitOperation]:
"""Disambiguation for operation serialization."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no args list or return types.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this function.

],
**kwargs,
) -> cirq.Operation:
"""Disambiguation for operation deserialization."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing args and return type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REmoved this function.

)


OPERATIONS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to include symbol cases here and in the CircuitOperations tests ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. CircuitOperations with symbols still pending.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified circuit operation test to include symbols.

FloatArg duration_nanos=1;
}

enum GateType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to remove GateType enum since it is replaced by gate_value oneof field.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. This was for side-by-side performance testing and I forgot to remove it.


// Map from the argument name to the Argument needed to fully specify
// the gate.
map<string, Arg> args = 2;
repeated Arg arg_list = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also remove arg_list field since gate_value submessages have their own gate-specific args.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

msg = v2.program_pb2.FloatArg() if out is None else out

if isinstance(
value, (float, int, sympy.Integer, sympy.Float, sympy.Rational, sympy.NumberSymbol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to put this type tuple into a module-level constant since it is also used on line 147 below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

which = arg_proto.WhichOneof('arg')
if which == 'float_value':
result = float(arg_proto.float_value)
if math.ceil(result) == math.floor(result):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do if math.round(result) == result:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if required_arg_name is not None:
raise ValueError(
f'{required_arg_name} is missing or has an unrecognized '
f'argument type (WhichOneof("arg")={which!r}).'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest separating these two cases to give more specific error messages:

if which == 'float_value':
    ...
elif which == 'symbol':
    ...
elif which == 'func':
    ...
elif which is None:
    if required_arg_name is not None:
        raise ValueError(f'{required_arg_name} is missing')
    return None
else:
    raise ValueError(f'unrecognized argument type: WhichOneof("arg")={which!r}')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, done.

arg_function_language: The `arg_function_language` field from
`Program.Language`.
use_constants: Whether to use constants in serialization. This is
required to be True for serializing CircuitOperations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be the point of allowing this to be false? Could we remove this option and always use the constants table for v2.5?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed (for now). My main concern was keeping the function signatures similar to SerializableGateSet, since the next change would be to add an interface that both of them follow, so that I can use both as a "gate_set". I think I can get around this (or deprecate the use_constants in the other version? In any case, I will deal with this in the follow-up PR when I unify the interface.

raw_constants=raw_constants,
)
if constants is not None:
msg.constants.extend(constants)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we instead pass msg.constants into _serialize_circuit so that the function can add constants to it directly? Perhaps this could be combined with raw_constants into a helper object that deals with the details of storing constants, since constants and raw_constants must both be present or not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed msg.constants into the file now. Decided not to make a helper object given that raw_constants is the helper object now, and, with all the refactoring, everything that uses these is now private.

return self.serialize_circuit_op(op, msg, **kwargs)
raise ValueError(f'Operation is of an unrecognized type: {op!r}')

if isinstance(msg, v2.program_pb2.Operation):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit awkward to have to check at runtime to match types between the proto message and the operation itself. I'd suggest having _serialize_circuit call serialize_gate_op or serialize_circuit_op directly since it already checks whether or not a given op is a CircuitOperation. I suppose this method could stay if you want it to be available to call directly (e.g. for tests), but I think it'd probably be fine to remove.

Also, I'd suggest moving _serialize_circuit up to be immediately after the serialize method; it's a bit hard to follow the flow when that internal method is located down after all the deserialization code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

arg_function_language=arg_function_language,
)
return msg
raise ValueError(f'Cannot serialize op {op!r} of type {type(gate)}')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest changing this to a chain of elif with a final else that raises the exception, then having a single return at the end. Could also delay adding qubits and tags until after this check of gate types, since it doesn't make sense to do either of those if the gate type itself is not supported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Returns:
The cirq.google.api.v2.CircuitOperation proto.
"""
circuit = getattr(op.untagged, 'circuit', None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should fail if circuit is None, so could remove the default here. But I would suggest instead doing an explicit type check:

if not isinstance(op.untagged, cirq.CircuitOperation):
   raise ValueError(...)
circuit = op.untagged.circuit

This should also make the cast below unnecessary since we've already checked the type of op.untagged.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@maffoo maffoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of nits, but otherwise LGTM!


// Representation of cirq.YPowGate
message YPowGate {
FloatArg exponent=1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add whitespace around = for consistency, in several places in this file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


SUPPORTED_SYMPY_OPS = (sympy.Symbol, sympy.Add, sympy.Mul, sympy.Pow)

# Argument types for gates.
ARG_LIKE = Union[int, float, List[bool], str, sympy.Symbol, sympy.Add, sympy.Mul]
FLOAT_ARG_LIKE = Union[float, sympy.Symbol, sympy.Add, sympy.Mul]
FLOAT_LIKE = (float, int, sympy.Integer, sympy.Float, sympy.Rational, sympy.NumberSymbol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe name this differently to distinguish it from the Unions above, e.g. FLOAT_LIKE_TYPES. Might also be nice to add a comment to indicate why a tuple is needed here, instead of a union.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@dstrain115
Copy link
Collaborator Author

Let me know if there are additional comments. I will be submitting this week.

@MichaelBroughton MichaelBroughton added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Aug 4, 2021
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Aug 4, 2021
@CirqBot CirqBot merged commit b1dc973 into quantumlib:master Aug 4, 2021
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Aug 4, 2021
@tonybruguier
Copy link
Contributor

Hello,
Does this PR required a bump the the version of protobuf in requirements.txt? It seems that the file cirq-google/cirq_google/api/v1/program_pb2.py contains code referring to _descriptor._internal_create_key which may be from a newer version?

I'm not certain, but I did some rudimentary tests and it seems that version protobuf==3.17.3 might be compatible.

@tonybruguier
Copy link
Contributor

I took the liberty of sending #4473 but I am happy to withdraw if misguided.

CirqBot pushed a commit that referenced this pull request Sep 10, 2021
It seems PR #4333 used a newer version of the proto syntax, but the requirements have not been updated, so this PR bumps them,

As a check, I decided to execute the follow commands:
```
python3 -m pip install protobuf~=3.17.3
python3 -m pip install grpcio-tools~=1.34.1
python3 -m pip install grpcio~=1.34.1
python3 -m pip install mypy-protobuf==1.10

./dev_tools/build-protos.sh
```

And the generated protos are unchanged, so maybe it's correct?
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
This PR creates a new class called CircuitSerializer that can serialize Circuits into protocol buffers more efficiently than existing mechanisms.

This class is similar to SerializableGateSet and op_(de)serializer, in that it serializes and deserializes cirq.Circuits into Program protocol buffers, but it is about 60% more space efficient and has significant time savings as well.

This serializer works by changing the Gate field and Args dict to a specific message for each allowed gate type.  It changes Qubits from a repeated Qubit message, to an index in the constant message.

This PR also adds a FloatArg message which is specific to Arguments where it is known the value will be a float (or a sympy expression).
rht pushed a commit to rht/Cirq that referenced this pull request May 1, 2023
…4473)

It seems PR quantumlib#4333 used a newer version of the proto syntax, but the requirements have not been updated, so this PR bumps them,

As a check, I decided to execute the follow commands:
```
python3 -m pip install protobuf~=3.17.3
python3 -m pip install grpcio-tools~=1.34.1
python3 -m pip install grpcio~=1.34.1
python3 -m pip install mypy-protobuf==1.10

./dev_tools/build-protos.sh
```

And the generated protos are unchanged, so maybe it's correct?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Makes googlebot stop complaining.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants