Skip to content

Timeseries example not reproducible #2018

Open
@twoody2007

Description

@twoody2007

Issue Type

Documentation Bug

Source

binary

Keras Version

3.7.0

Custom Code

No

OS Platform and Distribution

Ubuntu 22.04

Python version

3.12.8

GPU model and memory

RTX 5000 Ada

Current Behavior?

Running the code from this time series example does not produce the same number of parameters as the example output in the documentation. Further, the model does not achieve stated accuracy.

The colab link also has the same issue.

What is strange is that the doc's final dense layer has ~64K params while running the code produces 2x the mlp unit input, which is 128. I tried increasing it to see if that fixed the problem, but it seems that there is something structurally different between how this code runs on 2.4 vs 3.7.0.

I expected the close to the same output as what is documented on the page.

Standalone code to reproduce the issue or tutorial link

You can run the colab example:
* https://colab.research.google.com/github/keras-team/keras-io/blob/master/examples/timeseries/ipynb/timeseries_classification_transformer.ipynb)

or run the code located here:
* https://github.com/keras-team/keras-io/blob/master/examples/timeseries/timeseries_classification_transformer.py

Relevant log output

(tcap) travis@travis-p1-g6:~/projects/tetra_capital$ python scripts/example_classifications.py 
Model: "functional"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                  ┃ Output Shape              ┃         Param # ┃ Connected to               ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ input_layer (InputLayer)      │ (None, 500, 1)            │               0 │ -                          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention          │ (None, 500, 1)            │           7,169 │ input_layer[0][0],         │
│ (MultiHeadAttention)          │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_1 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention[0][0] │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization           │ (None, 500, 1)            │               2 │ dropout_1[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add (Add)                     │ (None, 500, 1)            │               0 │ layer_normalization[0][0], │
│                               │                           │                 │ input_layer[0][0]          │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d (Conv1D)               │ (None, 500, 4)            │               8 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_2 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d[0][0]               │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_1 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_2[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_1         │ (None, 500, 1)            │               2 │ conv1d_1[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_1 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_1[0][… │
│                               │                           │                 │ add[0][0]                  │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_1        │ (None, 500, 1)            │           7,169 │ add_1[0][0], add_1[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_4 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_1[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_2         │ (None, 500, 1)            │               2 │ dropout_4[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_2 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_2[0][… │
│                               │                           │                 │ add_1[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_2 (Conv1D)             │ (None, 500, 4)            │               8 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_5 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_2[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_3 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_5[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_3         │ (None, 500, 1)            │               2 │ conv1d_3[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_3 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_3[0][… │
│                               │                           │                 │ add_2[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_2        │ (None, 500, 1)            │           7,169 │ add_3[0][0], add_3[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_7 (Dropout)           │ (None, 500, 1)            │               0 │ multi_head_attention_2[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_4         │ (None, 500, 1)            │               2 │ dropout_7[0][0]            │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_4 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_4[0][… │
│                               │                           │                 │ add_3[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_4 (Conv1D)             │ (None, 500, 4)            │               8 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_8 (Dropout)           │ (None, 500, 4)            │               0 │ conv1d_4[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_5 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_8[0][0]            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_5         │ (None, 500, 1)            │               2 │ conv1d_5[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_5 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_5[0][… │
│                               │                           │                 │ add_4[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ multi_head_attention_3        │ (None, 500, 1)            │           7,169 │ add_5[0][0], add_5[0][0]   │
│ (MultiHeadAttention)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_10 (Dropout)          │ (None, 500, 1)            │               0 │ multi_head_attention_3[0]… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_6         │ (None, 500, 1)            │               2 │ dropout_10[0][0]           │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_6 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_6[0][… │
│                               │                           │                 │ add_5[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_6 (Conv1D)             │ (None, 500, 4)            │               8 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_11 (Dropout)          │ (None, 500, 4)            │               0 │ conv1d_6[0][0]             │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ conv1d_7 (Conv1D)             │ (None, 500, 1)            │               5 │ dropout_11[0][0]           │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ layer_normalization_7         │ (None, 500, 1)            │               2 │ conv1d_7[0][0]             │
│ (LayerNormalization)          │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ add_7 (Add)                   │ (None, 500, 1)            │               0 │ layer_normalization_7[0][… │
│                               │                           │                 │ add_6[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ global_average_pooling1d      │ (None, 1)                 │               0 │ add_7[0][0]                │
│ (GlobalAveragePooling1D)      │                           │                 │                            │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense (Dense)                 │ (None, 2048)              │           4,096 │ global_average_pooling1d[… │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dropout_12 (Dropout)          │ (None, 2048)              │               0 │ dense[0][0]                │
├───────────────────────────────┼───────────────────────────┼─────────────────┼────────────────────────────┤
│ dense_1 (Dense)               │ (None, 2)                 │           4,098 │ dropout_12[0][0]           │
└───────────────────────────────┴───────────────────────────┴─────────────────┴────────────────────────────┘
 Total params: 36,938 (144.29 KB)
 Trainable params: 36,938 (144.29 KB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 22s 284ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6927 - val_sparse_categorical_accuracy: 0.5354
Epoch 2/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 9s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5024 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 3/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5005 - val_loss: 0.6926 - val_sparse_categorical_accuracy: 0.5354
Epoch 4/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5031 - val_loss: 0.6925 - val_sparse_categorical_accuracy: 0.5354
Epoch 5/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5155 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 6/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.5004 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 7/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6931 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 8/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5096 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 9/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5131 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 10/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5196 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 11/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5021 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 12/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6934 - sparse_categorical_accuracy: 0.4936 - val_loss: 0.6924 - val_sparse_categorical_accuracy: 0.5354
Epoch 13/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6928 - sparse_categorical_accuracy: 0.5176 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 14/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6933 - sparse_categorical_accuracy: 0.4975 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 15/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5098 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 16/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5078 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 17/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6927 - sparse_categorical_accuracy: 0.5171 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 18/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5118 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 19/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 20/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 84ms/step - loss: 0.6932 - sparse_categorical_accuracy: 0.5029 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 21/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5075 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 22/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6929 - sparse_categorical_accuracy: 0.5145 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 23/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5101 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
Epoch 24/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5090 - val_loss: 0.6923 - val_sparse_categorical_accuracy: 0.5354
Epoch 25/150
45/45 ━━━━━━━━━━━━━━━━━━━━ 4s 85ms/step - loss: 0.6930 - sparse_categorical_accuracy: 0.5079 - val_loss: 0.6922 - val_sparse_categorical_accuracy: 0.5354
42/42 ━━━━━━━━━━━━━━━━━━━━ 5s 58ms/step - loss: 0.6925 - sparse_categorical_accuracy: 0.5264 0

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions