Skip to content

Components push is super slow on single-file mode #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
edvinasjurele opened this issue May 30, 2025 · 6 comments
Open

Components push is super slow on single-file mode #211

edvinasjurele opened this issue May 30, 2025 · 6 comments
Labels
pending-triage [Issue] Ticket is pending to be prioritised

Comments

@edvinasjurele
Copy link

edvinasjurele commented May 30, 2025

I have a space with 221 blocks (including 17 folders and extensive whitelisting). When running components push using the new beta CLI in single-file mode, I encountered several issues:

  1. [performance] The command seems super slow in general. It seems it iterates through all of components in the list and does lots of requests where each tasks ~1.5-2s. The current CLI was way way faster!

Image

I was 10 minutes ⚠ ❗ in and I still got an error:

Image

  1. [validation] Validation error (Component ... not found) occurs at runtime (during iteration), but could be performed statically upfront.

I haven't checked code yet but, the new CLI seems smarter in traversing whitelist groups and blocks, but errors only surface during runtime - forcing a full re-run of the components push process, which adds ~10 minutes of waiting after an error was surfaced and manually fixed. These errors typically stem from outdated JSON or misconfigurations (e.g. whitelists referencing deleted blocks). Since removing a block doesn't update associated whitelists in the CMS, this leads to mismatches that could be caught earlier with static validation.

I believe the whitelist issue is likely to affect many users—not just during migration from v3 to v4, but also in regular workflows. For example, if someone deletes a block, its technical name might still remain in other components’ whitelist arrays.

Is there any cleanup happening automatically when a block is removed? If not, I strongly recommend introducing a static validation step before running components push. This would help catch broken references early and avoid wasting time on lengthy runtime validations.

This could either be part of components push or introduced as a separate components validate command. Otherwise, if static validation isn’t feasible, perhaps whitelist references should not be strictly validated at all (just warnings thrown?)

  1. [DX] When pushing components in single-file mode, why is the whitelist iterated for each component? It seems like some blocks might be pushed multiple times, leading to excessive outgoing requests. Maybe there should be some in-memory array and then push component ONLY once. If this is a full space configuration push, a one-time static validation should be enough. Once validated, the push could proceed as efficiently as in the old CLI, without redundant operations or iterations.

Old CLI speed is typically 800~1000ms per each request to push, hence it's ~4 minutes to push all of my 221 blocks and all those 17 folders.

@edvinasjurele
Copy link
Author

edvinasjurele commented May 30, 2025

I am having super hard time to understand and manually fix Circular dependency detected ... error :/ I can give you full json dump to tak a closer look.

Therefore I cannot calculate how long does it take to push all data with new CLI to compare speeds.

@edvinasjurele
Copy link
Author

As mentioned above, it's very easy for component_whitelist to become outdated or invalid. Here’s a simple example:

  1. Create component-a, component-b, and component-c.
  2. In component-c, set "component_whitelist": ["component-a", "component-b"].
  3. Later, delete component-a when it's no longer needed.
  4. Observe.❗ At this point, component-c still references component-a in its component_whitelist. This stale reference remains as junk, and the only way to clean it up is to manually edit and re-save component-c in CMS GUI.

This kind of leftover config can easily accumulate and break static validations or newer CLI behavior, so a cleanup mechanism or automated validation would be highly beneficial.

@alvarosabu
Copy link
Contributor

alvarosabu commented Jun 2, 2025

Hi @edvinasjurele as usual, thanks for providing such detailed feedback 💚

push speed

The command seems super slow in general. It seems it iterates through all of components in the list and does lots of requests where each tasks ~1.5-2s. The current CLI was way way faster!

Old CLI speed is typically 800~1000ms per each request to push, hence it's ~4 minutes to push all of my 221 blocks and all those 17 folders.

Would be unfair to compare the speed of this new implementation with the current CLI, the v3 didn't covered component whitelists dependencies and didn't upsert resources (like folders etc) if they existed in the target space, so its not only pushing those 221 blocks and 17 folders, if you have multiple component whitelist the number will just sum up, please consider that.

That being said, I'm sure performance is definitely improvable, I will take a look with the team how can we make it faster. We have the limitation of not having bulk operations or a proper UPSERT on the API which could improve both speed and rate-limit issues.

Errors and circular dependencies

forcing a full re-run of the components push process

The way the CLI push command is implemented is that if a specific component failed to be pushed, the rest of the operations that were sucessfull are not affected. If component-a failed, I would suggest just run storyblok component push component-a instead of trying to push all components again, the CLI is granular in that way.

I am having super hard time to understand and manually fix Circular dependency detected ... error :/ I can give you full json dump to tak a closer look.

No worries, pass me the full json dump, maybe we should also talk with support if they can make a copy of your space so we can debug there without breaking anything.

Therefore I cannot calculate how long does it take to push all data with new CLI to compare speeds.

Similar answer as before, don't try to compare speed because old CLI didn't cover half of the dependencies of the new one. I would suggest to improve the current number and focus on functionality first.

Deleting resources and broken relations

Later, delete component-a when it's no longer needed.

If you do this operation via the CMS UI I'm afraid there is no mechanism to automatically delete all broken references of the component-a, as you mention here: and the only way to clean it up is to manually edit and re-save component-c in CMS GUI.

When we implement a command for deleting resources via CLI we can take in consideration for sure.

This kind of leftover config can easily accumulate and break static validations or newer CLI behavior, so a cleanup mechanism or automated validation would be highly beneficial.

We can create a support ticket for cleanup mechanism or automated validation when operations are done on the CMS, but just to clarify we can control that from the CLI.

@edvinasjurele
Copy link
Author

edvinasjurele commented Jun 2, 2025

Hey there! 🙏

I'm afraid I have to disagree with the calculation being labeled as incorrect. In theory, we only need to push each block (≈221), folder (≈17), and tags (if any ?) ONCE per occurrence, and that should be it.

In the JSON structure, we deal with these components directly. Whitelist components or groups are essentially references or duplicates, so pushing the same schema block multiple times due to whitelist inclusion feels redundant ❗. Ideally, the script should maintain some internal context or state to avoid re-pushing already-handled components within the same run.

Based on that, I still estimate the number of API calls to be no more than ~250. At an average of ~1.5s per request, the process should complete in about 6 minutes and 30 seconds. Of course, actual performance depends on network conditions, but this roughly aligns with the speed of the older CLI.

Note: My calculation assumes sequential execution speed. Now, imagine the performance boost if the pushing ran through a cascading queue 🚀. However, for that to be reliable, the system should first prepare the fetch request queue by running validation as a separate, upfront step. This way, any JSON issues could be surfaced to the user early as errors or warnings — before executing the push.

Image

Alternatively, there should be an endpoint to accept an array of components. This would be even better! 🙌 and I am sure whole space pushing could happen in under 10 seconds for sure.


Regarding validation: I’m really glad the new CLI introduces this improvement. However, it would be more efficient if the validation step could be executed as a separate, static check — ideally before any pushing starts. Currently, the CLI seems to iterate through the full list, only to fail partway through due to a misconfiguration. After that, I’m forced to fix the issue and re-run the entire operation. This makes bad DX.

While it's technically possible to rerun only a subset of components, manually determining that subset introduces potential for human error. Running the full push again is simply more reliable. If the failure only affects a single component, fine — but with more than one, this becomes a real headache.

@alvarosabu alvarosabu added the pending-triage [Issue] Ticket is pending to be prioritised label Jun 2, 2025
@alvarosabu
Copy link
Contributor

Hi @edvinasjurele, thanks for your reply.

I'm afraid I have to disagree with the calculation being labeled as incorrect.

Apologies if my previous message gave the impression that the calculation itself was incorrect or if I phrased it poorly.

What I intended to highlight is that comparing the performance of the new implementation with the current CLI might not be productive, as the v3 version didn't include several key features—such as handling component whitelist dependencies or upserting resources (e.g., folders, tags, whitelist elements) if they already existed in the target space. Which was an issue raised by users in the past.

In that context, benchmarking two commands that differ significantly in functionality may not provide a fair or productive comparison. I believe benchmarks are most useful when the conditions and feature parity are similar.

Alternatively, there should be an endpoint to accept an array of components. This would be even better! 🙌 and I am sure whole space pushing could happen in under 10 seconds for sure.

That would be ideal yes, also, consider that currently to validate the upsert, there is no PATCH method available on the API, so we need to do a GET call first to the target space, and then PUT or POST depending on if the resource exists. I already raised this internally

With these two limitations in mind, I believe there is a lot to improve in the way we look-up dependencies in the algorithm itself that can be improved, so we will work on that to ensure that CLI command is as performance as it can be.

I am having super hard time to understand and manually fix Circular dependency detected ... error :/ I can give you full json dump to tak a closer look.

Regarding this, I created a separate ticket here #215, If you could please add the JSON of the specific component structure with the whitelist to try to reproduce it would be really helpful  🙏🏻

@edvinasjurele
Copy link
Author

edvinasjurele commented Jun 3, 2025

... there is no PATCH method available on the API, so we need to do a GET call first to the target space ...

Oh, that explains the ~1600ms speed — it’s due to two requests 🙂. I’m wondering if it might be possible to fetch all space data upfront, and use that for subsequent lookups to determine whether a resource exists. Right now, the back-and-forth to Storyblok for every check adds latency per request. Such strategy does not give you a chance to do some static check, hence this way some dedicated storyblok components validate would even be possible too.

Also, considering the local JSON is the single source of truth, it's debatable whether we need to check against the space at all — as long as the JSON is internally consistent (e.g., whitelists only reference blocks that exist in the same file). Maybe I’m missing some edge case, but I feel fairly confident that offline validation — or at worst, a single pre-fetch of all space data — could streamline the process significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending-triage [Issue] Ticket is pending to be prioritised
Projects
None yet
Development

No branches or pull requests

2 participants