Data replication model, and fast takedown of nasty servers #50
Replies: 1 comment
-
Thanks for the question! It's a good topic.
This is correct.
Our thinking (perhaps naive / insufficient - very open to feedback and poking holes) is that because we are not hosting source code, the responsibility for pulling/deleting malicious code actually falls to npm/pypi/GHCR/etc. To that end, we do need some feature that removes That said... I can think of a few vectors we might need to think more on:
So, we probably do need some notion of being able to manage this centrally even if npm/pypi/etc help with part of the issue. Because we have no long term operational resourcing in place for this project, it would be good to figure out a way to do this in an automated or community-driven way. e.g. perhaps there is a way for community members to report these problems, or some vendor the project could lean on to help. Open to suggestions on how to power this mechanism.
No design has been suggested on this but I think it'd be a reasonable addition to the roadmap. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Pre-submission Checklist
Question Category
Your Question
Hey folks! I am on t he NuGet.org and VS Marketplace team. I'm really excited to see this project and read about your new registry. I have a bunch of thoughts swirling in my mind, but I'll start with just one for now.
It looks like your registry is going to be mainly "data upstream" and end users will interact directly with consumers of your registry not with your registry yourself (except perhaps server authors who publish to you). Apologies if I misunderstand that.
In the event of an unsafe/malicious/illegal MCP entry in your registry, I presume you will delete it on the backend, then subsequent calls to
GET /v0/servers
will not show the delete server signaling to the "middle layer" that a delete occurred.It seems like the "middle layer" needs to poll pretty frequently to reduce the "time-to-mitigate" to the smallest time possible.
It seems like being a "data upstream" puts you in a different position than other public registries, w.r.t. to your downstreams. It's like you and your downstreams are collectively responsible for keeping the ecosystem clean, instead of that responsibility solely resting on you, the central registry.
Am I understanding this right? Or do you envision takedowns happening in some other way?
The reason I ask is that the trustworthiness of a public registry is one of the primary "value adds" it can provide to a new or even established ecosystem. I hope that this data replication model does not yield problems when server deletion (rather than add/updates) need to happen fast, but perhaps don't due to something outside of your control (a slow downstream).
As an aside, I wonder if you plan on providing any transparency on the deletions that have occurred, such as an event log that can be followed without polling the entire server list each time.
Beta Was this translation helpful? Give feedback.
All reactions