Welcome to the Automated Actions project! π This system provides a toolset for predefined actions that can be self-serviced by tenants or automatically triggered by events.
- π Table of Contents
- π― Problem Statement
- β¨ Goals
- π Key Features
- ποΈ Architecture Overview
- π Key Concepts
- π§© Packages Overview
- π οΈ Tech Stack
- π¬ Action Overview
- π Development Setup
- π‘ Usage Examples
- βοΈ Deployment
- π€ Contributing
- π License
AppSRE tenants regularly require manual intervention from AppSRE for various operational tasks. This project aims to establish a toolset allowing a set of predefined actions to be either self-serviced by tenants or automatically triggered by events (e.g., alerts). This reduces manual workload and improves response times. β±οΈ
See
- π§© Extensibility: Easily add new automated actions.
- π Discoverability: Actions should be stored and discoverable by users.
- π‘οΈ Security: Implement robust authentication and authorization.
- βοΈ Flexible Triggers: Support manual (user-initiated) and automated (event-driven, e.g., AlertManager, cron) triggers.
- π Output Accessibility: Action outputs (logs, dumps) must be accessible to the requester.
- π¦ Throttling: Provide mechanisms to prevent system abuse or overload.
- π Flexible Action Initiation: Supports both manual self-service by users and automated triggering by system events (e.g., alerts).
- π₯οΈ Centralized API Control: A dedicated API server for managing, tracking, and orchestrating all automated actions.
- π¨ Scalable Asynchronous Processing: Employs a robust task queue system for efficient and reliable execution of actions in the background.
- πΎ Robust State Management & Throttling: Persistently stores action status and enforces operational limits to ensure system stability.
- π Enterprise-Grade Security: Implements strong authentication (e.g., OIDC via Red Hat SSO) and fine-grained, configuration-driven authorization.
- π Declarative Action & Policy Definition: System behavior, including available actions, permissions, and operational parameters, is defined via externalized configuration (app-interface).
- β¨οΈ Dedicated User Interface: Offers a command-line tool (CLI) for tenants to easily trigger and monitor actions.
The system is centered around a FastAPI server that orchestrates actions triggered by users or automated systems.
flowchart TD
CLI["CLI, AlertManager, Qontract-Reconcile"] --> FastAPIServer["automated-actions API Server"]
FastAPIServer -- "authentication" <--> RHSSO["Red Hat SSO (OIDC)"]
CLI -- "authentication" <--> RHSSO["Red Hat SSO (OIDC)"]
FastAPIServer -- "authorization" <--> OPA["Open Policy Agent (OPA)"] <--> AppInterfaceAuthZ["app-interface <br/> (Action Permissions)"]
FastAPIServer -- "Stores Action Details" --> DynamoDB["DynamoDB (Action Store)"]
FastAPIServer -- "Enqueues Task" --> TaskQueue["Task Queue (Celery/SQS)"]
TaskQueue -- "Worker Consumes Task" --> CeleryWorkers["Celery Workers <br/> (Action Logic)"]
CeleryWorkers -- "Reads Config" --> AppInterface["app-interface <br/> Application Data"]
CeleryWorkers -- "Fetches Secrets" --> Vault["Vault"]
CeleryWorkers -- "execute actions" --> TargetSystems["Target Systems (e.g., AWS, OpenShift)"]
CeleryWorkers -- "Updates Status" --> DynamoDB
- FastAPI Server (automated-actions):
- The central component.
- Receives requests from users via the CLI or from AlertManager (webhooks).
- Handles authentication via Red Hat SSO (OIDC).
- Performs authorization based on configurations in app-interface.
- Validates requests and checks throttling limits (using DynamoDB).
- Assigns a unique ID, stores the action details in DynamoDB, and enqueues the task in Celery via AWS SQS.
- Provides endpoints to query action status.
- CLI (Command Line Interface):
- The primary tool for users/tenants to interact with the FastAPI server to trigger actions.
- Task Queue (Celery & SQS):
- Manages asynchronous execution of actions. FastAPI sends tasks to SQS, and Celery workers pick them up.
- DynamoDB:
- Used by FastAPI and Celery to store action requests, their status, and timestamps for throttling.
- Celery Workers:
- Execute the logic for each predefined action.
- Fetch necessary configuration from app-interface and secrets/credentials from Vault.
- Interact with target systems to perform actions (e.g., AWS RDS reboot, OpenShift Pod restart).
- Update action status in DynamoDB.
- Configuration (app-interface):
- Defines action permissions (authorization rules), and throttling parameters.
- Automated Actions: Triggered by end-users (typically AppSRE tenants) via the CLI.
- Example: Restarting a specific deployment.
- Automatic Remediations: (not implemented yet) Triggered by alerting systems (e.g., AlertManager webhooks).
- Example: Automatically restarting a stuck integration based on an alert.
This project is structured into several key packages, each with a distinct role:
The heart of the system! πͺ This package contains the FastAPI server application and the Celery task queue. It exposes the API endpoints, handles incoming requests, orchestrates task queuing, and manages the state of automated actions.
Your friendly Python helper for talking to the API! π This is an auto-generated Python HTTP client, created directly from the project's OpenAPI (Swagger) documentation. It simplifies programmatic interaction with the automated_actions
server.
The command center for users! π This package provides the Command Line Interface (CLI). Tenants and SREs use this tool to trigger actions, check their status, and interact with the automated actions system from their terminals.
Guardian of the gates! ποΈ This directory houses the Open Policy Agent (OPA) Rego files. These files define the authorization and throttling policies and rules that determine who can perform which actions under what conditions, ensuring secure and controlled operations.
Putting it all together! π¬ This package contains the integration tests. These tests verify that the different components of the system (API, workers, database, etc.) work correctly in concert, ensuring end-to-end functionality.
The shared toolbox! π§ This package provides common utility functions and API implementations used across other packages. This includes convenient wrappers for interacting with services like HashiCorp Vault (for secrets) and AWS APIs (e.g., for SQS, DynamoDB), promoting code reuse and consistency.
- Backend API: Python π, FastAPI π
- Task Queue: Celery π
- Message Broker: AWS SQS βοΈ
- Database: AWS DynamoDB ποΈ
- Authentication/Authorization: Red Hat SSO (OIDC) π, Open Policy Agent (OPA) π‘οΈ
- Linting/Formatting: Ruff β¨
- Package Management: uv π¦
Action permissions and throttling limits are defined in the app-interface
. This declarative approach allows for centralized management and easy auditing of system behavior. The qontract-reconcile automated-actions-config integration transforms these configurations into the OPA policy files.
This project provides a set of predefined actions that can be triggered by users or automatically by the system. Each action is designed to perform specific tasks on target systems, such as restarting workloads in OpenShift or rebooting AWS RDS instances.
Refer to actions.md for a detailed list of available actions, their parameters, and usage examples.
- π Getting application configuration and logs.
- πΎ Obtaining heap dumps.
- βοΈ Automatically restarting stuck integrations.
- π€ Recycling stuck Jenkins workers.
- Python (see
.python-version
orpyproject.toml
for specific version) uv
(for Python environment and package management)make
- AWS CLI configured (for local DynamoDB/SQS interaction if not using mocks, and for deployment)
- Access to relevant qontract-server and Vault instances (for action execution logic)
- Docker (for running local dependencies like LocalStack)
-
Clone the repository:
git clone https://github.com/chassing/automated-actions.git cd automated-actions
-
Set up the development environment: This command creates/updates a virtual environment using
uv
and installs dependencies.make dev-env
-
Activate the virtual environment:
source .venv/bin/activate
Configure all required settings in a local settings.conf
file in the project root directory (ignored by git).
Please find and use predefined setting files in Vault:
- AppSRE: Use app-sre-only-settings
- Other Red Hat employees: TODO: Create a similar settings file in Vault for other Red Hat employees.
Refer to the settings documentation for details on all available automated-actions settings.
Use the provided docker-compose.yml file to run the application and its dependencies (like LocalStack for AWS services) locally.
docker-compose up automated-actions # Or 'docker-compose up -d' for detached mode
This will typically start the FastAPI server, Celery workers, OPA, and mock AWS services. Check the docker-compose.yml
for specific service names and configurations.
Run the main test suite from the root directory:
make test
Each package may also have its own test suite. To run tests for a specific package:
cd packages/automated_actions
make test
This project uses Ruff for linting and formatting. To check for issues:
make format
automated-actions-client
and automated-actions-cli
are released to PyPI. To release a new versions:
-
Recreate the
automated-actions-client
package after changes to the OpenAPI spec.make generate-client
-
Update the version numbers in automated_actions_client/pyproject.toml. automated_actions_cli/pyproject.toml.
-
Create and merge a PR.
Feel free to explore our development guides to enhance your understanding and contribution to the Automated Actions project!
Action: Restart an OpenShift Deployment.
(Assuming automated-actions-cli
is installed or you entered your local development virtual environment)
automated-actions openshift-workload-restart --cluster <CLUSTER_NAME> --namespace <NAMESPACE_NAME> --kind Pod --name <POD_NAME>
Refer to the automated_actions_cli
package README.md
for detailed usage instructions.
Deployment is managed using OpenShift templates. Refer to the openshift
directory in the root of this repository for specific templates and deployment instructions.
Contributions are welcome! π Please follow standard practices:
- Fork the repository.
- Create a feature branch (
git checkout -b feature/your-amazing-feature
). - Make your changes.
- Ensure tests pass (
make test
). - Ensure code is linted and formatted (
make format
). - Commit your changes (
git commit -m 'feat: Add some amazing feature'
). - Push to the branch (
git push origin feature/your-amazing-feature
). - Open a Pull Request.
This project is licensed under the terms of the Apache 2.0 license.