CLARES - Compliance License & Asset Reminder Engine System Overview
Ever missed a certificate renewal and had a production outage? Or …
Ever missed a certificate renewal and had a production outage? Or forgot to renew a software license until it was too late? I built CLARES to make sure that never happens again.
CLARES stands for Compliance License & Asset Reminder Engine System. It's a full-stack web application that tracks expiry dates of SSL certificates, software licenses, compliance certificates, and any custom asset type your organization manages — and sends email reminders before things expire.
I built it because every team I've worked with has the same problem: critical renewals tracked in spreadsheets, emails, or someone's memory. CLARES replaces all of that with a single, centralized dashboard.
In any enterprise environment, you're juggling dozens (or hundreds) of:
The cost of missing even one renewal can be significant — from service outages to compliance violations. CLARES gives you a single pane of glass with urgency-based grouping and automated email reminders.
Step 1: Login — Users authenticate with username/password. The server validates credentials, checks the account is active, and issues a JWT token (8-hour expiry). Deactivated accounts get a clear error message — no cryptic "session expired" nonsense.
Step 2: Dashboard — The home page auto-fetches all renewal items and groups them into four urgency buckets: Expired, Critical (≤14 days), Warning (≤30 days), and Upcoming (≤90 days). Summary cards show counts per catalog type.
Step 3: Manage Catalogs — The sidebar lists all catalog types. Click any catalog to view, add, edit, or delete entries. Upload up to 500 rows at once via CSV bulk import. Add custom catalog types beyond the built-in ones.
Step 4: Granular Permissions — Global admins see everything. Other users get per-catalog roles: No Access, View (read-only), or Admin (full CRUD). A user can be a viewer globally but an admin for specific catalogs. Permissions are enforced on both frontend and backend.
Step 5: Email Reminders — Configure SMTP settings from the admin page, test the connection, send a test email, then trigger reminders. Each item has its own reminder config — how many days before expiry and how many times to repeat. The system calculates exact send dates by evenly spacing repeats within the window (e.g. 30 days / 3 repeats → reminders at 30, 20, and 10 days before expiry).
Step 6: Automatic Scheduler — Enable the daily auto-reminder from Admin Settings and pick an hour (server time). A background scheduler checks every 60 seconds and fires once per day. A reminder_logs table tracks which reminder number has been sent per item — no duplicates, and missed reminders are caught up automatically.
| Layer | Technology |
|---|---|
| Frontend | React 18, Vite 5, React Router v6 |
| Backend | Node.js 20, Express 4 |
| Database | PostgreSQL |
| Auth | JWT + bcrypt |
| Nodemailer (configurable SMTP) | |
| Deployment | Docker (multi-arch), Helm, Kubernetes |
The frontend is a React SPA built by Vite into static files. Express serves both the static files and the REST API on the same port. Authentication is stateless via JWT — no server-side sessions. The whole thing is packaged into a single Docker image using a multi-stage build.
Items are automatically grouped by urgency. No more scanning through spreadsheets — you instantly see what needs attention. Summary cards give you counts per catalog type at a glance.
Three built-in catalog types (Certificates, Licenses, SSL Certs) plus unlimited custom types. Each catalog tracks items with name, environment, expiry date, owner, notes, and per-item reminder settings.
Download a CSV template, fill it in, and upload up to 500 rows at once. Perfect for initial data migration or when you inherit a spreadsheet full of renewal dates.
This was one of the trickier features. The permission model has two layers:
A user can be a global Viewer but have Admin rights on specific catalogs. This means you can delegate management of SSL certificates to the infra team without giving them access to license data.
Configure any SMTP server (Exchange, Gmail, etc.) from the admin UI. Test the connection, send a test email, then trigger reminders. Each item can have its own reminder settings — how many days before expiry, and how many times to repeat.
The system calculates exact reminder dates by evenly spacing the repeat count within the reminder window. For example:
SSL cert expires June 10 · Remind 30 days before · Repeat 3 times
Emails include the reminder number (e.g. "reminder 2 of 3") and a color-coded status — red for ≤7 days, amber for ≤14, green for 14+.
No more relying on someone clicking "Send Reminders" manually. Enable the automatic reminder scheduler from Admin Settings, pick an hour (0–23, server time), and CLARES handles the rest:
reminder_logs table tracks which reminder number has been sent per item — no duplicate emailsThe app is containerized with a multi-stage Dockerfile and supports multi-architecture builds (amd64/arm64). For Kubernetes deployment, there's a complete Helm chart with:
kubectl exec deployment/clares -n clares -- node server/setup.jsThe setup script is idempotent — it creates tables only if they don't exist and seeds a default admin user when the users table is empty. Safe to run multiple times.
# Clone the repo git clone https://github.com/DevOpsArts/clares.git cd clares # Option 1: Local development npm install cp .env.example .env # Edit with your DB credentials npm run setup-db npm run dev # Frontend → :5174, API → :3002 # Option 2: Kubernetes with Helm helm install clares-postgres bitnami/postgresql \ --set auth.database=clares --set auth.username=clares \ --namespace clares --create-namespace helm install clares ./helm/clares-engine \ -f ./helm/clares-engine/values-minikube.yaml \ --namespace clares kubectl exec deployment/clares -n clares -- node server/setup.js # Login with admin / admin
CLARES is open source on GitHub: github.com/DevOpsArts/clares
Check out the project page: devopsarts.github.io/clares
If you're managing renewal dates in spreadsheets, give CLARES a try. It takes under 5 minutes to deploy and the default admin account is ready out of the box.
Built with React 18, Node.js, Express, PostgreSQL, Docker, and Helm. Deployed on Kubernetes.
The only Kubernetes log agent with intelligent error context capture, rule-based alerting, and 9 pluggable storage backends.
Every SRE knows the pain: an alert fires at 3 AM, and you're digging through gigabytes of logs trying to understand what happened before the error. Traditional log solutions either capture everything (expensive) or miss crucial context (frustrating).
What if your log agent was smart enough to capture only what matters—the error AND the context around it—and alert you instantly?
Logsenta is an open-source Kubernetes log monitoring agent that solves this problem with intelligent error-aware context capture and rule-based alerting. Instead of blindly forwarding all logs, Logsenta:
Logsenta uses regex and string-based pattern matching to detect errors across multiple languages and frameworks:
errorPatterns: - "ERROR" - "Exception" - "FATAL" - "panic:" - "Traceback" - "OOMKilled" - "CrashLoopBackOff" These patterns are "fully customizable"
Route different error patterns to different teams with customizable thresholds:
alerting:
enabled: true
rules:
# Critical errors → On-call team immediately
- name: "critical-errors"
patterns: ["CRITICAL", "FATAL", "OOMKilled", "panic:"]
threshold:
count: 1 # Alert on FIRST occurrence
windowSeconds: 60
email:
enabled: true
toAddresses: ["oncall@company.com"]
# Java exceptions → Backend team (after 2 occurrences)
- name: "java-exceptions"
patterns: ["NullPointerException", "OutOfMemoryError"]
threshold:
count: 2 # Alert after 2 occurrences
windowSeconds: 300
email:
enabled: true
toAddresses: ["backend-team@company.com"]
# Python errors → Data team
- name: "python-errors"
patterns: ["Traceback", "TypeError", "ValueError"]
threshold:
count: 1
email:
enabled: true
toAddresses: ["data-team@company.com"]
Why rule-based alerting matters:
One agent, any storage destination:
| Backend | Use Case |
|---|---|
| PostgreSQL | Relational queries, SQL analysis |
| MongoDB | Flexible document storage |
| Elasticsearch | Full-text search, Kibana dashboards |
| Azure Log Analytics | Azure ecosystem, KQL queries |
| AWS CloudWatch | AWS ecosystem, CloudWatch Insights |
| GCP Cloud Logging | Google Cloud ecosystem |
Configure how much context to capture around errors:
captureWindow: bufferDurationMinutes: 2 # Lines BEFORE error captureAfterMinutes: 2 # Lines AFTER error
This means when an error occurs, you get the full story—not just the error line.
Deploy Logsenta in under 2 minutes:
# Clone the repository git clone https://github.com/DevOpsArts/logsenta.git cd logsenta # Install with Helm (PostgreSQL backend + alerting) helm install logsenta-engine ./charts/logsenta-engine \ --namespace logsenta \ --create-namespace \ --set storage.type=postgresql \ --set connections.postgresql.host=your-db-host \ --set connections.postgresql.username=logsenta \ --set connections.postgresql.password=YOUR_PASSWORD \ --set alerting.enabled=true \ --set alerting.email.smtpHost=smtp.company.com
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Pod A│ │Pod B│ │Pod C│ ← Monitored │
│ └──┬──┘ └──┬──┘ └──┬──┘ │
│ └────────┼────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ Logsenta-Engine │ │
│ │ • Error Detection │ │
│ │ • Context Capture │ │
│ │ • Rule-Based Alert│ ──► π§ Email │
│ │ • Rolling Buffer │ ──► π Webhook │
│ └─────────┬──────────┘ │
└───────────────┼─────────────────────────────┘
▼
┌────────────────┐
│ Storage Backend│
│ (Your Choice) │
└────────────────┘
Logsenta is open-source and free to use. Check out the resources below:
Have questions or feedback? Drop a comment below or open an issue on GitHub!
Tags: Kubernetes, DevOps, SRE, Logging, Monitoring, Alerting, Azure, AWS, GCP, Helm, Open Source
Go to Azure AKS, in the side blade select "Diagnostic Settings", and choose "Add Diagnostic Setting".
Then, in the new page, select which logs need to be sent to the Event Hub and choose "Stream to an Event Hub". Here, provide the newly created Event Hub namespace and Event Hub.
Step 10: Configure Grafana Agent to scrap the messages from Azure eventhub,
Next, We need to pull the data from Azure eventhub and push it to Grafana loki,
In our existing grafana-agent-values.yaml add below lines to pull the messages from Azure eventhub and redeploy grafana agent in AKS.
Here is the reference github url and below is the yaml.
https://github.com/DevOpsArts/grafana_loki_agent/blob/main/grafana-agent-values-azure-aks.yaml
loki.source.azure_event_hubs "azure_aks" {
fully_qualified_namespace = "==XXX Eventhub namespace hostname XX===:9093"
event_hubs = ["aks"]
forward_to = [loki.write.local.receiver]
labels = {
"job" = "azure_aks",
}
authentication {
mechanism = "connection_string"
connection_string = " ===XXX Eventhub connection String XX==="
}
}
Replace the correct value for the above RED color. We can add multiple Event hubs in the Grafana agent by providing different Job names for each Azure PAAS.
Note : Make sure the communication is established between Azure AKS and Azure Eventhub to send the messages on port 9093.
Redeploy grafana agent in AKS using below command,
helm install --values grafana-agent-values-azure-aks.yaml grafana-agent grafana/grafana-agent -n observability
Check all the Grafana agent pods are up and running using below command,
kubectl get all -n observability
Now, the Grafana agent will pull the messages from Azure Event Hub and push them to Grafana Loki for Azure AKS, which is configured to send the logs in Diagnostic Settings.
We can verify the status of message processing from Azure Event Hub, including the status of incoming and outgoing messages.
Step 11: Access Azure AKS logs in Grafana dashboard,
Go to Grafan Dashboard, Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the grafana-agent-values-azure-aks.yaml. In our case the job name is "azure_aks"
Thats all, We have successfully deployed centralized logging with Grafana Loki, Grafana Agent for Kubernetes, VM application and Azure PAAS.
In Part 1, We covered how to setup Grafana Loki and Grafana Agent to view Kubernetes pod logs
Requirement:
Next double click the downloaded exe and install it, by default in windows the installer path is,
C:\Program Files\Grafana Agent
Once installation is completed, We need to update the configuration based on our needs like which application logs we need to send to Grafana loki.
In our case, we installed Grafana Dashboard in the windows VM and configured the Grafana dashboard logs in Grafana agent.
Similarly, we can add multiple application with different Job names.
Copy the grafana agent config file from below repo and update the required changes according on your needs.
We can start manually by below command in command prompt as well.
In command Prompt go to, C:\Program Files\Grafana Agent
Execute below command,
grafana-agent-windows-amd64.exe --config.file=agent-config.yaml
This will help to find any issue with the configuration.
Note : Here the Grafana loki distributed service endpoint(which is configured in the agent-config.yaml) should be accessible from the windows VM
Step 7 : Access VM application logs in Grafana Loki,
Go to Grafana Dashboard > Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the agent-config.yaml. In our case the job name is "devopsart-vm"
Now We are able to view the Grafana Dashboard logs in Grafana Loki. You can create the Dashboard from here based on your preference.
In Part 2, We covered how to export Windows VM application logs to Grafana Loki and how to view them from the Grafana Dashboard.
In Part 3, We will cover how to export Azure PAAS services logs to Grafana Loki
Dealing with multiple tools for capturing application logs from different sources can be a hassle for anyone. In this blog post, we'll dive into the steps required to establish centralized logging with Grafana Loki and Grafana Agent. This solution will allow us to unify the collection of logs from Kubernetes pods, VM services, and Azure PAAS services.
Grafana Loki : It is a highly scalable log aggregation system designed for cloud-native environments
Grafana Agent : It is an observability agent that collects metrics and logs from various application for visualization and analysis in Grafana
Requirement:
schemaConfig:
configs:
- from: "2020-09-07"
store: boltdb-shipper
object_store: azure
schema: v11
index:
prefix: index_
period: 24h
storageConfig:
boltdb_shipper:
shared_store: azure
active_index_directory: /var/loki/index
cache_location: /var/loki/cache
cache_ttl: 1h
filesystem:
directory: /var/loki/chunks
azure:
account_name: === Azure Storage name ===
account_key: === Azure Storage access key ===
container_name: === Container Name ===
request_timeout: 0
In this blog, we will explore a new tool called 'Rover,' which helps to visualize the Terraform plan
Rover : This open-source tool is designed to visualize Terraform Plan output, offering insights into infrastructure and its dependencies.
We will use the "Rover" docker image, to do our setup and visualize the infra.
Requirements:
1.Linux/Windows VM
2. Docker
Steps 1 : Generate terraform plan output
I have a sample Azure terraform block in devopsart folder, will generate terraform plan output from there and store is locally.
cd devopsart
terraform plan -out tfplan.out
terraform show -json tfplan.out > tfplan.json
Now both the files are generated.
Step 2 : Run Rover tool locally,
Execute below docker command to run rover from the same step 1 path,
docker run --rm -it -p 9000:9000 -v $(pwd)/tfplan.json:/src/tfplan.json im2nguyen/rover:latest -planJSONPath=tfplan.json
Its run the webUI in port number 9000.
Step 3 : Accessing Rover WebUI,
Lets access the WebUI and check it,
Go to browser, and enter http://localhost:9000
In the UI, color codes on the left side provide assistance in understanding the actions that will take place for the resources when running terraform apply.
When a specific resource is selected from the image, it will provide the name and parameter information.
Additionally, the image can be saved locally by clicking the 'Save' option
I hope this is helpful for someone who is genuinely confused by the Terraform plan output, especially when dealing with a large infrastructure.
Thanks for reading!! We have tried Rover tool and experimented with examples.
Reference:
https://github.com/im2nguyen/rover
Infracost : It provides cloud cost projections from Terraform. It enables engineers to view a detailed cost breakdown and comprehend expenses before implementions.
Requirement :
1. One window/Linux VM
2.Terraform
3.Terraform examples
Step 1 : infracost installation,
For Mac, use below brew command to do the installation,
# brew install infracost
For other Operating systems, follow below link,
https://www.infracost.io/docs/#quick-start
Step 2 : Infracost configuration,
We need to set up the Infracost API key by signing up here,
https://dashboard.infracost.io
Once logged in, visit the following URL to obtain the API key,
https://dashboard.infracost.io/org/praboosingh/settings/general
Next, open the terminal and set the key as an environment variable using the following command,
# export INFRACOST_API_KEY=XXXXXXXXXXXXX
or You can log in to the Infracost UI and grant terminal access by using the following command,
# infracost auth login
Note : Infracost will not send any cloud information to their server.
Step 3 : Infracost validation
Next, We will do the validation. For validation purpose i have cloned below github repo which contains terraform examples.
# git clone https://github.com/alfonsof/terraform-azure-examples.git
# cd terraform-azure-examples/code/01-hello-world
try infracost by using below command to get the estimated cost for a month,
# infracost breakdown --path .
To save the report in json format and upload to infracost server, use below command,
# infracost breakdown --path . --format json --out-file infracost-demo.json
# infracost upload --path infracost-demo.json
In case we plan to upgrade the infrastructure and need to understand the new cost, execute the following command to compare it with the previously saved output from the Terraform code path.
# infracost diff --path . --compare-to infracost-demo.json
Thanks for reading!! We have installed infracost and experimented with examples.
References:
https://github.com/infracost/infracost
https://www.infracost.io/docs/#quick-start
In this blog, we will install and examine a new tool called Trivy, which helps identify vulnerabilities, misconfigurations, licenses, secrets, and software dependencies in the following,
1.Container image
2.Kubernetes Cluster
3.Virtual machine image
4.FileSystem
5.Git Repo
6.AWS
Requirements,
1.One Virtual Machine
2.Above mentioned tools anyone
Step 1 : Install Trivy
Exceute below command based on your OS,
For Mac :
brew install trivy
In this blog post, We will explore a new tool called "KOR" (Kubernetes Orphaned Resources), which assists in identifying unused resources within a Kubernetes(K8S) cluster. This tool will be beneficial for those who are managing Kubernetes clusters.
Requirements:
1.One machine(Linux/Windows/Mac)
2.K8s cluster
Step 1 : Install kor in the machine.
Am using linux VM to do the experiment and for other flavours download the binaries from below link,
https://github.com/yonahd/kor/releases
Download the linux binary for linux VM,
wget https://github.com/yonahd/kor/releases/download/v0.1.8/kor_Linux_x86_64.tar.gz
tar -xvzf kor_Linux_x86_64.tar.gz
chmod 777 kor
cp -r kor /usr/bin
kor --help
Step 2 : Nginx Webserver deployment in K8s
I have a k8s cluster, We will deploy nginx webserver in K8s and try out "kor" tool
Create a namespace as "nginxweb"
kubectl create namespace nginxweb
Using helm, we will deploy nginx webserver by below command,
helm install nginx bitnami/nginx --namespace nginxweb
kubectl get all -n nginxweb
Step 3 : Validate with kor tool
lets check the unused resources with kor tool in the nginx namespace,
Below command will list all the unused resources available in the given namespace,
Syntax : kor all -n namespace
kor all -n nginxweb
lets delete one service from the nginxweb namespace and try it.
kubectl delete deployments nginx -n nginxweb
Now check what are the resources are available in the namespace,
kubectl get all -n nginxweb
it gives the result of one k8s service is available under the nginxweb namespace
And now try out with kor tool using below command,
kor all -n nginxweb
it gives the same result, that the nginx service is not used anywhere in the namespace.
We can check only configmap/secret/services/serviceaccount/deployments/statefulsets/role/hpa by,
kor services -n nginxweb
kor serviceaccount -n nginxweb
kor secret -n nginxweb
That's all. We have installed the KOR tool and validated it by deleting one of the component in the Nginx web server deployment.
References:
https://github.com/yonahd/kor
Ever missed a certificate renewal and had a production outage? Or forgot to renew a software license until it was too late? I built CLARES to make sure that never happens again.
CLARES stands for Compliance License & Asset Reminder Engine System. It's a full-stack web application that tracks expiry dates of SSL certificates, software licenses, compliance certificates, and any custom asset type your organization manages — and sends email reminders before things expire.
I built it because every team I've worked with has the same problem: critical renewals tracked in spreadsheets, emails, or someone's memory. CLARES replaces all of that with a single, centralized dashboard.
In any enterprise environment, you're juggling dozens (or hundreds) of:
The cost of missing even one renewal can be significant — from service outages to compliance violations. CLARES gives you a single pane of glass with urgency-based grouping and automated email reminders.
Step 1: Login — Users authenticate with username/password. The server validates credentials, checks the account is active, and issues a JWT token (8-hour expiry). Deactivated accounts get a clear error message — no cryptic "session expired" nonsense.
Step 2: Dashboard — The home page auto-fetches all renewal items and groups them into four urgency buckets: Expired, Critical (≤14 days), Warning (≤30 days), and Upcoming (≤90 days). Summary cards show counts per catalog type.
Step 3: Manage Catalogs — The sidebar lists all catalog types. Click any catalog to view, add, edit, or delete entries. Upload up to 500 rows at once via CSV bulk import. Add custom catalog types beyond the built-in ones.
Step 4: Granular Permissions — Global admins see everything. Other users get per-catalog roles: No Access, View (read-only), or Admin (full CRUD). A user can be a viewer globally but an admin for specific catalogs. Permissions are enforced on both frontend and backend.
Step 5: Email Reminders — Configure SMTP settings from the admin page, test the connection, send a test email, then trigger reminders. Each item has its own reminder config — how many days before expiry and how many times to repeat. The system calculates exact send dates by evenly spacing repeats within the window (e.g. 30 days / 3 repeats → reminders at 30, 20, and 10 days before expiry).
Step 6: Automatic Scheduler — Enable the daily auto-reminder from Admin Settings and pick an hour (server time). A background scheduler checks every 60 seconds and fires once per day. A reminder_logs table tracks which reminder number has been sent per item — no duplicates, and missed reminders are caught up automatically.
| Layer | Technology |
|---|---|
| Frontend | React 18, Vite 5, React Router v6 |
| Backend | Node.js 20, Express 4 |
| Database | PostgreSQL |
| Auth | JWT + bcrypt |
| Nodemailer (configurable SMTP) | |
| Deployment | Docker (multi-arch), Helm, Kubernetes |
The frontend is a React SPA built by Vite into static files. Express serves both the static files and the REST API on the same port. Authentication is stateless via JWT — no server-side sessions. The whole thing is packaged into a single Docker image using a multi-stage build.
Items are automatically grouped by urgency. No more scanning through spreadsheets — you instantly see what needs attention. Summary cards give you counts per catalog type at a glance.
Three built-in catalog types (Certificates, Licenses, SSL Certs) plus unlimited custom types. Each catalog tracks items with name, environment, expiry date, owner, notes, and per-item reminder settings.
Download a CSV template, fill it in, and upload up to 500 rows at once. Perfect for initial data migration or when you inherit a spreadsheet full of renewal dates.
This was one of the trickier features. The permission model has two layers:
A user can be a global Viewer but have Admin rights on specific catalogs. This means you can delegate management of SSL certificates to the infra team without giving them access to license data.
Configure any SMTP server (Exchange, Gmail, etc.) from the admin UI. Test the connection, send a test email, then trigger reminders. Each item can have its own reminder settings — how many days before expiry, and how many times to repeat.
The system calculates exact reminder dates by evenly spacing the repeat count within the reminder window. For example:
SSL cert expires June 10 · Remind 30 days before · Repeat 3 times
Emails include the reminder number (e.g. "reminder 2 of 3") and a color-coded status — red for ≤7 days, amber for ≤14, green for 14+.
No more relying on someone clicking "Send Reminders" manually. Enable the automatic reminder scheduler from Admin Settings, pick an hour (0–23, server time), and CLARES handles the rest:
reminder_logs table tracks which reminder number has been sent per item — no duplicate emailsThe app is containerized with a multi-stage Dockerfile and supports multi-architecture builds (amd64/arm64). For Kubernetes deployment, there's a complete Helm chart with:
kubectl exec deployment/clares -n clares -- node server/setup.jsThe setup script is idempotent — it creates tables only if they don't exist and seeds a default admin user when the users table is empty. Safe to run multiple times.
# Clone the repo git clone https://github.com/DevOpsArts/clares.git cd clares # Option 1: Local development npm install cp .env.example .env # Edit with your DB credentials npm run setup-db npm run dev # Frontend → :5174, API → :3002 # Option 2: Kubernetes with Helm helm install clares-postgres bitnami/postgresql \ --set auth.database=clares --set auth.username=clares \ --namespace clares --create-namespace helm install clares ./helm/clares-engine \ -f ./helm/clares-engine/values-minikube.yaml \ --namespace clares kubectl exec deployment/clares -n clares -- node server/setup.js # Login with admin / admin
CLARES is open source on GitHub: github.com/DevOpsArts/clares
Check out the project page: devopsarts.github.io/clares
If you're managing renewal dates in spreadsheets, give CLARES a try. It takes under 5 minutes to deploy and the default admin account is ready out of the box.
Built with React 18, Node.js, Express, PostgreSQL, Docker, and Helm. Deployed on Kubernetes.
The only Kubernetes log agent with intelligent error context capture, rule-based alerting, and 9 pluggable storage backends.
Every SRE knows the pain: an alert fires at 3 AM, and you're digging through gigabytes of logs trying to understand what happened before the error. Traditional log solutions either capture everything (expensive) or miss crucial context (frustrating).
What if your log agent was smart enough to capture only what matters—the error AND the context around it—and alert you instantly?
Logsenta is an open-source Kubernetes log monitoring agent that solves this problem with intelligent error-aware context capture and rule-based alerting. Instead of blindly forwarding all logs, Logsenta:
Logsenta uses regex and string-based pattern matching to detect errors across multiple languages and frameworks:
errorPatterns: - "ERROR" - "Exception" - "FATAL" - "panic:" - "Traceback" - "OOMKilled" - "CrashLoopBackOff" These patterns are "fully customizable"
Route different error patterns to different teams with customizable thresholds:
alerting:
enabled: true
rules:
# Critical errors → On-call team immediately
- name: "critical-errors"
patterns: ["CRITICAL", "FATAL", "OOMKilled", "panic:"]
threshold:
count: 1 # Alert on FIRST occurrence
windowSeconds: 60
email:
enabled: true
toAddresses: ["oncall@company.com"]
# Java exceptions → Backend team (after 2 occurrences)
- name: "java-exceptions"
patterns: ["NullPointerException", "OutOfMemoryError"]
threshold:
count: 2 # Alert after 2 occurrences
windowSeconds: 300
email:
enabled: true
toAddresses: ["backend-team@company.com"]
# Python errors → Data team
- name: "python-errors"
patterns: ["Traceback", "TypeError", "ValueError"]
threshold:
count: 1
email:
enabled: true
toAddresses: ["data-team@company.com"]
Why rule-based alerting matters:
One agent, any storage destination:
| Backend | Use Case |
|---|---|
| PostgreSQL | Relational queries, SQL analysis |
| MongoDB | Flexible document storage |
| Elasticsearch | Full-text search, Kibana dashboards |
| Azure Log Analytics | Azure ecosystem, KQL queries |
| AWS CloudWatch | AWS ecosystem, CloudWatch Insights |
| GCP Cloud Logging | Google Cloud ecosystem |
Configure how much context to capture around errors:
captureWindow: bufferDurationMinutes: 2 # Lines BEFORE error captureAfterMinutes: 2 # Lines AFTER error
This means when an error occurs, you get the full story—not just the error line.
Deploy Logsenta in under 2 minutes:
# Clone the repository git clone https://github.com/DevOpsArts/logsenta.git cd logsenta # Install with Helm (PostgreSQL backend + alerting) helm install logsenta-engine ./charts/logsenta-engine \ --namespace logsenta \ --create-namespace \ --set storage.type=postgresql \ --set connections.postgresql.host=your-db-host \ --set connections.postgresql.username=logsenta \ --set connections.postgresql.password=YOUR_PASSWORD \ --set alerting.enabled=true \ --set alerting.email.smtpHost=smtp.company.com
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Pod A│ │Pod B│ │Pod C│ ← Monitored │
│ └──┬──┘ └──┬──┘ └──┬──┘ │
│ └────────┼────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ Logsenta-Engine │ │
│ │ • Error Detection │ │
│ │ • Context Capture │ │
│ │ • Rule-Based Alert│ ──► π§ Email │
│ │ • Rolling Buffer │ ──► π Webhook │
│ └─────────┬──────────┘ │
└───────────────┼─────────────────────────────┘
▼
┌────────────────┐
│ Storage Backend│
│ (Your Choice) │
└────────────────┘
Logsenta is open-source and free to use. Check out the resources below:
Have questions or feedback? Drop a comment below or open an issue on GitHub!
Tags: Kubernetes, DevOps, SRE, Logging, Monitoring, Alerting, Azure, AWS, GCP, Helm, Open Source
Go to Azure AKS, in the side blade select "Diagnostic Settings", and choose "Add Diagnostic Setting".
Then, in the new page, select which logs need to be sent to the Event Hub and choose "Stream to an Event Hub". Here, provide the newly created Event Hub namespace and Event Hub.
Step 10: Configure Grafana Agent to scrap the messages from Azure eventhub,
Next, We need to pull the data from Azure eventhub and push it to Grafana loki,
In our existing grafana-agent-values.yaml add below lines to pull the messages from Azure eventhub and redeploy grafana agent in AKS.
Here is the reference github url and below is the yaml.
https://github.com/DevOpsArts/grafana_loki_agent/blob/main/grafana-agent-values-azure-aks.yaml
loki.source.azure_event_hubs "azure_aks" {
fully_qualified_namespace = "==XXX Eventhub namespace hostname XX===:9093"
event_hubs = ["aks"]
forward_to = [loki.write.local.receiver]
labels = {
"job" = "azure_aks",
}
authentication {
mechanism = "connection_string"
connection_string = " ===XXX Eventhub connection String XX==="
}
}
Replace the correct value for the above RED color. We can add multiple Event hubs in the Grafana agent by providing different Job names for each Azure PAAS.
Note : Make sure the communication is established between Azure AKS and Azure Eventhub to send the messages on port 9093.
Redeploy grafana agent in AKS using below command,
helm install --values grafana-agent-values-azure-aks.yaml grafana-agent grafana/grafana-agent -n observability
Check all the Grafana agent pods are up and running using below command,
kubectl get all -n observability
Now, the Grafana agent will pull the messages from Azure Event Hub and push them to Grafana Loki for Azure AKS, which is configured to send the logs in Diagnostic Settings.
We can verify the status of message processing from Azure Event Hub, including the status of incoming and outgoing messages.
Step 11: Access Azure AKS logs in Grafana dashboard,
Go to Grafan Dashboard, Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the grafana-agent-values-azure-aks.yaml. In our case the job name is "azure_aks"
Thats all, We have successfully deployed centralized logging with Grafana Loki, Grafana Agent for Kubernetes, VM application and Azure PAAS.
In Part 1, We covered how to setup Grafana Loki and Grafana Agent to view Kubernetes pod logs
Requirement:
Next double click the downloaded exe and install it, by default in windows the installer path is,
C:\Program Files\Grafana Agent
Once installation is completed, We need to update the configuration based on our needs like which application logs we need to send to Grafana loki.
In our case, we installed Grafana Dashboard in the windows VM and configured the Grafana dashboard logs in Grafana agent.
Similarly, we can add multiple application with different Job names.
Copy the grafana agent config file from below repo and update the required changes according on your needs.
We can start manually by below command in command prompt as well.
In command Prompt go to, C:\Program Files\Grafana Agent
Execute below command,
grafana-agent-windows-amd64.exe --config.file=agent-config.yaml
This will help to find any issue with the configuration.
Note : Here the Grafana loki distributed service endpoint(which is configured in the agent-config.yaml) should be accessible from the windows VM
Step 7 : Access VM application logs in Grafana Loki,
Go to Grafana Dashboard > Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the agent-config.yaml. In our case the job name is "devopsart-vm"
Now We are able to view the Grafana Dashboard logs in Grafana Loki. You can create the Dashboard from here based on your preference.
In Part 2, We covered how to export Windows VM application logs to Grafana Loki and how to view them from the Grafana Dashboard.
In Part 3, We will cover how to export Azure PAAS services logs to Grafana Loki
Dealing with multiple tools for capturing application logs from different sources can be a hassle for anyone. In this blog post, we'll dive into the steps required to establish centralized logging with Grafana Loki and Grafana Agent. This solution will allow us to unify the collection of logs from Kubernetes pods, VM services, and Azure PAAS services.
Grafana Loki : It is a highly scalable log aggregation system designed for cloud-native environments
Grafana Agent : It is an observability agent that collects metrics and logs from various application for visualization and analysis in Grafana
Requirement:
schemaConfig:
configs:
- from: "2020-09-07"
store: boltdb-shipper
object_store: azure
schema: v11
index:
prefix: index_
period: 24h
storageConfig:
boltdb_shipper:
shared_store: azure
active_index_directory: /var/loki/index
cache_location: /var/loki/cache
cache_ttl: 1h
filesystem:
directory: /var/loki/chunks
azure:
account_name: === Azure Storage name ===
account_key: === Azure Storage access key ===
container_name: === Container Name ===
request_timeout: 0
In this blog, we will explore a new tool called 'Rover,' which helps to visualize the Terraform plan
Rover : This open-source tool is designed to visualize Terraform Plan output, offering insights into infrastructure and its dependencies.
We will use the "Rover" docker image, to do our setup and visualize the infra.
Requirements:
1.Linux/Windows VM
2. Docker
Steps 1 : Generate terraform plan output
I have a sample Azure terraform block in devopsart folder, will generate terraform plan output from there and store is locally.
cd devopsart
terraform plan -out tfplan.out
terraform show -json tfplan.out > tfplan.json
Now both the files are generated.
Step 2 : Run Rover tool locally,
Execute below docker command to run rover from the same step 1 path,
docker run --rm -it -p 9000:9000 -v $(pwd)/tfplan.json:/src/tfplan.json im2nguyen/rover:latest -planJSONPath=tfplan.json
Its run the webUI in port number 9000.
Step 3 : Accessing Rover WebUI,
Lets access the WebUI and check it,
Go to browser, and enter http://localhost:9000
In the UI, color codes on the left side provide assistance in understanding the actions that will take place for the resources when running terraform apply.
When a specific resource is selected from the image, it will provide the name and parameter information.
Additionally, the image can be saved locally by clicking the 'Save' option
I hope this is helpful for someone who is genuinely confused by the Terraform plan output, especially when dealing with a large infrastructure.
Thanks for reading!! We have tried Rover tool and experimented with examples.
Reference:
https://github.com/im2nguyen/rover
Infracost : It provides cloud cost projections from Terraform. It enables engineers to view a detailed cost breakdown and comprehend expenses before implementions.
Requirement :
1. One window/Linux VM
2.Terraform
3.Terraform examples
Step 1 : infracost installation,
For Mac, use below brew command to do the installation,
# brew install infracost
For other Operating systems, follow below link,
https://www.infracost.io/docs/#quick-start
Step 2 : Infracost configuration,
We need to set up the Infracost API key by signing up here,
https://dashboard.infracost.io
Once logged in, visit the following URL to obtain the API key,
https://dashboard.infracost.io/org/praboosingh/settings/general
Next, open the terminal and set the key as an environment variable using the following command,
# export INFRACOST_API_KEY=XXXXXXXXXXXXX
or You can log in to the Infracost UI and grant terminal access by using the following command,
# infracost auth login
Note : Infracost will not send any cloud information to their server.
Step 3 : Infracost validation
Next, We will do the validation. For validation purpose i have cloned below github repo which contains terraform examples.
# git clone https://github.com/alfonsof/terraform-azure-examples.git
# cd terraform-azure-examples/code/01-hello-world
try infracost by using below command to get the estimated cost for a month,
# infracost breakdown --path .
To save the report in json format and upload to infracost server, use below command,
# infracost breakdown --path . --format json --out-file infracost-demo.json
# infracost upload --path infracost-demo.json
In case we plan to upgrade the infrastructure and need to understand the new cost, execute the following command to compare it with the previously saved output from the Terraform code path.
# infracost diff --path . --compare-to infracost-demo.json
Thanks for reading!! We have installed infracost and experimented with examples.
References:
https://github.com/infracost/infracost
https://www.infracost.io/docs/#quick-start
In this blog, we will install and examine a new tool called Trivy, which helps identify vulnerabilities, misconfigurations, licenses, secrets, and software dependencies in the following,
1.Container image
2.Kubernetes Cluster
3.Virtual machine image
4.FileSystem
5.Git Repo
6.AWS
Requirements,
1.One Virtual Machine
2.Above mentioned tools anyone
Step 1 : Install Trivy
Exceute below command based on your OS,
For Mac :
brew install trivy
In this blog post, We will explore a new tool called "KOR" (Kubernetes Orphaned Resources), which assists in identifying unused resources within a Kubernetes(K8S) cluster. This tool will be beneficial for those who are managing Kubernetes clusters.
Requirements:
1.One machine(Linux/Windows/Mac)
2.K8s cluster
Step 1 : Install kor in the machine.
Am using linux VM to do the experiment and for other flavours download the binaries from below link,
https://github.com/yonahd/kor/releases
Download the linux binary for linux VM,
wget https://github.com/yonahd/kor/releases/download/v0.1.8/kor_Linux_x86_64.tar.gz
tar -xvzf kor_Linux_x86_64.tar.gz
chmod 777 kor
cp -r kor /usr/bin
kor --help
Step 2 : Nginx Webserver deployment in K8s
I have a k8s cluster, We will deploy nginx webserver in K8s and try out "kor" tool
Create a namespace as "nginxweb"
kubectl create namespace nginxweb
Using helm, we will deploy nginx webserver by below command,
helm install nginx bitnami/nginx --namespace nginxweb
kubectl get all -n nginxweb
Step 3 : Validate with kor tool
lets check the unused resources with kor tool in the nginx namespace,
Below command will list all the unused resources available in the given namespace,
Syntax : kor all -n namespace
kor all -n nginxweb
lets delete one service from the nginxweb namespace and try it.
kubectl delete deployments nginx -n nginxweb
Now check what are the resources are available in the namespace,
kubectl get all -n nginxweb
it gives the result of one k8s service is available under the nginxweb namespace
And now try out with kor tool using below command,
kor all -n nginxweb
it gives the same result, that the nginx service is not used anywhere in the namespace.
We can check only configmap/secret/services/serviceaccount/deployments/statefulsets/role/hpa by,
kor services -n nginxweb
kor serviceaccount -n nginxweb
kor secret -n nginxweb
That's all. We have installed the KOR tool and validated it by deleting one of the component in the Nginx web server deployment.
References:
https://github.com/yonahd/kor
Ever missed a certificate renewal and had a production outage? Or forgot to renew a software license until it was too late? I built CLARES to make sure that never happens again.
CLARES stands for Compliance License & Asset Reminder Engine System. It's a full-stack web application that tracks expiry dates of SSL certificates, software licenses, compliance certificates, and any custom asset type your organization manages — and sends email reminders before things expire.
I built it because every team I've worked with has the same problem: critical renewals tracked in spreadsheets, emails, or someone's memory. CLARES replaces all of that with a single, centralized dashboard.
In any enterprise environment, you're juggling dozens (or hundreds) of:
The cost of missing even one renewal can be significant — from service outages to compliance violations. CLARES gives you a single pane of glass with urgency-based grouping and automated email reminders.
Step 1: Login — Users authenticate with username/password. The server validates credentials, checks the account is active, and issues a JWT token (8-hour expiry). Deactivated accounts get a clear error message — no cryptic "session expired" nonsense.
Step 2: Dashboard — The home page auto-fetches all renewal items and groups them into four urgency buckets: Expired, Critical (≤14 days), Warning (≤30 days), and Upcoming (≤90 days). Summary cards show counts per catalog type.
Step 3: Manage Catalogs — The sidebar lists all catalog types. Click any catalog to view, add, edit, or delete entries. Upload up to 500 rows at once via CSV bulk import. Add custom catalog types beyond the built-in ones.
Step 4: Granular Permissions — Global admins see everything. Other users get per-catalog roles: No Access, View (read-only), or Admin (full CRUD). A user can be a viewer globally but an admin for specific catalogs. Permissions are enforced on both frontend and backend.
Step 5: Email Reminders — Configure SMTP settings from the admin page, test the connection, send a test email, then trigger reminders. Each item has its own reminder config — how many days before expiry and how many times to repeat. The system calculates exact send dates by evenly spacing repeats within the window (e.g. 30 days / 3 repeats → reminders at 30, 20, and 10 days before expiry).
Step 6: Automatic Scheduler — Enable the daily auto-reminder from Admin Settings and pick an hour (server time). A background scheduler checks every 60 seconds and fires once per day. A reminder_logs table tracks which reminder number has been sent per item — no duplicates, and missed reminders are caught up automatically.
| Layer | Technology |
|---|---|
| Frontend | React 18, Vite 5, React Router v6 |
| Backend | Node.js 20, Express 4 |
| Database | PostgreSQL |
| Auth | JWT + bcrypt |
| Nodemailer (configurable SMTP) | |
| Deployment | Docker (multi-arch), Helm, Kubernetes |
The frontend is a React SPA built by Vite into static files. Express serves both the static files and the REST API on the same port. Authentication is stateless via JWT — no server-side sessions. The whole thing is packaged into a single Docker image using a multi-stage build.
Items are automatically grouped by urgency. No more scanning through spreadsheets — you instantly see what needs attention. Summary cards give you counts per catalog type at a glance.
Three built-in catalog types (Certificates, Licenses, SSL Certs) plus unlimited custom types. Each catalog tracks items with name, environment, expiry date, owner, notes, and per-item reminder settings.
Download a CSV template, fill it in, and upload up to 500 rows at once. Perfect for initial data migration or when you inherit a spreadsheet full of renewal dates.
This was one of the trickier features. The permission model has two layers:
A user can be a global Viewer but have Admin rights on specific catalogs. This means you can delegate management of SSL certificates to the infra team without giving them access to license data.
Configure any SMTP server (Exchange, Gmail, etc.) from the admin UI. Test the connection, send a test email, then trigger reminders. Each item can have its own reminder settings — how many days before expiry, and how many times to repeat.
The system calculates exact reminder dates by evenly spacing the repeat count within the reminder window. For example:
SSL cert expires June 10 · Remind 30 days before · Repeat 3 times
Emails include the reminder number (e.g. "reminder 2 of 3") and a color-coded status — red for ≤7 days, amber for ≤14, green for 14+.
No more relying on someone clicking "Send Reminders" manually. Enable the automatic reminder scheduler from Admin Settings, pick an hour (0–23, server time), and CLARES handles the rest:
reminder_logs table tracks which reminder number has been sent per item — no duplicate emailsThe app is containerized with a multi-stage Dockerfile and supports multi-architecture builds (amd64/arm64). For Kubernetes deployment, there's a complete Helm chart with:
kubectl exec deployment/clares -n clares -- node server/setup.jsThe setup script is idempotent — it creates tables only if they don't exist and seeds a default admin user when the users table is empty. Safe to run multiple times.
# Clone the repo git clone https://github.com/DevOpsArts/clares.git cd clares # Option 1: Local development npm install cp .env.example .env # Edit with your DB credentials npm run setup-db npm run dev # Frontend → :5174, API → :3002 # Option 2: Kubernetes with Helm helm install clares-postgres bitnami/postgresql \ --set auth.database=clares --set auth.username=clares \ --namespace clares --create-namespace helm install clares ./helm/clares-engine \ -f ./helm/clares-engine/values-minikube.yaml \ --namespace clares kubectl exec deployment/clares -n clares -- node server/setup.js # Login with admin / admin
CLARES is open source on GitHub: github.com/DevOpsArts/clares
Check out the project page: devopsarts.github.io/clares
If you're managing renewal dates in spreadsheets, give CLARES a try. It takes under 5 minutes to deploy and the default admin account is ready out of the box.
Built with React 18, Node.js, Express, PostgreSQL, Docker, and Helm. Deployed on Kubernetes.
The only Kubernetes log agent with intelligent error context capture, rule-based alerting, and 9 pluggable storage backends.
Every SRE knows the pain: an alert fires at 3 AM, and you're digging through gigabytes of logs trying to understand what happened before the error. Traditional log solutions either capture everything (expensive) or miss crucial context (frustrating).
What if your log agent was smart enough to capture only what matters—the error AND the context around it—and alert you instantly?
Logsenta is an open-source Kubernetes log monitoring agent that solves this problem with intelligent error-aware context capture and rule-based alerting. Instead of blindly forwarding all logs, Logsenta:
Logsenta uses regex and string-based pattern matching to detect errors across multiple languages and frameworks:
errorPatterns: - "ERROR" - "Exception" - "FATAL" - "panic:" - "Traceback" - "OOMKilled" - "CrashLoopBackOff" These patterns are "fully customizable"
Route different error patterns to different teams with customizable thresholds:
alerting:
enabled: true
rules:
# Critical errors → On-call team immediately
- name: "critical-errors"
patterns: ["CRITICAL", "FATAL", "OOMKilled", "panic:"]
threshold:
count: 1 # Alert on FIRST occurrence
windowSeconds: 60
email:
enabled: true
toAddresses: ["oncall@company.com"]
# Java exceptions → Backend team (after 2 occurrences)
- name: "java-exceptions"
patterns: ["NullPointerException", "OutOfMemoryError"]
threshold:
count: 2 # Alert after 2 occurrences
windowSeconds: 300
email:
enabled: true
toAddresses: ["backend-team@company.com"]
# Python errors → Data team
- name: "python-errors"
patterns: ["Traceback", "TypeError", "ValueError"]
threshold:
count: 1
email:
enabled: true
toAddresses: ["data-team@company.com"]
Why rule-based alerting matters:
One agent, any storage destination:
| Backend | Use Case |
|---|---|
| PostgreSQL | Relational queries, SQL analysis |
| MongoDB | Flexible document storage |
| Elasticsearch | Full-text search, Kibana dashboards |
| Azure Log Analytics | Azure ecosystem, KQL queries |
| AWS CloudWatch | AWS ecosystem, CloudWatch Insights |
| GCP Cloud Logging | Google Cloud ecosystem |
Configure how much context to capture around errors:
captureWindow: bufferDurationMinutes: 2 # Lines BEFORE error captureAfterMinutes: 2 # Lines AFTER error
This means when an error occurs, you get the full story—not just the error line.
Deploy Logsenta in under 2 minutes:
# Clone the repository git clone https://github.com/DevOpsArts/logsenta.git cd logsenta # Install with Helm (PostgreSQL backend + alerting) helm install logsenta-engine ./charts/logsenta-engine \ --namespace logsenta \ --create-namespace \ --set storage.type=postgresql \ --set connections.postgresql.host=your-db-host \ --set connections.postgresql.username=logsenta \ --set connections.postgresql.password=YOUR_PASSWORD \ --set alerting.enabled=true \ --set alerting.email.smtpHost=smtp.company.com
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Pod A│ │Pod B│ │Pod C│ ← Monitored │
│ └──┬──┘ └──┬──┘ └──┬──┘ │
│ └────────┼────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ Logsenta-Engine │ │
│ │ • Error Detection │ │
│ │ • Context Capture │ │
│ │ • Rule-Based Alert│ ──► π§ Email │
│ │ • Rolling Buffer │ ──► π Webhook │
│ └─────────┬──────────┘ │
└───────────────┼─────────────────────────────┘
▼
┌────────────────┐
│ Storage Backend│
│ (Your Choice) │
└────────────────┘
Logsenta is open-source and free to use. Check out the resources below:
Have questions or feedback? Drop a comment below or open an issue on GitHub!
Tags: Kubernetes, DevOps, SRE, Logging, Monitoring, Alerting, Azure, AWS, GCP, Helm, Open Source
Go to Azure AKS, in the side blade select "Diagnostic Settings", and choose "Add Diagnostic Setting".
Then, in the new page, select which logs need to be sent to the Event Hub and choose "Stream to an Event Hub". Here, provide the newly created Event Hub namespace and Event Hub.
Step 10: Configure Grafana Agent to scrap the messages from Azure eventhub,
Next, We need to pull the data from Azure eventhub and push it to Grafana loki,
In our existing grafana-agent-values.yaml add below lines to pull the messages from Azure eventhub and redeploy grafana agent in AKS.
Here is the reference github url and below is the yaml.
https://github.com/DevOpsArts/grafana_loki_agent/blob/main/grafana-agent-values-azure-aks.yaml
loki.source.azure_event_hubs "azure_aks" {
fully_qualified_namespace = "==XXX Eventhub namespace hostname XX===:9093"
event_hubs = ["aks"]
forward_to = [loki.write.local.receiver]
labels = {
"job" = "azure_aks",
}
authentication {
mechanism = "connection_string"
connection_string = " ===XXX Eventhub connection String XX==="
}
}
Replace the correct value for the above RED color. We can add multiple Event hubs in the Grafana agent by providing different Job names for each Azure PAAS.
Note : Make sure the communication is established between Azure AKS and Azure Eventhub to send the messages on port 9093.
Redeploy grafana agent in AKS using below command,
helm install --values grafana-agent-values-azure-aks.yaml grafana-agent grafana/grafana-agent -n observability
Check all the Grafana agent pods are up and running using below command,
kubectl get all -n observability
Now, the Grafana agent will pull the messages from Azure Event Hub and push them to Grafana Loki for Azure AKS, which is configured to send the logs in Diagnostic Settings.
We can verify the status of message processing from Azure Event Hub, including the status of incoming and outgoing messages.
Step 11: Access Azure AKS logs in Grafana dashboard,
Go to Grafan Dashboard, Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the grafana-agent-values-azure-aks.yaml. In our case the job name is "azure_aks"
Thats all, We have successfully deployed centralized logging with Grafana Loki, Grafana Agent for Kubernetes, VM application and Azure PAAS.
In Part 1, We covered how to setup Grafana Loki and Grafana Agent to view Kubernetes pod logs
Requirement:
Next double click the downloaded exe and install it, by default in windows the installer path is,
C:\Program Files\Grafana Agent
Once installation is completed, We need to update the configuration based on our needs like which application logs we need to send to Grafana loki.
In our case, we installed Grafana Dashboard in the windows VM and configured the Grafana dashboard logs in Grafana agent.
Similarly, we can add multiple application with different Job names.
Copy the grafana agent config file from below repo and update the required changes according on your needs.
We can start manually by below command in command prompt as well.
In command Prompt go to, C:\Program Files\Grafana Agent
Execute below command,
grafana-agent-windows-amd64.exe --config.file=agent-config.yaml
This will help to find any issue with the configuration.
Note : Here the Grafana loki distributed service endpoint(which is configured in the agent-config.yaml) should be accessible from the windows VM
Step 7 : Access VM application logs in Grafana Loki,
Go to Grafana Dashboard > Home > Explore > Select Loki Datasource
In the filter section, select "Job" and value as the job name which is given in the agent-config.yaml. In our case the job name is "devopsart-vm"
Now We are able to view the Grafana Dashboard logs in Grafana Loki. You can create the Dashboard from here based on your preference.
In Part 2, We covered how to export Windows VM application logs to Grafana Loki and how to view them from the Grafana Dashboard.
In Part 3, We will cover how to export Azure PAAS services logs to Grafana Loki
Dealing with multiple tools for capturing application logs from different sources can be a hassle for anyone. In this blog post, we'll dive into the steps required to establish centralized logging with Grafana Loki and Grafana Agent. This solution will allow us to unify the collection of logs from Kubernetes pods, VM services, and Azure PAAS services.
Grafana Loki : It is a highly scalable log aggregation system designed for cloud-native environments
Grafana Agent : It is an observability agent that collects metrics and logs from various application for visualization and analysis in Grafana
Requirement:
schemaConfig:
configs:
- from: "2020-09-07"
store: boltdb-shipper
object_store: azure
schema: v11
index:
prefix: index_
period: 24h
storageConfig:
boltdb_shipper:
shared_store: azure
active_index_directory: /var/loki/index
cache_location: /var/loki/cache
cache_ttl: 1h
filesystem:
directory: /var/loki/chunks
azure:
account_name: === Azure Storage name ===
account_key: === Azure Storage access key ===
container_name: === Container Name ===
request_timeout: 0
In this blog, we will explore a new tool called 'Rover,' which helps to visualize the Terraform plan
Rover : This open-source tool is designed to visualize Terraform Plan output, offering insights into infrastructure and its dependencies.
We will use the "Rover" docker image, to do our setup and visualize the infra.
Requirements:
1.Linux/Windows VM
2. Docker
Steps 1 : Generate terraform plan output
I have a sample Azure terraform block in devopsart folder, will generate terraform plan output from there and store is locally.
cd devopsart
terraform plan -out tfplan.out
terraform show -json tfplan.out > tfplan.json
Now both the files are generated.
Step 2 : Run Rover tool locally,
Execute below docker command to run rover from the same step 1 path,
docker run --rm -it -p 9000:9000 -v $(pwd)/tfplan.json:/src/tfplan.json im2nguyen/rover:latest -planJSONPath=tfplan.json
Its run the webUI in port number 9000.
Step 3 : Accessing Rover WebUI,
Lets access the WebUI and check it,
Go to browser, and enter http://localhost:9000
In the UI, color codes on the left side provide assistance in understanding the actions that will take place for the resources when running terraform apply.
When a specific resource is selected from the image, it will provide the name and parameter information.
Additionally, the image can be saved locally by clicking the 'Save' option
I hope this is helpful for someone who is genuinely confused by the Terraform plan output, especially when dealing with a large infrastructure.
Thanks for reading!! We have tried Rover tool and experimented with examples.
Reference:
https://github.com/im2nguyen/rover
Infracost : It provides cloud cost projections from Terraform. It enables engineers to view a detailed cost breakdown and comprehend expenses before implementions.
Requirement :
1. One window/Linux VM
2.Terraform
3.Terraform examples
Step 1 : infracost installation,
For Mac, use below brew command to do the installation,
# brew install infracost
For other Operating systems, follow below link,
https://www.infracost.io/docs/#quick-start
Step 2 : Infracost configuration,
We need to set up the Infracost API key by signing up here,
https://dashboard.infracost.io
Once logged in, visit the following URL to obtain the API key,
https://dashboard.infracost.io/org/praboosingh/settings/general
Next, open the terminal and set the key as an environment variable using the following command,
# export INFRACOST_API_KEY=XXXXXXXXXXXXX
or You can log in to the Infracost UI and grant terminal access by using the following command,
# infracost auth login
Note : Infracost will not send any cloud information to their server.
Step 3 : Infracost validation
Next, We will do the validation. For validation purpose i have cloned below github repo which contains terraform examples.
# git clone https://github.com/alfonsof/terraform-azure-examples.git
# cd terraform-azure-examples/code/01-hello-world
try infracost by using below command to get the estimated cost for a month,
# infracost breakdown --path .
To save the report in json format and upload to infracost server, use below command,
# infracost breakdown --path . --format json --out-file infracost-demo.json
# infracost upload --path infracost-demo.json
In case we plan to upgrade the infrastructure and need to understand the new cost, execute the following command to compare it with the previously saved output from the Terraform code path.
# infracost diff --path . --compare-to infracost-demo.json
Thanks for reading!! We have installed infracost and experimented with examples.
References:
https://github.com/infracost/infracost
https://www.infracost.io/docs/#quick-start
In this blog, we will install and examine a new tool called Trivy, which helps identify vulnerabilities, misconfigurations, licenses, secrets, and software dependencies in the following,
1.Container image
2.Kubernetes Cluster
3.Virtual machine image
4.FileSystem
5.Git Repo
6.AWS
Requirements,
1.One Virtual Machine
2.Above mentioned tools anyone
Step 1 : Install Trivy
Exceute below command based on your OS,
For Mac :
brew install trivy
In this blog post, We will explore a new tool called "KOR" (Kubernetes Orphaned Resources), which assists in identifying unused resources within a Kubernetes(K8S) cluster. This tool will be beneficial for those who are managing Kubernetes clusters.
Requirements:
1.One machine(Linux/Windows/Mac)
2.K8s cluster
Step 1 : Install kor in the machine.
Am using linux VM to do the experiment and for other flavours download the binaries from below link,
https://github.com/yonahd/kor/releases
Download the linux binary for linux VM,
wget https://github.com/yonahd/kor/releases/download/v0.1.8/kor_Linux_x86_64.tar.gz
tar -xvzf kor_Linux_x86_64.tar.gz
chmod 777 kor
cp -r kor /usr/bin
kor --help
Step 2 : Nginx Webserver deployment in K8s
I have a k8s cluster, We will deploy nginx webserver in K8s and try out "kor" tool
Create a namespace as "nginxweb"
kubectl create namespace nginxweb
Using helm, we will deploy nginx webserver by below command,
helm install nginx bitnami/nginx --namespace nginxweb
kubectl get all -n nginxweb
Step 3 : Validate with kor tool
lets check the unused resources with kor tool in the nginx namespace,
Below command will list all the unused resources available in the given namespace,
Syntax : kor all -n namespace
kor all -n nginxweb
lets delete one service from the nginxweb namespace and try it.
kubectl delete deployments nginx -n nginxweb
Now check what are the resources are available in the namespace,
kubectl get all -n nginxweb
it gives the result of one k8s service is available under the nginxweb namespace
And now try out with kor tool using below command,
kor all -n nginxweb
it gives the same result, that the nginx service is not used anywhere in the namespace.
We can check only configmap/secret/services/serviceaccount/deployments/statefulsets/role/hpa by,
kor services -n nginxweb
kor serviceaccount -n nginxweb
kor secret -n nginxweb
That's all. We have installed the KOR tool and validated it by deleting one of the component in the Nginx web server deployment.
References:
https://github.com/yonahd/kor
Ever missed a certificate renewal and had a production outage? Or …
The only Kubernetes log agent with intelligent error context capture, …
In Part 1 , We have covered how to setup Grafana Loki and Grafana Agent to view…
In Part 1 , We covered how to setup Grafana Loki and Grafana Agent to view Kub…
Dealing with multiple tools for capturing application logs from different sourc…
In this blog, we will explore a new tool called 'Rover,' which helps to…
In this blog, we will see a new tool called Infracost, which helps provide expe…
In this blog, we will install and examine a new tool called Trivy , which helps…
In this blog post, We will explore a new tool called "KOR" (Kubernete…