The definitive guide to best practices of observability. Or maybe not so much …
In the world where every job title uses acronyms or abbreviations from all sorts of different words — I’m looking at you SRE, DevOps, NetOps, NetSecOps, and all the other combinations of the words Site, Networking, Operations, Development, and Security.
Surely there are more combinations generated daily and to be fair, it’s getting really hard already to stay on top of it.
However, we are gathered here to talk about another phenomenon that is — observability (or o11y for those of you who like to nerd out with numeronyms). …
This post right here will be somewhat of a downer, however, I see the subject super under-served in the whole startup world.
Mental health and mental illnesses are one of the biggest factors we have to tackle as founders. I want to let all of you know that what you’re feeling is fine and if some of the bad thoughts that prevail on a deeper level — please seek professional help. …
The system that doesn’t call me in the middle of the night. No-one want’s to be awakened by any broken system! That’s why we have built our systems to break just at the right times, be it just the time that you have a vacancy for some outage or just bored out of your mind, just kidding, our systems have been built in the way that hopefully, they do not break at all, but let’s keep in mind that some things still go sideways even if you put much effort into it.
I found out years ago that there isn’t a solution for SMB’s and this is a fairly underserved market, to begin with. I and my four other co-founders saw that we want to build something for the people that are not that tech-savvy, however, could still benefit off from super technological solutions. …
You can use this aforementioned sentence as a tl;dr, however, there are more merits to this story than just plain old ignorance.
This issue is one of the most widespread of them all. Companies that are serving clients as their source of income hugely rely on the fact that the customer will tell them once their service is unreachable. But this means downtime in their revenue numbers as well. This will cause lots and lots of upsets within the team behind the product as well, as the consumer trying to procure the product in the first place.
Given the fact that you would never want to lose money over something so simple as your service is unavailable, you would want to do something. …
For the sake of this article, let's say that you already have a killer idea that you would like to put into play and want to disrupt the vertical your idea lays in.
The majority of us have had some kick-ass shower thoughts, some of us have tried to put them to work, few of us have managed to succeed with them.
Let’s see what the questions are that you should most certainly ask from yourself before deep-diving into your new venture.
There are perks on going both ways, and it really tones down on the idea that you have. Let’s say that you are a great product marketer and product visionary, however, you lack technical coding skills or design skills. It would be highly beneficial to have someone by your side as a supportive role for the things that you lack.
Personally, I have done it both ways — I have tried solo and both duo and a larger group of co-founders (there are five of us currently in my startup).
There is no silver bullet here, but one main key point you should follow — you have to get along with whomever you bring on board. Think of them as your second family. Because when your product takes off and you will become successful, you will share the same office and seat at the table with them for many years to come. You need to get along! Keep in mind that getting along is just the tip of the iceberg, you and your co-founders need to click, once you have it — you’ll know what I’m talking about. …
Today I want to spend time on a subject that is very dear to me, Prometheus and exporters, ultimately monitoring altogether. Firstly I would like to spend some of the reading time to explain why I choose Prometheus as my weapon of choice when it comes to hosted monitoring software.
As per prometheus.io Prometheus is …
… an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company. …
For full disclosure, I have hired people into the role of Monitoring Engineering and have done it successfully as the guys who have ended up in the given position are still there. And they’re shining at what they do.
So here are my (and Rasmus Rüngenen’s) questions you should ask from any engineer that you would hire to fill in the monitoring bits and pieces for your company.
I will formulate the questions and answers like this:
- What are you looking for in the answer
This time around I would like to tone down on the enterprise-ish way of thinking and drill deeper into the needs of smaller businesses and their incident management process.
First of all, you will need a monitoring system of sorts to know where you have the issues — meaning that are the disks of your servers filling up or has the network started flapping or maybe there is a wider outage with your service provider altogether.
In any case, we can assume that you have some sort, either on-premises or SaaS solutions, set up to monitor your whole stack. Once an alert is triggered and things seem bleak for your developers and admins you would need a process to guide them through the incident. …
This piece of software is undoubtedly one of the most tedious one around for any sysadmin or developer alike. People have asked the question that I headlined countless of times on the pages of StackOverflow or broader part of the internet. Today I want to put a serious stop on it!
But first, let’s get to know what people think about vi/vim in general.
To start things off, I’d like to shed light on a problem that we had at the very beginning of our monitoring journey in Pipedrive. If you haven’t yet, please check out the post about how Pipedrive is “Fueling the Rocket for 500 deploys per week”, as this is a center-point to the story here.
As you may know, developers usually want to get actionable data immediately after they hit the Deploy button. This is reasonable, who wouldn’t want to know, as soon as possible, if something that they are implementing is behaving as it should.
That’s where we step in, “we” being the Monitoring Platform team in Pipedrive. To give you a little background — we used to use Zabbix as our monitoring platform, coupled with Graylog for logging, and some homegrown scripts (which performed a few little magical things that I won’t go into now). During the span of the last 2 years, we improved our stack drastically and the following will give you insight on how we managed to do this. …