PostgreSQL, the fourth most popular database and DBMS of the Year in 2017, has exploded in popularity amongst the development and database communities across the world. Stealing market share from leaders Oracle, MySQL, and Microsoft SQL Server, PostgreSQL hosting is also highly leveraged by new businesses in exciting spaces like IoT, e-commerce, SaaS, analytics, and more.
So What’s Trending in PostgreSQL Management?
We attended PostgresOpen in San Francisco last month to uncover the latest trends from the experts themselves.
Most Time-Consuming PostgreSQL Management Tasks
So, what’s eating up your time on the PostgreSQL management front? While there are thousands of tasks involved with managing your PostgreSQL production deployments, managing queries was the strong leader with over 30% from respondents.
Managing space was a distant second with 15% of PostgreSQL users finding it their most difficult task, followed by replication, upgrades, and monitoring. 23% of PostgreSQL users fell into the “All others” category, consisting of tasks like patching, recoveries, partitioning, and migrations.
Managing PostgreSQL Queries Breakdown
With the far lead on managing PostgreSQL queries, we dove deeper to see what specific tasks were consuming their time. The results spread across the entire process of managing queries, from structuring at setup to optimizing after analysis.
To explain this further, let’s start at the beginning of the query management process:
The smallest segment, managing query structures, accounted for 22% of responses from PostgreSQL users who selected queries as their most time-consuming management task.
Before getting started, you need to create a PostgreSQL query plan around your clusters to match your query structure with your data properties. These consist of nodes, ranging from scan nodes at the bottom level for raw row table returns, along with non-table rows such as values.
Slow Query Analysis
After you’ve established your structure, the next step is analyzing your queries to identify slow-running queries that can be affecting your application’s performance. By default, “slow queries” are defined as queries that take longer than 100ms.
Now that you’ve identified your slow queries, the real work begins – optimizing your PostgreSQL queries. Postgres performance tuning can be a gruesome task, but with proper identification and analysis, you can hone in on the bottlenecks and make your necessary query changes and add indexes where needed to improve your execution. Here’s a great article on performance tuning queries in PostgreSQL.
Latest PostgreSQL Trends: Most Time-Consuming Tasks & Important Metrics to TrackClick To Tweet
Most Important Metrics to Track for PostgreSQL Performance
Now that we’ve identified the most time-consuming PostgreSQL management task, let’s take a deeper look into the important metrics PostgreSQL users track to optimize their performance.
The most important PostgreSQL metrics results were significantly more even than the management tasks, resulting in a four-way tie between replication stats, CPU & RAM usage, transactions per second (TPS), and slow queries:
Monitoring your PostgreSQL replication status is a crucial task to ensure your replications are properly executed and your production deployments remain highly available. The replication process should be customized to best fit your application needs, and continuous endpoint monitoring is the best way to ensure your data is secure and ready for recovery.
It’s important to track metrics on both your standby servers and primary servers. Your standby servers should be monitored for incoming replication and recovery status, and your primary servers should be monitored for outgoing replication and replication slots. If you’re using PostgreSQL streaming replication, replication slots are not always required. Streaming replication ensures the immediate data availability on your standby servers and is ideal for low-TPS servers.
CPU & RAM Usage
Tracking your CPU and RAM (memory) usage are crucial metrics to monitor to ensure the health of your PostgreSQL servers. If your CPU usage is too high, your application will experience slowdowns making your users suffer. It’s often a result of poorly optimized queries, or even high query parallelisms. Monitoring RAM is very important to ensure you have enough disk space, and to understand exactly what your RAM is being used for. It’s recommended to have approximately 25% of your memory allocated for shared_buffers. PostgreSQL also defaults working memory buffer size to 4MB, which is often too little and results in high execution times.
Transactions Per Second
Monitoring the number of transactions per second allows you to determine the load on the system and current throughput. By analyzing this metric, one can decide to scale the system accordingly to achieve the desired throughput. You can also determine how a change in configuration settings or system resources affects the throughput.
Inefficient queries can slow down the PostgreSQL performance even if the system is configured with adequate resources. It’s always a good practice to analyze these inefficient queries and fix them. PostgreSQL provides parameter called log_min_duration_statement. When this is set, it causes the duration of each completed statement to be logged if the statement ran for at least the specified number of milliseconds. Once the slow queries are obtained, you can run EXPLAIN ANALYSE to understand the execution plan. This will enable you to track the issue and optimize the query accordingly. Hence, monitoring the slow queries regularly will avoid the slowness in performance.
Find us next week at the PostgresConf Silicon Valley 2018 event where we hope to uncover more insights on what’s trending in the PostgreSQL management space. If you have any questions or comments, feel free to share them with us here in our comments or on Twitter at @scalegridio.