PROGRAMMING AND SOMETHING MORE...
PGConf Europe 2017 finished. It was great!
The only right direction when you have to struggle with MS SQL Server... :-)
PGConf Europe 2017 begins at last! The first presentation: "Database Horror Stories, Bad Code, and How Not to be a Statistic".
"36,321 tables, one row each, one database. A stored procedure with 1,752 parameters. A DBA convinced that storing everything in the entire database typed as BYTEA will definitely improve performance. All of these are true stories, and the sad fact is that many PostgreSQL professionals can relate to them -- even if they won't admit it in public."
But I noticed that all presented examples were written in T-SQL... LOL! Creative T-SQL developers strikes back! :-D SQL Server guys - so typical... ;-) This is what I expected from talented T-SQL developers. All these stupid things I always saw in every database created in MS SQL Server, never in Firebird, MySQL or PostgreSQL. Of course all databases, which I worked with, had bad things in design and in stored procedures, but there were negligible in comparison to "surprises" in SQL Server databases which I saw. Is it something wrong with SQL Server users? ;-)
PostgreSQL Europe is proud to announce the 9th Annual PostgreSQL Conference Europe which will be held at the Warsaw Marriott Hotel in Warsaw, Poland, on October 24-27 2017.See more: https://2017.pgconf.eu
For SQL Server/Oracle fanboys, open source haters and other ignorants
SQL Server not allowed - restricted area - authorized ANSI SQL only
"Developers are more likely to port SQL Server to Postgres because it is open source, supports more ANSI features, runs on more platforms, uses a different ACID model and has a great optimizer." Joe Celko
Something about reliability - another opinion same as above:
"I am DBA since 1992, working first with Ingres (the previous "release" of Postgres) and Oracle. What I can say is that with an Ingres cluster production server with 32 nodes and a customer base of 1,5 million customers, I was able to manage it alone during 2 years at TIM mobile company. The same customer base went to Oracle 8i paralel (so it means only 2 nodes) and 4 dbas spend some nights regularly in order to get it 24x7." Ingres/Oracle DBA
"SQL Server DBAs are the only ones that have to worry about tempdb (Oracle and Postgres DBAs get a break). But look at that tempdb contention limit. I like to point out to anyone who will listen that tempdb contention is Microsoft’s fault. Every minute spent on tempdb problems is time spent working around a defect in SQL Server. It’s frustrating. It’s already hard enough worrying about CPU and IO without worrying about logical contention caused by the database engine." Troubleshooting Tempdb, a Case StudyOracle has fine SQL implementation like PostgreSQL, but SQL Server not... T-SQL is EVIL!!! It is worst SQL dialect I've ever seen. It's like damned Internet Explorer in the databases world. Try this in SQL Server: "DROP SCHEMA IF EXISTS schema_name CASCADE" :-D You end up with custom recursive procedures and scanning system tables to do this operation, which needs only one fucking line in PostgreSQL! CASCADE is available since PostgreSQL 7.3 (2002) or even earlier. IF EXISTS is available since PostgreSQL 8.2 (2006). SQL Server doesn't support CASCADE clause, IF EXISTS is available since SQL Server 2016 (sic!). Date and time arithmetic in SQL Server - one big WTF?! DDL implementation - it's a joke! Sequences? Forget! Available since 2012! WTF?! No LIMIT/OFFSET support (TOP n is not the same), no ORDER BY clause in views, no correlated subqueries with aggregations, CASE statement can't use logical output values from conditional expressions in WHEN sections, default parameters in stored procedures/functions requires calling with DEFAULT keyword instead of omitting them like in pl/pgSQL and other languages with default parameters in subroutines (like C++, Java, PHP), WHERE 1 = 1 instead of WHERE TRUE, crazy "N" prefixed CHAR types and string literals, crappy non-standard compliant date and time types with insane functions, incomplete XML schema validator...
It's my favourite one fuck-up: triggers. Every RDBMS, which I used, offers virtual rows NEW and OLD at the trigger logic level. But SQL Server of course not! You have to run special SELECT queries on virtual read-only tables (inserted, deleted) to obtain primary key of the inserted/modified/deleted row and other columns. WTF?! What a stupid idea! Instead of overwrite some columns in NEW row or check columns in OLD/NEW virtual rows, before sending it to the table, you have to select rows from these fucking tables, get primary keys and run additional UPDATE query with selected primary key value to modify some columns in this new row after trigger activation (BEFORE UPDATE for example). Do you understand? You have to run extra UPDATE query from trigger logic to modify the same row in the table instead of modify NEW row associated with triggered BEFORE UPDATE action before writing it in the table. NEW/OLD rows should be available at trigger context like in other databases. WTF?! Unecessary additional, unintuitive SQL code and lower performance. USING clause in joins? Forget about it! Not implemented...
I found many more situations like this. Just a little example: you can't pass expressions directly as arguments to the stored procedures/functions - you have to assign them to the temporary variables and pass them to the called subroutines. Another one: XML validation schemas don't support ALTER expression - only CREATE and DROP. What a fucking inconsistency! It doesn't support full XSD specification for XML Schema Collections validation. Almost every single DDL operation requires stupid GO command which closes current session. Forget about transactions in DDL and batch execution of DDL commands, like in PostgreSQL you can do!
Yes! I knew it! Bad design from the ground - tempdb fuckups and pseudo-MVCC impossible to fix.
"Horse crap. Any major database that can't support simultaneous readers and writers without having to manually order is junk. Here's the real world example that prompted this post. I was asked to query another applications database for reporting. I couldn't alter the database by changing settings. Works fine except for the occasional deadlocks when the reader ran into a writer. Since I don't control the application or the database I can't change this situation. This is entirely common. And an example of what a piece of crap SQL Server is. Charles (Ted) Wise | Principal Architect | Health Diagnostic Laboratory, Inc. | 610-804-2964 (...) However, because MS added SNAPSHOT in response to losing marketing battles against Oracle, it was an afterthought. For performance reasons, it's horrible. It hammers SQL Server's common temp space (TEMPDB), which is another design issue altogether. The point is, unless you set the default transaction isolation level for SQL Server to SNAPSHOT, and take a huge performance hit, you aren't going to get the protection that Oracle's REDO log design gives you. It's a design limitation of SQL Server. This is not small potatoes, I've talked with a few SQL MVPs about this, they all know it's true. Please stop arguing that SQL Server's limitations are the developer's fault. (...) Ted is absolutely correct. People "work around" SQL Server's limited functionality by creating report databases of various kinds (mirrors, cubes, and so on). The fact that SELECTs block INSERTs, and break production ETL when someone runs a report or AdHoc query is truly pathetic. Sql Server invented Snapshotting, which is very inefficient and hammers SQL Server's TEMPDB. I work with SQL Server MVPs and we all know MS SQL Server is inferior to Oracle. But Sql Server costs less, depending on how you look at it. And, avoiding the technical argument by saying he's a small developer is ridiculous. It's simple: On Oracle (and even Postgres for example), SELECTs can NOT interrupt INSERTs, but on SQL Server, your production ETL is always at risk of failing due to someone deciding to ... gasp ... query a table. Select deadlocks in SQL Server - AKA SQL Server sucksSo yes - SQL Server sucks and you have to pay for it. Everywhere ugly hacks, workarounds, redundant procedural T-SQL spaghetti code with stupid NOLOCK hints unknown in other databases with full MVCC support. I know - SQL Server 2005 and higher supports READ COMMITTED SNAPSHOT mode with storing rows in tempdb which is default disabled... Is it a temporary, buggy, low performance ("Performance impact is minimal" - performance impact...), marketing solution for MVCC, because other databases have it from the beginning or at early development stage, like Oracle, PostgreSQL, InterBase/Firebird, MySQL (InnoDB)? "But it also has consequences, like tempdb growing and changed rows become wider by 14-byte pointer, possibly causing page splits if your fillfactor is 100%. So you probably want to test that before you enable it on existing systems." - my favourite fucking tempdb fuckups! LOL! Even SQLite suports MVCC since 2010 (project developed by smart three guys) - only 5 years later after READ COMMITTED SNAPSHOT introduction in MS SQL Server (product of one of the biggest IT corporations in the world). Why it is disabled by default? It's a bad joke... MVCC in PostgreSQL is available since 6.5 version (1999-06-09). SQL Server 6.0 (1995) - built-in replication with table level locking as a bonus (I saw databases based on SQL Server 2008 R2 still without replication) - easy configurable replication instead of MVCC. SQL Server 6.5 (1996) - table level locking. SQL Server 7.0 (1998) - row level locking, still without MVCC. SQL Server 2000 (2000-11) - still without MVCC. SQL Server 2005 (2006-01) - "MVCC" disabled by default even in 2008 and later. Wow! Compare COMMIT and ROLLBACK time of big UPDATE between PostgreSQL and SQL Server. In PostgreSQL it's almost instant, because it stores versioning data in table structure instead of throwing them away to the external tempdb database (if snapshot mode is enabled of course, if not - SELECT deadlocks FROM sql_server). I know that this mechanism requires auto vacuum, but it is a price for better concurrency and transactions management almost without deadlocks (no need to run super profiler tool to investigate another fucking deadlock in SS Agent job).
SQL Server is always faster than PostgreSQL? It's a bullshit. Both of them are comparable at speed, sometimes SQL Server is better, sometimes PostgreSQL is better, sometimes both engines do some tasks at the same performance. Because of MVCC architecture, COUNT(*) without WHERE clause is slower in PostgreSQL, but try to load metadata in SQL Server and compare loading time to the PostgreSQL time. LOL! What operation is more useful and done more frequently? Stupid COUNT without conditions whether loading database structure in a database administration tool? If you properly use PostgreSQL features unavailable in SQL Server, I bet it will be a lot faster than any SQL Server configuration for the same tasks. See above: transactions management and COMMIT/ROLLBACK time. Don't forget that it will be much cheaper and more friendly for database developers also.
Read this: Postgres looks to me like it's saving us like 5X in hardware costs as we continue to grow.
"This paper compares the performance and scaling of the BenchmarkSQL workload runningon Red Hat Enterprise Linux 5.4 with that of the same workload on Windows Server 2008 R2 Enterprise. The database servers used were HP ProLiant DL370 G6 servers equipped with 48 GB of RAM and comprised of dual sockets, each with a 3.2 GHz Intel Xeon W5580 Nehalem processor (totaling 8 cores).
The data presented in this paper establishes that a common OLTP workload on PostgreSQL can contend with SQL Server and with minimal tuning, is capable of outperforming SQL Server using the same load in an enterprise environment.
The number of actual users and throughput supported in any specific customer situation would naturally depend on the specifics of the application used and the degree of user activity." Comparing Benchmark SQL Performance on Red Hat® Enterprise Linux 5 to Windows Server Enterprise
"About a 5x-10x CPU and Disk I/O improvement migrating a pretty large project from [a major proprietary database mentioned in the article] to Postgres. CPU and Disk I/O Graphs below."
"I went through a similar switch from MSSQL to postgres. We moved 7 applications of various sizes over and the biggest hurdle was office politics. The migration never took more than a day for each app. ETL and reporting were not affected since we had previously integrated messaging (rabbitmq) pretty heavily into every system at the company and ETL gets their data from there. We also switched the DB servers to over to linux and got a huge performance boost from that. We did rewrite some stuff, but that was mostly to take advantage of postgres features like array types (instead of a article tags table, you'd have an tags array) which we used to speed up some applications. One we sped up by a factor of 10, just by using array types and offloading json generation to postgres: select array_to_json(array_agg(row_to_json(t))) from ( some query ) t" SQL Server 2014 Standard Edition Sucks
"Is running on linux, to it's cheaper. Even when we use SQL Server as Bizspark (for free now), the Azure Windows VM for it costs us much more, than VM with Linux. And of course when we want to cluster our DB, Postgres is even cheaper. - Great JSON support. There are parts of our project where it's helpfull. - Better configurability, like WAL, checkpoints etc. We have much better write performance on postgres than in sql server (probably just our case)." SQL Server and PostgreSQL developer/DBA
"Krzysztof Nienartowicz did a presentation on ESA Gaia Project at the European Space Agency. Gaia is a project to map all objects in the night sky over a period of years at much higher resolutions than was ever possible before. This uses a special new mirror-based satellite telescope developed for Gaia, and generates terabytes of data per day. Since they are already analyzing the data with PostgreSQL, they hope to use PostgresXC in order to build up a data platform to hold their huge amonts of sky object data to analyze it. He discussed some of the requirements they have for PostgresXC to start using it later this year." PostgresXC Developer Meeting pgCon 2014
"In this area Postgres - XL, a fork of PostgreSQL specifically designed to deal with huge amount of data, has been proved to be a very useful allied. Their cluster, based on this technology, will eventually store petabytes of data!" The Gaia Project and Postgres -XLSo what you said about poor PostgreSQL's performance? I can't hear it. PostgreSQL will be even better at IO operations if development team remove unnecessary system calls and replace pages caching via OS disk cache buffer with custom implementation at the RDBMS side. They know about this limitation and I think they improve this in the future.
PGConf Europe 2017: ESA - Gaia project (1 PB Postgres-XL database which still grows)
But why not SQL Server Enterprise Edition? It has cool features: compression, no RAM and CPU cores limit, SSIS, SSRS, SSAS, super ultimate ETL for BI, beautiful creators and wizards for monkeys, blah blah blah...
- European Space Agency: Gaia is an ambitious mission to chart a three-dimensional map of our Galaxy, the Milky Way
- European Space Agency and Gaia Data Processing and Analysis Consortium - over 1 PB of data and still grows
- Gaia Variability Studies: case for Postgres-XC
- Processing around 10 million events per second and generating 1.7 million aggregates
- Running 10 Million PostgreSQL Indexes In Production (And Counting)
- Plumbing the Internet with PostgreSQL: An Open Source Case Study
- PostgreSQL affiliates .ORG domain
- Size matters: Yahoo claims 2-petabyte database is world's biggest, busiest (based on a heavily modified PostgreSQL engine, the year-old database processes 24 billion events a day)
- Learnings from Five Years as a Skype Architect
- Building A High-Availability PostgreSQL Cluster At ARIN (closely replicated the Oracle RAC system)
- Yandex.Mail: Migration from Oracle to PostgreSQL (1 PB with redundancy, 100+ billion records, 250k TPS, faster deployment)
- Pentaho - data integration and business analytics platform
- Mondrian - Online Analytical Processing server (OLAP)
- BIRT - open source data visualizations and reports
- D3.js - Data-Driven Documents
- 2UDA - big data analytics with PostgreSQL
- Orange - open source machine learning and data visualization
- Weka 3: Data Mining Software in Java
- PgAdmin 3/PgAdmin 4 and PgAgent (I use it under Ubuntu and OS X)
- JetBrains DataGrip - multi-database with very good code completion and database navigation (I use it with MariaDB/PostgreSQL/SQL Server and with other JetBrains cool development tools under Ubuntu and OS X)
- Navicat for PostgreSQL with very good code completion and database navigation
- EMS SQL Manager for PostgreSQL with very good code completion and database navigation (I used it under Windows with PostgreSQL and damned SQL Server because MS SSMS sucks, except SSMS 2017)
- HeidiSQL for PostgreSQL (I used it with MySQL under OS X)
- and more...
- and more...
- SQL Feature Comparison
- PostgreSQL vs. MS SQL Server
- PostgreSQL vs. MS SQL Server at LinkedIn
- What makes PostgreSQL better than SQL Server?
- Ten Reasons PostgreSQL is Better Than SQL Server
- Top 10 Reasons I like Postgres Over SQL Server
- SQL Server 2008 R2 or PostgreSQL
- How Popular Databases Deal with DDL Commands in Transactions
- Is it possible to run multiple DDL statements inside a transaction (within SQL Server)?
- Migrating from MSSQL to PostgreSQL
- Why should you learn PostgreSQL?
- Skype Plans For PostgreSQL To Scale To 1 Billion Users
- Comparing Benchmark SQL Performance on Red Hat Enterprise Linux 5 to Windows Server Enterprise (Postgres Plus 8.3.8 vs SQL Server 2008 R2)
***Is it an idiot or stupid troll: I Hate Open Source Software - Why Open Source Sucks? Never mind. Hey, Grumpy Nerd, you corporate M$ scum! You can put all these MicroShit software like MS SQL Server, MS Internet Explorer and MS Outlook, deep in your ass! Except good products like: MS Office (without Outlook), .NET/C#, MS Visual Studio, TypeScript.
"After having to inherit and deal with a multi-terabyte mongo cluster in a production environment, I will never use it again. Especially with Postgres' composite types, jsonb querying and indexing, materialized views, plv8, and numerous intergrated transaction and locking capabilities.... It has everything I've needed in a database." Reddit user - "Why you should never, ever, ever use MongoDB"
"Similar aggregate queries deliver the same results. At 8Kdata, while developing ToroDB, we have performed dozens of similar queries. The results are consistently clear: a relational design almost always outperforms MongoDB aggregate queries by several factors, often orders of magnitude. Although we have seen even more pathological cases (i.e., even faster), it can safely be said that it is "100x faster". (...) While developing Stampede we have performed benchmarks where we have observed more than 2000x faster queries. Of course, this may be a degraded case for MongoDB and surely Stampede does not perform always as well on every single circumstance." The conundrum of BI/aggregate queries on MongoDB (PostgreSQL vs MongoDB - aggregation) - MongoDB to PostgreSQL BI Connector