Blog

  • Jui 27 2014

    PHP Tour 2014

    En début de semaine se tenait le PHP Tour 2014 à Lyon, et j'ai eu le privilège d'y être invité afin de présenter comment Utiliser PostgreSQL en 2014.

    À l'heure où le NoSQL passe de mode doucement, il est temps de se poser les bonnes questions vis à vis des technologies de bases de données à utiliser, comment et pourquoi. PostgreSQL entre de plein droit dans la case des SGBD relationnels classiques, aussi nous commencerons par étudier ce que de ces outils apportent. Puis nous ferons le tour des fonctionnalités avancées de PostgreSQL, qui le positionnent comme un élément clé de votre architecture d'application.

    J'ai ensuite eu le plaisir de présenter ce même contenu dans un contexte très différent grâce aux Brown Bag Lunch, dans les locaux d'une entreprise qui envisageait l'utilisation de MongoDB. Mais ça c'était avant.

    Voici une liste des articles contenant des exemples avancés d'utilisation de PostgreSQL, plusieurs d'entre eux ayant servi de support lors de la réalisation des slides utilisés dans la présentation :

     

    PHP Tour 2014

    June, 27 2014

    En début de semaine se tenait le PHP Tour 2014 à Lyon, et j'ai eu le privilège d'y être invité afin de présenter comment Utiliser PostgreSQL en 2014.

     

    Last week some PostgreSQL users, contributors and advocates have organized a really great conference in Stockholm, Sweden, where I had the please to give the following talk:

     

    In our previous article Aggregating NBA data, PostgreSQL vs MongoDB we spent time comparing the pretty new MongoDB Aggregation Framework with the decades old SQL aggregates. Today, let's showcase more of those SQL aggregates, producing a nice histogram right from our SQL console.

     

    When reading the article Crunching 30 Years of NBA Data with MongoDB Aggregation I coulnd't help but think that we've been enjoying aggregates in SQL for 3 or 4 decades already. When using PostgreSQL it's even easy to actually add your own aggregates given the SQL command create aggregate.

     

    Back from the FODESM 2014 Conference, here's the slides I've been using for the Advanced Extension Use Cases talk I gave, based on the ongoing work to be found under the Tour of Extensions index in this web site.

     

    Back From Dublin

    November, 05 2013

    Last week I had the pleasure to present two talks at the awesome PostgreSQL Conference Europe. The first one was actually a tutorial about Writing & using Postgres Extensions where we spent 3 hours on what are PostgreSQL Extensions, what you can expect from them, and how to develop a new one. Then I also had the opportunity to present the new version of pgloader in a talk about Migrating from MySQL to PostgreSQL.

     

    Denormalizing Tags

    October, 24 2013

    In our Tour of Extensions today's article is about advanced tag indexing. We have a great data collection to play with and our goal today is to be able to quickly find data matching a complex set of tags. So, let's find out those lastfm tracks that are tagged as blues and rhythm and blues, for instance.

     

    At the Open World Forum two weeks ago I had the pleasure to meet with Colin Charles. We had a nice talk about the current state of both MariaDB and PostgreSQL, and even were both interviewed by the Open World Forum Team. The interview is now available online. Dear French readers, it's in English.

     

    PostgreSQL is an all round impressive Relational DataBase Management System which implements the SQL standard (see the very useful reference page Comparison of different SQL implementations for details). PostgreSQL also provides with unique solutions in the database market and has been leading innovation for some years now. Still, there's no support for Autonomous Transactions within the server itself. Let's have a look at how to easily implement them with PL/Proxy.

     

    Let's get back to our Tour of Extensions that had to be kept aside for awhile with other concerns such as last chance PostgreSQL data recovery. Now that we have a data loading tool up to the task (read about it in the Loading Geolocation Data article) we're going to be able to play with the awesome ip4r extension from RhodiumToad.

     

    Last Friday I had the chance to be speaking at the Open World Forum in the NewSQL track, where we had lots of interest and excitement around the NoSQL offerings. Of course, my talk was about explaining how PostgreSQL is Web Scale with some historical background and technical examples about what this database engine is currently capable of.

     

    Using trigrams against typos

    September, 06 2013

    In our ongoing Tour of Extensions we played with earth distance in How far is the nearest pub? then with hstore in a series about trigger, first to generalize Trigger Parameters then to enable us to Auditing Changes with Hstore. Today we are going to work with pg_trgm which is the trigrams PostgreSQL extension: its usage got seriously enhanced in recent PostgreSQL releases and it's now a poor's man Full Text Search engine.

     

    In a previous article about Trigger Parameters we have been using the extension hstore in order to compute some extra field in our records, where the fields used both for the computation and for storing the results were passed in as dynamic parameters. Today we're going to see another trigger use case for hstore: we are going to record changes made to our tuples.

     

    Trigger Parameters

    August, 23 2013

    Sometimes you want to compute values automatically at INSERT time, like for example a duration column out of a start and an end column, both timestamptz. It's easy enough to do with a BEFORE TRIGGER on your table. What's more complex is to come up with a parametrized spelling of the trigger, where you can attach the same stored procedure to any table even when the column names are different from one another.

     

    There was SQL before window functions and SQL after window functions: that's how powerful this tool is. Being that of a deal breaker unfortunately means that it can be quite hard to grasp the feature. This article aims at making it crystal clear so that you can begin using it today and are able to reason about it and recognize cases where you want to be using window functions.

     

    In our recent article about The Most Popular Pub Names we did have a look at how to find the pubs nearby, but didn't compute the distance in between that pub and us. That's because how to compute a distance given a position on the earth expressed as longitude and latitude is not that easy. Today, we are going to solve that problem nonetheless, thanks to PostgreSQL Extensions.

     

    In his article titled The Most Popular Pub Names Ross Lawley did show us how to perform some quite interesting geographic queries against MongoDB, using some nice Open Data found at the Open Street Map project.

     

    In a recent article here we've been talking about how do do Batch Updates in a very efficient way, using the Writable CTE features available in PostgreSQL 9.1. I sometime read how Common Table Expressions changed the life of fellow DBAs and developers, and would say that Writable CTE are at least the same boost again.

     

    In a recent article Craig Kerstiens from Heroku did demo the really useful crosstab extension. That function allows you to pivot a table so that you can see the data from different categories in separate columns in the same row rather than in separate rows. The article from Craig is Pivoting in Postgres.

     

    Tonight I had the pleasure to present a talk at the Dublin PostgreSQL User Group using remote technologies. The talk is about how to make the most ouf of PostgreSQL when using SQL as a developer, and tries to convince you to dive into mastering SQL by showing how to solve an application example all in SQL, using window functions and common table expressions.

     

    Nearest Big City

    May, 02 2013

    In this article, we want to find the town with the greatest number of inhabitants near a given location.

     

    The Need For Speed

    March, 29 2013

    Hier se tenait la cinquième édition de la conférence organisée par dalibo, où des intervenants extérieurs sont régulièrement invités. Le thème hier était à la fois clair et très vaste : la performance.

     

    Batch Update

    March, 15 2013

    Performance consulting involves some tricks that you have to teach over and over again. One of them is that SQL tends to be so much better at dealing with plenty of rows in a single statement when compared to running as many statements, each one against a single row.

     

    HyperLogLog Unions

    February, 26 2013

    In the article from yesterday we talked about PostgreSQL HyperLogLog with some details. The real magic of that extension has been skimmed over though, and needs another very small article all by itself, in case you missed it.

     

    PostgreSQL HyperLogLog

    February, 25 2013

    If you've been following along at home the newer statistics developments, you might have heard about this new State of The Art Cardinality Estimation Algorithm called HyperLogLog. This technique is now available for PostgreSQL in the extension postgresql-hll available at https://github.com/aggregateknowledge/postgresql-hll and soon to be in debian.

     

    PostgreSQL for developers

    November, 02 2012

    As Guillaume says, we've been enjoying a great evening conference in Lyon 2 days ago, presenting PostgreSQL to developers. He did the first hour presenting the project and the main things you want to know to start using PostgreSQL in production, then I took the opportunity to be talking to developers to show off some SQL.

     

    Reset Counter

    October, 05 2012

    I've been given a nice puzzle that I think is a good blog article opportunity, as it involves some thinking and window functions.

     

    Let's say you need to ALTER TABLE foo ALTER COLUMN bar TYPE bigint;, and PostgreSQL is helpfully telling you that no you can't because such and such views depend on the column. The basic way to deal with that is to copy paste from the error message the names of the views involved, then prepare a script wherein you first DROP VIEW ...; then ALTER TABLE and finally CREATE VIEW again, all in the same transaction.

     

    Dynamic Triggers in PLpgSQL

    November, 24 2010

    You certainly know that implementing dynamic triggers in PLpgSQL is impossible. But I had a very bad night, being up from as soon as 3:30 am today, so that when a developer asked me about reusing the same trigger function code from more than one table and for a dynamic column name, I didn't remember about it being impossible.

     

    The drawback of hosting a static only website is, obviously, the lack of comments. What happens actually, though, is that I receive very few comments by direct mail. As I don't get another spam source to cleanup, I'm left unconvinced that's such a drawback. I still miss the low probability of seeing blog readers exchange directly, but I think a tapoueh.org mailing list would be my answer, here...

     

    Window Functions example

    September, 09 2010

    So, when 8.4 came out there was all those comments about how getting window functions was an awesome addition. Now, it seems that a lot of people seeking for help in #postgresql just don't know what kind of problem this feature helps solving. I've already been using them in some cases here in this blog, for getting some nice overview about Partitioning: relation size per “group”.

     

    Happy Numbers

    August, 30 2010

    After discovering the excellent Gwene service, which allows you to subscribe to newsgroups to read RSS content ( blogs, planets, commits, etc), I came to read this nice article about Happy Numbers. That's a little problem that fits well an interview style question, so I first solved it yesterday evening in Emacs Lisp as that's the language I use the most those days.

     

    Playing with bit strings

    August, 26 2010

    The idea of the day ain't directly from me, I'm just helping with a very thin subpart of the problem. The problem, I can't say much about, let's just assume you want to reduce the storage of MD5 in your database, so you want to abuse bit strings. A solution to use them works fine, but the datatype is still missing some facilities, for example going from and to hexadecimal representation in text.

     

    This time, we are trying to figure out where is the bulk of the data on disk. The trick is that we're using DDL partitioning, but we want a “nice” view of size per partition set. Meaning that if you have for example a parent table foo with partitions foo_201006 and foo_201007, you would want to see a single category foo containing the accumulated size of all the partitions underneath foo.

     

    This time we're having a database where sequences were used, but not systematically as a default value of a given column. It's mainly an historic bad idea, but you know the usual excuse with bad ideas and bad code: the first 6 months it's experimental, after that it's historic.

     

    The problem was raised this week on IRC and this time again I felt it would be a good occasion for a blog entry: how to load an XML file content into a single field?

     

  • Jui 23 2014

    PostgreSQL’s CTEs are optimisation fences

    When optimising queries in PostgreSQL (true at least in 9.4 and older), it’s worth keeping in mind...  

  • Jui 16 2014

    Conférences Françaises

    Ce mois-ci est particulièrement actif en terme de conférences françaises, puisqu'après un détour à Toulon pour les PGDay.fr nous avons un Meetup à Paris puis une conférence PHP Tour 2014 à Lyon.

    PHP Tour 2014

    À Lyon avec les développeurs PHP nous aborderons ce que signifie Utiliser PostgreSQL en 2014, et voici le résumé de la conférence :

    À l'heure où le NoSQL passe de mode doucement, il est temps de se poser les bonnes questions vis à vis des technologies de bases de données à utiliser, comment et pourquoi. PostgreSQL entre de plein droit dans la case des SGBD relationnels classiques, aussi nous commencerons par étudier ce que de ces outils apportent. Puis nous ferons le tour des fonctionnalités avancées de PostgreSQL, qui le positionnent comme un élément clé de votre architecture d'application.

    PostgreSQL Meetup Paris

    À Paris dès ce mercredi, nous nous retrouverons dans les locaux de Le Bon Coin pour un Meetup PostgreSQL très chaleureux où Jean-Louis nous parlera de leur utilisation de PostgreSQL, et en particulier du fait que 2TB of RAM ought to be enough for anybody!

     

  • Jui 16 2014

    Putting a PostgreSQL tablespace on a ramdisk risks ALL your data

    I periodically see people being advised to put their tablspaces on RAM disks or tempfs volumes. This...  

  • Mai 14 2014

    Why is pgloader so much faster?

    pgloader loads data into PostgreSQL. The new version is stable enough nowadays that it's soon to be released, the last piece of the 3.1.0 puzzle being full debian packaging of the tool.

    The pgloader logo is a loader truck, just because.

    As you might have noticed if you've read my blog before, I decided that pgloader needed a full rewrite in order for it to be able to enter the current decade as a relevant tool. pgloader used to be written in the python programming language, which is used by lots of people and generally quite appreciated by its users.

    Why changing

    Still, python is not without problems, the main ones I had to deal with being poor performances and lack of threading capabilities. Also, the pgloader setup design was pretty hard to maintain, and adding compatiblity to other loader products from competitors was harder than it should.

    As I said in my pgloader lightning talk at the 7th European Lisp Symposium last week, in searching for a modern programming language the best candidate I found was actually Common Lisp.

    After some basic performances checking as seen in my Common Lisp Sudoku Solver project where I did get up to ten times faster code when compared to python, it felt like the amazing set of features of the language could be put to good use here.

    So, what about performances after rewrite?

    The main reason why I'm now writing this blog post is receiving emails from pgloader users with strange feelings about the speedup. Let's see at the numbers one user gave me, for some data point:

     select rows, v2, v3,
            round((  extract(epoch from v2)
                   / extract(epoch from v3))::numeric, 2) as speedup
       from timing;
            
      rows   |        v2         |       v3        | speedup 
    ---------+-------------------+-----------------+---------
     4768765 | @ 37 mins 10.878  | @ 1 min 26.917  |   25.67
     3115880 | @ 36 mins 5.881   | @ 1 min 10.994  |   30.51
     3865750 | @ 33 mins 40.233  | @ 1 min 15.33   |   26.82
     3994483 | @ 29 mins 30.028  | @ 1 min 18.484  |   22.55
    (4 rows)
    
    The raw numbers have been loaded into a PostgreSQL table

    So what we see in this quite typical CSV Loading test case is a best case of 30 times faster import. Which brings some questions on the table, of course.

    Wait, you're still using COPY right?

    The PostgreSQL database system provides a really neat COPY command, which in turn is only exposing the COPY Streaming Protocol, that pgloader is using.

    So yes, pgloader is still using COPY. This time the protocol implementation is to be found in the Common Lisp Postmodern driver, which is really great. Before that, back when pgloader was python code, it was using the very good psycopg driver, which also exposes the COPY protocol.

    So, what did happen here?

    Well it happens that pgloader is now built using Common Lisp technologies, and those are really great, powerful and fast!

    Not only is Common Lisp code compiled to machine code when using most Common Lisp Implementations such as SBCL or Clozure Common Lisp; it's also possible to actually benfit from parallel computing and threads in Common Lisp.

    That's not how I did it!

    In the pgloader case I've been using the lparallel utilities, in particular its queuing facility to be able to implement asynchronous IOs where a thread reads the source data and preprocess it, fills up a batch at a time in a buffer that is then pushed down to the writer thread, that handles the COPY protocol and operations.

    So my current analysis is that the new thread based architecture used with a very powerful compiler for the Common Lisp high-level language are allowing pgloader to enter a whole new field of data loading performances.

    Conclusion

    Not only is pgloader so much faster now, it's also full of new capabilities and supports several sources of data such as dBase files, SQLite database files or even MySQL live connections.

    Rather than a configuration file, the way to use the new pgloader is using a command language that has been designed to look as much like SQL as possible in the pgloader context, to make it easy for its users. Implementation wise, it should now be trivial enough to implement compatibility with other data load software that some PostgreSQL competitor products do have.

    Also, the new code base and feature set seems to attract way more users than the previous implementation ever did, despite using a less popular programming language.

    You can already download pgloader binary packages for debian based distributions and centos based ones too, and you will even find a Mac OS X package file ( .pkg) that will make /usr/local/bin/pgloader available for you on the command line. If you need a windows binary, drop me an email.

    The first stable release of the new pgloader utility is scheduled to be named 3.1.0 and to happen quite soon. We are hard at work on packaging the dependencies for debian, and you can have a look at the Quicklisp to debian project if you want to help us get there!

     

  • Avr 17 2014

    New York!

    A couple of week ago I had the chance to participate into the PGConf NYC 2014 Conference, one of the biggest conferences about PostgreSQL worldwide.

    I presented one of my favourite talks over there, where the whole goal is to blow the mind of innocent developers and show them how much they can do in just SQL.

    PostgreSQL for developers, window functions galore!

    The basis for the talk is this detailed blog entry about the Reset Counter application and how to leverage SQL to write the code for it.

     

  • Avr 17 2014

    Meetup PostgreSQL à Paris

    Hier soir se déroulait le troisième Meetup PostgreSQL à Paris, et je crois pouvoir dire que tous les participants étaient ravis de cette édition.

    Malheureusement, quelques uns sont restés bloqués du mauvais côté de la porte hier soir, j'en suis navré. Nous essayerons d'étendre nos horaires d'accueil lors des prochaines rencontres, dans la mesure du possible.

    Les sponsors de la soirée

    Un grand merci à nos sponsors : hier soir eNovance nous accueillait dans une superbe salle de conférence et Hegoa s'est chargée de mettre à disposition des participants un buffet très apprécié.

    Si vous désirez intervenir auprès de notre organisation en tant que sponsor ou bien en tant que conférencier, n'hésitez pas à m'envoyer un email afin de nous organiser au mieux !

    Annonce Officielle du Ministère de l'Intérieur

    Vincent Laborie a eu l'opportunité hier soir de faire la première annonce officielle de l'utilisation de PostgreSQL en production au sein des Systèmes d'Information de la Sécurité Intérieure, c'est à dire dans les applications critiques déployées au sein du Ministère de l'Intérieur et dont les utilisateurs sont nos policiers et gendarmes.

    OpenStack et PostgreSQL

    Julien Danjou nous a ensuite présenté l'utilisation faite par OpenStack de PostgreSQL, et nous retiendrons qu'il reste beaucoup de progrès à faire en la matière, à commencer par une intégration véritable des tests unitaires.

    Une fois de plus, une analyse faite avec assez de recul montre que l'utilisation d'un ORM reste un problème car il impose un nivellement par le bas des fonctionnalités exploitées dans les applications. La portabilité du SGBD cible ne s'obtient qu'à un coût exhorbitant.

    Learn everything you need to build a successful Python project

    Si vous voulez approfondir le sujet, je détaille mon point de vue sur les problèmes d'intégration liés aux ORM dans l'excellent livre The Hacker's Guide to PythonJulien me fait l'honneur d'une interview.

    Tsung et PostgreSQL

    Rodolphe Quiédeville nous a ensuite présenté le merveilleux outil Tsung dans un cas d'utilisation convainquant. Ses slides sont déjà disponibles en ligne : Un Tsung vaut mieux que 2 "croisons les doigts".

    Tsung is an open-source multi-protocol distributed load testing tool

    Prochaine rencontre

    Les détails sont encore à valider, à priori la prochaine rencontre des utilisateurs de PostgreSQL à Paris se tiendra au moins de juin, un mois déjà chargé en activité PostgreSQL avec non seulement le PGDay France à Toulon les 5 et 6 juin, mais également une belle participation de notre communauté à la conférence PHP Tour qui se tiendra à Lyon les 23 et 24 Juin 2014.

    Note : c'est la rencontre qui est située à Paris, les utilisateurs de PostgreSQL sont les bienvenus d'où qu'ils viennent !

     

  • Avr 14 2014

    Announcing Barman 1.3.1

    Version 1.3.1 of Barman, Backup and Recovery Manager for PostgreSQL, has been released. Barman now supports concurrent...  

  • Avr 02 2014

    Indexando ando: Índices GIN

    PostgreSQL tiene varios tipos de índices: B-tree, Hash, GiST, Gin y SP-GiST. Obviamente cada uno de ellos...  

  • Mar 25 2014

    Nordic PostgreSQL Day 2014

    Last week some PostgreSQL users, contributors and advocates have organized a really great conference in Stockholm, Sweden, where I had the please to give the following talk:

    PostgreSQL is YeSQL!

    Nordic PostgreSQL Conference

    The conference was very well put together and the occasion to meet with plenty of well known PostgreSQL friends and newcomers to the community too. If you were there, I hope you had as much of a good time than I did!

    All the slides from this day of conferences have now been uploaded on the PostgreSQL Wiki Page for Nordic PGDAY, so that if you couldn't make it to the event you can still read through the material and send emails around to the authors with any and all questions about the contents!

    See you at the next PostgreSQL conference maybe, this time in New York City!

    The Hacker's Guide to Python

    In other news, my good friend Julien Danjou (you might know him for having hacked together the awesome window manager) just released today The Hacker's Guide to Python where you will Learn everything you need to build a successful Python project. I've had the pleasure to answer's Julien's questions about how to best benefit from PostgreSQL when coding in python, which makes it for an interview chapter within:

    In this book, we'll see how you can leverage Python to efficiently tackle your problems and build great Python applications.

    And while at it:

    Have a good read!

     

© 2001-2014 2ndQuadrant Ltd. All rights reserved. | Privacy Policy