DBD::mysql 4.026 released

Dear Perl and MySQL community,

I’m pleased to announce the release of DBD::mysql 4.026

In this release:

2014-01-15 Patrick Galbraith, Michiel Beijen, DBI/DBD community (4.026)
* t/29warnings.t fails on MySQL Server 5.1.something - Reported by RT91202, Gisle Aas. Now is handled depending on version.
* README.pod misses proper NAME heading - RT90101 - Damyan Ivanov, Debian Perl Group
* Added fix and test for RT91715 (ability to obtain $dbh->{mysql_use_result} value)
* Added feature from Jacob Gelbman (cPanel) mysql_skip_secure_auth

Thanks to everyone who contributed!

For more information: http://search.cpan.org/~capttofu/DBD-mysql-4.026

DBD::mysql 4.0.23 and MySQLBind released

For the sake of getting things done and not having projects succumb to bit-rot (in the case of MySQLBind), today I have released DBD::mysql 4.0.23 and MySQLBind with prepared-statement support.

DBD::mysql

DBD::mysql is the Perl driver for MySQL. I have been maintaining this project now for almost 10 years. I work on a lot of different projects and various languages, but Perl is very dear to me and so many things taken for granted that we use in the technical world use it and it Just Works (TM). MySQL is of course dear to me as well, and Perl talking to MySQL is of utmost importance, hence why I love maintaining this driver. I know the code in my sleep as well! This version includes various changes from contributors in the Open Source community who I am very grateful to. Please see the change log for details!

Also-- this is very important -- I want to thank Sveta Smirnova (Спасибо Света!) as she took painstaking effort to transfer all the DBD::mysql bugs from Oracle's bug system for MySQL to http://rt.cpan.org. I really appreciate this, even though there are a mountain of bugs to fix (some fixed already). I have had offers to help with that I intend to accept! Thank you to MySQL AB, Sun, Oracle, et al, for having helped with bugs throughout the years.

The code is at:

https://github.com/CaptTofu/DBD-mysql

As well as:

http://search.cpan.org/~capttofu/DBD-mysql-4.023/ (allow time for mirrors to update)

MySQLBind

This is a project developed by Robert Prouse, ~2004. It essentially makes it possible to use MySQL as a store for DNS records for Bind 9 vs. files hence only requiring a Bind reload (using rndc) when a zone is added or deleted versus having to reload for any zone record change when using files. Over the last several months, I have been working on DNSaaS at HP. One of the early proto-types I was working on with Kiall I had created an agent to talk to MySQLBind, which I initially thought we'd use, though we went instead with PowerDNS (great choice!). The code for MySQLBind works quite well. Additionally, I converted all database API calls to utilize the prepared statement API as well as updated the driver to use a single database table whereas before it used a table-per-zone which I felt was untenable for management and scaling. The single table I have provided SQL creation files for and show how you can use partitioning to make it possible to scale out -- if you need to and have so many domains you need to worry about this good problem to have! Also, I added a column "tenant_id". This is very specific to Moniker (https://wiki.openstack.org/wiki/Moniker), but ultimately you can simply use some unique per-domain identifier which is the key used for partitioning. The code is at https://github.com/CaptTofu/mysql-bind.

Percona Live Sessions: "Managing MySQL with Chef" and "RedDwarf"

I'm looking forward to presenting, along with Peter Boros, Jim Cooley and Vipul Sabhaya, at the Percona Live Conference the week of April 22nd where I will be giving two talks about the management of MySQL using Chef (http://www.percona.com/live/mysql-conference-2013/sessions/managing-mysql-chef 24 April 4:30pm - 5:20pm @ Ballroom A) and Red Dwarf, the Openstack project that HPCS is using for DBaaS (http://www.percona.com/live/mysql-conference-2013/sessions/reddwarf-database-service-openstack-project 25 April 11:00am - 11:50am @ Ballroom C).

I wanted to do a Chef talk, despite my on-again, off-again love/hate relationship with Chef (AKA learning process) because for the past year or so, I have had to become proficient with Chef to manage database resources since this is how databases are provisioned at HPCS. First, filling in as a DBA while they were in the process of the OPs team hiring full-time DBAs, and now with my HP Cloud DNS team where we have used chef to deploy our entire stack. I have grown to like Chef quite a bit and despite being a Perl guy at heart, like Ruby now as well. I wanted to share my experiences as well as help DBAs who will encounter Chef be better acquainted with it and know in advance how to learn more rapidly than did I, as well as consider using Chef to build out database and other component infrastructure. I asked Peter Boros if he'd be kind enough to also present with me. I have worked with him in developing MySQL cookbooks: one for Percona XtraDB Cluster (Galera) as well as his replication/pacemaker cookbook. We'll start basic and then show the really useful components of our cookbooks that make setting up a cluster or replication in a much more automated, quick and easy fashion than manually doing so.

With the Red Dwarf talk, I wanted to give Vipul, the DBaaS team lead as well as Jim Cooley, director of Data Services at HPCS, an opportunity to discuss with me Red Dwarf, the OpenStack project that is being used by HPCS and Rackspace to build out their DBaaS products. I have been involved to some degree though they have more insight into more recent developments and enhancements with the project. It makes for an interesting topic of how great technology works that provides DBaaS.

DNSaaS application MySQL HA solution with Percona XtraDB Cluster (Galera)

I haven't written in a while. Some of the reason is that life has been incredibly busy with two young children. The other reason is that I've been busy at work with my team developing HP Cloud DNS, a service that runs in the cloud, as do the various other applications within the Platform as a Service group that I am part of. After a period of rapid development and deployment, Cloud DNS recently went into private beta . The ability to build an cloud application such as this was primarily that I have a great team (Kiall Mac Innes, Simon McCartney and a great product manger Josh Barry) as well as having chosen the right tools and software solutions. For the deployment side of things, Chef, despite frustrations that I have had before getting my mind around it, was a solution that I knew would make it so we could deploy anywhere, as well as being what HPCS uses for their deployments for under-the-cloud infrastructure. This is a topic in of itself, something that I wish to elaborate in another future post. For the application, the core of Cloud DNS is based around Moniker, an OpenStack-like project created by Kiall.

Of course there is a need in an application for a backing database, and the choice I pondered was a MySQL dual-master setup or Galera Cluster, which I had been wanting to investigate and familiarize myself with for some time because of what little I did know about it. Having synchronous replication and a multi-master setup where split brain is not a much of an issue (if at least three nodes are used) was very appealing. Also, avoiding having to do a pacemaker setup for failover, which me saying so is not a disparagement, but the goal being with a three person team to keep the database solution simple.

After testing Percona XtraDB Cluster which uses Galera, I was very pleased with the results and the performance was excellent for what we needed. Since we use chef, there was an existing cookbook for Percona XtraDB Cluster albeit it was not really ready for what we needed and it even attempted to build the server from source, something we definitely wanted to avoid adding to a deployment process, so quite a few changes were needed to simplify it and utilize binary packages for installation instead of source compilation.

The functionality of the cookbook for a cluster is somewhat straightforward now that I reflect on how it was implemented and know more about the cluster setup, but at the time I wasn't sure of the approach I would take with the cookbook. The basic setup is that three nodes are needed in a Galera cluster, the first node needs to be boot-strapped. At this point, being that this is a cloud environment where a node is launched and then connected to a chef server which depending on the run list that node is provisioned accordingly, the other cloud instances are not yet in existence as far as this boot-strapped node, and chef, are concerned. Why is this important to discuss? Because in setting up a Galera cluster, there is a MySQL parameter "wsrep_cluster_address", which has the format "gcomm://,,...". The value of this parameter is a string in the chef recipe that will be built using the results of chef search for percona nodes, excluding itself, and interpolated in the template for my.cnf. It's important to note, that in the beginning, the first node is set up as a cluster of one with solely "gcomm://" as the wsrep_cluster_address. So, at this point of bootstrapping, node1 would have "gcomm://". This happens without any particular logic, because there are no other nodes other than itself, hence the list of nodes used to build the string empty.
After the first node is boot-strapped, the second node is launched and provisioned with chef. At this point, through chef, this node can be searched and found to exist. Also, the the first node (shall henceforth be refered to as node1) that was bootstrapped can be found through a chef search. And with this search, the IP address of the first node1 is used as the wsrep_cluster_address for node2 such that "wsrep_cluster_address=gcomm://. When node2 has MySQL started, it connects to node1 and state exchange is initiated, the cluster state being given to node2. Again, node1 is the single-node cluster at this point and the state of node2 is most likely different hence node2 will request a snapshot (in this case using Xtrabackup) of the state (Snapshot State Transfer, SST). Once node2 the SST is installed, node2 will then join the cluster, the cluster now consisting of node1 and node2.

Then the third node is launched and provisioned with chef. Now there are two other nodes that can be found, hence "wsrep_cluster_address=gcomm://, (the order depending on results of chef search). The same process as occured with node2 now begins for node3: state exchange, SST (with two nodes, the cluster will assign either node1 or node2 as the SST donor), then SST installation, node3 then joins the cluster of node1 and node2. At this point, the wsrep_cluster_address as found in the chef generated from template my.cnf for the three nodes are as follows:


  • node1: "wsrep_cluster_address=gcomm://"</li>

  • node2: "wsrep_cluster_address=gcomm://<node1 IP>"

  • node3: "wsrep_cluster_address=gcomm://<node1 IP>,<node2 IP>"


It is apparent that this is not complete! What must remain to be done? Well, through the utility of chef, a subsequent run of chef-client is required. The recipe in the cookbook has logic that as stated before performs a chef search that will now return a list of all three nodes. Of course, in building up the string for wsrep_cluster_address to be used in the template for the my.cnf, excluding itself. At end end of the second chef-client run on each of the nodes, the three nodes will have in their my.cnf the following:


  • node1: "wsrep_cluster_address=gcomm://<node2 IP>,<node3 IP>"

  • node2: "wsrep_cluster_address=gcomm://<node1 IP>,<node3 IP>"

  • node3: "wsrep_cluster_address=gcomm://<node1 IP>,<node2 IP>"


At this point, there is a running cluster!

The other important design of this is the HAProxy setup. This setup is pretty much describe in Peter's excellent post. Essentially, each application node also has a running HAProxy that is configured to use the node in the cluster with the least connections and utilizes an HTTP check against an inetd service that provides the output of the utility clustercheck which simply indicates if the cluster has a problem. The application itself connects to localhost and HAProxy takes care of the rest. And all of this is built using chef!

Finally, the other important HA design of all of this is that all of these components-- each database node and each API server with HAProxy are in separate availability zones. The functionality that Simon built into our deployment tool is cognizant of all three availability zones and is able to launch these components into each AZ seamlesly!

The following illustration shows the finished product:

xtradb_3node_cluster

It is worth noting that I worked with Peter Boros of Percona on this setup, as well as the rest of my team of course was able to help me get unblocked while developing this cookbook for building this.

I will conclude this post by stating that I am very pleased at the architecture we have chosen for the database component of our DNSaaS and encourage anyone who needs a simple MySQL-based HA solution for their application to consider using such a setup. Also, the basic chef cookbooks for this I will make available at a later day after much cleanup and testing.

Useful modification to MySQL security feature in Percona Server

In cloud environments or other secure environments, you may want to lock your database down, allowing or disallowing certain grants or capabilities outside the database. One potential security issue is the use of LOAD DATA INFO and SELECT INTO OUTFILE, depending on what files that exist in directories the MySQL server has access to, or even if you have concerns with any database user ever having any access to the file system outside of the database. A few months ago, with version 5.5-25a-27.1, Percona extended this security feature so that you can disable LOAD DATA INFILE and SELECT INTO OUTFILE, simply called "secure-file-priv". This feature is extremely easy to use-- simply specify it in your my.cnf. You can set it a number of ways:

For instance, if you wanted to limit LOAD DATA INFILE or SELECT INTO OUTFILE to /var/tmp:

secure-file-priv = /var/tmp

Or if you wanted to disable it completely, specify no argument

secure-file-priv

The result of when a user tries to run either would be:

mysql> select * into outfile '/home/mydir/' t1.txt from t1;
ERROR 1290 (HY000): The MySQL server is running with the --secure-file-priv option so it cannot execute this statement

I love my new job!

I just have to chime in about how happy I am with my new job. I now work for Blue Gecko, as of August 30th. My role is a Senior Database and Systems Administrator. Blue Gecko is based out of Seattle though I'll be working out of my home in New Hampshire, albeit with my frequent travels to Seattle for family reasons, this will work out quite well.

Already in the last week, I've engaged in several tasks, all of which have been very interesting problems to solve. Not only that, but I've spoken with several existing and potential customers and never realized I really enjoy consulting with and acquiring customers-- hearing what problems they need to solve and being able to ascertain quickly how to solve those problems, making the customer look forward to engaging with us.

Who is Blue Gecko?

Based out of Seattle, their website states (http://www.bluegecko.net/)

"We don't eat, sleep, or go on vacation. We live for three things:

* Smooth, uninterrupted database operation for your company
* Proactive monitoring for potential problems
* Rapid, expert response with no restrictions or delays

"

We are a remote DBA (Database Administrative) service. We provide a service that would require an organization to have to hire a bunch of DBAs and even sysadmins. We provide this for Oracle, MySQL, and recently PostgreSQL and SQL Server.

Just from my first week here, there are some top-notch people that I'm already enjoying working with.

Some of the tasks I've worked on:

* Restoring data that a developer accidentally truncated from tables using InnoDB Tools (http://code.google.com/p/innodb-tools/). I hadn't done this before, and it was fantastic to be able to restore that data in such a dire situation

* Optimizing a query and ultimately how the data is stored in a table for a query that used a file sort that would be extremely slow unless you forced use of the index that the order by was using. The teaser was that EXPLAIN would show a full-table scan if the force index was used. This was a good exercise in understanding the buffer pool as well as how InnoDB works with indexes.

* Crawling through a schema with a bunch of tables and finding many optimizations

* Discussing deployment of The Sphinx Search Engine for a client who needs search functionality

* Various Perl questions from one of my Co-workers. I was able to send the code that I wrote for my book to help them solve a problem. I felt great being able to help someone so soon after starting a new job

The thing that has really dawned on me is that I prefer working on components within the LAMP stack, especially MySQL, with an eye on where NoSQL fits in as well.

I just wanted to write about my realization that I've had over the last several days!

Cloud systems vs. NoSQL email with tons of questions

I had an email after my webinar on NoSQL/SQL for Oracle Development Tools User Group last week (http://www.odtug.com/apex/f?p=500:1:0) from an attendee that was chock full of some questions. I decided to answer them to clarify with this fellow NoSQL and Cloud Systems. I'm pretty happy with my answers. I'd be glad for any thoughts from people about my replies.

Here are my responses (the fellow's name is also Patrick):

Patrick,

You are welcome! Thank you for attending. I put that together a bit hasty but thought it was a good topic to be covering as there are so many organizations that are considering such an architecture.

Patrick Francois wrote:
> Hi,
>
> Thank you for the NoSQL Webinar!
> Not an easy theme. ..kind of "wide open".
> I have recently also tried getting more info on NoSQL and related systems and performance issues. ...and get confused, especially also with the "cloud systems".

Yeah, a lot of buzzword-BS and reinventing the wheel going on. They key to remember is that SQL is not going to be replace SQL any time soon, but the combination of SQL and NoSQL can have its advantages.

>
> I checked for example this document regarding benchmarking "cloud systems":
> (maybe you know that document as well)
> -> "BenchmarkingCloudServingSystemswithYCSB"
> (YCSB -> Yahoo!CloudServingBenchmark)
> -> straight link: http://www.brianfrankcooper.net/pubs/ycsb.pdf
>
> There they speak about : explosion of new systems for datastorage and management "in the cloud"
> They mention there all these different NoSQL storage systems,
> and say for example also: Some systems are offered only as cloud services, either directly in the case of Amazon SimpleDB[1]


Yeah, SimpleDB being Amazon's S3, a proprietary system. You can download it and run it on your own cluster or private cloud.


> ....
>
> That's why I'm also a bit confused about "cloud systems" and "NoSQL".

Cloud systems - usually virtual machines that can be easy spun up for elasticity - I want to add more servers to my virtual network on the fly. This can be via either your own VMWare Labmanager setup or something like EC2. There are also real hardware clouds using services such as what Rackspace offers.

By "cloud systems" and "NoSQL", it just means running these databases in an environment such as EC2, VMware, or Rackspace, etc... For instance, you can have relational databases in the cloud such as with MySQL, Drizzle, or etc...

You can also run them NoSQL databases on your own systems. NoSQL and Cloud systems are not mutually exclusive of each other.

> Basically if you speak about cloud systems, you also speak about scaling out same as you are speaking about "scaling out" when speaking about NoSQL.


Scaling out can be in the cloud as well as outside the cloud. It just means running applications and databases over a several systems versus using an every increasing big huge server and growing that server to scale as was done in the Good Old Days (TM).


>
> So, does it imply that NoSQL systems are cloud systems?

No, they can be, but are not. NoSQL means a database system that doesn't use SQL to access data, non-relational

> How would you see the relation "NoSQL" / "Cloud systems".


They complement each other. NoSQL works well in a cloud paradigm.


>
> --
> Even "cloud system" meaning as such is rather unclear.
> There was the question about "Cassandra server farm":
> If I create an own cassandra server farm, can I then still speak about a "cloud system", or can I speak about a "cloud system" only when underlying servers are also geographically distrubuted?

Geographical distribution isn't the determinant in the definition of cloud system. I could have a cloud system down in my cellar if I wanted to if I have multiple real or virtual servers.

>
> There is also the "security" issue when it comes to "cloud systems" and some say security is bigger because data is distributed. I understand that in that way, you cannot really get hand (or get hacked) on all the data at once, eventually just a part of the data.
> But if I think on an own Cassandra farm, where eventually all machines are in the same machine room and network, I can imagine if you can get into one machine, you can get into all of them.

This issue requires you to do some homework about how to secure your servers. Set up a good image with everything locked down and use something like puppet to have it come up with all the goodies you set up in that image. Define a list of must-haves before that box (virtual or real) is on the network.

>
> From the YCSB-document "some systems are offered only as cloud services",
> I understand that for using Amazon's SimpleDB, you could not create your own server farm, rather need to buy that service from Amazon?


Yeah, you're not going to run your own S3. You can run other NoSQL or distrubuted file systems - pick your choice with a Google Search.


>
> That is causing me confusion between "NoSQL" and "Cloud systems".


NoSQL - as I defined in the presentation - schema less, often non-relational, doesn't use SQL as the Linga Franca of data access.

Cloud systems - multiple boxes, real or virtual, elastic, as I mention above.

Two different but mutually complementary concepts.


> --
> Memcached and Membased sounded very interesting. I will check more on those.


Please do. And do join the mailing lists. I see Matt Ingenthron and Perry Krug answering emails every day!


> --
> Can you eventually also provide the slides you used?


Certainly - let me make sure they care clean and I'll send them to you!

Getting Ruby and Rake versions sorted out

I've been working on updating the various libmemcached wrapper drivers to the latest libmemcached and was having an issue trying to build the Ruby libmemcached client, Fauna (developed by Evan Weaver) due to my Ubuntu box originally having Ruby 1.8 installed but gem would install modules that are needed for Fauna - echoe and mocha - into the 1.9 library directory. I found this useful post here: http://michalf.me/blog:make-ruby-1-9-default-on-ubuntu-9-10-karmic-koala which I modified and this here is the script I ended up using that will allow Fauna to build:

#!/bin/sh

aptitude -y install ruby1.9.1 ruby1.9.1-dev \
        libopenssl-ruby1.9.1 rubygems1.9.1 irb1.9.1 ri1.9.1 rdoc1.9.1 g++

# If you already have 1.8 installed
update-alternatives --install /usr/bin/ruby ruby /usr/bin/ruby1.8 500 \
        --slave   /usr/share/man/man1/ruby.1.gz ruby.1.gz \
            /usr/share/man/man1/ruby.1.8.gz \
        --slave   /usr/bin/ri ri /usr/bin/ri1.8 \
        --slave   /usr/bin/irb irb /usr/bin/irb1.8 \
        --slave   /usr/bin/rdoc rdoc /usr/bin/rdoc1.8

# Install ruby1.9.1
update-alternatives --install /usr/bin/ruby ruby /usr/bin/ruby1.9.1 400 \
         --slave   /usr/share/man/man1/ruby.1.gz ruby.1.gz \
                                   /usr/share/man/man1/ruby1.9.1.1.gz \
        --slave   /usr/bin/ri ri /usr/bin/ri1.9.1 \
         --slave   /usr/bin/irb irb /usr/bin/irb1.9.1 \
        --slave   /usr/bin/rdoc rdoc /usr/bin/rdoc1.9.1

# choose your interpreter
# changes symlinks for /usr/bin/ruby ,
# /usr/bin/irb, /usr/bin/ri and man (1) ruby
update-alternatives --config ruby
update-alternatives --config gem

gem update
gem install echoe
gem install mocha


And if you now run:

patg@patg-desktop:~/code_dev/ruby-memcached$ ruby --version
ruby 1.9.1p378 (2010-01-10 revision 26273) [x86_64-linux]
patg@patg-desktop:~/code_dev/ruby-memcached$ gem --version
1.3.5
patg@patg-desktop:~/code_dev/ruby-memcached$ rake --version
rake, version 0.8.7