ANALYSIS: ========= Server sometimes exited when multiple threads tried to acquire and release metadata locks simultaneously (for example, necessary to access a table). The same problem could have occurred when new objects were registered/ deregistered in Performance Schema.
The problem was caused by a bug in LF_HASH - our lock free hash implementation which is used by metadata locking subsystem in 5.7 branch. In 5.5 and 5.6 we only use LF_HASH in Performance Schema Instrumentation implementation. So for these versions, the problem was limited to P_S.
The problem was in my_lfind() function, which searches for the specific hash element by going through the elements list. During this search it loads information about element checked such as key pointer and hash value into local variables. Then it confirms that they are not corrupted by concurrent delete operation (which will set pointer to 0) by checking if element is still in the list. The latter check did not take into account that compiler (and processor) can reorder reads in such a way that load of key pointer will happen after it, making result of the check invalid.
FIX: ==== This patch fixes the problem by ensuring that no such reordering can take place. This is achieved by using my_atomic_loadptr() which contains compiler and processor memory barriers for the check mentioned above and other similar places.
The default (for non-Windows systems) implementation of my_atomic*() relies on old __sync intrisics and implements my_atomic_loadptr() as read-modify operation. To avoid scalability/performance penalty associated with addition of my_atomic_loadptr()'s we change the my_atomic*() to use newer __atomic intrisics when available. This new default implementation doesn't have such a drawback.
大体含义是:
当多个线程分别同时获取、释放 metadata locks 时,或者在 Performance Schema 中注册/撤销新的 object 时,可能会触发该问题,导致 mysql server crash。
diff --git a/mysys/lf_hash.c b/mysys/lf_hash.c index dc019b07bd9..3a3f665a4f1 100644 --- a/mysys/lf_hash.c +++ b/mysys/lf_hash.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006, 2016, Oracle and/or its affiliates. All rights reserved. +/* Copyright (c) 2006, 2017, Oracle and/or its affiliates. All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by @@ -83,7 +83,8 @@ retry: do { /* PTR() isn't necessary below, head is a dummy node */ cursor->curr= (LF_SLIST *)(*cursor->prev); _lf_pin(pins, 1, cursor->curr); - } while (*cursor->prev != (intptr)cursor->curr && LF_BACKOFF); + } while (my_atomic_loadptr((void**)cursor->prev) != cursor->curr && + LF_BACKOFF); for (;;) { if (unlikely(!cursor->curr)) @@ -97,7 +98,7 @@ retry: cur_hashnr= cursor->curr->hashnr; cur_key= cursor->curr->key; cur_keylen= cursor->curr->keylen; - if (*cursor->prev != (intptr)cursor->curr) + if (my_atomic_loadptr((void**)cursor->prev) != cursor->curr) { (void)LF_BACKOFF; goto retry;
A server exit could result from simultaneous attempts by multiple threads to register and deregister metadata Performance Schema objects, or to acquire and release metadata locks. (Bug #26502135)