MariaDB crashed, Kopano Server can't connect itself anymore when MariaDB is back online

Beleggrodion

I had a strange issue since some days this week with MariaDB and Kopano. The MariaDB Database is on a different Server (on which are other database tables from other servers)

On monday early morning we had the problem that the backup job on the database server set the root disk to “full” , but not the disk on which the mariadb tables resides. The backup aborted itself of this and get the disk space free. So we moved the backup cache to a new disk, so that root don’t get full.

Now we had on Tuesday and Today a crash of the MariaDB Server. Below the crash dump from Tuesday, but the one from today looks similar.

Nov 10 03:46:41 sql01.intra mysqld: 201110  3:46:41 [ERROR] mysqld got signal 11 ;
Nov 10 03:46:41 sql01.intra mysqld: This could be because you hit a bug. It is also possible that this binary
Nov 10 03:46:41 sql01.intra mysqld: or one of the libraries it was linked against is corrupt, improperly built,
Nov 10 03:46:41 sql01.intra mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
Nov 10 03:46:41 sql01.intra mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Nov 10 03:46:41 sql01.intra mysqld: We will try our best to scrape up some info that will hopefully help
Nov 10 03:46:41 sql01.intra mysqld: diagnose the problem, but since we have already crashed,
Nov 10 03:46:41 sql01.intra mysqld: something is definitely wrong and this may fail.
Nov 10 03:46:41 sql01.intra mysqld: Server version: 10.4.16-MariaDB-log
Nov 10 03:46:41 sql01.intra mysqld: key_buffer_size=134217728
Nov 10 03:46:41 sql01.intra mysqld: read_buffer_size=131072
Nov 10 03:46:41 sql01.intra mysqld: max_used_connections=149
Nov 10 03:46:41 sql01.intra mysqld: max_threads=1026
Nov 10 03:46:41 sql01.intra mysqld: thread_count=117
Nov 10 03:46:41 sql01.intra mysqld: It is possible that mysqld could use up to
Nov 10 03:46:41 sql01.intra mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2413078 K  bytes of memory
Nov 10 03:46:41 sql01.intra mysqld: Hope that's ok; if not, decrease some variables in the equation.
Nov 10 03:46:41 sql01.intra mysqld: Thread pointer: 0x0
Nov 10 03:46:41 sql01.intra mysqld: Attempting backtrace. You can use the following information to find out
Nov 10 03:46:41 sql01.intra mysqld: where mysqld died. If you see no messages after this, something went
Nov 10 03:46:41 sql01.intra mysqld: terribly wrong...
Nov 10 03:46:41 sql01.intra mysqld: stack_bottom = 0x0 thread_stack 0x49000
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e841801f7e]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55e84129397f]
Nov 10 03:46:41 sql01.intra mysqld: sigaction.c:0(__restore_rt)[0x7fdc1ace9630]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xbd1944)[0x55e84159a944]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xbd4a3c)[0x55e84159da3c]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1983d)[0x55e8414e283d]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1b25c)[0x55e8414e425c]
Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1bed9)[0x55e8414e4ed9]
Nov 10 03:46:41 sql01.intra mysqld: pthread_create.c:0(start_thread)[0x7fdc1ace1ea5]
Nov 10 03:46:41 sql01.intra mysqld: /lib64/libc.so.6(clone+0x6d)[0x7fdc190828dd]
Nov 10 03:46:41 sql01.intra mysqld: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
Nov 10 03:46:41 sql01.intra mysqld: information that should help you find out what is causing the crash.
Nov 10 03:46:41 sql01.intra mysqld: Writing a core file...
Nov 10 03:46:41 sql01.intra mysqld: Working directory at /data/mysql
Nov 10 03:46:41 sql01.intra mysqld: Resource Limits:
Nov 10 03:46:41 sql01.intra kernel: mysqld[32936]: segfault at 0 ip 000055e8417ff256 sp 00007fdb78cc1170 error 6 in mysqld[55e8409c9000+14b4000]

I see that yesterday their was a mariadb update, which looks like fix some memory issue, so i will update this later. But the more intresting part is, that all programs and services which use the mariadb server healed themselfs and worked this morning expect kopano. In the server log i see the following multiple times:

Thu Nov 12 03:30:40 2020: [warning] SQL [00006623] info: MySQL server has gone away. Reconnecting.
Thu Nov 12 03:30:40 2020: [error  ] mysql_real_connect: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused")
Thu Nov 12 03:30:40 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 76, Query: "SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE"
Thu Nov 12 03:30:40 2020: [error  ] KDatabase::DoSelect(): query failed: "MYSQL not initialized", query: SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE
Thu Nov 12 03:30:45 2020: [warning] SQL [00006617] info: MySQL server has gone away. Reconnecting

and later when the mariadb server is comming back online the logs change to:

Thu Nov 12 08:03:47 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 5, Query: "BEGIN"
Thu Nov 12 08:03:47 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 76, Query: "SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE"

To fix this i need to do “service kopano-server restart”. After that the connection can be established again. So Kopano do something different than all other applications (like mattermost, different web applications in php like mediawiki, nextcloud, etc) so that mariadb don’t allow again a connection until kopano is restarted.

At the moment it looks like i need to write a watchdog which checks the logs of kopano and if their is the same issue again do a restart of kopano, but this shouldn’t be the solution. Perhaps it will be fixed with the mariadb update. But is still odd that only kopano had an issue after all when mariadb come’s back.

Which Kopano version you are using?
This should be fixed in version 8.7.16.

Beleggrodion

Currently i use 8.7.16:

root@kop01.intra:/var/log/kopano# kopano-admin --version
kopano-admin 8.7.16

root@kop01.intra:/var/log/kopano# dpkg -l | grep kopano-server
ii  kopano-server                              8.7.16.0-0+40.2                                 amd64        Server component for Kopano Core
ii  kopano-server-packages                     8.7.16.0-0+40.2                                 all          Metapackage to install the entire Kopano Core stack

klausade

That has happend to me as well, the fix for me was to move or delete the file /var/lib/mysql/tc.log (this is on Debian).