MariaDB crashed, Kopano Server can't connect itself anymore when MariaDB is back online


  • I had a strange issue since some days this week with MariaDB and Kopano. The MariaDB Database is on a different Server (on which are other database tables from other servers)

    On monday early morning we had the problem that the backup job on the database server set the root disk to “full” , but not the disk on which the mariadb tables resides. The backup aborted itself of this and get the disk space free. So we moved the backup cache to a new disk, so that root don’t get full.

    Now we had on Tuesday and Today a crash of the MariaDB Server. Below the crash dump from Tuesday, but the one from today looks similar.

    Nov 10 03:46:41 sql01.intra mysqld: 201110  3:46:41 [ERROR] mysqld got signal 11 ;
    Nov 10 03:46:41 sql01.intra mysqld: This could be because you hit a bug. It is also possible that this binary
    Nov 10 03:46:41 sql01.intra mysqld: or one of the libraries it was linked against is corrupt, improperly built,
    Nov 10 03:46:41 sql01.intra mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
    Nov 10 03:46:41 sql01.intra mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
    Nov 10 03:46:41 sql01.intra mysqld: We will try our best to scrape up some info that will hopefully help
    Nov 10 03:46:41 sql01.intra mysqld: diagnose the problem, but since we have already crashed,
    Nov 10 03:46:41 sql01.intra mysqld: something is definitely wrong and this may fail.
    Nov 10 03:46:41 sql01.intra mysqld: Server version: 10.4.16-MariaDB-log
    Nov 10 03:46:41 sql01.intra mysqld: key_buffer_size=134217728
    Nov 10 03:46:41 sql01.intra mysqld: read_buffer_size=131072
    Nov 10 03:46:41 sql01.intra mysqld: max_used_connections=149
    Nov 10 03:46:41 sql01.intra mysqld: max_threads=1026
    Nov 10 03:46:41 sql01.intra mysqld: thread_count=117
    Nov 10 03:46:41 sql01.intra mysqld: It is possible that mysqld could use up to
    Nov 10 03:46:41 sql01.intra mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2413078 K  bytes of memory
    Nov 10 03:46:41 sql01.intra mysqld: Hope that's ok; if not, decrease some variables in the equation.
    Nov 10 03:46:41 sql01.intra mysqld: Thread pointer: 0x0
    Nov 10 03:46:41 sql01.intra mysqld: Attempting backtrace. You can use the following information to find out
    Nov 10 03:46:41 sql01.intra mysqld: where mysqld died. If you see no messages after this, something went
    Nov 10 03:46:41 sql01.intra mysqld: terribly wrong...
    Nov 10 03:46:41 sql01.intra mysqld: stack_bottom = 0x0 thread_stack 0x49000
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55e841801f7e]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x55e84129397f]
    Nov 10 03:46:41 sql01.intra mysqld: sigaction.c:0(__restore_rt)[0x7fdc1ace9630]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xbd1944)[0x55e84159a944]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xbd4a3c)[0x55e84159da3c]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1983d)[0x55e8414e283d]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1b25c)[0x55e8414e425c]
    Nov 10 03:46:41 sql01.intra mysqld: /usr/sbin/mysqld(+0xb1bed9)[0x55e8414e4ed9]
    Nov 10 03:46:41 sql01.intra mysqld: pthread_create.c:0(start_thread)[0x7fdc1ace1ea5]
    Nov 10 03:46:41 sql01.intra mysqld: /lib64/libc.so.6(clone+0x6d)[0x7fdc190828dd]
    Nov 10 03:46:41 sql01.intra mysqld: The manual page at https://mariadb.com/kb/en/how-to-produce-a-full-stack-trace-for-mysqld/ contains
    Nov 10 03:46:41 sql01.intra mysqld: information that should help you find out what is causing the crash.
    Nov 10 03:46:41 sql01.intra mysqld: Writing a core file...
    Nov 10 03:46:41 sql01.intra mysqld: Working directory at /data/mysql
    Nov 10 03:46:41 sql01.intra mysqld: Resource Limits:
    Nov 10 03:46:41 sql01.intra kernel: mysqld[32936]: segfault at 0 ip 000055e8417ff256 sp 00007fdb78cc1170 error 6 in mysqld[55e8409c9000+14b4000]
    

    I see that yesterday their was a mariadb update, which looks like fix some memory issue, so i will update this later. But the more intresting part is, that all programs and services which use the mariadb server healed themselfs and worked this morning expect kopano. In the server log i see the following multiple times:

    Thu Nov 12 03:30:40 2020: [warning] SQL [00006623] info: MySQL server has gone away. Reconnecting.
    Thu Nov 12 03:30:40 2020: [error  ] mysql_real_connect: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused")
    Thu Nov 12 03:30:40 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 76, Query: "SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE"
    Thu Nov 12 03:30:40 2020: [error  ] KDatabase::DoSelect(): query failed: "MYSQL not initialized", query: SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE
    Thu Nov 12 03:30:45 2020: [warning] SQL [00006617] info: MySQL server has gone away. Reconnecting
    

    and later when the mariadb server is comming back online the logs change to:

    Thu Nov 12 08:03:47 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 5, Query: "BEGIN"
    Thu Nov 12 08:03:47 2020: [error  ] SQL [00000000] Failed: Can't connect to MySQL server on 'sql01.intra' (111 "Connection refused"), Query Size: 76, Query: "SELECT change_id,sourcekey,sync_type FROM syncs WHERE id=16436310 FOR UPDATE"
    

    To fix this i need to do “service kopano-server restart”. After that the connection can be established again. So Kopano do something different than all other applications (like mattermost, different web applications in php like mediawiki, nextcloud, etc) so that mariadb don’t allow again a connection until kopano is restarted.

    At the moment it looks like i need to write a watchdog which checks the logs of kopano and if their is the same issue again do a restart of kopano, but this shouldn’t be the solution. Perhaps it will be fixed with the mariadb update. But is still odd that only kopano had an issue after all when mariadb come’s back.


  • Which Kopano version you are using?
    This should be fixed in version 8.7.16.


  • Currently i use 8.7.16:

    root@kop01.intra:/var/log/kopano# kopano-admin --version
    kopano-admin 8.7.16
    
    root@kop01.intra:/var/log/kopano# dpkg -l | grep kopano-server
    ii  kopano-server                              8.7.16.0-0+40.2                                 amd64        Server component for Kopano Core
    ii  kopano-server-packages                     8.7.16.0-0+40.2                                 all          Metapackage to install the entire Kopano Core stack
    

  • That has happend to me as well, the fix for me was to move or delete the file /var/lib/mysql/tc.log (this is on Debian).