[Solved] KC::DAGENT: Postfix LMTP to DAgent: lost connection with localhost while sending end of data

Hello folks,
since kc::git-master Tue 26.06.2018 (commit unknown, upstream/master ~13:05) were having an issue with Postfix and Dagent in LMTP-mode:
Postfix is unable to handover incoming traffic to dagent with the following error in maillog:

Jul  5 11:05:20 root postfix/lmtp[24632]: 6909D541990: to=<horschtl@kaschpalverein.by>, relay=::1[::1]:2003, delay=2332, delays=2332/0.01/0.07/0.13, dsn=4.4.2, status=deferred (lost connection with ::1[::1] while sending end of data -- message may be sent more than once)

The problem appears independently of configured:

  • lmtp-transport:
    mailbox_transport = lmtp:inet:[::1]:2003
    mailbox_transport = lmtp:inet:[127.0.0.1]:2003
    virtual_transport = lmtp:unix:/var/run/kopano/dagent.sock

  • dagent process owner/user (kopano or root)

Configuration as postfix-unix-service with piped handover is not concerned: A postfix-config similar to:

master.cf
kopano  unix - n n - 10 pipe flags= user=kopano argv=/usr/sbin/kopano-dagent -e -h file:///var/run/kopano/server.sock ${user}

still does work.

Dagent produces the following debugout:

Thu Jul  5 11:03:34 2018: [info   ] Coredump status left at system default.
Thu Jul  5 11:03:34 2018: [info   ] Maximum LMTP threads set to 20
Thu Jul  5 11:03:34 2018: [info   ] Listening on 127.0.0.1:2003 for LMTP
Thu Jul  5 11:03:34 2018: [info   ] Listening on [::1]:2003 for LMTP
Thu Jul  5 11:03:34 2018: [info   ] Logger process started on pid 8059
Thu Jul  5 11:03:34 2018: [debug  ] StatsClient binding socket
Thu Jul  5 11:03:34 2018: [debug  ] StatsClient bound socket to /tmp/.51f81cc765ffb862.sock
Thu Jul  5 11:03:34 2018: [debug  ] StatsClient thread started
Thu Jul  5 11:03:34 2018: [=======] Starting kopano-dagent version 8.6.80.1121 (pid 8058) (LMTP mode)
Thu Jul  5 11:03:34 2018: [debug  ] Submit thread started
Thu Jul  5 11:03:40 2018: [info   ] Accepted connection from [::1]:34086
Thu Jul  5 11:03:40 2018: [info   ] Starting worker for LMTP request pid 8130
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Public Folders"
Thu Jul  5 11:03:40 2018: [debug  ] < 220 2.1.5 LMTP server is ready
Thu Jul  5 11:03:40 2018: [debug  ] > LHLO smtp.kaschpalverein.by
Thu Jul  5 11:03:40 2018: [debug  ] LHLO ID: smtp.kaschpalverein.by
Thu Jul  5 11:03:40 2018: [debug  ] < 250-SERVER ready
Thu Jul  5 11:03:40 2018: [debug  ] < 250-PIPELINING
Thu Jul  5 11:03:40 2018: [debug  ] < 250-ENHANCEDSTATUSCODE
Thu Jul  5 11:03:40 2018: [debug  ] < 250-RSET
Thu Jul  5 11:03:40 2018: [debug  ] < 250 SMTPUTF8
Thu Jul  5 11:03:40 2018: [debug  ] > MAIL FROM:<horschtl@kaschpalverein.by>
Thu Jul  5 11:03:40 2018: [debug  ] < 250 2.1.0 Ok
Thu Jul  5 11:03:40 2018: [debug  ] > RCPT TO:<horschtl@kaschpalverein.by>
Thu Jul  5 11:03:40 2018: [debug  ] Resolved command "RCPT TO:<horschtl@kaschpalverein.by>" to recipient address "horschtl@kaschpalverein.by"
Thu Jul  5 11:03:40 2018: [notice ] Resolved recipient horschtl@kaschpalverein.by as user horschtl
Thu Jul  5 11:03:40 2018: [debug  ] < 250 2.1.5 Ok
Thu Jul  5 11:03:40 2018: [debug  ] > DATA
Thu Jul  5 11:03:40 2018: [debug  ] < 354 2.1.5 Start mail input; end with <CRLF>.<CRLF>
Thu Jul  5 11:03:40 2018: [debug  ] PYTHONPATH = /usr/share/kopano-dagent/python
Thu Jul  5 11:03:40 2018: [info   ] * Loading plugins started
Thu Jul  5 11:03:40 2018: [info   ] ** Checking plugins in /var/lib/kopano/dagent/plugins
Thu Jul  5 11:03:40 2018: [info   ] * Loading plugins done
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 11:03:40 2018: [debug  ] Initializing provider "Public Folders"
[EOF]

best regards,
++umgfoin.

Hi @umgfoin ,

thanks for your report. one question.

@umgfoin said in [Problem] KC::DAGENT: Postfix LMTP to DAgent: lost connection with localhost while sending end of data:

kc::git-master Tue 26.06.2018

are you still on that build or are you already on a newer commit and the behaviour stays the same?

@fbartels
Hi Felix,
problem can be reproduced (here) with current kc::master as of 04.07.2018 (top commit 5cb6e378ffa):
Behaviour identical.
I already checked against the TCP_NODELAY-flag, which was recently removed.
This seems not to be the culprit.

++umgfoin.

Strange works just as expected here (with latest master 8.6.80.1169-0+164.1):

Thu Jul  5 14:31:19 2018: [info   ] Accepted connection from [::ffff:127.0.0.1]:59978
Thu Jul  5 14:31:19 2018: [info   ] Starting worker for LMTP request pid 7561
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Public Folders"
Thu Jul  5 14:31:19 2018: [debug  ] < 220 2.1.5 LMTP server is ready
Thu Jul  5 14:31:19 2018: [debug  ] > LHLO felix-ubuntu-xenial-master.lxd.kopano.lan
Thu Jul  5 14:31:19 2018: [debug  ] LHLO ID: felix-ubuntu-xenial-master.lxd.kopano.lan
Thu Jul  5 14:31:19 2018: [debug  ] < 250-SERVER ready
Thu Jul  5 14:31:19 2018: [debug  ] < 250-PIPELINING
Thu Jul  5 14:31:19 2018: [debug  ] < 250-ENHANCEDSTATUSCODE
Thu Jul  5 14:31:19 2018: [debug  ] < 250-RSET
Thu Jul  5 14:31:19 2018: [debug  ] < 250 SMTPUTF8
Thu Jul  5 14:31:19 2018: [debug  ] > MAIL FROM:<user1@farmer.lan>
Thu Jul  5 14:31:19 2018: [debug  ] < 250 2.1.0 Ok
Thu Jul  5 14:31:19 2018: [debug  ] > RCPT TO:<user1@farmer.lan>
Thu Jul  5 14:31:19 2018: [debug  ] Resolved command "RCPT TO:<user1@farmer.lan>" to recipient address "user1@farmer.lan"
Thu Jul  5 14:31:19 2018: [notice ] Resolved recipient user1@farmer.lan as user user1
Thu Jul  5 14:31:19 2018: [debug  ] < 250 2.1.5 Ok
Thu Jul  5 14:31:19 2018: [debug  ] > DATA
Thu Jul  5 14:31:19 2018: [debug  ] < 354 2.1.5 Start mail input; end with <CRLF>.<CRLF>
Thu Jul  5 14:31:19 2018: [debug  ] PYTHONPATH = /usr/share/kopano-dagent/python
Thu Jul  5 14:31:19 2018: [info   ] * Loading plugins started
Thu Jul  5 14:31:19 2018: [info   ] ** Checking plugins in /var/lib/kopano/dagent/plugins
Thu Jul  5 14:31:19 2018: [info   ] * Loading plugins done
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Public Folders"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Public Folders"
Thu Jul  5 14:31:19 2018: [info   ] Mail will be delivered in Inbox
Thu Jul  5 14:31:19 2018: [debug  ] Trying to parse alternative multipart 1 of mail body
Thu Jul  5 14:31:19 2018: [debug  ] HTML4 meta tag found: charset="text/html; charset=utf-8"
Thu Jul  5 14:31:19 2018: [debug  ] HTML charset adjusted to "utf-8"
Thu Jul  5 14:31:19 2018: [debug  ] renovate_encoding: reading data using charset "utf-8" succeeded.
Thu Jul  5 14:31:19 2018: [info   ] * PostConverting processing started
Thu Jul  5 14:31:19 2018: [info   ] * PostConverting processing done
Thu Jul  5 14:31:19 2018: [info   ] * PreDelivery processing started
Thu Jul  5 14:31:19 2018: [info   ] * PreDelivery processing done
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Kopano Directory Service"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Private Folders"
Thu Jul  5 14:31:19 2018: [debug  ] Initializing provider "Public Folders"
Thu Jul  5 14:31:19 2018: [info   ] * PreRuleProcess processing started
Thu Jul  5 14:31:19 2018: [info   ] * PreRuleProcess processing done
Thu Jul  5 14:31:19 2018: [debug  ] Target user has OOF inactive
Thu Jul  5 14:31:19 2018: [info   ] * PostDelivery processing started
Thu Jul  5 14:31:19 2018: [info   ] * PostDelivery processing done
Thu Jul  5 14:31:19 2018: [info   ] * SendNewMailNotify processing started
Thu Jul  5 14:31:19 2018: [info   ] * SendNewMailNotify processing done
Thu Jul  5 14:31:19 2018: [debug  ] Send 'New Mail' notification
Thu Jul  5 14:31:19 2018: [info   ] Delivered message to "user1", Subject: "RE: test to self", Message-Id: <kcis.7F01CC736FA24886AC3AEF4855AA24D4@felix-ubuntu-xenial-master>, size 3144
Thu Jul  5 14:31:19 2018: [info   ] Finished processing message
Thu Jul  5 14:31:19 2018: [debug  ] < 250 2.1.5 user1@farmer.lan Ok
Thu Jul  5 14:31:19 2018: [debug  ] > QUIT
Thu Jul  5 14:31:19 2018: [debug  ] < 221 2.0.0 Bye
Thu Jul  5 14:31:19 2018: [info   ] LMTP thread exiting

dagent.cfg is a stock one (only logging configured).

I have internally asked one of the devs to look into this.

lost connection: file descriptor got closed unexpectedly. crash?

@jengelh said

file descriptor got closed unexpectedly. crash?

2x no

Wireshark shows correct protocol-flow. Postfix closing with :
07a0 2e 0d 0a 51 55 49 54 0d 0a ...QUIT. .

Dagent responds with ACK followed by RST,ACK.

< 221 2.0.0 Bye is missing.
The latter seems to trigger Postfix’s error entry and, Dagent doesn’t process the mail.

This is the last packet, sent by Dagent:

No.     Time           Source                Destination           Protocol Length Info
   1990 37.986935568   ::1                   ::1                   TCP      86     brutus > 44782 [RST, ACK] Seq=195 Ack=1986 Win=130944 Len=0 TSval=251930008 TSecr=251929926

Frame 1990: 86 bytes on wire (688 bits), 86 bytes captured (688 bits) on interface 0
    Interface id: 0
    WTAP_ENCAP: 1
    Arrival Time: Jul  5, 2018 17:02:02.874623026 CEST
    [Time shift for this packet: 0.000000000 seconds]
    Epoch Time: 1530802922.874623026 seconds
    [Time delta from previous captured frame: 0.000039616 seconds]
    [Time delta from previous displayed frame: 0.082351957 seconds]
    [Time since reference or first frame: 37.986935568 seconds]
    Frame Number: 1990
    Frame Length: 86 bytes (688 bits)
    Capture Length: 86 bytes (688 bits)
    [Frame is marked: True]
    [Frame is ignored: False]
    [Protocols in frame: eth:ipv6:tcp]
    [Coloring Rule Name: TCP RST]
    [Coloring Rule String: tcp.flags.reset eq 1]
Ethernet II, Src: 00:00:00_00:00:00 (00:00:00:00:00:00), Dst: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Destination: 00:00:00_00:00:00 (00:00:00:00:00:00)
        Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Source: 00:00:00_00:00:00 (00:00:00:00:00:00)
        Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
        .... ..0. .... .... .... .... = LG bit: Globally unique address (factory default)
        .... ...0 .... .... .... .... = IG bit: Individual address (unicast)
    Type: IPv6 (0x86dd)
Internet Protocol Version 6, Src: ::1 (::1), Dst: ::1 (::1)
    0110 .... = Version: 6
        [0110 .... = This field makes the filter "ip.version == 6" possible: 6]
    .... 0000 0000 .... .... .... .... .... = Traffic class: 0x00000000
        .... 0000 00.. .... .... .... .... .... = Differentiated Services Field: Default (0x00000000)
        .... .... ..0. .... .... .... .... .... = ECN-Capable Transport (ECT): Not set
        .... .... ...0 .... .... .... .... .... = ECN-CE: Not set
    .... .... .... 0000 0000 0000 0000 0000 = Flowlabel: 0x00000000
    Payload length: 32
    Next header: TCP (6)
    Hop limit: 64
    Source: ::1 (::1)
    Destination: ::1 (::1)
Transmission Control Protocol, Src Port: brutus (2003), Dst Port: 44782 (44782), Seq: 195, Ack: 1986, Len: 0
    Source port: brutus (2003)
    Destination port: 44782 (44782)
    [Stream index: 43]
    Sequence number: 195    (relative sequence number)
    Acknowledgment number: 1986    (relative ack number)
    Header length: 32 bytes
    Flags: 0x014 (RST, ACK)
        000. .... .... = Reserved: Not set
        ...0 .... .... = Nonce: Not set
        .... 0... .... = Congestion Window Reduced (CWR): Not set
        .... .0.. .... = ECN-Echo: Not set
        .... ..0. .... = Urgent: Not set
        .... ...1 .... = Acknowledgment: Set
        .... .... 0... = Push: Not set
        .... .... .1.. = Reset: Set
            [Expert Info (Chat/Sequence): Connection reset (RST)]
                [Message: Connection reset (RST)]
                [Severity level: Chat]
                [Group: Sequence]
        .... .... ..0. = Syn: Not set
        .... .... ...0 = Fin: Not set
    Window size value: 1023
    [Calculated window size: 130944]
    [Window size scaling factor: 128]
    Checksum: 0xf83b [validation disabled]
        [Good Checksum: False]
        [Bad Checksum: False]
    Options: (12 bytes), No-Operation (NOP), No-Operation (NOP), Timestamps
        No-Operation (NOP)
            Type: 1
                0... .... = Copy on fragmentation: No
                .00. .... = Class: Control (0)
                ...0 0001 = Number: No-Operation (NOP) (1)
        No-Operation (NOP)
            Type: 1
                0... .... = Copy on fragmentation: No
                .00. .... = Class: Control (0)
                ...0 0001 = Number: No-Operation (NOP) (1)
        Timestamps: TSval 251930008, TSecr 251929926
            Kind: Timestamp (8)
            Length: 10
            Timestamp value: 251930008
            Timestamp echo reply: 251929926

0000  00 00 00 00 00 00 00 00 00 00 00 00 86 dd 60 00   ..............`.
0010  00 00 00 20 06 40 00 00 00 00 00 00 00 00 00 00   ... .@..........
0020  00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00   ................
0030  00 00 00 00 00 01 07 d3 ae ee e2 87 ae 7f a0 82   ................
0040  29 4b 80 14 03 ff f8 3b 00 00 01 01 08 0a 0f 04   )K.....;........
0050  25 98 0f 04 25 46 

++umgfoin

Sent wireshark capture to feedback@kopano.com

@jengelh said in [Problem] KC::DAGENT: Postfix LMTP to DAgent: lost connection with localhost while sending end of data:

lost connection: file descriptor got closed unexpectedly. crash?

Crashes with process_model = thread.

Can’t provide coredumps - no priv to set fs.suid_dumpable on this machine.
Crash probably in HandlerLMTParound/inProcessDeliveryToList and SaveRawMessage.
(raw-msg-dumps don’t get written).

Problem solved:
init.d was starting an outdated dagent in /usr/local/sbin as daemon.
This explains, why delivery via postfix-unix-service worked ( in this case, the correct dagent version in /usr/sbin was called).

Sorry to everyone spending time on this.
++umgfoin.