Fix bug during call `tcp_done`. #2340

EvgeniiMekhanik · 2025-02-10T23:41:47Z

When Tempesta FW closes socket from ss_do_close
function, Tempesta FW sends TCP FIN using tcp_send_fin or TCP RST using tcp_send_active_reset. Both this functions use tcp_write_xmit to push FIN/RST to network. If error occurs in tcp_write_xmit->tcp_tfw_sk_write_xmit we call tcp_tfw_handle_error->tcp_done. In this case ss_do_close called recursively from tcp_done function, that leads to extra inet_csk_destroy_sock call and extra socket put. This patch fixes this behaviour - Tempesta FW set new special flag at the beginning of ss_do_close and if this flag is set we don't call sk_state_change / inet_csk_destroy_sock from tcp_done.

const-t · 2025-02-11T13:12:08Z

fw/sock.c

+	 * when `tcp_done` is called because of error in `tcp_send_fin` /
+	 * `tcp_send_active_reset`->  `tcp_write_xmit->tcp_tfw_handle_error`.
+	 */
+	sock_set_flag(sk, SOCK_TIMESTA_CLOSING);


I have proposal for this patch.

First of all I can't see cases how tcp_write_xmit() could be called from tcp_send_active_reset(), the last avoids using of write_queue and calls tcp_transmit_skb() directly. Therefore tcp_tfw_handle_error() can't be called from tcp_send_active_reset().

The second. Looking into ss_do_close() I see single call of tcp_send_fin() here, and only inside this function ss_do_close() might be called recursively. Let's breakdown this call chain:

(#1) ss_do_close() \/ tcp_send_fin() #Before the call set state to TCP_FIN_WAIT1 \/ tcp_write_xmit() #Try to send skb with FIN. \/ handle_error() #Error occurred during sending FIN. \/ tcp_done() #SET state to TCP_CLOSE. \/ (#2) ss_do_close() -> goto adjudge_to_death -> set socket as DEAD -> call inet_csk_destroy_sock() the first time. \/ return from tcp_done() to ss_do_close() \/ (#1) ss_do_close() -> fall to adjudge_to_death -> set socket as DEAD again -> call inet_csk_destroy_sock() second time, because state is TCP_CLOSE.

As we can see, we should not call tcp_done() in case of error during sending FIN, maybe instead of adding new flag we can rely on socket's state and do something like this:

tcp_tfw_handle_error(struct sock *sk, int error) { tcp_send_active_reset(sk, GFP_ATOMIC); sk->sk_err = error; sk->sk_error_report(sk); tcp_write_queue_purge(sk); if (sk->sk_state == TCP_FIN_WAIT1) tcp_set_state(sk, TCP_CLOSE); else tcp_done(sk); }

it seems that we can't rely on the state here. For example when we call tcp_shutdown->tcp_close_state->tcp_send_fin and if tcp_send_fin fails we should call tcp_done

Maybe we can check the state in tcp_shutdown as well?

static inline void ss_do_shutdown(struct sock *sk) { tcp_shutdown(sk, SEND_SHUTDOWN); SS_CONN_TYPE(sk) |= Conn_Shutdown; if (sk->state == TCP_CLOSE) ss_do_close(sk, 0); }

Yeah, it'd be wonderful if @const-t 's proposal can be implemented: it's not so beautiful to introduce a new flag in the kernel patch just for one condition.

When Tempesta FW closes socket from `ss_do_close` function, Tempesta FW sends TCP FIN using `tcp_send_fin` This function use `tcp_write_xmit` to push FIN/RST to network. If error occurs in `tcp_write_xmit->tcp_tfw_sk_write_xmit` we call `tcp_tfw_handle_error->tcp_done`. In this case `ss_do_close` called recursively from `tcp_done` function, that leads to extra `inet_csk_destroy_sock` call and extra socket put. This patch fixes this behaviour - now we check socket state in `tcp_tfw_handle_error` and don't call `tcp_done` if socket state is `TCP_FIN_WAIT1` or `TCP_LAST_ACK` (socket mowed to these states if TCP FIN is sent from `ss_do_close`).

EvgeniiMekhanik requested review from const-t and krizhanovsky and removed request for const-t February 10, 2025 23:41

const-t reviewed Feb 11, 2025

View reviewed changes

EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2328-6 branch from 3aee725 to 6129199 Compare February 17, 2025 21:04

EvgeniiMekhanik requested a review from const-t February 17, 2025 21:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bug during call `tcp_done`. #2340

Fix bug during call `tcp_done`. #2340

EvgeniiMekhanik commented Feb 10, 2025

const-t Feb 11, 2025

EvgeniiMekhanik Feb 11, 2025

const-t Feb 12, 2025

krizhanovsky Feb 17, 2025

Fix bug during call tcp_done. #2340

Are you sure you want to change the base?

Fix bug during call tcp_done. #2340

Conversation

EvgeniiMekhanik commented Feb 10, 2025

const-t Feb 11, 2025

Choose a reason for hiding this comment

EvgeniiMekhanik Feb 11, 2025

Choose a reason for hiding this comment

const-t Feb 12, 2025

Choose a reason for hiding this comment

krizhanovsky Feb 17, 2025

Choose a reason for hiding this comment

Fix bug during call `tcp_done`. #2340

Fix bug during call `tcp_done`. #2340