Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix partial header receive by WatsonTCP. Fixes #75 #110

Merged
merged 2 commits into from
Jan 9, 2025

Conversation

sancheolz
Copy link
Contributor

No description provided.

@yallie
Copy link
Collaborator

yallie commented Jan 9, 2025

Hmm...

WatsonTcp Version=6.0.6 passes all unit tests.
WatsonTcp Version=6.0.8 freezes.

Looks like it's due to these changes:

image

_DataReceiver task is stuck on WaitingForActivation status and never gets to IsCompleted = true:

image

Not sure how it can happen given that _DataReceiver is created like this:

image

UPD. OK, looks like it means that the DataReceiver task is awaiting on something
and the continuation part is not yet scheduled by the task scheduler.

@yallie
Copy link
Collaborator

yallie commented Jan 9, 2025

By the way, WatsonTcp 6.0.8 freezes as well on before-async versions.
I've tried reverting to commit f1387ae and it hangs as well.
Apparently the freeze has nothing to do with the async rewrite of the transport channels.

Any ideas? :)

@yallie
Copy link
Collaborator

yallie commented Jan 9, 2025

Looks like commit fedfd5d introduced a deadlock 🤔

WatsonTcpClient.Disconnect waits for the DataReceive task to stop running.
But DataReceive task is waiting for the OnMessage event handler to complete, so Disconnect freezes.

image

I'm going to try reverting that commit.

@sancheolz could you please remember what issue it should have fixed?
Could we have a unit test for this issue?

…aReceive task is waiting for the OnMessage event handler to complete).
@yallie yallie merged commit f5106c8 into theRainbird:master Jan 9, 2025
3 checks passed
@sancheolz
Copy link
Contributor Author

sancheolz commented Jan 10, 2025

@yallie If remote object implements remote method for proxy service like this:

public void CloseSession()
{
    RemotingSession.Current.Close();
}

RPC for CloseSession() may be executed without exceptions, with Disconnect event after. Or sometimes this RPC will interrupted with exception. It happens when processing of "session_closed" occurs before "rpc_result". So we need to synchronized processing of messages in OnMessage for predictable behavior. Test should try to cause this unpredictability, but I don't know how to do it yet.

@sancheolz
Copy link
Contributor Author

@yallie I've brought back changes from fedfd5d in last revision and deadlock didn't occured. How often deadlock occurs?

@yallie
Copy link
Collaborator

yallie commented Jan 13, 2025

It happens when processing of "session_closed" occurs before "rpc_result". So we need to synchronized processing of messages in OnMessage for predictable behavior.

Ok, now I see! Will look into this case.

How often deadlock occurs?

It was failing pretty consistently on Github actions, so I couldn't make tests pass even once, see screenshot.
Then I reproduced it locally a couple of times, looked at parallel stacks and found the issue.

image

@yallie
Copy link
Collaborator

yallie commented Jan 13, 2025

@yallie If remote object implements remote method for proxy service like this:

public void CloseSession()
{
    RemotingSession.Current.Close();
}

RPC for CloseSession() may be executed without exceptions, with Disconnect event after. Or sometimes this RPC will interrupted with exception. It happens when processing of "session_closed" occurs before "rpc_result". So we need to synchronized processing of messages in OnMessage for predictable behavior. Test should try to cause this unpredictability, but I don't know how to do it yet.

@sancheolz, please have a look at #113

CloseSession should work fine without changes from commit fedfd5d.

@sancheolz
Copy link
Contributor Author

Looks good to me. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants