4 comments
burtonator Says:
OK... I read the bug report. Maybe I'm missing something.
It's a connect related issue and the connect timeout would control why you can't move forward while blocking on IO.
Did you set connect_timeout before or after the connect attempt?
This bug just seems like a fix to kill the connect.
I should look at the patch .....
doughboy Says:
No burtonator, that is not it.
# mysqladmin var | fgrep connect_timeout
| connect_timeout | 5 |
I gave the slaves at least a minute.
If you follow the bug report I linked in my updated, it is explained there in more detail. Also, the patch has been commited to fix this in future releases. Open Source scores a point there. My issue is confirmed, I see the patch and know the fix is coming.
burtonator Says:
This is a really sad and pathetic bug.
I've been working on a list of common distributed system bugs and this is going on the list.
You can fix it on your end though.
Look at the connect_timeout variable.
It defaults to 3600 seconds but I reduce it down to 30 seconds.
When you run SLAVE STOP it won't break the timeout but it will stop after 30 seconds.
Other bugs I've seen on this topic:
* infinite DNS caching
* DNS caching within the app
* Infinite or LONG read timeouts.
... etc.
Comments are disabled for this post.
doughboy Says:
See this bug for slave's connect timeout.
http://bugs.mysql.com/bug.php?id=30932