[FB-Tracker] Created: (CORE-5521) Race condition between event notification and event registration

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[FB-Tracker] Created: (CORE-5521) Race condition between event notification and event registration

JIRA tracker@firebirdsql.org
Race condition between event notification and event registration
----------------------------------------------------------------

                 Key: CORE-5521
                 URL: http://tracker.firebirdsql.org/browse/CORE-5521
             Project: Firebird Core
          Issue Type: Bug
          Components: Engine
    Affects Versions: 3.0.2, 2.5.7, 4.0 Initial
            Reporter: Mark Rotteveel


There is a race condition between event notification and event registration (queue) reusing the event buffer, which causes the notification to write the event-id of the just queued event instead of the event being notified. See also fb-devel thread "Concurrency bugs in posting events?"

Relevant excerpts:

2017-04-09 12:00:
I have built a sample application that reproduces it a little bit more
consistently (although it still occasionally succeeds without a
mismatch). Note that this doesn't include the logging I showed in this
conversation, let me know if you need that.

You can download it from
https://www.dropbox.com/s/6jxfcadxtojodf8/event-race-condition-1.0-SNAPSHOT.zip?dl=0

Start with ./bin/event-race-condition --help for instructions. It
requires Java 8.

Playing around with the --threadCount and --insertsPerThread can help to
improve predictability to reproduce it. I used the same number of
threads as I have (HT) cores in my machine. Using more inserts per
thread can also increase the chance of it eventually occurring.

defaults are:
     private static String hostName = "localhost";
     private static int portNumber = 3050;
     private static String databasePath = "D:/data/db/fb3/eventrace.fdb";
     private static String user = "sysdba";
     private static String password = "masterkey";
     private static int threadCount = 8;
     private static int insertsPerThread = 200;

Full project: https://github.com/mrotteveel/event-race-condition

2017-04-02 13:59:
there seems to be a concurrency bug in
events posted by Firebird to the client. It looks like it overwrites
local event ids (shared buffer, race condition?).

This is triggered by running the entire Jaybird test suite. Running the
specific test, TestFBEventManager.testLargeMultiLoad, in isolation
significantly reduces the chance of it occurring.

For example a test run shows:

[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:694, eventId:0, internalCount:897,
previousInternalCount:897 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]:
000000340000000000000012010C544553545F4556454E545F42C201000000000000000000000000000002B5000000340000000000000012010C544553545F4556454E545F418503000000000000000000000000000002B6
[V10AsynchronousChannel]Received event id 693, eventCount 450
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:695, eventId:0, internalCount:450,
previousInternalCount:450 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 694, eventCount 901
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:696, eventId:0, internalCount:901,
previousInternalCount:901 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=44 cap=2048]:
000000340000000000000012010C544553545F4556454E545F42C301000000000000000000000000000002B8
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 696, eventCount 451
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:697, eventId:0, internalCount:451,
previousInternalCount:451 }

In other words, Firebird posts event data for TEST_EVENT_B (count 450 ->
451) with the local event id of TEST_EVENT_A. On occasion I also see
that it resends an earlier - already acknowledged - local event id.

As the event name is proper (although not 100% sure it always is), I
might be able to workaround this in the pure java implementation by
matching based on the event name instead, but that is hardly a good
workaround, because it is possible the same event name is registered
multiple times, and it won't solve the occurrence of the same bug with
the native client.

This seems to suggest a race condition of some kind when the events are
posted/written to the aux connection.

I can reproduce this with Jaybird master, Firebird 3.0.2.32703 on
Windows 10 64 bit, but I have also seen it with other Firebird versions,
and with Jaybird 2.2 (which has a significantly different implementation
of event handling), both with pure java and the native client use.

Any thoughts or ideas on this, or is it better if I just create a bug
report?

Other example: both A and B are acknowledged with id of event B:

[V10AsynchronousChannel]Received event id 640, eventCount 843
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_A, localId:642, eventId:0, internalCount:843,
previousInternalCount:843 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[FBManagedConnection]End called: Xid[773794790]
[V10AsynchronousChannel]Received event id 641, eventCount 422
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:643, eventId:0, internalCount:422,
previousInternalCount:422 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]java.nio.HeapByteBuffer[pos=0 lim=88 cap=2048]:
000000340000000000000012010C544553545F4556454E545F414D0300000000000000000000000000000283000000340000000000000012010C544553545F4556454E545F42A70100000000000000000000000000000283
[V10AsynchronousChannel]Received event id 643, eventCount 845
[V10AsynchronousChannel]Queue event: WireEventHandle:{
name:TEST_EVENT_B, localId:644, eventId:0, internalCount:845,
previousInternalCount:845 }
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[AbstractWireOperations]readStatusVector arg:isc_arg_gds int: 0
[V10AsynchronousChannel]Received event id 643, eventCount 423

--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel
Loading...