Opened 6 years ago

Closed 5 years ago

#2696 closed defect (fixed)

(PostgreSQL) missing rollback in db.scheduler

Reported by: gracinet Owned by: dustin
Priority: major Milestone: 0.9.0
Version: master Keywords:
Cc:

Description

Main symptom:

          File "/srv/buildbot/local/lib/python2.7/site-packages/sqlalchemy/engin
e/base.py", line 1690, in _execute_context
            context)
          File "/srv/buildbot/local/lib/python2.7/site-packages/sqlalchemy/engin
e/default.py", line 335, in do_execute
            cursor.execute(statement, parameters)
        sqlalchemy.exc.InternalError: (InternalError) current transaction is aborted, commands ignored until end of transaction block
         'UPDATE scheduler_changes SET important=%(important)s WHERE scheduler_changes.objectid = %(objectid_1)s AND scheduler_changes.changeid = %(wc_changeid)s' {'wc_changeid': 5096, 'important': 1, 'objectid_1': 15}

This excerpt from db.schedulers cannot work with PostgreSQL:

                except (sqlalchemy.exc.ProgrammingError,
                        sqlalchemy.exc.IntegrityError):
                    # insert failed, so try an update
                    conn.execute(upd_q,

Indeed, after either of these two exceptions, the transaction must be rollbacked.

Change History (12)

comment:1 Changed 6 years ago by gracinet

A quickfix is at https://github.com/gracinet/buildbot/commit/c9289275452555f96db2aa1edd8e2824f34e85dc

Contrary to what the commit message says, the transactional behaviour with the quickfix is suboptimal (the transaction is about several changeset, one failed insert should not rollback the others).

A better version would be to check for prior existence and update or insert accordingly, but I'll have to dig a bit further to know what this is about really.

savepoints would be an option, but I'm not sure all RDMS provide them (does SQLite ? ) IIRC, they're not even exposed in an abstract API by psycopg2.

comment:2 Changed 6 years ago by dustin

Checking and then inserting introduces a race condition - that's what we're trying to avoid with this logic.

It's OK to introduce DB-specific behavior in the DB modules, if there's a need to.

Is it automatic that when an insert fails, Postgres aborts the whole transaction? Or is that something automatic that's happening on the client? In my testing with postgres, this works fine (the buildbot.test.unit.test_db_schedulers.TestSchedulersConnectorComponent?.test_classifyChanges_again executes this particular except block)

comment:3 Changed 5 years ago by Ben

This was hurting us on 0.8.10.

The linked revision did solved it for us.

What's the way forward there ?

comment:4 Changed 5 years ago by dustin

Gracinet's comment has me concerned that this doesn't quite solve the issue, since it may roll back some inserts which won't be replicated by the subsequent update.

Can you make a failing test case that we can use to verify a fix?

comment:5 Changed 5 years ago by Ben

It might be quite complicated to reproduce.

In our case, this happened during 'reconfig', one or two of the schedulers, out of ~50, amount of scheduler depending on the system involved (test / production), did consistently threw that error.

Those schedulers had nothing special, and were created in a loop like some of the other ones. Not all schedulers from the same loop showed the trouble.

comment:6 Changed 5 years ago by dustin

Incidentally, I somehow included the mentioned fix in 3874e116549f77e38e71fb55bcb08e481f4b1aab so the patch is present in master.

comment:7 Changed 5 years ago by dustin

  • Milestone changed from undecided to 0.9.0
  • Owner set to dustin
  • Status changed from new to assigned

comment:8 Changed 5 years ago by Ben

You're right, looks like the fix is indeed in master !

Is it an option to backport it to eight ?

comment:9 Changed 5 years ago by dustin

You probably want to backport the improved fix in GH:1572. That's up to sa2ajj.

comment:10 Changed 5 years ago by sa2ajj

It looks like cherry pick does not apply cleanly :(

I'll see if I can properly get it, but help would be appreciated!

comment:11 Changed 5 years ago by dustin

backport in GH:1598

comment:12 Changed 5 years ago by dustin

  • Resolution set to fixed
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.