python - How Pony (ORM) does its tricks? -
pony orm nice trick of converting generator expression sql. example:
>>> select(p p in person if p.name.startswith('paul')) .order_by(person.name)[:2] select "p"."id", "p"."name", "p"."age" "person" "p" "p"."name" "paul%" order "p"."name" limit 2 [person[3], person[1]] >>>
i know python has wonderful introspection , metaprogramming builtin, how library able translate generator expression without preprocessing? looks magic.
[update]
blender wrote:
here file you're after. seems reconstruct generator using introspection wizardry. i'm not sure if supports 100% of python's syntax, pretty cool. – blender
i thinking exploring feature generator expression protocol, looking file, , seeing ast
module involved... no, not inspecting program source on fly, they? mind-blowing...
@brenbarn: if try call generator outside select
function call, result is:
>>> x = (p p in person if p.age > 20) >>> x.next() traceback (most recent call last): file "<interactive input>", line 1, in <module> file "<interactive input>", line 1, in <genexpr> file "c:\python27\lib\site-packages\pony\orm\core.py", line 1822, in next % self.entity.__name__) file "c:\python27\lib\site-packages\pony\utils.py", line 92, in throw raise exc typeerror: use select(...) function or person.select(...) method iteration >>>
seems doing more arcane incantations inspecting select
function call , processing python abstract syntax grammar tree on fly.
i still see explaining it, source way beyond wizardry level.
pony orm author here.
pony translates python generator sql query in 3 steps:
- decompiling of generator bytecode , rebuilding generator ast (abstract syntax tree)
- translation of python ast "abstract sql" -- universal list-based representation of sql query
- converting abstract sql representation specific database-dependent sql dialect
the complex part second step, pony must understand "meaning" of python expressions. seems interested in first step, let me explain how decompiling works.
let's consider query:
>>> pony.orm.examples.estore import * >>> select(c c in customer if c.country == 'usa').show()
which translated following sql:
select "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address" "customer" "c" "c"."country" = 'usa'
and below result of query printed out:
id|email |password|name |country|address --+-------------------+--------+--------------+-------+--------- 1 |john@example.com |*** |john smith |usa |address 1 2 |matthew@example.com|*** |matthew reed |usa |address 2 4 |rebecca@example.com|*** |rebecca lawson|usa |address 4
the select()
function accepts python generator argument, , analyzes bytecode. can bytecode instructions of generator using standard python dis
module:
>>> gen = (c c in customer if c.country == 'usa') >>> import dis >>> dis.dis(gen.gi_frame.f_code) 1 0 load_fast 0 (.0) >> 3 for_iter 26 (to 32) 6 store_fast 1 (c) 9 load_fast 1 (c) 12 load_attr 0 (country) 15 load_const 0 ('usa') 18 compare_op 2 (==) 21 pop_jump_if_false 3 24 load_fast 1 (c) 27 yield_value 28 pop_top 29 jump_absolute 3 >> 32 load_const 1 (none) 35 return_value
pony orm has function decompile()
within module pony.orm.decompiling
can restore ast bytecode:
>>> pony.orm.decompiling import decompile >>> ast, external_names = decompile(gen)
here, can see textual representation of ast nodes:
>>> ast genexpr(genexprinner(name('c'), [genexprfor(assname('c', 'op_assign'), name('.0'), [genexprif(compare(getattr(name('c'), 'country'), [('==', const('usa'))]))])]))
let's see how decompile()
function works.
the decompile()
function creates decompiler
object, implements visitor pattern. decompiler instance gets bytecode instructions one-by-one. each instruction decompiler object calls own method. name of method equal name of current bytecode instruction.
when python calculates expression, uses stack, stores intermediate result of calculation. decompiler object has own stack, stack stores not result of expression calculation, ast node expression.
when decompiler method next bytecode instruction called, takes ast nodes stack, combines them new ast node, , puts node on top of stack.
for example, let's see how subexpression c.country == 'usa'
calculated. corresponding bytecode fragment is:
9 load_fast 1 (c) 12 load_attr 0 (country) 15 load_const 0 ('usa') 18 compare_op 2 (==)
so, decompiler object following:
- calls
decompiler.load_fast('c')
. method putsname('c')
node on top of decompiler stack. - calls
decompiler.load_attr('country')
. method takesname('c')
node stack, createsgeattr(name('c'), 'country')
node , puts on top of stack. - calls
decompiler.load_const('usa')
. method putsconst('usa')
node on top of stack. - calls
decompiler.compare_op('==')
. method takes 2 nodes (getattr , const) stack, , putscompare(getattr(name('c'), 'country'), [('==', const('usa'))])
on top of stack.
after bytecode instructions processed, decompiler stack contains single ast node corresponds whole generator expression.
since pony orm needs decompile generators , lambdas only, not complex, because instruction flow generator relatively straightforward - bunch of nested loops.
currently pony orm covers whole generator instructions set except 2 things:
- inline if expressions:
a if b else c
- compound comparisons:
a < b < c
if pony encounters such expression raises notimplementederror
exception. in case can make work passing generator expression string. when pass generator string pony doesn't use decompiler module. instead gets ast using standard python compiler.parse
function.
hope answers question.
Comments
Post a Comment