How to use the SQL parser subsystem

Started by Ravindra Jajuover 19 years ago6 messagesgeneral
Jump to latest
#1Ravindra Jaju
list.jaju@gmail.com

Hi.

Is it possible to use the SQL parser embedded inside the postgres code via
some public API?
I wish to get a parsed SQL query in any format which lends itself to easy
analysis (structural, for example) so as to be able to find similarities
between queries.

I tried peeking around a bit, and tried pulling out src/backend/parser - but
that depends on other parts of the code, making it a bit non-trivial. Plus,
any changes in the base code upstream would not be available to this
pulled-out application over time. Looking further, I noticed on my Fedora
distribution that the pg devel package has a parser/parser.h file which
exposes the function 'raw_parser' which is the function that seemed most
interesting to me while exploring the code too.

I have some questions:
1] is 'raw_parser' the right function to parse an SQL statement (for mainly
structural/syntactic analysis)?
2] If yes, I found that none of the shared library files expose this
function - I could not link a simple C program with this function
successfully. (I tried all the .so files related to postgres! :-()
3] If not, what would be a better mechanism to re-use all the work already
done in postgres for SQL analysis?

regards,
jaju

#2Martijn van Oosterhout
kleptog@svana.org
In reply to: Ravindra Jaju (#1)
Re: How to use the SQL parser subsystem

On Thu, Jan 04, 2007 at 08:01:17PM +0530, Ravindra Jaju wrote:

I tried peeking around a bit, and tried pulling out src/backend/parser - but
that depends on other parts of the code, making it a bit non-trivial. Plus,
any changes in the base code upstream would not be available to this
pulled-out application over time. Looking further, I noticed on my Fedora
distribution that the pg devel package has a parser/parser.h file which
exposes the function 'raw_parser' which is the function that seemed most
interesting to me while exploring the code too.

Note that that's a *backend* header file.

I have some questions:
1] is 'raw_parser' the right function to parse an SQL statement (for mainly
structural/syntactic analysis)?

I beleive it's the function used y postgres internally, not sure
though.

2] If yes, I found that none of the shared library files expose this
function - I could not link a simple C program with this function
successfully. (I tried all the .so files related to postgres! :-()

It's inside the server, it's not in a seperate library. To use it you
need to be inside the server.

3] If not, what would be a better mechanism to re-use all the work already
done in postgres for SQL analysis?

No idea.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#3Ravindra Jaju
list.jaju@gmail.com
In reply to: Martijn van Oosterhout (#2)
Re: How to use the SQL parser subsystem

Hi.

That was informative. Thanks.

The reason I started pursuing this path was that this file (parser/parser.h)
happens
to be distributed as part of some 'devel' package, making me think that it
might
be possible to re-use the SQL parsing functionality that exists within
postgres.

Even if it is in the backend, I guessed that there might be a .so related to
the backend
which I could link against. Not so...

If someone can confirm that there is no known utility which uses postgres
codebase to
play with SQL statements, then it would make sense for me to try and hack
something.

regards,
jaju

Show quoted text

On 1/4/07, Martijn van Oosterhout <kleptog@svana.org> wrote:

On Thu, Jan 04, 2007 at 08:01:17PM +0530, Ravindra Jaju wrote:

I tried peeking around a bit, and tried pulling out src/backend/parser -

but

that depends on other parts of the code, making it a bit non-trivial.

Plus,

any changes in the base code upstream would not be available to this
pulled-out application over time. Looking further, I noticed on my

Fedora

distribution that the pg devel package has a parser/parser.h file which
exposes the function 'raw_parser' which is the function that seemed most
interesting to me while exploring the code too.

Note that that's a *backend* header file.

I have some questions:
1] is 'raw_parser' the right function to parse an SQL statement (for

mainly

structural/syntactic analysis)?

I beleive it's the function used y postgres internally, not sure
though.

2] If yes, I found that none of the shared library files expose this
function - I could not link a simple C program with this function
successfully. (I tried all the .so files related to postgres! :-()

It's inside the server, it's not in a seperate library. To use it you
need to be inside the server.

3] If not, what would be a better mechanism to re-use all the work

already

done in postgres for SQL analysis?

No idea.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

From each according to his ability. To each according to his ability to

litigate.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFFnROLIB7bNG8LQkwRApb+AJ9z8V9p3krQ9zxGCB/yQCFPVNg2XgCeLZGl
qDi9lewzfJmPfM23lht/p7E=
=tk7B
-----END PGP SIGNATURE-----

#4Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Ravindra Jaju (#1)
Re: How to use the SQL parser subsystem

Hi.

Is it possible to use the SQL parser embedded inside the postgres code via
some public API?
I wish to get a parsed SQL query in any format which lends itself to easy
analysis (structural, for example) so as to be able to find similarities
between queries.

I tried peeking around a bit, and tried pulling out src/backend/parser - but
that depends on other parts of the code, making it a bit non-trivial. Plus,
any changes in the base code upstream would not be available to this
pulled-out application over time. Looking further, I noticed on my Fedora
distribution that the pg devel package has a parser/parser.h file which
exposes the function 'raw_parser' which is the function that seemed most
interesting to me while exploring the code too.

I have some questions:
1] is 'raw_parser' the right function to parse an SQL statement (for mainly
structural/syntactic analysis)?

Yes.

2] If yes, I found that none of the shared library files expose this
function - I could not link a simple C program with this function
successfully. (I tried all the .so files related to postgres! :-()

It's not exported.

3] If not, what would be a better mechanism to re-use all the work already
done in postgres for SQL analysis?

One idea would be writing a PostgreSQL function which calls
raw_parser() and returns the parse tree as a texual representaion.

Another idea would be pulling out raw_parser() along with related
functions from PostgreSQL. pgpool-II
(http://pgfoundry.org/projects/pgpool/) has alread done this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

#5Ravindra Jaju
list.jaju@gmail.com
In reply to: Tatsuo Ishii (#4)
Re: How to use the SQL parser subsystem

On 1/4/07, Tatsuo Ishii <ishii@postgresql.org> wrote:

3] If not, what would be a better mechanism to re-use all the work

already

done in postgres for SQL analysis?

One idea would be writing a PostgreSQL function which calls
raw_parser() and returns the parse tree as a texual representaion.

Another idea would be pulling out raw_parser() along with related
functions from PostgreSQL. pgpool-II
(http://pgfoundry.org/projects/pgpool/) has alread done this.

The first idea would mean having a postgres instance always accessible for
SQL analysis, which would be an overkill.

I shall check out pgpool-II.

Is there any specific reason why the SQL parsing portion is not exposed -
I expect a lot of people to do structure-level analysis of SQL statements,
hence
I do not really get why it should not be exposed.

Thanks a lot, and thanks for bearing with my questions! :-)

regards,
jaju

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#2)
Re: How to use the SQL parser subsystem

Martijn van Oosterhout <kleptog@svana.org> writes:

On Thu, Jan 04, 2007 at 08:01:17PM +0530, Ravindra Jaju wrote:

2] If yes, I found that none of the shared library files expose this
function - I could not link a simple C program with this function
successfully. (I tried all the .so files related to postgres! :-()

It's inside the server, it's not in a seperate library. To use it you
need to be inside the server.

By and large there is no part of the backend that is designed to be run
standalone --- almost everything relies on palloc and elog, for instance.
I concur with the suggestion to consider doing this as a backend
function rather than in a standalone program.

Note that what raw_parser gives you is the raw grammar output, which is
probably not really what you want. For almost any sort of interesting
analysis, I'd think you'd want to run the syntax tree through
parse_analyze() or one of its siblings, so that semantic interpretation
gets done. There is definitely no hope of pulling out parse_analyze(),
because it has to consult the catalogs ...

regards, tom lane