Postgresql parser

Started by andurkarover 14 years ago4 messages
#1andurkar
andurkarad10.comp@coep.ac.in

Hello,
Currently I am working on Postgresql... I need to study the gram.y and
scan.l parser files...since I want to do some qery modification. Can anyone
please help me to understand the files. What should I do ? Is there any
documentation available ?

Regards,
Aditi.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Postgresql-parser-tp4844522p4844522.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

#2Kerem Kat
keremkat@gmail.com
In reply to: andurkar (#1)
Re: Postgresql parser

On Tue, Sep 27, 2011 at 11:44, andurkar <andurkarad10.comp@coep.ac.in> wrote:

Hello,
Currently I am working on Postgresql... I need to study the gram.y and
scan.l parser files...since I want to do some qery modification. Can anyone
please help me to understand the files. What should I do ? Is there any
documentation available ?

Regards,
Aditi.

What kind of modifications do you want to do?

regards,

Kerem KAT

#3Florian Pflug
fgp@phlo.org
In reply to: andurkar (#1)
Re: Postgresql parser

On Sep27, 2011, at 10:44 , andurkar wrote:

Currently I am working on Postgresql... I need to study the gram.y and
scan.l parser files...since I want to do some qery modification. Can anyone
please help me to understand the files. What should I do ? Is there any
documentation available ?

scan.l defines the lexer, i.e. the algorithm that splits a string (containing
an SQL statement) into a stream of tokens. A token is usually a single word
(i.e., doesn't contain spaces but is delimited by spaces), but can also be
a whole single or double-quoted string for example. The lexer is basically
defined in terms of regular expressions which describe the different token types.

gram.y defines the grammar (the syntactical structure) of SQL statements,
using the tokens generated by the lexer as basic building blocks. The grammar
is defined in BNF notation. BNF resembles regular expressions but works
on the level of tokens, not characters. Also, patterns (called rules or productions
in BNF) are named, and may be recursive, i.e. use themselves as sub-patters.

The actual lexer is generated from scan.l by a tool called flex. You can find
the manual at http://flex.sourceforge.net/manual/

The actual parser is generated from gram.y by a tool called bison. You can find
the manual at http://www.gnu.org/s/bison/.

Beware, though, that you'll have a rather steep learning curve ahead of you
if you've never used flex or bison before.

best regards,
Florian Pflug

#4Alvaro Herrera
alvherre@commandprompt.com
In reply to: Florian Pflug (#3)
Re: Postgresql parser

Excerpts from Florian Pflug's message of mar sep 27 08:28:00 -0300 2011:

On Sep27, 2011, at 10:44 , andurkar wrote:

Currently I am working on Postgresql... I need to study the gram.y and
scan.l parser files...since I want to do some qery modification. Can anyone
please help me to understand the files. What should I do ? Is there any
documentation available ?

scan.l defines the lexer, i.e. the algorithm that splits a string (containing
an SQL statement) into a stream of tokens. A token is usually a single word
(i.e., doesn't contain spaces but is delimited by spaces), but can also be
a whole single or double-quoted string for example. The lexer is basically
defined in terms of regular expressions which describe the different token types.

Seemed a good answer so I added it to the developer's faq
http://wiki.postgresql.org/wiki/Developer_FAQ#I_need_to_do_some_changes_to_query_parsing._Can_you_succintly_explain_the_parser_files.3F

Feel free to edit.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support