This repository provides tools for straightforward extraction of the SQL Select statements from the Posts.xml file and their subsequent analysis. Posts.xml
is a StackOverflow file that contains all the questions and answers in XML format.
Pass a path to the Posts.xml
file to the postsxml_sql_extract.py
program. The program will generate sqlcommands.txt
file with SQL Select on each line.
python postsxml_sql_extract.py <path_to_posts.xml_file>
The creation of sqlcommands.txt
is a mandatory step for any subsequent SQL Select analysis. Once sqlcommands.txt
is created you may collect different statistics.
This analysis compute two statistics:
- The number of window function per each chunk of thousand queries.
- The number of window function types.
python sql_WF_analysis.py sqlcommands.txt