Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract syntax trees #43

Merged
merged 23 commits into from
Jan 3, 2024
Merged

Abstract syntax trees #43

merged 23 commits into from
Jan 3, 2024

Conversation

johan-t
Copy link
Contributor

@johan-t johan-t commented Jan 1, 2024

What issue did you fix?

I fixed the issue that we could not recognise complex nested functions, functions that are declared in a not normal way and increased performance.

Fixes: #41
Fixes: #22
Fixes: #10

Quick description of your approach:

Instead of finding function names and full functions with regex functions I used abstract syntax trees for this task. Unfortunately the library for parsing javascript in python is outdated that is why I have had to create a Javascript subprocess where we are using the Babel.
I also had to update the tests, because they did not use valid Javascript and could not be parsed by the abstract syntax trees.
And I increased performance by a big amount, by avoiding to parse commits multiple times.

How to run the code?

Clone the testRepo to use the updated tests. Run npm install to install the necessary dependencies and run python getFunctionData.py or python test_getFunctionData.py.

Checklist before requesting a review:

  • Remove all uneccessary debug print statements
  • Make sure all tests pass
  • Make sure there are enough comments to understand your code

@johan-t johan-t self-assigned this Jan 1, 2024
@johan-t johan-t added the Data Improvement Improving the quality of our trainingdata label Jan 1, 2024
@johan-t johan-t merged commit 0fc338b into main Jan 3, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Improvement Improving the quality of our trainingdata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime performance Avoid false positives Replace regex with ATS
1 participant