Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a new parser for parsing BI500 files #1252

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

praneethratna
Copy link
Contributor

Draft PR for BI500 parser code.

CC @leewujung @jmjech

@codecov-commenter
Copy link

codecov-commenter commented Jan 2, 2024

Codecov Report

Attention: Patch coverage is 20.56075% with 85 lines in your changes missing coverage. Please review.

Project coverage is 46.77%. Comparing base (529fa60) to head (0cf806b).
Report is 170 commits behind head on main.

Files with missing lines Patch % Lines
echopype/convert/parse_bi500.py 20.00% 84 Missing ⚠️
echopype/core.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1252       +/-   ##
===========================================
- Coverage   83.52%   46.77%   -36.75%     
===========================================
  Files          64       63        -1     
  Lines        5686     5772       +86     
===========================================
- Hits         4749     2700     -2049     
- Misses        937     3072     +2135     
Flag Coverage Δ
unittests 46.77% <20.56%> (-36.75%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jmjech
Copy link

jmjech commented Jan 4, 2024

Hi @praneethratna & @leewujung:
Thanks for the initial start on reading BI500 data! Great start.

I copied your parse_bi500.py and core.py to my ~/echopype/convert/ and ~/echopype directories and ran my code:

import echopype as ep
from echopype import open_raw
from pathlib import Path

filename = Path('/home/mjech/NOAA_Gdrive/sonarpros/IDL_Programs/testdata/singlefile/N031-S445-S2000008-F011990-T01-D20000913-T103336-Data')
EKmodel = 'BI500' 
ed = open_raw(str(filename), sonar_model=EKmodel)

You'll notice that I use pathlib's Path for my files. I like it, but to get the filename into echopype, I need to cast it as a string. The reason I mention this is that I ran into some errors based on the file name in parse_bi500.py code and core.py. I think they came up because the BI500 files do not have a "true" suffix, which seems necessary for some of the initial file name and file path checking that is done in parse_bi500 and maybe ParseBase.

Here is what I did to get around these errors:

  1. In core.py, I modified the line in the BI500 sonar model dict entry "validate_ext": validate_ext(''), rather than the None type that was used. You'll notice I use single quotes rather than double quotes so you would have .validate_ext(""). Anyway, that prevented an error that "None" types were not valid.
  2. In parse_bi500.py I needed to
    a. insert from pathlib import Path
    b. reorganize the self declarations in the init section to:
        self.file_types = FILE_TYPES
        self.timestamp_pattern = FILENAME_DATETIME_BI500
        self.file_type_map = defaultdict(None)

        self.parameters = defaultdict(list)
        self.ping_counts = defaultdict(list)
        self.vlog_counts = defaultdict(list)
        self.index_counts = defaultdict(list)
        self.unpacked_data = defaultdict(list)
        self.sonar_type = "BI500"

        self.fsmap = self._validate_folder_path(file)
        self.index_file = self._get_index_file(self.fsmap)

You'll notice that I needed to set all the different parameters before going to the "file" lines (the last two lines) because the parameters didn't get set and those "file" functions needed them set.

In addition, in the _validate_folder_path(self, folder_path) function, I used Path to get the parent folder:

   def _validate_folder_path(self, folder_path):
        """Validate the folder path."""
        folder_path = str(Path(folder_path).parent)
        fsmap = fsspec.get_mapper(folder_path, **self.storage_options)
        try:
            all_files = fsmap.fs.ls(folder_path)
        except NotADirectoryError:
            raise ValueError(
                "Expecting a folder containing at least '-Data' and '-Info' files, "
                f"but got {folder_path}"
            )

I seems I needed to do this because the BI500 files don't have a suffix.

I've gotten over the initial humps, but now I get some other errors that may be better for you to look into. Here are the error messages:

error Traceback (most recent call last)
~/NOAA_Gdrive/sonarpros/Python_Programs/EK_ES/test_BI500.py in
15 filename = Path('/home/mjech/NOAA_Gdrive/sonarpros/IDL_Programs/testdata/singlefile/N031-S445-S2000008-F011990-T01-D20000913-T103336-Data')
16 EKmodel = 'BI500'
---> 17 ed = open_raw(str(filename), sonar_model=EKmodel)
18
19 '''

~/.local/lib/python3.10/site-packages/echopype/utils/prov.py in inner(*args, **kwargs)
235 @functools.wraps(func)
236 def inner(*args, **kwargs):
--> 237 dataobj = func(*args, **kwargs)
238 if is_echodata:
239 ed = dataobj

~/.local/lib/python3.10/site-packages/echopype/convert/api.py in open_raw(raw_file, sonar_model, xml_path, convert_params, storage_options, use_swap, max_chunk_size)
421 )
422 # Actually parse the raw datagrams from source file
--> 423 parser.parse_raw()
424
425 # Direct offload to zarr and rectangularization only available for some sonar models

~/.local/lib/python3.10/site-packages/echopype/convert/parse_bi500.py in parse_raw(self)
240 self.unpacked_data["pelagic"].append(unpacked_data[:PELAGIC_COUNT])
241 self.unpacked_data["bottom"].append(
--> 242 unpacked_data[PELAGIC_COUNT : PELAGIC_COUNT + BOTTOM_COUNT]
243 )
244 for trace_num in range(TRACE_COUNT):

error: unpack requires a buffer of 1324 bytes

These errors seem to be associated with the actual reading and parsing of the data file.

If this is unclear, I can send my revised parse_bi500.py and core.py code.

Thanks!
mike

@praneethratna
Copy link
Contributor Author

praneethratna commented Jan 5, 2024

Hey @jmjech Thanks for testing out the code locally on sample data and informing me about the errors.

  1. The reason i have used None earlier since as mentioned by you the BI500 files don't have a true suffix and it isn't necessary for such a check. I have changed it to validate_ext("") as suggested and it works fine now.
  2. I have also re-arranged the lines __init__ to solve the errors regarding the initialisation.
  3. The error caused after that is due to offset and count values being used from -Vlog instead of -Ping while unpacking -Data file and there is difference in pings in both the files as discussed and also due a mistake in START_FORMAT value which i have rectified.

You can now pull the latest code changes and everything should work fine on the parser part. Since, we don't have set_groups_bi500.py setup yet we cannot test code using open_raw method but can test the parser as follows:

>>> from echopype.convert.parse_bi500 import ParseBI500
>>> parser = ParseBI500(str('/home/praneeth/echopype/test_ek500'), None)
>>> parser.parse_raw()

where test_ek500 is a folder containing -Data, -Ping, -Vlog, -Info, -Work, -Snap files corresponding to prefix N031-S445-S2000008-F011990-T01-D20000913-T103336. You can access the parsed data of -Data file using parser.unpacked_data and that of rest of the files using parser.parameters.

@jmjech
Copy link

jmjech commented Jan 10, 2024

Hi @praneethratna. I successfully imported BI500 data and was able to produce a couple of echograms! One thing that needs to be done is convert the values that have been read from the data file to dB. You do that by multiplying by 10*log_base_10(2)/256.

@leewujung leewujung added this to the v0.9.0 milestone Mar 14, 2024
@ctuguinay ctuguinay modified the milestones: v0.9.0, v0.9.1 Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants