Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that relative imports can be imported without requiring ./ in front of the import file name #350

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ialarmedalien
Copy link

fixes linkml/linkml#2422

Ensures that relative imports work correctly without having ./ in front of the import file name.

@ialarmedalien
Copy link
Author

ialarmedalien commented Jan 31, 2025

@sierra-moxon any chance of a review? should be pretty simple.

@ialarmedalien ialarmedalien force-pushed the bugfix/issue-2422 branch 2 times, most recently from 151aa7c to adb8ed1 Compare January 31, 2025 20:36
@sierra-moxon sierra-moxon self-requested a review February 4, 2025 17:54
@@ -303,7 +303,7 @@ def imports_closure(self, imports: bool = True, traverse: Optional[bool] = None,
# - subdir/types.yaml
# we should treat the two `types.yaml` as separate schemas from the POV of the
# origin schema.
if sn.startswith('.') and ':' not in i:
if '/' in sn and ':' not in i:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want an additional check for something being an absolute path here:

Suggested change
if '/' in sn and ':' not in i:
if '/' in sn and ':' not in i and not PurePath(sn).is_absolute():

but otherwise this seems fine. the ':' check still protects against this breaking other protocol schemes.

only downside i can think of are that by allowing relative paths without leading ./ or ../, they become less explicit, but shouldn't affect the way imports are normally used now. maybe a tiny chance of some weird json <-> yaml escaped slash problem, but that would be a json parsing bug.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you suggest a test case to add where the importer will fail without the test for an absolute path? I tried to add as many variants of path representation as possible to the tests so I'd like to have at least one path that would fail if the suggested addition was not in place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure - should be any second-level schema imported via absolute path. one second

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! pleasantly surprised to learn that an absolute path causes the thing it's being appended to to be ignored. learned something new :). Disregard this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking - glad I could provide a surprising learning experience for the day!

@dalito
Copy link
Member

dalito commented Feb 10, 2025

It seems that it breaks Windows but you don't see it because #360 is not yet merged.

Also running your PR-branch locally on Win10 gives an error:

(.venv) λ pytest -k schemaview
======================================= test session starts =======================================
platform win32 -- Python 3.10.11, pytest-8.0.1, pluggy-1.4.0
rootdir: C:\Users\dlinke\MyProg_local\gh-dalito\linkml-runtime
collected 663 items / 630 deselected / 33 selected

tests\test_issues\test_linkml_runtime_issue_1317.py .s.                                      [  9%]
tests\test_utils\test_schemaview.py ............F.s...............                           [100%]

============================================ FAILURES =============================================
______________________________________ test_imports_relative ______________________________________

    def test_imports_relative():
        """Relative imports from relative imports should evaluate relative to the *importing* schema."""
        sv = SchemaView(SCHEMA_RELATIVE_IMPORT_TREE)
>       closure = sv.imports_closure(imports=True)

tests\test_utils\test_schemaview.py:524:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
linkml_runtime\utils\schemaview.py:281: in imports_closure
    imported_schema = self.load_import(sn)
linkml_runtime\utils\schemaview.py:227: in load_import
    schema = load_schema_wrap(sname + '.yaml', base_dir=base_dir)
linkml_runtime\utils\schemaview.py:84: in load_schema_wrap
    schema = yaml_loader.load(path, target_class=SchemaDefinition, **kwargs)
linkml_runtime\loaders\loader_root.py:76: in load
    results = self.load_any(*args, **kwargs)
linkml_runtime\loaders\yaml_loader.py:41: in load_any
    data_as_dict = self.load_as_dict(source, base_dir=base_dir, metadata=metadata)
linkml_runtime\loaders\yaml_loader.py:27: in load_as_dict
    data = self._read_source(source, base_dir=base_dir, metadata=metadata, accept_header="text/yaml, application/yaml;q=0.9")
linkml_runtime\loaders\loader_root.py:167: in _read_source
    data = hbread(source, metadata, base_dir, accept_header)
.venv\lib\site-packages\hbreader\__init__.py:260: in hbread
    with hbopen(source, open_info, base_path, accept_header, is_actual_data, read_codec) as f:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

source = './four.yaml'
open_info = FileInfo(source_file=None, source_file_date=None, source_file_size=None, base_path='C:\\Users\\dlinke\\MyProg_local\\gh-dalito\\linkml-runtime\\tests\\test_utils\\input\\imports_relative\\L0_0\\L1_0_0')
base_path = 'C:\\Users\\dlinke\\MyProg_local\\gh-dalito\\linkml-runtime\\tests\\test_utils\\input\\imports_relative\\L0_0\\L1_0_0'
accept_header = 'text/yaml, application/yaml;q=0.9'
is_actual_data = <function default_str_tester at 0x00000199BCE57760>, read_codec = None

    def hbopen(source: HB_TYPE,
               open_info: Optional[FileInfo] = None,
               base_path: Optional[str] = None,
               accept_header: Optional[str] = None,
               is_actual_data: Optional[Callable[[str], bool]] = default_str_tester,
               read_codec: str = None) -> TextIO:
        """
        Return an open IO representation of source
        :param source: anything that can be construed to be a string, a URL, a file name or an open file handle
        :param open_info: what we learned about source in the process of converting it
        :param base_path: Base to use if source is a relative URL or file name
        :param accept_header: Accept header to use if it turns out to be a URL
        :param is_actual_data: Function to differentiate plain text from URL or file name
        :param read_codec: Name of codec to use if bytes being read. (URL only)
        :return: TextIO representation of open file
        """
        source_type = detect_type(source, base_path, is_actual_data)
        if source_type is HBType.STRINGABLE:
            source_as_string = str(source)
        elif source_type is HBType.DECODABLE:
            # TODO: Tie this into the autodetect machinery
            source_as_string = source.decode()
        elif source_type is HBType.STRING:
            source_as_string = source
        else:
            source_as_string = None

        # source is a URL or a file name
        if source_as_string:
            if open_info:
                assert open_info.source_file is None, "source_file parameter not allowed if data is a file or URL"
                assert open_info.source_file_date is None, "source_file_date parameter not allowed if data is a file or URL"
                open_info.source_file_size = len(source_as_string)
            return StringIO(source_as_string)

        if source_type is HBType.URL:
            url = source if '://' in source else urljoin(base_path + ('' if base_path.endswith('/') else '/'),
                                                         source, allow_fragments=True)
            req = Request(quote(url, '/:'))
            if accept_header:
                req.add_header("Accept", accept_header)
            try:
                response = urlopen(req, context=ssl._create_unverified_context())
            except HTTPError as e:
                # This is here because the message out of urllib doesn't include the file name
                e.msg = f"{e.filename}"
                raise e
            if open_info:
                open_info.source_file = response.url
                open_info.source_file_date = response.headers['Last-Modified']
                if not open_info.source_file_date:
                    open_info.source_file_date = response.headers['Date']
                open_info.source_file_size = response.headers['Content-Length']
                parts = urlsplit(response.url)
                open_info.base_path = urlunsplit((parts.scheme, parts.netloc, os.path.dirname(parts.path),
                                                 parts.query, None))
            # Auto convert byte stream to
            return _to_textio(response, response.fp.mode, read_codec)

        if source_type is HBType.FILENAME:
            if not base_path:
                fname = os.path.abspath(source)
            else:
                fname = source if os.path.isabs(source) else os.path.abspath(os.path.join(base_path, source))
>           f = open(fname, encoding=read_codec if read_codec else 'utf-8')
E           FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\dlinke\\MyProg_local\\gh-dalito\\linkml-runtime\\tests\\test_utils\\input\\imports_relative\\L0_0\\L1_0_0\\four.yaml'

.venv\lib\site-packages\hbreader\__init__.py:206: FileNotFoundError

===================================== short test summary info =====================================
FAILED tests/test_utils/test_schemaview.py::test_imports_relative - FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\dlinke\\MyProg_local\\gh-da...
============== 1 failed, 30 passed, 2 skipped, 630 deselected, 37 warnings in 9.81s ===============

@ialarmedalien
Copy link
Author

@dalito is #360 going to be merged? Not sure what the order of precedence of these PRs is.

@dalito
Copy link
Member

dalito commented Feb 10, 2025

I have asked on dev-channel if #360 can be merged.

@cmungall
Copy link
Member

#360 now merged! I think it's best if one of you two handle the resulting conflict...

@ialarmedalien
Copy link
Author

@dalito it looks like the GA tests passed; would you be able to see if they pass locally on Win10? I don't have a windows box available.

@dalito
Copy link
Member

dalito commented Feb 18, 2025

@ialarmedalien - The tests pass now on Win10. Great!

Copy link
Contributor

@sneakers-the-rat sneakers-the-rat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for fixing this. the ./ check always felt fragile to me. always love it when the fix is so simple because the stdlib tools are better than i knew they were!

no idea why upstream tests are failing, but doesn't look related to this. approving for the substance of the PR and hopefully that's uh just a temporary glitch

@ialarmedalien
Copy link
Author

I don't have merge privs so can someone who does press the magic button (or make a mental note to press the magic button if there are other PRs that depend on this)?

@dalito @sneakers-the-rat

@sierra-moxon
Copy link
Member

Before we merge, lets make sure we address the upstream failing tests (as an out here, we could post a note to why they are failing in this PR and we can get someone to fix separately). (been burned so many times by merging without the green checks that I am def hesitant to do that - esp on Fridays! :P )

@sneakers-the-rat
Copy link
Contributor

the failures are truly mysterious to me, it's swapping out owl:DataRange for rdfs:Datatype ... and i can't fathom how this PR would have affected that (except for on windows in python 3.12). can anyone with privs trigger a re-run of the failed actions to see if it was a weird fluke?

@dalito
Copy link
Member

dalito commented Feb 21, 2025

I just triggered a re-run of failed jobs. Let's see.

@dalito
Copy link
Member

dalito commented Feb 22, 2025

The mysterious errors are gone. The remaining errors are probably related to the changes made in #357

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

imports does not respect local directory
5 participants