Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the scraper actually support basic auth? #78

Open
comatory opened this issue Jan 29, 2025 · 1 comment
Open

Does the scraper actually support basic auth? #78

comatory opened this issue Jan 29, 2025 · 1 comment

Comments

@comatory
Copy link

comatory commented Jan 29, 2025

The documentation mentions two env variables:

  • DOCSEARCH_BASICAUTH_USERNAME
  • DOCSEARCH_BASICAUTH_PASSWORD

I was looking through the code to figure out how they are encoded into Authorization headers so I can set up my private internal site correctly. However, the only mention I see is documentation_spider.py file.

This file reads the environment variables but does not seem to do anything with them. I see these are class properties assigned to http_user and http_pass so I tried searching the codebase for them, but did not find anything.

Am I right to assume that this does not actually work? Is the documentation lying or did I miss some important piece?

Any clarification would help. Meanwhile, I hope #76 gets merged, that way I could specify Authorization header directly without relying on implementation.

@comatory
Copy link
Author

comatory commented Jan 29, 2025

Basically I'd expect that the scraper would:

  1. Dispatch all requests with Authorization header in case these two variables are provided.
  2. The Authorization for basic HTTP auth would use username:password encoded as base64

I would also hope redirects would be respected. For example, the domain I'm trying to reach is internally redirecting to opaque domain (Cloudflare Worker). So I'd need for these headers to be sent there as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant