Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature web parser with pattern #950

Open
wants to merge 4 commits into
base: Omega
Choose a base branch
from

Conversation

alpgul
Copy link

@alpgul alpgul commented Jan 31, 2025

Description

Implement WebStreamExtractor to support dynamic URL extraction from web sources in playlist loading. This includes:

  • New WebStreamExtractor utility class for extracting stream URLs
  • Updated PlaylistLoader to support '@' prefixed URLs with optional web pattern
  • Added CMakeLists.txt to include new source files
  • Added Windows build script for easier compilation

Missing Features

  • Missing header support for web requests(CFile.OpenFile function doesn't support web header. what alternatives can be used?)
  • In-depth web crawling is currently not supported.

Example Usage

  • usage with default pattern:
#EXTINF:-1 tvg-name="Test" , Test
@http://127.0.0.1:3000/index.html
  • usage with custom pattern:
#EXTINF:-1 tvg-name="Test1", Test1
#WEBPROP:web-regex="(https?://[^\"]+\.m3u8)"
@http://127.0.0.1:3000/index.html

URL must be specified in parentheses.

@phunkyfish
Copy link
Member

phunkyfish commented Jan 31, 2025

Could you provide some examples? Preferably real world use cases.

The purpose of the feature is not clear to me from the PR description.

@alpgul
Copy link
Author

alpgul commented Jan 31, 2025

It is used to search for media URLs in HTML pages using a regex pattern.
Example usage: Define an m3u8 link in an HTML page. Then, define a regex to find the media URL link, and add the html link and regex as shown in the example usage above.
Example Html Page:

<html>
...
url:"https://localhost:3000/index.m3u8"
...
</html>

Example Regex:#WEBREGEX:url:"(https?://[^"]+\.m3u8)"

The current issue is that I can't send requests using custom headers. Which library should I use that works on all platforms? I'm considering using CURL, but does it work on all platforms?

@alpgul
Copy link
Author

alpgul commented Feb 1, 2025

  • Added support for headers in web URL requests.

Example Usage

#EXTM3U
#EXTINF:-1 tvg-name="Test1", Test1
#WEBPROP:web-regex="([^"]+\.mp4)"
#WEBPROP:web-headers=user-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/132.0.0.0 Safari/537.36&referer:https://google.com/
@https://test-videos.co.uk/bigbuckbunny/mp4-h264

@phunkyfish
Copy link
Member

How do you ensure only one URL is selected, and also ensure it is the correct one?

@phunkyfish
Copy link
Member

Also note that new features must be tested and merged on the Piers branch before being considered for the Omega branch.

Copy link
Member

@phunkyfish phunkyfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a light first review. I'd suggest collapsing all commits into a single one, as many of the commits change the previous one, making it difficult to review.

Also, many of the files are missing a newline at the end of the file.

build-windows.bat Outdated Show resolved Hide resolved
src/iptvsimple/PlaylistLoader.cpp Outdated Show resolved Hide resolved
src/iptvsimple/PlaylistLoader.cpp Show resolved Hide resolved
src/iptvsimple/utilities/WebStreamExtractor.cpp Outdated Show resolved Hide resolved
src/iptvsimple/utilities/WebStreamExtractor.h Outdated Show resolved Hide resolved
src/iptvsimple/PlaylistLoader.cpp Outdated Show resolved Hide resolved
src/iptvsimple/utilities/WebStreamExtractor.cpp Outdated Show resolved Hide resolved
src/iptvsimple/utilities/STRING.cpp Outdated Show resolved Hide resolved
@phunkyfish
Copy link
Member

build-windows.bat should be a separate commit.

@alpgul alpgul force-pushed the newFeature-webParserWithPattern branch from 9d67352 to 6603db2 Compare February 7, 2025 11:57
@alpgul alpgul force-pushed the newFeature-webParserWithPattern branch from 6603db2 to 08a4ca9 Compare February 7, 2025 12:31
@alpgul
Copy link
Author

alpgul commented Feb 7, 2025

How do you ensure only one URL is selected, and also ensure it is the correct one?

Initially, the URL fetching process was in the playlist loader. However, now that it has been moved to "GetChannelStreamProperties" there should be no issues.

build-windows.bat should be a separate commit.

It has been moved to a separate commit.

I merged the all commits.

Copy link
Member

@phunkyfish phunkyfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends/common/zlib/01-build-static.patch Outdated Show resolved Hide resolved
depends/common/zlib/03-install-pkgconfig-in-lib.patch Outdated Show resolved Hide resolved
depends/common/zlib/zlib.sha256 Outdated Show resolved Hide resolved
src/IptvSimple.cpp Outdated Show resolved Hide resolved
src/IptvSimple.cpp Outdated Show resolved Hide resolved
src/iptvsimple/PlaylistLoader.cpp Show resolved Hide resolved
Comment on lines 1 to 12
#include "Base64.h"

using namespace iptvsimple;
using namespace utilities;
namespace
{
constexpr char PADDING{'='};
constexpr std::string_view CHARACTERS{"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz"
"0123456789+/"};
// clang-format off
constexpr unsigned char BASE64_TABLE[] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about using https://github.com/azawadzki/base-n instead as a dependency? It support base64 out of the box. Then we don't need to carry this extra code.

As far as I can tell it's standard C++ so should be cross platform.

Copy link
Author

@alpgul alpgul Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this code from inputstream.adaptive. I chose this to minimize the risk of errors. If base-n is more performant, can we first adapt the changes in inputstream.adaptive and then apply them here as well?
Reference Code Link

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably take a different view on this. We should try the non custom code approach in iptvsimple and if that proves to work correctly then propose that dependency change for adaptive but not the other way around.

src/iptvsimple/utilities/CurlUtils.cpp Outdated Show resolved Hide resolved
src/iptvsimple/utilities/CurlUtils.cpp Outdated Show resolved Hide resolved
src/iptvsimple/utilities/CurlUtils.cpp Outdated Show resolved Hide resolved
@alpgul
Copy link
Author

alpgul commented Feb 8, 2025

Ok, review round 2.

Please take note https://github.com/xbmc/xbmc/blob/master/docs/CODE_GUIDELINES.md

https://github.com/xbmc/xbmc/blob/master/.clang-format

The .clang-format file is not working properly because when I apply automatic formatting, the content of the code changes completely. In other words, the formatting file is incompatible with the previous code style, and I have to manually fix the code every time. If you have one, could you share the .clang-format file you are using?


using namespace iptvsimple::utilities;

std::string WebStreamExtractor::ExtractStreamUrl(const std::string& webUrl,
Copy link
Author

@alpgul alpgul Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code explained in this link

@phunkyfish
Copy link
Member

Ok, review round 2.
Please take note https://github.com/xbmc/xbmc/blob/master/docs/CODE_GUIDELINES.md

https://github.com/xbmc/xbmc/blob/master/.clang-format

The .clang-format file is not working properly because when I apply automatic formatting, the content of the code changes completely. In other words, the formatting file is incompatible with the previous code style, and I have to manually fix the code every time. If you have one, could you share the .clang-format file you are using?

That clang format file is for xbmc, we don't follow all the conventions in addons just some of the more prominent ones to make code easier to read. I understand that doesn't make it black and white for contributors but it's where we are currently.

@alpgul
Copy link
Author

alpgul commented Feb 11, 2025

Removed Unused Methods in CurlUtils and Base64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants