Filter out file extension - Printable Version

Filter out file extension - Printable Version

+- Python Forum (https://python-forum.io)
+-- Forum: Python Coding (https://python-forum.io/forum-7.html)
+--- Forum: General Coding Help (https://python-forum.io/forum-8.html)
+--- Thread: Filter out file extension (/thread-1469.html)

Filter out file extension - Kiwi_man82 - Jan-06-2017

Hi all,
Am new here so please be nice :). Also new to python coding and am still learning with a lot more learning to be done.

I have coded a simple Python script which takes an m3u playlist and converts it into an XML that is playable with livestreamer pro addon in kodi, thing is, I can't figure out how to pass .TS links thru f4mproxy only and not any .Mp4 / .Mkv links etc

[color=#000000]import os, re

## Enter file location here ie: C:/temp/PythonTests
dir = ' '

## Enter name of playlist less .m3u extension
input_file = ' '
input_extn = '.m3u'
output_file = input_file+'.xml'

in_path = (dir)+'\\'+(input_file)+input_extn
out_path = (dir)+'\\'+(output_file)

in_file = open(in_path, 'r')
in_txt = in_file.read()

out_file = open(out_path, 'w')
m3u_regex = '#.+,(.+?)\n(.+?)\n'


link = in_txt
match = re.compile(m3u_regex).findall(link)
out_file.write ('<streamingInfos>\n\n')
for title, url in match:
    url = url.replace('&', '&amp;').replace('rtmp://$OPT:rtmp-raw=', '').strip()
    title = title.strip()
    out_file.write('<item>\n<title>' + title + '</title>\n<link>plugin://plugin.video.f4mTester/?streamtype=TSDOWNLOADER;name=' + title + '&amp;url=' + url + '</link>\n<thumbnail>' + ' ' + '</thumbnail>\n</item>\n\n')
    # print('<item>\n<title>' + title + '</title>\n<link>plugin://plugin.video.f4mTester/?streamtype=TSDOWNLOADER&amp;url=' + url + '</link>\n<thumbnail>' + ' ' + '</thumbnail>\n</item>\n\n')

out_file.close()
in_file.close()
[/color]

RE: Filter out file extension - Mekire - Jan-06-2017

you could just use splitext to split off the extension and check against an ignore set.

import os


files = ["some_file.png", "some_file.jpg", "some_file.bat",
         "some_file.exe", "some_file.dat"]

ignore = {".jpg", ".png"}


for filename in files:
    name, ext = os.path.splitext(filename)
    if ext not in ignore:
        print(filename)

RE: Filter out file extension - Kiwi_man82 - Jan-06-2017

(Jan-06-2017, 01:15 AM)Mekire Wrote: you could just use splitext to split off the extension and check against an ignore set.
import os


files = ["some_file.png", "some_file.jpg", "some_file.bat",
         "some_file.exe", "some_file.dat"]

ignore = {".jpg", ".png"}


for filename in files:
    name, ext = os.path.splitext(filename)
    if ext not in ignore:
        print(filename)

How would this work with .'s in the url. Http://someurl.com:8000/username/password/file.ts

It would look for the etxn being after the first dot . Yes? And not all URLs would be the same either. Some maybe an IP address which would have more dots .'s in it. Ie xxx.xxx.xxx.xxx:8080/username/password/file.ts
So I'd need to apply the filter only after the last / ?

RE: Filter out file extension - ichabod801 - Jan-06-2017

(Jan-06-2017, 02:45 AM)Kiwi_man82 Wrote: How would this work with .'s in the url. Http://someurl.com:8000/username/password/file.ts

It would look for the etxn being after the first dot . Yes? And not all URLs would be the same either. Some maybe an IP address which would have more dots .'s in it. Ie xxx.xxx.xxx.xxx:8080/username/password/file.ts
So I'd need to apply the filter only after the last / ?

os.path.splitext splits on the last dot, not the first.

RE: Filter out file extension - snippsat - Jan-06-2017

Or os.path.basename to not get a tuple output.

>>> import os
>>> os.path.basename('Http://someurl.com:8000/username/password/file.ts')
'file.ts'
>>> os.path.basename('xxx.xxx.xxx.xxx:8080/username/password/file.ts')
'file.ts'

RE: Filter out file extension - Kiwi_man82 - Jan-06-2017

Awesome thanks guys. I'll give it a go and see how I get on. Any more questions I know we're to come too. Thanks heaps ??

RE: Filter out file extension - Kiwi_man82 - Jan-06-2017

Thanks guys, finally got it figured out.
Here is the final code

import os, re


## Enter file location here ie: C:/temp/PythonTests
dir = ' '

## Enter name of playlist less .m3u extension
input_file = ' '
input_extn = '.m3u'
output_file = input_file+'.xml'


in_path = (dir)+'\\'+(input_file)+input_extn
out_path = (dir)+'\\'+(output_file)

in_file = open(in_path, 'r')
in_txt = in_file.read()

out_file = open(out_path, 'w')
m3u_regex = '#.+,(.+?)\n(.+?)\n'

ignore = {'.mp4', '.mkv'}


link = in_txt
match = re.compile(m3u_regex).findall(link)
out_file.write ('<streamingInfos>\n\n')
for title, url in match:
   url = url.replace('&', '&amp;').replace('rtmp://$OPT:rtmp-raw=', '').strip()
   title = title.strip()
   ignore = {".mp4", ".mkv"}
   name, ext = os.path.splitext(url)
   if ext not in ignore:
       print('<item>\n<title>' + title + '</title>\n<link>plugin://plugin.video.f4mTester/?streamtype=TSDOWNLOADER&amp;url=' + url + '</link>\n<thumbnail>' + ' ' + '</thumbnail>\n</item>\n\n')
   else:
       print('<item>\n<title>' + title + '</title>\n<link>' + url + '</link>\n<thumbnail>' + ' ' + '</thumbnail>\n</item>\n\n')
out_file.close()
in_file.close()

RE: Filter out file extension - Kiwi_man82 - Jan-08-2017

Could I also use the same method to pull out say, all the sports links only and use that to create a sports XML that only contains sports links? Having fun learning python and it's capabilities. (I know it's not related to my op but didn't want to start a new thread for a very similar question)

RE: Filter out file extension - ichabod801 - Jan-08-2017

It's possible, but it depends on the links. Is there something in the links that identifies them as sports links? Is there something at the destination of the link that identifies it as a sports link? If there is, you could use that to gather only the sports links.

RE: Filter out file extension - Kiwi_man82 - Jan-08-2017

(Jan-08-2017, 04:09 PM)ichabod801 Wrote: It's possible, but it depends on the links. Is there something in the links that identifies them as sports links? Is there something at the destination of the link that identifies it as a sports link? If there is, you could use that to gather only the sports links.

I actually ended up playing with a regex and managed to do it that way :)
Thanks for the reply all the same