removing portion of filename - file

I have done some searching but cannot see how to actually code this. I am new to Python and not really sure what method I should use to try to do this.
I have some files that I would like to rename. Unfortunately the portion towards the file extension is never the same and would like to just remove it.
File name is like AC_DC - Shot Down In Flames (Official Video)-UKwVvSleM6w.mp3
Any help would be appreciated.

Since this looks like the result from youtube-dl, the "random" substring is most likely the unique video id, which in my experience is always 11 characters long. It can, however, include dashes (-), so the regex-approach suggested by smitrp would not always work.
I use this "dirty" workaround:
>>> original_name="AC_DC - Shot Down In Flames (Official Video)-UKwVvSleM6w.mp3"
>>> new_name=original_name[:-16]+".mp3"
>>> new_name
'AC_DC - Shot Down In Flames (Official Video).mp3'
Edit:
If you really, REALLY want to find the "-XXXX"-portion, have a look at str.rfind(). This will help you to find the index of the last dash (-), which you can directly use for the slice notation of the string.
Disclaimer:
This will provide wrong results, if the video id contains a dash, e.g. here: https://www.youtube.com/watch?v=7WVBEB8-wa0
Then you will find the last dash, remove -wa0 and be left with -7WVBEB8 at the end of the filename.

Using idea of the above answer, one can also take into account that a normal word does not
contain more than one capital character.
def youtube_name_fix(folder):
import os
from pathlib import Path
import re
REGEX = re.compile(r'[A-Z]')
for name in os.listdir(folder):
basename = Path(name)
last_12 = basename.stem[-12:]
# check if the end string is not all uppercase (then it could be part of a valid name)
if not last_12.isupper():
# check if the last string has more than one uppercase letters
if len(REGEX.findall(last_12)) > 1:
# remove the end youtube string and create new full path
new_name = os.path.join(folder, basename.stem[:-12] + basename.suffix)
try:
os.rename(os.path.join(folder,name), new_name)
except Exception as e:
print(e)
> youtube_name_fix(p)
old name -> "4-Discrete and Continuous Probability Models-esHwigpYggU.mp4"
new name -> "4-Discrete and Continuous Probability Models.mp4"

Related

fetch a discord link from a long text

૮₍ • ᴥ • ₎ა・Raiden ▬▭⋱𓂅
ᘏ⑅ᘏ╭╯Welcome╰╮𓂃ᘏᗢ
・・・・・・・・・・・・・・・・・・・・・
https://discord.gg/rsCC8y7WC4
・・・・・・・・・・・・・・・・・・・・・
Join!
・・・・・・・・・・・・・・・・・・・・・
How can I pull the "discord.gg/rsCC8y7WC4" link from a text like this
console.log(invitelink) ==> discord.gg/rsCC8y7WC4
Use String.match() for this. String.match accepts a regex argument which looks like this:
let str = 'hey there this is just a random string';
let res = str.match(/random/);
//res is now ['random']
Now for your problem, you are probably looking for this:
if(msg.content.match(/discord\.gg\/.+/) || msg.content.match(/discordapp\.com\/invite\/.+/)) return msg.channel.send('Hey! You put an invite in your message!');
Now that regex may look a bit messy/complicated but the \s are to escape the character and make sure regex knows that it’s not the special character it uses, and is actually just a character part of the search. To clarify why the above example should work, here’s a little explanation:
match() returns either an array (if it gets a match) or null (if it gets no match). You are searching for the strings 'discord.gg/' followed by any characters and also checking for the string 'discordapp.com/invite/' also followed by any characters.
If this doesn’t work, please tell me.

VBSCRIPT REPLACE not removing spaces from Decrypted fields

Got quite a head-scratcher....
I'm using the VBScript function REPLACE to replace spaces in a decrypted field from a MSSQL DB with "/".
But the REPLACE function isn't "seeing" the spaces.
For example, if I run any one of the following, where the decrypted value of the field "ITF_U_ClientName_Denc" is "Johnny Carson":
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc")," ","/")
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc")," ","/")
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc"),"Chr(160)","/")
REPLACE(CSTR(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc"))," ","/")
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc")," ","/",1,-1,1)
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc")," ","/",1,-1,0)
The returned value is "Johnny Carson" (space not replaced with /)
The issue seems to be exclusively with spaces, because when I run this:
REPLACE(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc"),"a","/")
I get "Johnny C/rson".
Also, the issue seems to be exclusively with spaces in the decrypted value, because when I run this:
REPLACE("Johnny Carson"," ","/")
Of course, the returned value is "Johnny/Carson".
I have checked what is being written to the source of the page and it is simply "Johnny Carson" with no encoding or special characters.
I have also tried the SPLIT function to see if it would "see" the space, but it doesn't.
Finally, thanks to a helpful comment, I tried VBS REGEX searching for \s.
Set regExp = New RegExp
regExp.IgnoreCase = True
regExp.Global = True
regExp.Pattern = "\s" 'Add here every character you don't consider as special character
strProcessed = regExp.Replace(ITF_U_Ledger.Fields("ITF_U_ClientName_Denc"), "?")
Unfortunately, strProcessed retruns "Johnny Carson" (ie. spaces not detected/removed).
If I replace regExp.Pattern = "a", strProcessed returns "Johnny C?rson".
Many thanks for your help!!
As we found, the right character code is 160, and that did the trick:
replace(..., ChrW(160), "...")
This seems to be data specific and, additionally, as an alternative you can try to get same encoding of the source script (i.e. save with Save As with Encoding), or convert received database value into a different target encoding.

Import timeseries via loop (pot. generic)

Solved, now working example!
I have a set of time series that is populated in a folder structure as follows:
TimeSeriesData\Config_000000\seed_001\Aggregate_Quantity.txt
…
TimeSeriesData\Config_000058\seed_010\Aggregate_Quantity.txt
And in each file there is a first line containing a character, followed by lines containing the data. E.g.
share
-21.75
-20.75
…
Now, I would like to import all those files (at best, without specifying ex post the number of configs and seeds, at worst with supplying it) into single time series of the like: “AggQuant_config_seed” where config relates to the config ID and seed to the seed id.
I tried the following (using the non-preferred way), but “parsing” the “path” does not work / I do not know how to do it.
string base = "TimeSeriesData/Config_"
string middle = "/seed_"
string endd = "/Aggregate_Quantity.txt"
string path = ""
loop for (i=0;i<=58;i+=1)
loop for (j=1;j<=10;j+=1)
path = ""
sprintf path "%s%06d%s%03d%s",base,i,middle,j,endd
append #path #will be named share
rename share Agg_Q_$i_$j #rename
endloop
endloop
To sum up the problem, the following does not work:
string path="somwhere/a_file.txt" #string holding path
#wrong: append $path #use append on string
append #path #works!
And if possible, a way to search recursively through a set of folders, using the folder information together with file-information for the variable name, would be nice. Is that possible within gretl?
I realize that in many cases I would like to refer to a specific “help” section like those for functions and commands, but for operators instead (like “$”, which is obviously wrong here).

Locating a dynamic string in a text file

Problem:
Hello, I have been struggling recently in my programming endeavours. I have managed to receive the output below from Google Speech to Text, but I cannot figure out how draw data from this block.
Excerpt 1:
[VoiceMain]: Successfully initialized
{"result":[]}
{"result":[{"alternative":[{"transcript":"hello","confidence":0.46152416},{"transcript":"how low"},{"transcript":"how lo"},{"transcript":"how long"},{"transcript":"Polo"}],"final":true}],"result_index":0}
[VoiceMain]: Successfully initialized
{"result":[]}
{"result":[{"alternative":[{"transcript":"hello"},{"transcript":"how long"},{"transcript":"how low"},{"transcript":"howlong"}],"final":true}],"result_index":0}
Objective:
My goal is to extract the string "hello" (without the quotation marks) from the first transcript of each block and set it equal to a variable. The problem arises when I do not know what the phrase will be. Instead of "hello", the phrase may be a string of any length. Even if it is a different string, I would still like to set it to the same variable to which the phrase "hello" would have been set to.
Furthermore, I would like to extract the number after the word "confidence". In this case, it is 0.46152416. Data type does not matter for the confidence variable. The confidence variable appears to be more difficult to extract from the blocks because it may or may not be present. If it is not present, it must be ignored. If it is present however, it must be detected and stored as a variable.
Also please note that this text block is stored within a file named "CurlOutput.txt".
All help or advice related to solving this problem is greatly appreciated.
You could do this with regex, but then I am assuming you will want to use this as a dict later in your code. So here is a python approach to building this result as a dictionary.
import json
with open('CurlOutput.txt') as f:
lines = f.read().splitlines()
flag = '{"result":[]} '
for line in lines: # Loop through each lin in file
if flag in line: # check if this is a line with data on it
results = json.loads(line.replace(flag, ''))['result'] # Load data as a dict
# If you just want to change first index of alternative
# results[0]['alternative'][0]['transcript'] = 'myNewString'
# If you want to check all alternative for confidence and transcript
for result in results[0]['alternative']: # Loop over each alternative
transcript = result['transcript']
confidence = None
if 'confidence' in result:
confidence = result['confidence']
# now do whatever you want with confidence and transcript.

getting the file in a sub-folder # Isolated Storage - WP7

I would like to get all the files that a sub-folder holds in a string array.
So, I have tried something like the following:
var IOstore = IsolatedStorageFile.GetUserStoreForApplication();
string searchpath = System.IO.Path.Combine("product", ProductName);
string filesInSubDirs[] = IOstore.GetFileNames(searchpath);
But I got all the files in the "product" folder. I have also tried with "productname" only as the parameter.
Thanks for your help.
The search pattern for a sub-folder needs to include "*.*" at the end to pattern match any file, which would make your code something like the following:
var IOstore = IsolatedStorageFile.GetUserStoreForApplication();
string searchpath = System.IO.Path.Combine("product", ProductName);
searchpath = string.Format("{0}\\*.*", searchpath);
string filesInSubDirs[] = IOstore.GetFileNames(searchpath);
Something you might want to try. (this is sort of a left field answer, sorry). In my dropbox client http://sharpdropbox.codeplex.com/) I have a set of facades for System.IO.File, System.IO.FileInfo, System.IO.Directory, and System.IO.DirectoryInfo. They work pretty good and I have tested them.
Basically, you add a Using or Import for System.IO.IsolatedStorage and then PSFile, PSDirectory, PSFileInfo, or PSDirectoryInfo. It's saved me from having to remember all the nuances... for instance if you are querying a directory, it knows to add a slash, etc. BTW, the "PS" prefix stands for "Persisted Storage" which is what IsolatedStorage is sometimes called (starting them with an "I" implies they are interfaces.. and having no prefix makes things even more confusing).
Anyway, you can grab the code from source or I believe the last release had the DLLs for them (it's called something like "IsolatedStorageFacade-WP7")

Resources