I have written a python library that has become somewhat complex. Today I noticed
that running my code multiple times in a row without changing a single line of code or anything else, results sometimes in an Exception and sometimes not. The error occurs when parsing a json string using the json.loads method. (The string in question is valid json format '["1", "6", "12", "14", "36", "44"]'). Due to the fact that the error appeared after working on the project for a long time and I am not sure what causes it I don't know how to provide a minimum work example. The code is simply of type
result = json.loads(string) # where string is the aforementioned list
When executing the script consecutive times in a row, most of the times the code finished without exception while sometimes it raises
File "myfile.py", line 800, in get
opt = json.loads(opt)
File "/usr/lib64/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python3.10/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
(I have omitted the lines preceding the part which is causing the exception).
Unfortunatley I am not sure how to investigate what is going on since the traceback of the error alone gives me no information on what happens differently to when the code runs fine. Are there any ways to compare what the interpreter does between two runs?
Related
When using the PUT command in a threaded process, I often receive the following traceback:
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/cursor.py", line 657, in execute
sf_file_transfer_agent.execute()
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/file_transfer_agent.py", line 347, in execute
self._parse_command()
File "/data/pydig/venv/lib64/python3.6/site-packages/snowflake/connector/file_transfer_agent.py", line 1038, in _parse_command
self._command_type = self._ret["data"]["command"]
KeyError: 'command'
It seems to be fairly benign, yet occurs randomly. The command itself seems to run successfully when looking at the stage. To combat this, I simply catch KeyErrors when puts occur, and retry several times. This allows processes to continue as expected, but leads to issues with proceeding COPY INTO statements. Mainly, because the initial PUT succeeds, I will receive a LOAD_SKIPPED status from the COPY INTO. Effectively, the file is put and copied, but we lose information such as rows_parsed, rows_loaded, and errors_seen.
Please advise on work arounds for the initial traceback.
NOTE: An example output after running PUT/COPY INTO processes: SAMPLE OUTPUT
NOTE: I have found I can use the FORCE parameter with COPY INTO to bypass the LOAD_SKIPPED status, however, the initial error still persists, and this can cause duplication.
I'm trying to split some datasets in two parts, running a loop over files like this:
cd C:\Users\Macrina\Documents\exports
qui fs *
foreach f in `r(files)' {
use `r(files)'
keep id adv*
save adv_spa*.dta
clear
use `r(files)'
drop adv*
save fin_spa*.dta
}
I don't know whether what is inside the loop is correctly written but the point is that I get the error:
invalid '"e2.dta'
where e2.dta is the second file in the folder. Does this message refer to the loop or maybe what is inside the loop? Where is the mistake?
You want lines like
use "`f'"
not
use `r(files)'
given that fs (installed from SSC, as you should explain) returns r(files) as a list of all the files whereas you want to use each one in turn (not all at once).
The error message was informative: use is puzzled by the second filename it sees (as only one filename makes sense). The other filenames are ignored: use fails as soon as something is evidently wrong.
Incidentally, note that putting "" around filenames remains essential if any includes spaces.
I am trying to print the - print statements in python behave step files, meanwhile i do not want to print the traceback errors. I used behave.ini file with stdout_capture=False setup. It helps to print the print statements in the step file but along with it, it also prints the traceback errors. I do not want this to print and want only output the print statements mentioned in the step files. Is there any way to do it ?
I used below command.
behave exmaple.feature --logging-level= ERROR
Try to add this to behave.ini
stderr_capture = no
refer the below documentation: https://github.com/behave/behave/blob/master/docs/behave.rst#configuration-files
hi i want to save the position of all the lines that contains "CREATE TABLE" in a list
a) is there a better and right way to do it? (i'm new to python)
b) why does it matter that tell is being used for the iterator? i thought it's a read method (or equivalent) thus just telling the position shouldn't hurt the file iteration proccess.
so i have the following class:
class SQLParser(object):
def __init__(self,filename):
self.file = open(filename,'r')
self.createTablePositions=[]
self.insertIntoPositions=[]
def findCreateTable(self):
for line in self.file:
if line.find("CREATE TABLE") is 0:
print(line)
self.createTablePositions.append(self.file.tell())
sqlhandler = SQLParser("sql.sql")
sqlhandler.findCreateTable()
print(sqlhandler.createTablePositions)
that yields the following error: "Traceback (most recent call last):
File "C:/Users/user/PycharmProjects/sqlparser/sqlparser.py", line 18, in
sqlhandler.findCreateTable()
File "C:/Users/user/PycharmProjects/sqlparser/sqlparser.py", line 12, in findCreateTable
curPos = self.file.tell()
OSError: telling position disabled by next() call"
i've searched the net and stackoverflow but i didn't find a direct solution to my problem.
--currently solution like rewritting next() method are beyond my knowledge and i doubt this excercise aims for that.
please your advice will be highly appriciated!
For starters, you are never closing the file, that is not good.
The error arises mainly due to the internal behaviour of the tell() method. By iterating the file via for line in file you are constantly calling the internal operation next() which messes up how the tell() method works. Usually it is better to use the specific methods for reading data from a file: readline() or readlines(). Unless you know exactly what you are doing, iterating over objects that are controlled by the OS (file system) can cause errors due to conflicting access methods (sometimes) or other.
Also file.tell() method returns the position of the cursor not the line where you are. So if say you were reading the first line that had 20 characters, after using file.readline() the method file.tell() would return 22 (number of characters plus endline character or other)
Rather than what you are doing I'd suggest thinking about it a different way around.
class SQLParser(object):
"""
Parses a SQL file.
"""
def __init__(self,filename):
self.createTablePositions= self.findCreateTable(filename)
self.insertIntoPositions=[]
def findCreateTable(self, filename):
temp = []
with open(filename, 'r') as file:
# with operator closes the file upon exit of block
fileNum = 0
for line in file.readlines():
if "CREATE TABLE" in line:
print(line)
temp.append(fileNum)
fileNum += 1
return temp
sqlhandler = SQLParser("sql.sql")
print(sqlhandler.createTablePositions)
Therefore now, you will parse the file upon initialisation of the class object.
You can then proceed to doing a similar thing for the other method of insertIntoPosition.
If your SQL file is too large, there are two solutions according to this answer:
Using file.readline() instead of next()
with open(path, mode) as file:
while True:
line = file.readline()
if not line:
break
file.tell()
Using offset += len(line) instead of file.tell()
offset = 0
with open(path, mode) as file:
for line in file:
offset += len(line)
I have a package that giving me a very confusing "Text was truncated or one or more characters had no match in the target code page" error but only when I run the full package in the control flow, not when I run just the task by itself.
The first task takes CSV files and combines them into one file. The next task reads the output of the previous file and begins to process the records. What is really odd is the truncation error is thrown in the flat file source in the 2nd step. This is the exact same flat file source which was the destination in the previous step.
If there was a truncation error wouldn't that be thrown by the previous step that tried to create the file? Since the 1st step created the file without truncation, why can't I just read that same file in the very next task?
Note - Only thing that makes this package different from the others I have worked on is I am dealing with special characters and using code page 65001 UTF-8 to capture the fields that have special characters. My other packages were all referencing flat file connection managers with code page 1252.
The problem was caused by the foreach loop and using the ColumnNamesInFirstDataRow expression where I have the formula "#[User::count_raw_input_rows] < 0". I have a variable initialized to -1 and I assign it to the ColumnNamesInFirstDataRow for the flat file. When in the loop I update the variable with a row counter on each read of a CSV file. This puts the header in the first time (-1) but then avoids repeating on all the other CSV files. When I exit the loop and try to read the input file it treats the header as data and blows up. I only avoided this on my last package because I didn't tighten the column definitions for the flat file like I did with this package. Thanks for the help.