I am using aws sagemaker to invoke the endpoint :
payload = pd.read_csv('payload.csv', header=None)
>> payload
0 1 2 3 4
0 setosa 5.1 3.5 1.4 0.2
1 setosa 5.1 3.5 1.4 0.2
with this code :
response = runtime.invoke_endpoint(EndpointName=r_endpoint,
ContentType='text/csv',
Body=payload)
But I got this problem :
ParamValidationError Traceback (most recent call last)
<ipython-input-304-f79f5cf7e0e0> in <module>()
1 response = runtime.invoke_endpoint(EndpointName=r_endpoint,
2 ContentType='text/csv',
----> 3 Body=payload)
4
5 result = json.loads(response['Body'].read().decode())
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
312 "%s() only accepts keyword arguments." % py_operation_name)
313 # The "self" in this scope is referring to the BaseClient.
--> 314 return self._make_api_call(operation_name, kwargs)
315
316 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
584 }
585 request_dict = self._convert_to_request_dict(
--> 586 api_params, operation_model, context=request_context)
587
588 handler, event_response = self.meta.events.emit_until_response(
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _convert_to_request_dict(self, api_params, operation_model, context)
619 api_params, operation_model, context)
620 request_dict = self._serializer.serialize_to_request(
--> 621 api_params, operation_model)
622 prepare_request_dict(request_dict, endpoint_url=self._endpoint.host,
623 user_agent=self._client_config.user_agent,
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/validate.py in serialize_to_request(self, parameters, operation_model)
289 operation_model.input_shape)
290 if report.has_errors():
--> 291 raise ParamValidationError(report=report.generate_report())
292 return self._serializer.serialize_to_request(parameters,
293 operation_model)
ParamValidationError: Parameter validation failed:
Invalid type for parameter Body, value: 0 1 2 3 4
0 setosa 5.1 3.5 1.4 0.2
1 setosa 5.1 3.5 1.4 0.2, type: <class 'pandas.core.frame.DataFrame'>, valid types: <class 'bytes'>, <class 'bytearray'>, file-like object
I am just using the same code/step like in the aws tutorial .
Can you help me to resolve this problem please?
thank you
The payload variable is a Pandas' DataFrame, while invoke_endpoint() expects Body=b'bytes'|file.
Try something like this (coding blind):
response = runtime.invoke_endpoint(EndpointName=r_endpoint,
ContentType='text/csv',
Body=open('payload.csv'))
More on the expected formats here.
Make sure the file doesn't include a header.
Alternatively, convert your DataFrame to bytes, like in this example, and pass those bytes instead of passing a DataFrame.
Related
PROBLEM
Use interativeImputer from sklearn.impute.IterativeImputer, to get regression model fit for for BayesianRidge() for impute missing data in variable 'Frontage'.
After the interative_imputer_fit = interative_imputer.fit(data) run, the interative_imputer_fit.transform(X) runs but invoke on function, imputer_bay_ridge(data), the transform() function from interative_imputer, e.g., interative_imputer_fit.transform(X) error on value error. Passed in two variables, Frontage and Area. But only Frontage was inside the numpy.array.
Python CODE using sklearn
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
from sklearn.ensemble import ExtraTreesRegressor
from sklearn.linear_model import BayesianRidge
def imputer_bay_ridge(data):
data_array = data.to_numpy()
data_array.reshape(1, -1)
interative_imputer = IterativeImputer(BayesianRidge())
interative_imputer_fit = interative_imputer.fit(data_array)
X = data['LotFrontage']
data_imputed = interative_imputer_fit.transform(X)
train_data[['Frontage', 'Area']]
INVOKE FUNCTION
fit_tranformed_imputed = imputer_bay_ridge(train_data[['Frontage', 'Area']])
DATA EXAMPLE
train_data[['Frontage', 'Area']]
Frontage Area
0 65.0 8450
1 80.0 9600
2 68.0 11250
3 60.0 9550
4 84.0 14260
... ... ...
1455 62.0 7917
1456 85.0 13175
1457 66.0 9042
1458 68.0 9717
1459 75.0 9937
1460 rows × 2 columns
ERROR
ValueError Traceback (most recent call last)
Cell In[243], line 1
----> 1 fit_tranformed_imputed = imputer_bay_ridge(train_data[['LotFrontage', 'LotArea']])
Cell In[242], line 12, in imputer_bay_ridge(data)
10 interative_imputer_fit = interative_imputer.fit(data_array)
11 X = data['LotFrontage']
---> 12 data_imputed = interative_imputer_fit.transform(X)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/impute/_iterative.py:724, in IterativeImputer.transform(self, X)
707 """Impute all missing values in `X`.
708
709 Note that this is stochastic, and that if `random_state` is not fixed,
(...)
720 The imputed input data.
721 """
722 check_is_fitted(self)
--> 724 X, Xt, mask_missing_values, complete_mask = self._initial_imputation(X)
726 X_indicator = super()._transform_indicator(complete_mask)
728 if self.n_iter_ == 0 or np.all(mask_missing_values):
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/impute/_iterative.py:514, in IterativeImputer._initial_imputation(self, X, in_fit)
511 else:
512 force_all_finite = True
--> 514 X = self._validate_data(
515 X,
516 dtype=FLOAT_DTYPES,
517 order="F",
518 reset=in_fit,
519 force_all_finite=force_all_finite,
520 )
521 _check_inputs_dtype(X, self.missing_values)
523 X_missing_mask = _get_mask(X, self.missing_values)
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/base.py:566, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, **check_params)
564 raise ValueError("Validation should be done on X, y or both.")
565 elif not no_val_X and no_val_y:
--> 566 X = check_array(X, **check_params)
567 out = X
568 elif no_val_X and not no_val_y:
File ~/opt/anaconda3/lib/python3.9/site-packages/sklearn/utils/validation.py:769, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
767 # If input is 1D raise error
768 if array.ndim == 1:
--> 769 raise ValueError(
770 "Expected 2D array, got 1D array instead:\narray={}.\n"
771 "Reshape your data either using array.reshape(-1, 1) if "
772 "your data has a single feature or array.reshape(1, -1) "
773 "if it contains a single sample.".format(array)
774 )
776 # make sure we actually converted to numeric:
777 if dtype_numeric and array.dtype.kind in "OUSV":
ValueError: Expected 2D array, got 1D array instead:
array=[65. 80. 68. ... 66. 68. 75.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
I have a table of price updates in the format (timestamp, price, amount).
The timestamp is a datetime, price categorical and amount float64. The timestamp column is set as an index.
My goal is to get the amount available at each price level at each point in time.
First, I use the pivot to spread the prices into columns, and then forward fill.
pivot = price_table.pivot_table(index = 'timestamp',
columns = 'price', values = 'amount')
pivot_ffill = pivot.fillna(method = 'ffill')
I can compute or apply head to pivot_ffill and it works fine.
Clearly, there are still NAs at the beginning of the table where there have been no updates yet.
When I apply
pivot_nullfill = pivot_ffill.fillna(0)
pivot_nullfill.head()
I do get an error
The columns in the computed data do not match the columns in the provided metadata. I tried replacing the zero with 0.0 or float(0), but to no avail. As the previous steps work, I strongly suspect it has something to do with the fillna, but due to the delayed calculations that does not have to be true.
Does someone know what causes this? Thank you!
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-180-f8ab344c7939> in <module>
----> 1 pivot_ffill.fillna(0).head()
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\dataframe\core.py in head(self, n, npartitions, compute)
896
897 if compute:
--> 898 result = result.compute()
899 return result
900
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\base.py in compute(self, **kwargs)
154 dask.base.compute
155 """
--> 156 (result,) = compute(self, traverse=False, **kwargs)
157 return result
158
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\base.py in compute(*args, **kwargs)
396 keys = [x.__dask_keys__() for x in collections]
397 postcomputes = [x.__dask_postcompute__() for x in collections]
--> 398 results = schedule(dsk, keys, **kwargs)
399 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
400
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
74 results = get_async(pool.apply_async, len(pool._pool), dsk, result,
75 cache=cache, get_id=_thread_get_id,
---> 76 pack_exception=pack_exception, **kwargs)
77
78 # Cleanup pools associated to dead threads
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\local.py in get_async(apply_async, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, **kwargs)
460 _execute_task(task, data) # Re-execute locally
461 else:
--> 462 raise_exception(exc, tb)
463 res, worker_id = loads(res_info)
464 state['cache'][key] = res
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\compatibility.py in reraise(exc, tb)
110 if exc.__traceback__ is not tb:
111 raise exc.with_traceback(tb)
--> 112 raise exc
113
114 import pickle as cPickle
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
228 try:
229 task, data = loads(task_info)
--> 230 result = _execute_task(task, data)
231 id = get_id()
232 result = dumps((result, id))
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk)
116 elif istask(arg):
117 func, args = arg[0], arg[1:]
--> 118 args2 = [_execute_task(a, cache) for a in args]
119 return func(*args2)
120 elif not ishashable(arg):
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\core.py in <listcomp>(.0)
116 elif istask(arg):
117 func, args = arg[0], arg[1:]
--> 118 args2 = [_execute_task(a, cache) for a in args]
119 return func(*args2)
120 elif not ishashable(arg):
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk)
117 func, args = arg[0], arg[1:]
118 args2 = [_execute_task(a, cache) for a in args]
--> 119 return func(*args2)
120 elif not ishashable(arg):
121 return arg
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\optimization.py in __call__(self, *args)
940 % (len(self.inkeys), len(args)))
941 return core.get(self.dsk, self.outkey,
--> 942 dict(zip(self.inkeys, args)))
943
944 def __reduce__(self):
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\core.py in get(dsk, out, cache)
147 for key in toposort(dsk):
148 task = dsk[key]
--> 149 result = _execute_task(task, cache)
150 cache[key] = result
151 result = _execute_task(out, cache)
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\core.py in _execute_task(arg, cache, dsk)
117 func, args = arg[0], arg[1:]
118 args2 = [_execute_task(a, cache) for a in args]
--> 119 return func(*args2)
120 elif not ishashable(arg):
121 return arg
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\compatibility.py in apply(func, args, kwargs)
91 def apply(func, args, kwargs=None):
92 if kwargs:
---> 93 return func(*args, **kwargs)
94 else:
95 return func(*args)
C:\ProgramData\Anaconda3\envs\python36\lib\site-packages\dask\dataframe\core.py in apply_and_enforce(*args, **kwargs)
3800 if not np.array_equal(np.nan_to_num(meta.columns),
3801 np.nan_to_num(df.columns)):
-> 3802 raise ValueError("The columns in the computed data do not match"
3803 " the columns in the provided metadata")
3804 else:
ValueError: The columns in the computed data do not match the columns in the provided metadata
The error message should have give you a suggestion of how to fix the situation. We assume you are loading from CSV (the question doesn't say), so you would probably end up with a line like
df = dd.read_csv(..., dtype={...})
which instructs the pandas reader on the dtypes you want to enforce, since you know more information than pandas does. That ensures that all partitions have the same types for all columns - see the notes part of the docs.
Why does this work:
In [17]: %%cython -f
...: from libc.string cimport memcpy
...:
...: DEF KLEN = 5
...: DEF TRP_KLEN = KLEN * 3
...:
...: cdef:
...: unsigned char k[KLEN]
...: unsigned char kk[TRP_KLEN]
...: kk = bytearray(b'12345abcde!##$%')
...: memcpy(&k, &kk[5], KLEN)
...: print(k)
b'abcde'
While this:
In [16]: %%cython -f
...: from libc.string cimport memcpy
...:
...: DEF KLEN = 5
...: DEF TRP_KLEN = KLEN * 3
...: ctypedef unsigned char SingleKey[KLEN]
...: ctypedef unsigned char TripleKey[TRP_KLEN]
...:
...: cdef:
...: SingleKey k
...: TripleKey kk
...: kk = bytearray(b'12345abcde!##$%')
...: memcpy(&k, &kk[5], KLEN)
...: print(k)
throws an obscure error, which doesn't directly mention my code:
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-16-a4bb608248b0> in <module>()
----> 1 get_ipython().run_cell_magic('cython', '-f', "from libc.string cimport memcpy\n\nDEF KLEN = 5\nDEF TRP_KLEN = KLEN * 3\nctypedef unsigned char SingleKey[KLEN]\nctypedef unsigned char DoubleKey[TRP_KLEN]\n\ncdef:\n SingleKey k[KLEN]\n DoubleKey kk[TRP_KLEN]\nkk = bytearray(b'12345abcde!##$%')\nmemcpy(&k, &kk[5], KLEN)\nprint(k)")
/usr/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
2165 magic_arg_s = self.var_expand(line, stack_depth)
2166 with self.builtin_trap:
-> 2167 result = fn(magic_arg_s, cell)
2168 return result
2169
<decorator-gen-118> in cython(self, line, cell)
/usr/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in cython(self, line, cell)
318 extension = None
319 if need_cythonize:
--> 320 extensions = self._cythonize(module_name, code, lib_dir, args, quiet=args.quiet)
321 assert len(extensions) == 1
322 extension = extensions[0]
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Build/IpythonMagic.py in _cythonize(self, module_name, code, lib_dir, args, quiet)
426 elif sys.version_info[0] >= 3:
427 opts['language_level'] = 3
--> 428 return cythonize([extension], **opts)
429 except CompileError:
430 return None
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Build/Dependencies.py in cythonize(module_list, exclude, nthreads, aliases, quiet, force, language, exclude_failures, **options)
1024 if not nthreads:
1025 for args in to_compile:
-> 1026 cythonize_one(*args)
1027
1028 if exclude_failures:
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Build/Dependencies.py in cythonize_one(pyx_file, c_file, fingerprint, quiet, options, raise_on_failure, embedded_metadata, full_module_name, progress)
1127 any_failures = 0
1128 try:
-> 1129 result = compile_single(pyx_file, options, full_module_name=full_module_name)
1130 if result.num_errors > 0:
1131 any_failures = 1
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Main.py in compile_single(source, options, full_module_name)
647 recursion.
648 """
--> 649 return run_pipeline(source, options, full_module_name)
650
651
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Main.py in run_pipeline(source, options, full_module_name, context)
497
498 context.setup_errors(options, result)
--> 499 err, enddata = Pipeline.run_pipeline(pipeline, source)
500 context.teardown_errors(err, options, result)
501 return result
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Pipeline.py in run_pipeline(pipeline, source, printtree)
352 exec("def %s(phase, data): return phase(data)" % phase_name, exec_ns)
353 run = _pipeline_entry_points[phase_name] = exec_ns[phase_name]
--> 354 data = run(phase, data)
355 if DebugFlags.debug_verbose_pipeline:
356 print(" %.3f seconds" % (time() - t))
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Pipeline.py in run(phase, data)
332
333 def run(phase, data):
--> 334 return phase(data)
335
336 error = None
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Pipeline.py in generate_pyx_code_stage(module_node)
50 def generate_pyx_code_stage_factory(options, result):
51 def generate_pyx_code_stage(module_node):
---> 52 module_node.process_implementation(options, result)
53 result.compilation_source = module_node.compilation_source
54 return result
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/ModuleNode.py in process_implementation(self, options, result)
140 self.find_referenced_modules(env, self.referenced_modules, {})
141 self.sort_cdef_classes(env)
--> 142 self.generate_c_code(env, options, result)
143 self.generate_h_code(env, options, result)
144 self.generate_api_code(env, options, result)
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/ModuleNode.py in generate_c_code(self, env, options, result)
376 # generate normal variable and function definitions
377 self.generate_variable_definitions(env, code)
--> 378 self.body.generate_function_definitions(env, code)
379 code.mark_pos(None)
380 self.generate_typeobj_definitions(env, code)
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_function_definitions(self, env, code)
436 #print "StatListNode.generate_function_definitions" ###
437 for stat in self.stats:
--> 438 stat.generate_function_definitions(env, code)
439
440 def generate_execution_code(self, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_function_definitions(self, env, code)
9242 entry.cname = cname
9243
-> 9244 self.node.generate_function_definitions(env, code)
9245
9246 def generate_execution_code(self, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_function_definitions(self, env, code)
1974 # ----- Function body -----
1975 # -------------------------
-> 1976 self.generate_function_body(env, code)
1977
1978 code.mark_pos(self.pos, trace=False)
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_function_body(self, env, code)
1736
1737 def generate_function_body(self, env, code):
-> 1738 self.body.generate_execution_code(code)
1739
1740 def generate_function_definitions(self, env, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
442 for stat in self.stats:
443 code.mark_pos(stat.pos)
--> 444 stat.generate_execution_code(code)
445
446 def annotate(self, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
6185 for i, if_clause in enumerate(self.if_clauses):
6186 self._set_branch_hint(if_clause, if_clause.body)
-> 6187 if_clause.generate_execution_code(code, end_label, is_last=i == last)
6188 if self.else_clause:
6189 code.mark_pos(self.else_clause.pos)
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code, end_label, is_last)
6247 self.condition.generate_disposal_code(code)
6248 self.condition.free_temps(code)
-> 6249 self.body.generate_execution_code(code)
6250 code.mark_pos(self.pos, trace=False)
6251 if not (is_last or self.body.is_terminator):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
442 for stat in self.stats:
443 code.mark_pos(stat.pos)
--> 444 stat.generate_execution_code(code)
445
446 def annotate(self, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/UtilNodes.py in generate_execution_code(self, code)
324 def generate_execution_code(self, code):
325 self.setup_temp_expr(code)
--> 326 self.body.generate_execution_code(code)
327 self.teardown_temp_expr(code)
328
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
6605 self.item.generate_evaluation_code(code)
6606 self.target.generate_assignment_code(self.item, code)
-> 6607 self.body.generate_execution_code(code)
6608 code.mark_pos(self.pos)
6609 code.put_label(code.continue_label)
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
442 for stat in self.stats:
443 code.mark_pos(stat.pos)
--> 444 stat.generate_execution_code(code)
445
446 def annotate(self, code):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_execution_code(self, code)
5100 def generate_execution_code(self, code):
5101 code.mark_pos(self.pos)
-> 5102 self.generate_rhs_evaluation_code(code)
5103 self.generate_assignment_code(code)
5104
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/Nodes.py in generate_rhs_evaluation_code(self, code)
5387
5388 def generate_rhs_evaluation_code(self, code):
-> 5389 self.rhs.generate_evaluation_code(code)
5390
5391 def generate_assignment_code(self, code, overloaded_assignment=False):
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/ExprNodes.py in generate_evaluation_code(self, code)
718 self.allocate_temp_result(code)
719
--> 720 self.generate_result_code(code)
721 if self.is_temp and not (self.type.is_string or self.type.is_pyunicode_ptr):
722 # If we are temp we do not need to wait until this node is disposed
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/ExprNodes.py in generate_result_code(self, code)
13135
13136 code.putln(self.type.from_py_call_code(
> 13137 self.arg.py_result(), self.result(), self.pos, code, from_py_function=from_py_function))
13138 if self.type.is_pyobject:
13139 code.put_gotref(self.py_result())
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/PyrexTypes.py in from_py_call_code(self, source_code, result_code, error_pos, code, from_py_function, error_condition)
512 source_code, result_code, error_pos, code,
513 from_py_function or self.from_py_function,
--> 514 error_condition or self.error_condition(result_code)
515 )
516
~/code/lsup/virtualenv/lib/python3.6/site-packages/Cython/Compiler/PyrexTypes.py in from_py_call_code(self, source_code, result_code, error_pos, code, from_py_function, error_condition)
2481 def from_py_call_code(self, source_code, result_code, error_pos, code,
2482 from_py_function=None, error_condition=None):
-> 2483 assert not error_condition, '%s: %s' % (error_pos, error_condition)
2484 call_code = "%s(%s, %s, %s)" % (
2485 from_py_function or self.from_py_function,
AssertionError: (<StringSourceDescriptor:carray.from_py>, 87, 19): (!__pyx_t_11) && PyErr_Occurred()
This is especially problematic in my 1000+-line program, where I had to find out what was causing the problem. Removing all ctypedefs for char arrays that were being assigned removed the error, but I'd like to know what is causing it.
Thanks.
UPDATE
Note: This is not a NUL-terminated string, but a byte sequence that may have NULs anywhere. That's why I am using memcpy rather than string functions or regular = assignments.
I am trying to write to a file in cloud storage from the remote api shell and am seeing the following:
s~appid> FILENAME = '/gs/test_bucket/test'
s~appid> writable_file = files.gs.create(FILENAME,
mime_type='application/octet-stream', acl='project-private')
s~appid> with files.open(writable_file, 'a') as f:
... f.write('[]')
...
---------------------------------------------------------------------------
FileNotOpenedError Traceback (most recent call last)
/Users/dhruvkaranmehta/Projects/getaround3/tools/g3/shell.pyc in <module>()
1 with files.open(writable_file, 'a') as f:
----> 2 f.write('[]')
3
/usr/local/google_appengine/google/appengine/api/files/file.pyc in
__exit__(self, atype, value, traceback)
288
289 def __exit__(self, atype, value, traceback):
--> 290 self.close()
291
292 def write(self, data, sequence_key=None):
/usr/local/google_appengine/google/appengine/api/files/file.pyc in
close(self, finalize)
282 request.set_filename(self._filename)
283 request.set_finalize(finalize)
--> 284 self._make_rpc_call_with_retry('Close', request, response)
285
286 def __enter__(self):
/usr/local/google_appengine/google/appengine/api/files/file.pyc in
_make_rpc_call_with_retry(self, method, request, response)
395 def _make_rpc_call_with_retry(self, method, request, response):
396 try:
--> 397 _make_call(method, request, response)
398 except (ApiTemporaryUnavailableError,
FileTemporaryUnavailableError):
399
/usr/local/google_appengine/google/appengine/api/files/file.pyc in
_make_call(method, request, response, deadline)
243 rpc.check_success()
244 except apiproxy_errors.ApplicationError, e:
--> 245 _raise_app_error(e)
246
247
/usr/local/google_appengine/google/appengine/api/files/file.pyc in
_raise_app_error(e)
186 elif (e.application_error ==
187 file_service_pb.FileServiceErrors.FILE_NOT_OPENED):
--> 188 raise FileNotOpenedError()
189 elif (e.application_error ==
190 file_service_pb.FileServiceErrors.READ_ONLY):
FileNotOpenedError:
This seems weird since the file was just opened. I have also seen another scenario where opening a file in 'a' mode leads to a FinalizationError.
Any additional information will be greatly helpful.
Thanks!
For the first part, there's a feature request to support the files api from the remote api shell. Could you try the same using the interactive console (See Is there an interactive console for public/uploaded app engine apps?).
Regarding your second error the documentation states that:
You cannot open and write to a file that has already been finalized.
I am new to R and I am trying to read in a data set. The data set is here:
http://petitlien.fr/myfiles
(The above link will expand to a GMX File storage folder link and click on Guest access to retrieve the file.)
The file named mydata.log has 32 entries with no header and it consists of 2 columns which are delimited by spaces.
I am trying the powerful command scan
test.frame<-scan(file="mydata.log",sep= "", nlines=32,blank.lines.skip=TRUE)
The above just read the first 3 rows:
head(test.frame)
[1] 0.0000 0.0000 144.3210 0.3400 159.4070 0.8925
I have tried also read.table:
test.frame<-read.table(file="mydata.log",sep= "", nrows=32,blank.lines.skip=TRUE)
This one reads the first 6 lines only as shown below:
names(test.frame)
[1] "V1" "V2"
> head(test.frame)
V1 V2
1 0.000 0.0000
2 144.321 0.3400
3 159.407 0.8925
4 198.413 0.9450
5 222.557 0.9975
6 235.464 1.0500
Does someone know how to read this data set properly?
A related question: Can I control the number of significant digits or perhaps decimal places in the data being read in?
Thanks a lot...
This line of your code works perfectly:
test.frame<-read.table(file="mydata.log",sep= "", nrows=32,blank.lines.skip=TRUE)
The reason why you only get 6 lines in your output is because you are using head. To view all lines, just enter the name of your object:
> test.frame
V1 V2
1 0.000 0.0000
2 144.321 0.3400
3 159.407 0.8925
4 198.413 0.9450
5 222.557 0.9975
6 235.464 1.0500
7 296.918 1.1025
8 346.773 1.1550
9 442.955 1.2075
10 694.879 1.2600
11 892.436 1.3125
12 1492.970 1.3650
13 2916.960 1.4175
14 3596.060 1.4700
15 5278.950 1.5225
16 7480.730 1.5750
17 12259.800 1.6275
18 14032.600 1.6800
19 19565.600 1.7325
20 31427.700 1.7850
21 58221.400 1.8375
22 92283.900 1.9900
23 165601.000 1.9425
24 165703.000 1.9950
25 213925.000 2.8750
26 260381.000 2.1000
27 312701.000 2.1525
28 370853.000 2.2050
29 479303.000 2.2575
30 487265.000 2.3100
31 545225.000 2.3625
32 703186.000 2.4150
Here is an easy way to see how many rows you have (useful when you have many observations):
nrow(test.frame)
[1] 32
As for the number of digits, see the round command. To look at the documentation for a command, enter a ? and then the command, in this case a function: ?round
#note that you do not have to put "digits=2", you can just put "2", but this way is clearer
> rounded_test.frame <- round(test.frame, digits=2)
> rounded_test.frame
V1 V2
1 0.00 0.00
2 144.32 0.34
3 159.41 0.89
4 198.41 0.94
5 222.56 1.00
6 235.46 1.05
7 296.92 1.10
8 346.77 1.16
9 442.95 1.21
10 694.88 1.26
11 892.44 1.31
12 1492.97 1.36
13 2916.96 1.42
14 3596.06 1.47
15 5278.95 1.52
16 7480.73 1.57
17 12259.80 1.63
18 14032.60 1.68
19 19565.60 1.73
20 31427.70 1.78
21 58221.40 1.84
22 92283.90 1.99
23 165601.00 1.94
24 165703.00 2.00
25 213925.00 2.88
26 260381.00 2.10
27 312701.00 2.15
28 370853.00 2.21
29 479303.00 2.26
30 487265.00 2.31
31 545225.00 2.36
32 703186.00 2.42
Note in the above I created a new object instead of replacing the current one. If you want to replace the current one and lose the data forever (until you reload the dataset of course!), then you can use this line instead:
test.frame <- round(test.frame, digits=2)
If you don't really want to compress your numbers, you might just be interested in viewing the rounded numbers. You can do this the following command:
print(test.frame,digits=2)
Instead of nrow() as suggested, I would recommend str() ("structure") that gives you more useful information about your data set (class of variables etc). It's also a bit less cryptic....:)