Installing Postgres-XL with PostGis

Installing Postgres-XL with PostGis - postgis

Problem: I am trying to install Postgres-XL with PostGis enabled on a 3 node cluster. I managed to install Postgres-XL on 3 nodes with 1 GTM, 1 Co-ordinator & 1 Data Node. Now the problem is when I try to install the PostGis extension on DataNode it installs successfully but when I try to install it on Coordinator it gives me the following error:
[postgres#test_pg_coord postgis-2.3.2]$ psql -d test11 -q -c 'CREATE EXTENSION postgis;' -p 30001
ERROR: type "gidx" does not exist
CONTEXT: SQL statement "CREATE OPERATOR && (
LEFTARG = gidx,
RIGHTARG = geography,
PROCEDURE = overlaps_geog,
COMMUTATOR = &&
)"
PL/pgSQL function inline_code_block line 8 at SQL statement
Do I need to just install PostGis on all datanodes & Coordinator or just datanodes?
Any help on the error and / or above question would be appreciated.

We managed to get it working with PostGis 2.3.1. So the issue is with PostGis 2.3.2 version.
I think the new version PostGis 2.3.2 is still not compatible with Postgres-XL.

Related

Snowflake Python Pandas Connector - Unknown error using fetch_pandas_all

I am trying to connect to snowflake using the python pandas connector.
I use the anaconda distribution on Windows, but uninstalled the existing connector and pyarrow and reinstalled using instructions on this page: https://docs.snowflake.com/en/user-guide/python-connector-pandas.html
I have the following versions
pandas 1.0.4 py37h47e9c7a_0
pip 20.1.1 py37_1
pyarrow 0.17.1 pypi_0 pypi
python 3.7.7 h81c818b_4
snowflake-connector-python 2.2.7 pypi_0 pypi
When running step 2 of this document: https://docs.snowflake.com/en/user-guide/python-connector-install.html, I get: 4.21.2
On attempting to use fetch_pandas_all() I get an error: NotSupportedError: Unknown error
The code I am using is as follows:
import snowflake.connector
import pandas as pd
SNOWFLAKE_DATA_SOURCE = '<DB>.<Schema>.<VIEW>'
query = '''
select *
from table(%s)
LIMIT 10;
'''
def create_snowflake_connection():
conn = snowflake.connector.connect(
user='MYUSERNAME',
account='MYACCOUNT',
authenticator = 'externalbrowser',
warehouse='<WH>',
database='<DB>',
role='<ROLE>',
schema='<SCHEMA>'
)
return conn
con = create_snowflake_connection()
cur = con.cursor()
temp = cur.execute(query, (SNOWFLAKE_DATA_SOURCE)).fetch_pandas_all()
cur.close()
I am wondering what else I need to install/upgrade/check in order to get fetch_pandas_all() to work?
Edit: After posting an answer below, I have realised that the issue is with the SSO (single sign on) with authenticator='externalbrowser'. When using a stand-alone account I can fetch.

I found a workaround that avoids the SSO error by relying on fetchall() instead of fetch_all_pandas():
try:
cur.execute(sql)
all_rows = cur.fetchall()
num_fields = len(cur.description)
field_names = [i[0] for i in cur.description]
finally:
cur.close()
con.close()
df = pd.DataFrame(all_rows)
df.columns = field_names

The reason is snowflake-connector-python does not install "pyarrow" which you need to play with pandas.
Either you could install and Import Pyarrow or
Do :
pip install "snowflake-connector-python[pandas]"
and try it .

What happens when you run this code?
from snowflake import connector
import time
import logging
for logger_name in ['snowflake.connector', 'botocore', 'boto3']:
logger = logging.getLogger(logger_name)
logger.setLevel(logging.DEBUG)
ch = logging.FileHandler('test.log')
ch.setLevel(logging.DEBUG)
ch.setFormatter(logging.Formatter('%(asctime)s - %(threadName)s %(filename)s:%(lineno)d - %(funcName)s() - %(levelname)s - %(message)s'))
logger.addHandler(ch)
from snowflake.connector.cursor import CAN_USE_ARROW_RESULT
import pyarrow
import pandas as pd
print('CAN_USE_ARROW_RESULT', CAN_USE_ARROW_RESULT)
This will output whether CAN_USE_ARROW_RESULT is true and if it's not true, then pandas won't work. When you did the pip install, which of these did you run?
pip install snowflake-connector-python
pip install snowflake-connector-python[pandas]
Also, what OS are you running on?

I have this working now, but am not sure which part helps - the following steps were taken:
Based on comment by #Kirby, I tried pip3 install --upgrade snowflake-connector-python .. this is based on a historic screenshot .. I should have have [pandas] in brackets, i.e. pip3 install --upgrade snowflake-connector-python[pandas], but regardless, I got the following error message:
Error: Microsoft Visual C++ 14.0 is required. Get it with "Build Tools for Visual Studio": https://visualstudio.microsoft.com/downloads
I therefore downloaded (exact filename: vs_buildtools__121011638.1587963829.exe) and installed VS Build Tools.
This is the tricky part .. I subsequently got admin access to my machine (so hoping it is the visual studio build tools that helped, and not admin access)
I then followed the Snowflake Documentation Python Connector API instructions originally referred to:
a. Anaconda Prompt (opened as admin): pip install snowflake-connector-python[pandas]
b. Python:
import snowflake.connector
import pandas as pd
ctx = snowflake.connector.connect(
user=user,
account=account,
password= 'password',
warehouse=warehouse,
database=database,
role = role,
schema=schema)
# Create a cursor object.
cur = ctx.cursor()
# Execute a statement that will generate a result set.
sql = "select * from t"
cur.execute(sql)
# Fetch the result set from the cursor and deliver it as the Pandas DataFrame.
df = cur.fetch_pandas_all()
Edit I have since realised that I still have the error when executing df = cur.fetch_pandas_all() when using my Okta (single sign on) account, i.e. when I use my username and authenticator = 'externalbrowser'. When I use a different account, I no longer get the error (with password).
NOTE: That I am still able to connect with externalbrowser (and I see the query has executed successfully in Snowflake history); I am just not able to fetch.

Using ..
python -m pip install "snowflake-connector-python[pandas]"
..as in the docs did not fetch the correct version of pyarrow for me (docs says you need 3.0.x).
With my conda (using python 3.8) I had to manually update pyarrow to the specifiv version:
python -m pip install pyarrow=6.0

Trying to query Azure SQL Database with Azure ML / Docker Image

I wanted to do a realtime deployment of my model on azure, so I plan to create an image which firsts queries an ID in azure SQL db to get the required features, then predicts using my model and returns the predictions. The error I get from PyODBC library is that drivers are not installed
I tried it on the azure ML jupyter notebook to establish the connection and found that no drivers are being installed in the environment itself. After some research i found that i should create a docker image and deploy it there, but i still met with the same results
driver= '{ODBC Driver 13 for SQL Server}'
cnxn = pyodbc.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+ password+';Encrypt=yes'+';TrustServerCertificate=no'+';Connection Timeout=30;')
('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC
Driver 13 for SQL Server' : file not found (0) (SQLDriverConnect)")
i want a result to the query instead i get this message

and/or you could use pymssql==2.1.1, if you add the following docker steps, in the deployment configuration (using either Environments or ContainerImages - preferred is Environments):
from azureml.core import Environment
from azureml.core.environment import CondaDependencies
conda_dep = CondaDependencies()
conda_dep.add_pip_package('pymssql==2.1.1')
myenv = Environment(name="mssqlenv")
myenv.python.conda_dependencies=conda_dep
myenv.docker.enabled = True
myenv.docker.base_dockerfile = 'FROM mcr.microsoft.com/azureml/base:latest\nRUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc'
myenv.docker.base_image = None
Or, if you're using the ContainerImage class, you could add these Docker Steps
from azureml.core.image import Image, ContainerImage
image_config = ContainerImage.image_configuration(runtime= "python", execution_script="score.py", conda_file="myenv.yml", docker_file="Dockerfile.steps")
# Assuming this :
# RUN apt-get update && apt-get -y install freetds-dev freetds-bin vim gcc
# is in a file called Dockerfile.steps, it should produce the same result.
See this answer for more details on how I've done it using an Estimator Step and a custom docker container. You could use this Dockerfile to locally create a Docker container for that Estimator step (no need to do that if you're just using an Estimator run outside of a pipeline) :
FROM continuumio/miniconda3:4.4.10
RUN apt-get update && apt-get -y install freetds-dev freetds-bin gcc
RUN pip install Cython
For more details see this posting :using estimator in pipeline with custom docker images. Hope that helps!

Per my experience, I think the comment as #DavidBrowne-Microsoft said is right.
There is a similar SO thread I am getting an error while connecting to an sql DB in Jupyter Notebook answered by me, which I think it will help you to install the latest msodbcsql driver for Linux on Microsoft Azure Notebook or Docker.
Meanwhile, there is a detail about the connection string for Azure SQL Database which you need to carefully note, that you should use {ODBC Driver 17 for SQL Server} instead of {ODBC Driver 13 for SQL Server} if your Azure SQL Database had been created recently (ignore the connection string shown in Azure portal).

you can use AzureML built in solution dataset to connect to your SQL server.
To do so, you can first create an azure_sql_database datastore. reference here
Then create a dataset by passing the datastore you created and the query you want to run.
reference here
sample code
from azureml.core import Dataset, Datastore, Workspace
workspace = Workspace.from_config()
sql_datastore = Datastore.register_azure_sql_database(workspace = workspace,
datastore_name = 'sql_dstore',
server_name = 'your SQL server name',
database_name = 'your SQL database name',
tenant_id = 'your directory ID/tenant ID of the service principal',
client_id = 'the Client ID/Application ID of the service principal',
client_secret = 'the secret of the service principal')
sql_dataset = Dataset.Tabular.from_sql_query((sql_datastore, 'SELECT * FROM my_table'))
You can also do it via UI at ml.azure.com where you can register an azure SQL datastore using your user name and password.

(Postgrex.Error) ERROR 58P01 (undefined_file) $libdir/postgis-2.4

I had to brew reinstall some things that my existing project uses.
Now I'm getting this error when I'm running a SELECT statement:
Interactive Elixir (1.7.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)>
18:07:23.636 [debug] QUERY ERROR source="shops" db=5.4ms
SELECT s0."id", s0."name", s0."place_id", s0."point", s0."inserted_at", s0."updated_at",ST_Distance_Sphere(s0."point", ST_SetSRID(ST_MakePoint($1,$2), $3)) FROM "shops" AS s0 WHERE (ST_DWithin(s0."point"::geography, ST_SetSRID(ST_MakePoint($4, $5), $6), $7)) ORDER BY s0."point" <-> ST_SetSRID(ST_MakePoint($8,$9), $10) [176.1666197, -37.6741546, 4326, 176.1666197, -37.6741546, 4326, 2000, 176.1666197, -37.6741546, 4326]
18:07:23.666 [error] #PID<0.356.0> running Api.Router terminated
Server: 192.168.20.9:4000 (http)
Request: GET /products?categories[]=1&categories[]=2&categories[]=3&categories[]=4&categories[]=5&categories[]=6&categories[]=7&categories[]=8&categories[]=9&categories[]=10&categories[]=11&categories[]=12&categories[]=13&categories[]=14&categories[]=15&categories[]=16&categories[]=17&categories[]=18&categories[]=19&categories[]=20&categories[]=21&categories[]=22&categories[]=23&categories[]=24&categories[]=25&keyword=%22%22&latitude=-37.6741546&longitude=176.1666197&distanceFromLocationValue=2&distanceFromLocationUnit=%22kilometers%22
** (exit) an exception was raised:
** (Postgrex.Error) ERROR 58P01 (undefined_file): could not access file "$libdir/postgis-2.4": No such file or directory
(ecto) lib/ecto/adapters/sql.ex:436: Ecto.Adapters.SQL.execute_and_cache/7
(ecto) lib/ecto/repo/queryable.ex:130: Ecto.Repo.Queryable.execute/5
(ecto) lib/ecto/repo/queryable.ex:35: Ecto.Repo.Queryable.all/4
(api) lib/api/controllers/product/get_product.ex:46: Api.Controllers.GetProduct.get_products/1
(api) lib/api/router.ex:1: Api.Router.plug_builder_call/2
(api) lib/plug/debugger.ex:123: Api.Router.call/2
(plug) lib/plug/adapters/cowboy/handler.ex:15: Plug.Adapters.Cowboy.Handler.upgrade/4
(cowboy) /Users/Ben/Development/Projects/vepo/api/deps/cowboy/src/cowboy_protocol.erl:442: :cowboy_protocol.execute/4
It is complaining about PostGis. I did brew install postgis to install it again. Still getting the error. Where is $libdirdirectory in my macbook so that I can view the files? How do I fix this error?

The issue was that my versions of postGis and postgresql were incompatible (I had used brew install xxx to get them both).
So I just uninstalled all postgresql and postgis versions from my computer: brew uninstall postgresql and brew uninstall postgis
brew list still listed one version of postgresql:
postgresql9.4
So I uninstalled that version:
brew uninstall postgresql9.4
Then I used postgres.app downloads to download and install postgresql and postgis, because it gets them both together and they are always compatible.
I specifically went for the "Legacy Postgres.app with PostgreSQL 10" download because that is what I was using all through my development on this app.
Use instructions at https://postgresapp.com/ to install it after it is downloaded.

PostgresSQL / pgAdmin4 / dump server version mismatch

I tried to make a backup with Postgres 11.1 in pgAdmin4, but it failed.
pgadmin displayed a window with
Status: Failed (exit code: 1).
pg_dump:server version: 11.1; pg_dump: 10.5
pg_dump: aborting because of server mismatch
I don't really understand it. Does pgadmin4 not know that I am using 11.1 and not 10.5?
PROBLEM SOLVED - IN MY CASE.
Go to
pgadmin < Preferences < Path < Binary Path
The PostgreSQL Binary Path was set automatically to $DIR/../runtime
I changed the Path to my installed PostgreSQL Version C:\Program Files\PostgreSQL\11\bin

Your pgAdmin is using PostgresSQL client v10, but your server is v11.
Since v10 cannot know how to correctly dump a v11 database, it refuses to try.
Use a more recent version of pgAdmin!

How can I upgrade PostgreSQL 9.3.10 to PostgreSQL 9.4.5 in Ubuntu?

Hello and excuse my english.
I'm trying to pull a remote database with heroku, but it gives me this error:
pg_dump: server version: 9.4.5; pg_dump version: 9.3.10
pg_dump: aborting because of server version mismatch
pg_restore: [archiver] input file is too short (read 0, expected 5)
I'm guessing I just need to upgrade my current version 9.3.10 to server version 9.4.5, but is not clear to me how to do it.
PD: I don't mind losing the data of my current databases.

The only solution I found was to completely delete postgresql 9.3 and then installed postgresql 9.4

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Installing Postgres-XL with PostGis - postgis

We managed to get it working with PostGis 2.3.1. So the issue is with PostGis 2.3.2 version. I think the new version PostGis 2.3.2 is still not compatible with Postgres-XL.

Related

Snowflake Python Pandas Connector - Unknown error using fetch_pandas_all

Trying to query Azure SQL Database with Azure ML / Docker Image

(Postgrex.Error) ERROR 58P01 (undefined_file) $libdir/postgis-2.4

PostgresSQL / pgAdmin4 / dump server version mismatch

How can I upgrade PostgreSQL 9.3.10 to PostgreSQL 9.4.5 in Ubuntu?

Categories

Resources