What permissions do we need to create file_format in Snowflake - file-format

I am trying to create file_format to create stage in snowflake with custom role. I have assigned privileges to create stage and use the storage integrations, schema, database but it is still showing me error "SQL access control error: Insufficient privileges to operate on schema 'PUBLIC'.
It's able to create stage without file_format parameter but file_format is required for creating table.
Thanks
The code so far I have tried
grant create stage on schema public to role my_role2;
grant usage on integration s3_int to role my_role2;
GRANT USAGE, MONITOR ON ALL SCHEMAS IN DATABASE test TO ROLE my_role2;
grant create table on schema TEST.PUBLIC to role my_role2;
create or replace file format my_csv_format
type = csv field_delimiter = ',' skip_header = 1
field_optionally_enclosed_by = '"'
null_if = ('NULL', 'null')
empty_field_as_null = true;
create or replace stage demo_stage url=''
STORAGE_INTEGRATION="s3_int"
file_format = my_csv_format;
Creating file _format is giving error ""SQL access control error: Insufficient privileges to operate on schema 'PUBLIC'. It's able to create stage without file_format parameter but file_format is required for creating table.

As documented here, your role needs the CREATE FILE FORMAT privilege on the schema

Related

Snowpark SQL compilation error: unexpected '-' in Role name

When trying to connect to Snowpark using the session method below with role, database, schema, and warehouse names, there is a SQL compilation error with the role name since it contains dashes.
dbname = "MY_DB"
schemaname = "MY_SCHEMA"
warehouse = "MY_WH"
read_session.sql(r"USE ROLE MY-SNOWFLAKE-ROLE").collect()
read_session.sql(f"USE WAREHOUSE {warehouse}").collect()
read_session.sql(f"USE DATABASE {dbname}").collect()
read_session.sql(f"USE SCHEMA {dbname}.{schemaname}").collect()
The role has to be contained in double quotes while your entire USE statement needs to be within single quotes.
dbname = "MY_DB"
schemaname = "MY_SCHEMA"
warehouse = "MY_WH"
read_session.sql(r'USE ROLE "MY-SNOWFLAKE-ROLE"').collect()
read_session.sql(f"USE WAREHOUSE {warehouse}").collect()
read_session.sql(f"USE DATABASE {dbname}").collect()
read_session.sql(f"USE SCHEMA {dbname}.{schemaname}").collect()

Azure SQL: Adding from Blob Not Recognizing Storage

I am trying to load data from a CSV file to a table in my Azure Database following the steps in https://learn.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql?view=sql-server-ver15#f-importing-data-from-a-file-in-azure-blob-storage, using the Managed Identity option. When I run the query, I receive this error:
Failed to execute query. Error: Referenced external data source "adfst" not found.
This is the name of the container I created within my storage account. I have also tried using my storage account, with the same error. Reviewing https://learn.microsoft.com/en-us/sql/relational-databases/import-export/examples-of-bulk-access-to-data-in-azure-blob-storage?view=sql-server-ver15 does not provide any further insight as to what may be causing the issue. My storage account does not have public (anonymous) access configured.
I'm assuming that I'm missing a simple item that would resolve this issue, but I can't figure out what it is. My SQL query is below, modified to not include content that should not be required.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '**************';
GO
CREATE DATABASE SCOPED CREDENTIAL msi_cred WITH IDENTITY = '***********************';
CREATE EXTERNAL DATA SOURCE adfst
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://**********.blob.core.windows.net/adfst'
, CREDENTIAL= msi_cred
);
BULK INSERT [dbo].[Adventures]
FROM 'Startracker_scenarios.csv'
WITH (DATA_SOURCE = 'adfst');
If you want to use Managed Identity to access Azure Blob storage when you run BULK INSERT command. You need to enable Managed Identity for the SQL server. Otherwise, you will get the error Referenced external data source "***" not found. Besides, you also need to assign Storage Blob Data Contributor to the MSI. If you do not do that, you cannot access the CVS file storing in Azure blob
For example
Enable Managed Identity for the SQL server
Connect-AzAccount
#Enable MSI for SQL Server
Set-AzSqlServer -ResourceGroupName your-database-server-resourceGroup -ServerName your-SQL-servername -AssignIdentity
Assign role via Azure Portal
Under your storage account, navigate to Access Control (IAM), and select Add role assignment. Assign Storage Blob Data Contributor RBAC role to the server which you've registered with Azure Active Directory (AAD)
Test
a. Data
1,James,Smith,19750101
2,Meggie,Smith,19790122
3,Robert,Smith,20071101
4,Alex,Smith,20040202
b. script
CREATE TABLE CSVTest
(ID INT,
FirstName VARCHAR(40),
LastName VARCHAR(40),
BirthDate SMALLDATETIME)
GO
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'YourStrongPassword1';
GO
--> Change to using Managed Identity instead of SAS key
CREATE DATABASE SCOPED CREDENTIAL msi_cred WITH IDENTITY = 'Managed Identity';
GO
CREATE EXTERNAL DATA SOURCE MyAzureBlobStorage
WITH ( TYPE = BLOB_STORAGE,
LOCATION = 'https://jimtestdiag417.blob.core.windows.net/test'
, CREDENTIAL= msi_cred
);
GO
BULK INSERT CSVTest
FROM 'mydata.csv'
WITH (
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
DATA_SOURCE = 'MyAzureBlobStorage');
GO
select * from CSVTest;
GO

In the tutorial "Tutorial: Bulk Loading from a local file system using copy" what is the difference between my_stage and my_table permissions?

I started to go through the first tutorial for how to load data into Snowflake from a local file.
This is what I have set up so far:
CREATE WAREHOUSE mywh;
CREATE DATABASE Mydb;
Use Database mydb;
CREATE ROLE ANALYST;
grant usage on database mydb to role sysadmin;
grant usage on database mydb to role analyst;
grant usage, create file format, create stage, create table on schema mydb.public to role analyst;
grant operate, usage on warehouse mywh to role analyst;
//tutorial 1 loading data
CREATE FILE FORMAT mycsvformat
TYPE = "CSV"
FIELD_DELIMITER= ','
SKIP_HEADER = 1;
CREATE FILE FORMAT myjsonformat
TYPE="JSON"
STRIP_OUTER_ARRAY = true;
//create stage
CREATE OR REPLACE STAGE my_stage
FILE_FORMAT = mycsvformat;
//Use snowsql for this and make sure that the role, db, and warehouse are seelcted: put file:///data/data.csv #my_stage;
// put file on stage
PUT file://contacts.csv #my
List #~;
list #%mytable;
Then in my active Snowsql when I run:
Put file:///Users/<user>/Documents/data/data.csv #my_table;
I have confirmed I am in the correct role Accountadmin:
002003 (02000): SQL compilation error:
Stage 'MYDB.PUBLIC.MY_TABLE' does not exist or not authorized.
So then I try to create the table in Snowsql and am successful:
create or replace table my_table(id varchar, link varchar, stuff string);
I still run into this error after I run:
Put file:///Users/<>/Documents/data/data.csv #my_table;
002003 (02000): SQL compilation error:
Stage 'MYDB.PUBLIC.MY_TABLE' does not exist or not authorized.
What is the difference between putting a file to a my_table and a my_stage in this scenario? Thanks for your help!
EDIT:
CREATE OR REPLACE TABLE myjsontable(json variant);
COPY INTO myjsontable
FROM #my_stage/random.json.gz
FILE_FORMAT = (TYPE= 'JSON')
ON_ERROR = 'skip_file';
CREATE OR REPLACE TABLE save_copy_errors AS SELECT * FROM TABLE(VALIDATE(myjsontable, JOB_ID=>'enterid'));
SELECT * FROM SAVE_COPY_ERRORS;
//error for random: Error parsing JSON: invalid character outside of a string: '\\'
//no error for generated
SELECT * FROM Myjsontable;
REMOVE #My_stage pattern = '.*.csv.gz';
REMOVE #My_stage pattern = '.*.json.gz';
//yay your are done!
The put command copies the file from your local drive to the stage. You should do the put to the stage, not that table.
put file:///Users/<>/Documents/data/data.csv #my_stage;
The copy command loads it from the stage.
But in document its mention like it gets created by default for every stage
Each table has a Snowflake stage allocated to it by default for storing files. This stage is a convenient option if your files need to be accessible to multiple users and only need to be copied into a single table.
Table stages have the following characteristics and limitations:
Table stages have the same name as the table; e.g. a table named mytable has a stage referenced as #%mytable
in this case without creating stage its should load into default Snowflake stage allocated

Informix - Default permissions when creating table

So, Im using Informix DB engine to create my database. I have noticed something peculiar I cannot find information about in the official IBM page.
If you check the definition of my table, there is a line at the end saying :
revoke all on "gabriel.barrios".proveedores from "public" as "gabriel.barrios";
I did not write that, I simply defined the table attributes. But it seems as the engine itself is adding that.
Is this the case?
And if it is, how can I cahnge this default behaviour.
Additionally, could someone clarify ths line's output :
{ TABLE "gabriel.barrios".proveedores row size = 110 number of columns = 4 index size = 9 }
[gabriel.barrios#informix1 ~]$ dbschema -d practico_matias_barrios -t Proveedores
DBSCHEMA Schema Utility INFORMIX-SQL Version 11.70.UC8W1
{ TABLE "gabriel.barrios".proveedores row size = 110 number of columns = 4 index size = 9 }
create table "gabriel.barrios".proveedores
(
id serial not null ,
nombre varchar(50) not null constraint "gabriel.barrios".proveedor_nombre_vacio,
situacion integer not null constraint "gabriel.barrios".proveedor_situacion_vacio,
ciudad varchar(50) not null constraint "gabriel.barrios".proveedor_ciudad_vacio,
primary key (id) constraint "gabriel.barrios".proveedor_clave_primaria
);
revoke all on "gabriel.barrios".proveedores from "public" as "gabriel.barrios";
Informix default behavior is to grant privileges to the PUBLIC role.
As per the documentation (Table-level privileges) :
In an ANSI-compliant database, only the table owner has any
privileges. In other databases, the database server, as part of
creating a table, automatically grants to PUBLIC all table privileges
except Alter and References, unless the NODEFDAC environment variable
has been set to 'yes' to withhold all table privileges from PUBLIC.
When you allow the database server to automatically grant all table
privileges to PUBLIC, a newly created table is accessible to any user
with the Connect privilege. If this is not what you want (if users
exist with the Connect privilege who should not be able to access this
table), you must revoke all privileges on the table from PUBLIC after
you create the table.
What your are seeing is dbschema always revoking privileges from PUBLIC on the create table output and then adding them back on the privileges output.
$ dbschema -d mydatabase -t default_privileges
DBSCHEMA Schema Utility INFORMIX-SQL Version 12.10.FC12
{ TABLE "myuser".default_privileges row size = 4 number of columns = 1 index size = 0 }
create table "myuser".default_privileges
(
id integer
);
revoke all on "myuser".default_privileges from "public" as "myuser";
Using dbschema privileges output and filtering by table default_privileges :
$ dbschema -d mydatabase -p all | grep default_privileges
grant select on "myuser".default_privileges to "public" as "myuser";
grant update on "myuser".default_privileges to "public" as "myuser";
grant insert on "myuser".default_privileges to "public" as "myuser";
grant delete on "myuser".default_privileges to "public" as "myuser";
grant index on "myuser".default_privileges to "public" as "myuser";

What is the difference between user postgres and a superuser?

I created a new superuser just so that this user can run COPY command.
Note that a non-superuser cannot run a copy command.
I need this user due to a backup application, and that application requires to run COPY command
But all the restrictions that I specified does not take effect (see below).
What is the difference between user postgres and a superuser?
And is there a better way to achieve what I want? I looked into a function with security definer as postgres ... that seems a lot of work for multiple tables.
DROP ROLE IF EXISTS mynewuser;
CREATE ROLE mynewuser PASSWORD 'somepassword' SUPERUSER NOCREATEDB NOCREATEROLE NOINHERIT LOGIN;
-- ISSUE: the user can still CREATEDB, CREATEROLE
REVOKE UPDATE,DELETE,TRUNCATE ON ALL TABLES IN SCHEMA public, schema1, schema2, schema3 FROM mynewuser;
-- ISSUE: the user can still UPDATE, DELETE, TRUNCATE
REVOKE CREATE ON DATABASE ip2_sync_master FROM mynewuser;
-- ISSUE: the user can still create table;
You are describing a situation where a user can write files to the server where the database runs but is not a superuser. While not impossible, it's definitely abnormal. I would be very selective about who I allow to access my DB server.
That said, if this is the situation, I'd create a function to load the table (using copy), owned by the postgres user and grant the user rights to execute the function. You can pass the filename as a parameter.
If you want to get fancy, you can create a table of users and tables to define what users can upload to what tables and have the table name as a parameter also.
It's pretty outside of the norm, but it's an idea.
Here's a basic example:
CREATE OR REPLACE FUNCTION load_table(TABLENAME text, FILENAME text)
RETURNS character varying AS
$BODY$
DECLARE
can_upload integer;
BEGIN
select count (*)
into can_upload
from upload_permissions p
where p.user_name = current_user and p.table_name = TABLENAME;
if can_upload = 0 then
return 'Permission denied';
end if;
execute 'copy ' || TABLENAME ||
' from ''' || FILENAME || '''' ||
' csv';
return '';
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
COPY with option other than writing to STDOUT and reading from STDIN is only allowed for database superusers role since it allows reading or writing any file that the server has privileges to access.
\copy is a psql client command which serves the same functionality as COPY but is not server-sided, so only local files can be processed - meaning it invokes COPY but ... FROM STDIN / ... TO STDOUT, so that files on a server are not "touched".
You can not revoke specific rights from a superuser. I'm quoting docs on this one:
Docs: Access DB
Being a superuser means that you are not subject to access controls.
Docs: CREATE ROLE
"superuser", who can override all access restrictions within the database. Superuser status is dangerous and should be used only when really needed.

Resources